Lightnews — Scholar-powered news

Shahab Bakhtiari @shahabbakht.bsky.social · 3h

I don't see a direct causal path, but pessimistically speaking, when bubbles burst they often leave subconscious biases against the bubbled topic, e.g., in evaluation committees. In other words, the current abundance of AI funding (relative to other fields) might not last.

Shahab Bakhtiari @shahabbakht.bsky.social · 3h

What if the bubble collapse also takes down our funding so we can't even afford H100s at half price?! :)

1 3

Reposted by Shahab Bakhtiari

Brokoslaw Laschowski @drlaschowski.bsky.social · 1d

Imagine a brain decoding algorithm that could generalize across different subjects and tasks. Today, we’re one step closer to achieving that vision.

Introducing the flagship paper of our brain decoding program: www.biorxiv.org/content/10.1...
#neuroAI #compneuro @utoronto.ca @uhn.ca

3 13 52

Shahab Bakhtiari @shahabbakht.bsky.social · 2d

The results challenge the texture-bias hypothesis of Geirhos et al. (2019).

This is one of those cases where a deep, careful review can add real value.

arxiv.org/abs/2509.20234

🧠🤖 #MLSky

ImageNet-trained CNNs are not biased towards texture: Revisiting feature reliance through controlled suppression

The hypothesis that Convolutional Neural Networks (CNNs) are inherently texture-biased has shaped much of the discourse on feature use in deep learning. We revisit this hypothesis by examining limitat...

arxiv.org

1 6 18

Reposted by Shahab Bakhtiari

Sushrut Thorat @sushrutthorat.bsky.social · 2d

and the low-D part has been on the horizon since a bit now - proceedings.neurips.cc/paper/2019/h... - given complex numbers you can go loooowwww haha (O(1)). Also this is linked to top-down attention: arxiv.org/abs/1907.12309 , arxiv.org/abs/2502.15634 - which is a low-D modulation (O(N) vs O(N^2)).

Superposition of many models into one

proceedings.neurips.cc

1 1 3

Shahab Bakhtiari @shahabbakht.bsky.social · 2d

Yeah, it all makes sense in hindsight. I think the low-d structure of weights was actually the rationale behind LoRA when it was proposed.

2

Reposted by Shahab Bakhtiari

Stuart Gray @sgray.bsky.social · 2d

This seems to imply that, with a large enough context and the right prompt, you could “prototype” a LoRA in-context before creating it?

As someone removed from the low level implementation details, this speaks to something i’ve long wondered:

Could you “freeze” a context to create a LoRA from it?

Shahab Bakhtiari @shahabbakht.bsky.social · 2d

Interesting paper suggesting a mechanism for why in-context learning happens in LLMs.

They show that LLMs implicitly apply an internal low-rank weight update adjusted by the context. It’s cheap (due to the low-rank) but effective for adapting the model’s behavior.

#MLSky

arxiv.org/abs/2507.16003

Learning without training: The implicit dynamics of in-context learning

One of the most striking features of Large Language Models (LLM) is their ability to learn in context. Namely at inference time an LLM is able to learn new patterns without any additional weight updat...

arxiv.org

1 13

Shahab Bakhtiari @shahabbakht.bsky.social · 2d

Interesting connection to the recent Thinking Machines blog post on LoRA: thinkingmachines.ai/blog/lora/

Both seem to suggest that low-rank weight adjustments are sufficient for model adaptation, whether explicitly (LoRA fine-tuning) or implicitly (in-context learning).

LoRA Without Regret

How LoRA matches full training performance more broadly than expected.

thinkingmachines.ai

1 10

Shahab Bakhtiari @shahabbakht.bsky.social · 2d

Interesting paper suggesting a mechanism for why in-context learning happens in LLMs.

They show that LLMs implicitly apply an internal low-rank weight update adjusted by the context. It’s cheap (due to the low-rank) but effective for adapting the model’s behavior.

#MLSky

arxiv.org/abs/2507.16003

Learning without training: The implicit dynamics of in-context learning

One of the most striking features of Large Language Models (LLM) is their ability to learn in context. Namely at inference time an LLM is able to learn new patterns without any additional weight updat...

arxiv.org

1 18 59

Shahab Bakhtiari @shahabbakht.bsky.social · 4d

To be fair, most people don’t understand biology that well when it comes to the computational role of evolution, but then again, most people don’t make such strong claims either.

1

Shahab Bakhtiari @shahabbakht.bsky.social · 4d

The bitter lesson of the bitter lesson :)

1 4

Shahab Bakhtiari @shahabbakht.bsky.social · 4d

The charitable take would be that he’s arguing against the typical audience of Patel’s podcast, who might need to move away from LLM extremism a bit.

5

Shahab Bakhtiari @shahabbakht.bsky.social · 4d

I don’t doubt that :)

2

Shahab Bakhtiari @shahabbakht.bsky.social · 4d

It does feel a lot like that in this interview actually. He seems to be pushing a strong position for purely experience-dependent intelligence.

Though I just remembered this bit from their ‘reward is enough’ paper, which makes the notion of reward so wide it becomes almost meaningless.

3 6

Shahab Bakhtiari @shahabbakht.bsky.social · 4d

Though it’s surprising how much he downplays the role of evolution in bootstrapping animal learning and intelligence.

1 10

Shahab Bakhtiari @shahabbakht.bsky.social · 4d

The way Sutton himself interprets the “bitter lesson” in this interview definitely caught a lot of bitter lesson enthusiasts off guard.
LLMs not actually being an example of the bitter lesson was quite a nuance no one saw coming.

youtu.be/21EYKqUsPfg?...

Richard Sutton – Father of RL thinks LLMs are a dead end

YouTube video by Dwarkesh Patel

youtu.be

2 2 38

Shahab Bakhtiari @shahabbakht.bsky.social · 6d

Lone scientists are fictional creatures.

Michelle A. Rodrigues 🐒 @marspidermonkey.bsky.social · 6d

When we hear about a lone scientist who made groundbreaking discoveries on their own, it’s usually erasing the truth that science is a team sport, and field research builds on the local knowledge and expertise of the people that live there (8/10)

1 1 6

Reposted by Shahab Bakhtiari

Michelle A. Rodrigues 🐒 @marspidermonkey.bsky.social · 6d

As a primatologist, Jane Goodall was a huge inspiration to me. I admired the way she describes chimpanzee behavior with such detail and empathy, and she’s inspired so many people and advocated for chimpanzee conservation and welfare.

However, I'm dismayed at what her narrative leaves out (1/10)

Photo of Jane Goodall in the center, signing a book, with three women standing slightly hunched behind her. A very young Michelle is to the right, smiling.

2 110 250

Reposted by Shahab Bakhtiari

Charlotte Volk @charlottevolk.bsky.social · 8d

🚨 New preprint alert!

🧠🤖
We propose a theory of how learning curriculum affects generalization through neural population dimensionality. Learning curriculum is a determining factor of neural dimensionality - where you start from determines where you end up.
🧠📈

A 🧵:

tinyurl.com/yr8tawj3

The curriculum effect in visual learning: the role of readout dimensionality

Generalization of visual perceptual learning (VPL) to unseen conditions varies across tasks. Previous work suggests that training curriculum may be integral to generalization, yet a theoretical explan...

tinyurl.com

1 22 69

Shahab Bakhtiari @shahabbakht.bsky.social · 8d

This is also a very special paper to me: it's the first project I initiated in my lab, motivated by a long-standing curiosity to compare continual visual learning in humans and ANNs.

Big thanks to @charlottevolk.bsky.social for getting it to the finish line. (4/4)

3

Shahab Bakhtiari @shahabbakht.bsky.social · 8d

I'd value proven-wrong predictions even more, as they open up a path to 'what should be changed in the ANN' – a new direction of change in architecture, objectives, learning rules that can evolve *independently* of advances in AI, but who knows… maybe with implications for AI. (3/4)

1

Shahab Bakhtiari @shahabbakht.bsky.social · 8d

On a broader note: Our ANN-based models should generate bold predictions with direct prescriptions for how to test them. This is how I believe #NeuroAI can avoid becoming an isolated, self-referential domain of neuroscience. This is the only way to flywheel NeuroAI into a theory of the brain (2/4)

1 1 1

Shahab Bakhtiari @shahabbakht.bsky.social · 8d

So excited to see this preprint released from the lab into the wild.

Charlotte has developed a theory for how learning curriculum influences learning generalization.
Our theory makes straightforward neural predictions that can be tested in future experiments. (1/4)

🧠🤖 🧠📈 #MLSky

Charlotte Volk @charlottevolk.bsky.social · 8d

🚨 New preprint alert!

🧠🤖
We propose a theory of how learning curriculum affects generalization through neural population dimensionality. Learning curriculum is a determining factor of neural dimensionality - where you start from determines where you end up.
🧠📈

A 🧵:

tinyurl.com/yr8tawj3

The curriculum effect in visual learning: the role of readout dimensionality

Generalization of visual perceptual learning (VPL) to unseen conditions varies across tasks. Previous work suggests that training curriculum may be integral to generalization, yet a theoretical explan...

tinyurl.com

1 6 31

Reposted by Shahab Bakhtiari

Chris Paxton @cpaxton.bsky.social · 8d

The length of tasks that can be completed at 50% success rate by AI continues to increase

2 5 18

Reposted by Shahab Bakhtiari

Grace Lindsay @neurograce.bsky.social · 9d

Fyi if you're an educator looking for videos about the brain and/or how neuroscience is done, check out
@brainfacts.org's YouTube page: youtube.com/@brainfactsorg
It includes submissions to their Brain Awareness Week video contest, which can be quite fun! #neuroskyence

BrainFacts.org

BrainFacts.org is an authoritative source of information about the brain and nervous system for the public. The site is a public information initiative of The Kavli Foundation, the Gatsby Charitable F...

youtube.com

2 14 39