Lightnews — Scholar-powered news

Shahab Bakhtiari

@shahabbakht.bsky.social

6.1K followers 1.1K following 1.2K posts

|| assistant prof at University of Montreal || leading the systems neuroscience and AI lab (SNAIL: https://www.snailab.ca/) 🐌 || associate academic member of Mila (Quebec AI Institute) || #NeuroAI || vision and learning in brains and machines

www.snailab.ca

Posts Media Videos Starter Packs

Pinned

Shahab Bakhtiari @shahabbakht.bsky.social · 10d

So excited to see this preprint released from the lab into the wild.

Charlotte has developed a theory for how learning curriculum influences learning generalization.
Our theory makes straightforward neural predictions that can be tested in future experiments. (1/4)

🧠🤖 🧠📈 #MLSky

Charlotte Volk @charlottevolk.bsky.social · 10d

🚨 New preprint alert!

🧠🤖
We propose a theory of how learning curriculum affects generalization through neural population dimensionality. Learning curriculum is a determining factor of neural dimensionality - where you start from determines where you end up.
🧠📈

A 🧵:

tinyurl.com/yr8tawj3

The curriculum effect in visual learning: the role of readout dimensionality

Generalization of visual perceptual learning (VPL) to unseen conditions varies across tasks. Previous work suggests that training curriculum may be integral to generalization, yet a theoretical explan...

tinyurl.com

1 6 31

Shahab Bakhtiari @shahabbakht.bsky.social · 1d

It's definitely not 50/50 for me. More like 10/90 ;)

Reposted by Shahab Bakhtiari

Eli Sennesh @elisennesh.bsky.social · 7d

We started our RL debate series!
www.youtube.com/watch?v=E0A0...

RL Debate Series: Eli "abolish the value function" Sennesh

YouTube video by Sensorimotor AI

www.youtube.com

1 4 14

Shahab Bakhtiari @shahabbakht.bsky.social · 1d

This feels a lot like systems neuro, honestly. You could hear similar advice there, especially from the more experimentally-oriented minds.

Shahab Bakhtiari @shahabbakht.bsky.social · 1d

I guess the whole predictive circuit finding approach can be seen as a convergent evolution, which probably doesn’t scale and generalize outside of the experiment setting?

Shahab Bakhtiari @shahabbakht.bsky.social · 1d

Having full observation and control over the studied system is definitely the main advantage of MI. But the unintuitive mess of high-d computation is their shared problem, which seems to need more theories than experiments.

Shahab Bakhtiari @shahabbakht.bsky.social · 1d

No doubt!

Shahab Bakhtiari @shahabbakht.bsky.social · 1d

A systems neuroscientist turned mech interp researcher should write a paper on what the field should absolutely avoid, then observe how thoroughly they’ll be ignored :)

Though what I find intriguing in this domain (watching from afar): its much slower rate of progress compared to the rest of AI.

3 13

Shahab Bakhtiari @shahabbakht.bsky.social · 1d

Regardless of what explainability/mech interp in AI is actually after, and whether or not they know what they’re searching for, we can confidently say they’re pursuing what systems neuroscience has pursued for decades, with very similar puzzles and confusions.

Mel Andrews @bayesianboy.bsky.social · 1d

What problem is explainability/interpretability research trying to solve in ML, and do you have a favorite paper articulating what that problem is?

2 8 45

Reposted by Shahab Bakhtiari

Mel Andrews @bayesianboy.bsky.social · 1d

What problem is explainability/interpretability research trying to solve in ML, and do you have a favorite paper articulating what that problem is?

15 8 48

Shahab Bakhtiari @shahabbakht.bsky.social · 2d

I don't see a direct causal path, but pessimistically speaking, when bubbles burst they often leave subconscious biases against the bubbled topic, e.g., in evaluation committees. In other words, the current abundance of AI funding (relative to other fields) might not last.

Shahab Bakhtiari @shahabbakht.bsky.social · 2d

What if the bubble collapse also takes down our funding so we can't even afford H100s at half price?! :)

1 3

Reposted by Shahab Bakhtiari

Brokoslaw Laschowski @drlaschowski.bsky.social · 3d

Imagine a brain decoding algorithm that could generalize across different subjects and tasks. Today, we’re one step closer to achieving that vision.

Introducing the flagship paper of our brain decoding program: www.biorxiv.org/content/10.1...
#neuroAI #compneuro @utoronto.ca @uhn.ca

3 13 61

Shahab Bakhtiari @shahabbakht.bsky.social · 3d

The results challenge the texture-bias hypothesis of Geirhos et al. (2019).

This is one of those cases where a deep, careful review can add real value.

arxiv.org/abs/2509.20234

🧠🤖 #MLSky

ImageNet-trained CNNs are not biased towards texture: Revisiting feature reliance through controlled suppression

The hypothesis that Convolutional Neural Networks (CNNs) are inherently texture-biased has shaped much of the discourse on feature use in deep learning. We revisit this hypothesis by examining limitat...

arxiv.org

1 6 18

Reposted by Shahab Bakhtiari

Sushrut Thorat @sushrutthorat.bsky.social · 3d

and the low-D part has been on the horizon since a bit now - proceedings.neurips.cc/paper/2019/h... - given complex numbers you can go loooowwww haha (O(1)). Also this is linked to top-down attention: arxiv.org/abs/1907.12309 , arxiv.org/abs/2502.15634 - which is a low-D modulation (O(N) vs O(N^2)).

Superposition of many models into one

proceedings.neurips.cc

1 1 3

Shahab Bakhtiari @shahabbakht.bsky.social · 3d

Yeah, it all makes sense in hindsight. I think the low-d structure of weights was actually the rationale behind LoRA when it was proposed.

Reposted by Shahab Bakhtiari

Stuart Gray @sgray.bsky.social · 4d

This seems to imply that, with a large enough context and the right prompt, you could “prototype” a LoRA in-context before creating it?

As someone removed from the low level implementation details, this speaks to something i’ve long wondered:

Could you “freeze” a context to create a LoRA from it?

Shahab Bakhtiari @shahabbakht.bsky.social · 4d

Interesting paper suggesting a mechanism for why in-context learning happens in LLMs.

They show that LLMs implicitly apply an internal low-rank weight update adjusted by the context. It’s cheap (due to the low-rank) but effective for adapting the model’s behavior.

#MLSky

arxiv.org/abs/2507.16003

Learning without training: The implicit dynamics of in-context learning

One of the most striking features of Large Language Models (LLM) is their ability to learn in context. Namely at inference time an LLM is able to learn new patterns without any additional weight updat...

arxiv.org

1 13

Shahab Bakhtiari @shahabbakht.bsky.social · 4d

Interesting connection to the recent Thinking Machines blog post on LoRA: thinkingmachines.ai/blog/lora/

Both seem to suggest that low-rank weight adjustments are sufficient for model adaptation, whether explicitly (LoRA fine-tuning) or implicitly (in-context learning).

LoRA Without Regret

How LoRA matches full training performance more broadly than expected.

thinkingmachines.ai

1 10

Shahab Bakhtiari @shahabbakht.bsky.social · 4d

Learning without training: The implicit dynamics of in-context learning

arxiv.org

1 19 60

Shahab Bakhtiari @shahabbakht.bsky.social · 5d

To be fair, most people don’t understand biology that well when it comes to the computational role of evolution, but then again, most people don’t make such strong claims either.

Shahab Bakhtiari @shahabbakht.bsky.social · 6d

The bitter lesson of the bitter lesson :)

1 4

Shahab Bakhtiari @shahabbakht.bsky.social · 6d

The charitable take would be that he’s arguing against the typical audience of Patel’s podcast, who might need to move away from LLM extremism a bit.

Shahab Bakhtiari @shahabbakht.bsky.social · 6d

I don’t doubt that :)

Shahab Bakhtiari @shahabbakht.bsky.social · 6d

It does feel a lot like that in this interview actually. He seems to be pushing a strong position for purely experience-dependent intelligence.

Though I just remembered this bit from their ‘reward is enough’ paper, which makes the notion of reward so wide it becomes almost meaningless.

3 6

Shahab Bakhtiari @shahabbakht.bsky.social · 6d

Though it’s surprising how much he downplays the role of evolution in bootstrapping animal learning and intelligence.

1 10

Shahab Bakhtiari @shahabbakht.bsky.social · 6d

The way Sutton himself interprets the “bitter lesson” in this interview definitely caught a lot of bitter lesson enthusiasts off guard.
LLMs not actually being an example of the bitter lesson was quite a nuance no one saw coming.

youtu.be/21EYKqUsPfg?...

Richard Sutton – Father of RL thinks LLMs are a dead end

YouTube video by Dwarkesh Patel

youtu.be

2 2 38