Lightnews — Scholar-powered news

Sushrut Thorat @sushrutthorat.bsky.social · 1d

Also to your "not present in the training distribution" point. Neither the Geirhos stimuli nor your diagnostic stimuli are part of the train set either. Extreme generalization is what we usually resort to for interpretability and that is fine, no?

Sushrut Thorat @sushrutthorat.bsky.social · 1d

Regardless, the claim that it's not shape (outline, bulk, etc.) but something related to color, texture, etc. that ANNs rely on for classification is a hard one to refute, wouldn't you say? (coz people reading your paper's title would think you're speaking directly against this claim)

1

Sushrut Thorat @sushrutthorat.bsky.social · 1d

careful declaring that ANNs are not biased toward textures. Also it is worth mentioning that texture, in the Geirhos setting, is also much loosely defined (includes color, etc.) than how you or psychophysics expts refer to it. This is exactly where your controlled expts are a great addition.

1

Sushrut Thorat @sushrutthorat.bsky.social · 1d

I think your controlled analysis is great and important — sorry that is not being reflected in the directness of my arguments. What I'm trying to say is given the stark demonstration of what seems to be an inability to rely on global shape when embedded in a conflicting texture, I'd be (cont..)

1

Sushrut Thorat @sushrutthorat.bsky.social · 1d

Preference vs reliability, yes. But lets say shape was more salient; ok humans got it but ANNs did not; lets say texture was more salient, humans latch onto shape, and ANNs not (or on texture). Either way, "humans lean more towards shape OR ANNs cannot / do not rely on shape" is the conclusion, no?

Sushrut Thorat @sushrutthorat.bsky.social · 1d

CNNs relying on low-freq non-contour, etc. components can't use them anymore?

Sushrut Thorat @sushrutthorat.bsky.social · 1d

This could be because of reliance on the non-contour, low-freq components I mentioned earlier, no? yes, the relative comparisons signals a difference, but it could be signaling a different difference: humans biased towards shape might switch to local-texture-based decisions, whereas (cont..)

1 1

Sushrut Thorat @sushrutthorat.bsky.social · 1d

But what remains is shape parts AND texture, and humans/models could rely on either or both of them - hard to disentangle, no?

Sushrut Thorat @sushrutthorat.bsky.social · 1d

Thanks for engaging :)
In Geirhos's cc images, the texture doesn't have to only be high-freq. The gram matrices are aligned across all layers - in later layers the RF sizes are huge so the correlations needn't necessarily only reflect small-scale variation, as seen in my post.

1

Sushrut Thorat @sushrutthorat.bsky.social · 2d

Would be cool to discuss this with the authors of the paper. The only author I could find was @paolorota.bsky.social though. I'm curious, when all these observations are taken into account, what we come out with as a conclusion.

7 2

Sushrut Thorat @sushrutthorat.bsky.social · 2d

5. another example highlighting a texture/background bias comes from a forest vs trees - like benchmark - bsky.app/profile/timk...
... idk I'm, given the above concerns, and these observations, not convinced that "ImageNet-trained CNNs are not biased towards texture"

Tim Kietzmann @timkietzmann.bsky.social · Jul 8

Result 2: DVD-training enabled abstract shape recognition in cases where AI frontier models, despite being explicitly prompted, fail spectacularly.

t-SNE nicely visualises the fundamentally different approach of DVD-trained models. 6/

1 2

Sushrut Thorat @sushrutthorat.bsky.social · 2d

4. the famous cat-elephant cue-conflict image DOES tell us about the ability and preference of ANNs (another example w/ then sota VLMs - bsky.app/profile/sush...). There's clearly a reliance on texture. Now if controlled analysis employed are not in line with this, perhaps we need better controls?

Sushrut Thorat @sushrutthorat.bsky.social · Jan 16

... (Claude/Gemini do the same)

1 2

Sushrut Thorat @sushrutthorat.bsky.social · 2d

2. the global shape manipulation also preserves textures so humans and ANNs might be relying on different cues to solve the task.
3. noone says humans cannot/do not rely on texture (as seen w/ the local shape condition), but the Geirhos stimuli are about gauging a shape "bias", preference vs ability

1 1

Sushrut Thorat @sushrutthorat.bsky.social · 2d

hmm,
1. the way they quantify "texture" is based solely on high-freq components. but, there are low-freq components which do not signal meaningful information about shape either and could influence classification (suppl fig. from upcoming rev. of arxiv.org/abs/2507.03168)

1 1

Reposted by Sushrut Thorat

Bria Long @brialong.bsky.social · 2d

We’re recruiting a postdoctoral fellow to join our team! 🎉

I’m happy to share that I’ve opened back up the search for this position (it was temporarily closed due to funding uncertainty).

See lab page and doc below for details!

Bria Long @brialong.bsky.social · Nov 18

The Visual Learning Lab at UC San Diego is looking for a postdoctoral research fellow to join our lab!

Read more about our lab at www.vislearnlab.org

Home | The Visual Learning Lab at UCSD

Lab webpage for the Visual Learning Lab at UCSD, lead by Dr. Bria Long, Ph.D. Launching in July 2024!

www.vislearnlab.org

2 36 64

Sushrut Thorat @sushrutthorat.bsky.social · 2d

Thanks! Will check it out 😇

Sushrut Thorat @sushrutthorat.bsky.social · 2d

also I was referring to what LoRA et al. end up doing - modifying the transformation with a low-rank adapter - which is similar to the work I linked wherein a top-down attention signal, which is low-rank from the perspective of the weights, modifies the transformation.

1

Sushrut Thorat @sushrutthorat.bsky.social · 2d

Could you link it please?

1

Sushrut Thorat @sushrutthorat.bsky.social · 2d

counterpoint #1: bsky.app/profile/nico...

Nicole Rust @nicolecrust.bsky.social · 2d

I get it. Similar thoughts inspired me to write a book. I began pessimistic, like this author, but I came out on the other side with renewed optimism.

As a counterpoint to the blog below, this podcast could be titled, "Why Nicole Rust stayed"

www.fchampalimaud.org/news/episode...

1

Sushrut Thorat @sushrutthorat.bsky.social · 2d

oh? which one is that? I'm unaware of it. Also note that the aim of the work I linked above wasn't explicitly to do what LoRA/ICL does but I think the spirit is similar - contextual modulation one way or the other.

1 1

Sushrut Thorat @sushrutthorat.bsky.social · 2d

"Also: It is my opinion that neuroscience has stagnated over the past one or two decades."
"My field is spinning wheels with new data and new publications without new findings or conclusions."
relatable and guilty

PessoaBrain @pessoabrain.bsky.social · 2d

Michael X Cohen on why he left academia/neuroscience.
mikexcohen.substack.com/p/why-i-left...

Why I left academia and neuroscience

Don't worry, this isn't yet another story of rage-quitting.

mikexcohen.substack.com

1 1 8

Reposted by Sushrut Thorat

Shahab Bakhtiari @shahabbakht.bsky.social · 3d

Interesting paper suggesting a mechanism for why in-context learning happens in LLMs.

They show that LLMs implicitly apply an internal low-rank weight update adjusted by the context. It’s cheap (due to the low-rank) but effective for adapting the model’s behavior.

#MLSky

arxiv.org/abs/2507.16003

Learning without training: The implicit dynamics of in-context learning

One of the most striking features of Large Language Models (LLM) is their ability to learn in context. Namely at inference time an LLM is able to learn new patterns without any additional weight updat...

arxiv.org

1 18 59

Sushrut Thorat @sushrutthorat.bsky.social · 2d

and the low-D part has been on the horizon since a bit now - proceedings.neurips.cc/paper/2019/h... - given complex numbers you can go loooowwww haha (O(1)). Also this is linked to top-down attention: arxiv.org/abs/1907.12309 , arxiv.org/abs/2502.15634 - which is a low-D modulation (O(N) vs O(N^2)).

Superposition of many models into one

proceedings.neurips.cc

1 1 3

Sushrut Thorat @sushrutthorat.bsky.social · 2d

In a way, I'm wondering "how else" functionally would this work? There's, of course, an equivalence b/w ICL and finetuning from the perspective of the feedforward processing of the current token. The crazy/hard bit is showing HOW exactly this "contextual modulation" manifests.

1 2

Reposted by Sushrut Thorat

Lukas Vogelsang @lukasvogelsang.bsky.social · 4d

Excited to share our new review in #AnnualReviews Psychology on the role of time in perceptual organization! doi.org/10.1146/annu...

The Temporal Scaffolding of Sensory Organization

How a developing nervous system discovers meaning in complex sensory inputs has typically been examined separately for each sensory modality. Even as studies have uncovered modality-specific strategie...

doi.org

2 7