Lightnews — Scholar-powered news

Isabel Papadimitriou @isabelpapad.bsky.social · 21d

It’s a very exciting time to be thinking about the interaction of vision and language, and what we can find in (and learn from) VLMs. Looking forward to talking to people about this at COLM, and thanks to everyone doing awesome research on this topic!

1 1

Isabel Papadimitriou @isabelpapad.bsky.social · 21d

Lastly, we didn’t just go blindly into batchtopk SAEs, we tried other SAEs and a semi-NMF, but they don’t work as well: batchtopk dominates the reconstruction-sparsity tradeoff

1 1

Isabel Papadimitriou @isabelpapad.bsky.social · 21d

Check out our interactive demo (by the amazing @napoolar), where bridges illustrate our BridgeScore metric: a combination geometrical alignment (cosine) and statistical alignment (coactivation on image-caption pairs): vlm-concept-visualization.com

1 2

Isabel Papadimitriou @isabelpapad.bsky.social · 21d

And they’re stable ~across training data mixtures~! If we train the SAEs with a 5:1 ratio of text to images, we get a lot more text concepts (makes sense!). But if we weight the points by activation scores (bottom), we see basically the same concepts across very different mixtures

1 1

Isabel Papadimitriou @isabelpapad.bsky.social · 21d

But, are the SAEs even stable? It wouldn’t be very enlightening if we were just analyzing a fluke of the SAE seed. Across seeds, we find that frequently-used concepts (the ones that take up 99% of activation weights) are remarkably stable, but the rest are pretty darn unstable.

1

Isabel Papadimitriou @isabelpapad.bsky.social · 21d

How can this be? Because of the projection effect in SAEs! When we impose sparisty, then the inputs that are activated don’t necessarily reflect the whole story of what inputs align with that direction. Here, the batchtopk cutoff (dotted line) hides a multimodal story

1 1

Isabel Papadimitriou @isabelpapad.bsky.social · 21d

On first blush, however, the concepts look pretty single-modality: see here their modality scores (how many of the top-activating inputs are images vs text). The classifier results above show us that the actual geometry is often much closer to modality-agnostic.

1 1

Isabel Papadimitriou @isabelpapad.bsky.social · 21d

In fact, they often can’t even act as good modality classifiers: if we take the SAE concept direction, and see how well projecting on to that direction separates modality, we see that many of the concepts don’t get great accuracy

1 1

Isabel Papadimitriou @isabelpapad.bsky.social · 21d

We trained SAEs on the embedding spaces of four VLMs, and analyzed the resulting dictionaries of concepts. Even though image and text concepts lie on separate anisotropic cones, the SAE concepts don’t lie within those cones.

1

Isabel Papadimitriou @isabelpapad.bsky.social · 21d

Are there conceptual directions in VLMs that transcend modality? Check out our COLM oral spotlight 🔦 paper! We use SAEs to analyze the multimodality of linear concepts in VLMs

with @chloesu07.bsky.social, @thomasfel.bsky.social, @shamkakade.bsky.social and Stephanie Gil
arxiv.org/abs/2504.11695

1 6 25

Isabel Papadimitriou @isabelpapad.bsky.social · 21d

@avzaagzonunaada.bsky.social

1

Reposted by Isabel Papadimitriou

Ben Prystawski @benpry.bsky.social · Aug 1

How do people trade off between speed and accuracy in reasoning tasks without easy heuristics? Come to my talk, "Thinking fast, slow, and everywhere in between in humans and language models," in the Reasoning session this afternoon #CogSci2025 to find out!
paper: escholarship.org/uc/item/5td9...

Thinking fast, slow, and everywhere in between in humans and language models

Author(s): Prystawski, Ben; Goodman, Noah | Abstract: How do humans adapt how they reason to varying circumstances? Prior research has argued that reasoning comes in two types: a fast, intuitive type ...

escholarship.org

1 4

Reposted by Isabel Papadimitriou

Ben Prystawski @benpry.bsky.social · Aug 1

When people form conventions in reference games, how easy are they for outsiders to interpret? (for values of "outsider" that include naïve humans and vision-language models) Check out @vboyce.bsky.social's poster today at #CogSci2025 to find out.
paper: escholarship.org/uc/item/16c4...

Idiosyncratic but not opaque: Linguistic conventions formed in reference games are interpretable by naÃ¯ve humans and visionâ€“language models

Author(s): Boyce, Veronica; Prystawski, Ben; Tan, Alvin Wei Ming; Frank, Michael C. | Abstract: When are in-group linguistic conventions opaque to non-group members (teen slang like "rizz") or general...

escholarship.org

3 5

Isabel Papadimitriou @isabelpapad.bsky.social · Jun 18

@antararb.bsky.social is applying for PhDs this fall! She’s super impressive and awesome to work with, and conceived of this project independently and carried it out very successfully! Keep an eye out 🙂

1

Isabel Papadimitriou @isabelpapad.bsky.social · Jun 18

More in the preprint! arxiv.org/abs/2506.13886 This project was led by Antara, with @dmelis.bsky.social and Kate Davidson

Investigating the interaction of linguistic and mathematical reasoning in language models using multilingual number puzzles

Across languages, numeral systems vary widely in how they construct and combine numbers. While humans consistently learn to navigate this diversity, large language models (LLMs) struggle with linguist...

arxiv.org

1 2

Isabel Papadimitriou @isabelpapad.bsky.social · Jun 18

So is it really this implicit operators thing that’s tripping them up? We try many other ablations, looking at the effect of giving extra context in the prompt, using numbers vs words, left-to-right ordering, and subtractive systems, and none of them seem to affect the models that much.

1 1

Isabel Papadimitriou @isabelpapad.bsky.social · Jun 18

Our experiments are based on Linguistics Olympiad problems that deal with number systems, like the one here. We created additional hand-standardized versions of each puzzle in order to be able to do all of the operator ablations.

1 1

Isabel Papadimitriou @isabelpapad.bsky.social · Jun 18

This shows the types of reasoning and variable binding jumps that are hard for LMs. It’s hard to go one level up, and bind a variable to have the meaning of an operator, or to understand that an operator is implicit.

1 1

Isabel Papadimitriou @isabelpapad.bsky.social · Jun 18

If we alter the problems to make the operators explicit, the models can solve these problems pretty easily. But it’s still harder to bind a random symbol or word to mean an operator like +. It’s much easier when we use the familiar symbols for the operators, like + and x.

1 1

Isabel Papadimitriou @isabelpapad.bsky.social · Jun 18

Our main finding: LMs find it hard when *operators* are implicit. We don’t say “5 times 100 plus 20 plus 3”, we say “five hundred and twenty-three”. The Linguistics Olympiad puzzles are pretty simple systems of equations that an LM should solve – but the operators aren’t explicit.

1 1

Isabel Papadimitriou @isabelpapad.bsky.social · Jun 18

Why can’t LMs solve puzzles about the number systems of languages, when they can solve really complex math problems? Our new paper, led by @antararb.bsky.social looks at why this intersection of language and math is difficult, and what this means for LM reasoning! arxiv.org/abs/2506.13886

4 5 29

Reposted by Isabel Papadimitriou

Naomi Saphra @nsaphra.bsky.social · Jun 12

ACL paper alert! What structure is lost when using linearizing interp methods like Shapley? We show the nonlinear interactions between features reflect structures described by the sciences of syntax, semantics, and phonology.

3 12 55

Reposted by Isabel Papadimitriou

Mike Frank @mcxfrank.bsky.social · May 30

Congrats to Veronica Boyce on her dissertation defense! That’s three amazing talks by three great students in 8 days!

1 31

Isabel Papadimitriou @isabelpapad.bsky.social · Mar 6

(the unfortunate truth is that I am really enjoying this mac and its battery life oops)

2

Isabel Papadimitriou @isabelpapad.bsky.social · Mar 6

This work Mac (my first ever) is great because every time something seriously breaks, instead of becoming distressed and despondent like I usually do, it's just like "ooooooh yeahhh, yet another win for team Linux 😎😎😎🎉🐧"

1 7