Lightnews — Scholar-powered news

Reposted by Siyuan Song✈️COLM

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · 2d

Excited to present this at COLM tomorrow! (Tuesday, 11:00 AM poster session)

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · Apr 16

One of the ways that LLMs can be inconsistent is the "generator-validator gap," where LLMs deem their own answers incorrect.

🎯 We demonstrate that ranking-based discriminator training can significantly reduce this gap, and improvements on one task often generalize to others!

🧵👇

A visualization of the generator-validator gap, where the LM likelihoods of for the generator and discriminator forms of questions are poorly correlated.

Aligning the validator and generator rankings can fix it!

2 3

Reposted by Siyuan Song✈️COLM

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · 2d

I will be giving a short talk on this work at the COLM Interplay workshop on Friday (also to appear at EMNLP)!

Will be in Montreal all week and excited to chat about LM interpretability + its interaction with human cognition and ling theory.

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · May 27

A key hypothesis in the history of linguistics is that different constructions share underlying structure. We take advantage of recent advances in mechanistic interpretability to test this hypothesis in Language Models.

New work with @kmahowald.bsky.social and @cgpotts.bsky.social!

🧵👇!

5 8

Reposted by Siyuan Song✈️COLM

Kanishka Misra 🌊 @kanishka.bsky.social · 2d

Traveling to my first @colmweb.org🍁

Not presenting anything but here are two posters you should visit:

1. @qyao.bsky.social on Controlled rearing for direct and indirect evidence for datives (w/ me, @weissweiler.bsky.social and @kmahowald.bsky.social), W morning

Paper: arxiv.org/abs/2503.20850

Both Direct and Indirect Evidence Contribute to Dative Alternation Preferences in Language Models

Language models (LMs) tend to show human-like preferences on a number of syntactic phenomena, but the extent to which these are attributable to direct exposure to the phenomena or more general propert...

arxiv.org

1 5 13

Reposted by Siyuan Song✈️COLM

Jessy Li @jessyjli.bsky.social · 2d

On my way to #COLM2025 🍁

Check out jessyli.com/colm2025

QUDsim: Discourse templates in LLM stories arxiv.org/abs/2504.09373

EvalAgent: retrieval-based eval targeting implicit criteria arxiv.org/abs/2504.15219

RoboInstruct: code generation for robotics with simulators arxiv.org/abs/2405.20179

4 12

Reposted by Siyuan Song✈️COLM

Kyle Mahowald (COLM 2025) @kmahowald.bsky.social · 2d

I’m at #COLM2025 from Wed with:

@siyuansong.bsky.social Tue am introspection arxiv.org/abs/2503.07513

@qyao.bsky.social Wed am controlled rearing: arxiv.org/abs/2503.20850

@sashaboguraev.bsky.social INTERPLAY ling interp: arxiv.org/abs/2505.16002

I’ll talk at INTERPLAY too. Come say hi!

Language Models Fail to Introspect About Their Knowledge of Language

There has been recent interest in whether large language models (LLMs) can introspect about their own internal states. Such abilities would make LLMs more interpretable, and also validate the use of s...

arxiv.org

1 6 20

Siyuan Song✈️COLM @siyuansong.bsky.social · 2d

Heading to #COLM2025 to present my first paper w/ @jennhu.bsky.social @kmahowald.bsky.social !

When: Tuesday, 11 AM – 1 PM
Where: Poster #75

Happy to chat about my work and topics in computational linguistics & cogsci!

Also, I'm on the PhD application journey this cycle!

Paper info 👇:

Siyuan Song✈️COLM @siyuansong.bsky.social · Mar 12

New preprint w/ @jennhu.bsky.social @kmahowald.bsky.social : Can LLMs introspect about their knowledge of language?
Across models and domains, we did not find evidence that LLMs have privileged access to their own predictions. 🧵(1/8)

3 7

Reposted by Siyuan Song✈️COLM

Tom McCoy @rtommccoy.bsky.social · 8d

🤖 🧠 NEW BLOG POST 🧠 🤖

What skills do you need to be a successful researcher?

The list seems long: collaborating, writing, presenting, reviewing, etc

But I argue that many of these skills can be unified under a single overarching ability: theory of mind

rtmccoy.com/posts/theory...

Illustration of the blog post's main argument, summarized as: "Theory of Mind as a Central Skill for Researchers: Research involves many skills.If each skill is viewed separately, each one takes a long time to learn. These skills can instead be connected via theory of mind – the ability to reason about the mental states of others. This allows you to transfer your abilities across areas, making it easier to gain new skills."

2 2 19

Reposted by Siyuan Song✈️COLM

Kanishka Misra 🌊 @kanishka.bsky.social · 8d

The compling group at UT Austin (sites.utexas.edu/compling/) is looking for PhD students!

Come join me, @kmahowald.bsky.social, and @jessyjli.bsky.social as we tackle interesting research questions at the intersection of ling, cogsci, and ai!

Some topics I am particularly interested in:

Picture of the UT Tower with "UT Austin Computational Linguistics" written in bigger font, and "Humans processing computers processing human processing language" in smaller font

3 10 18

Reposted by Siyuan Song✈️COLM

Jessy Li @jessyjli.bsky.social · 13d

Can AI aid scientists amidst their own workflows, when they do not know step-by-step workflows and may not know, in advance, the kinds of scientific utility a visualization would bring?

Check out @sebajoe.bsky.social’s feature on ✨AstroVisBench:

NSF-Simons AI Institute for Cosmic Origins (CosmicAI) @nsfsimonscosmicai.bsky.social · 13d

Exciting news! Introducing AstroVisBench: A Code Benchmark for Scientific Computing and Visualization in Astronomy!

A new benchmark developed by researchers at the NSF-Simons AI Institute for Cosmic Origins is testing how well LLMs implement scientific workflows in astronomy and visualize results.

3 8

Reposted by Siyuan Song✈️COLM

Harvey Lederman @harveylederman.bsky.social · 14d

Simon Goldstein and I have a new paper, “What does ChatGPT want? An interpretationist guide”.

The paper argues for three main claims.

philpapers.org/rec/GOLWDC-2 1/7

Simon Goldstein & Harvey Lederman, What Does ChatGPT Want? An Interpretationist Guide - PhilPapers

This paper investigates LLMs from the perspective of interpretationism, a theory of belief and desire in the philosophy of mind. We argue for three conclusions. First, the right object of study ...

philpapers.org

2 6 24

Reposted by Siyuan Song✈️COLM

Naomi Saphra @nsaphra.bsky.social · 14d

I did a QA with Quanta about interpretability and training dynamics! I got to talk about a bunch of research hobby horses and how I got into them.

Ben Brubaker @benbenbrubaker.bsky.social · 14d

I really enjoyed talking to @nsaphra.bsky.social about her thoughts on what much language model interpretability research misses. My latest in @quantamagazine.bsky.social:

To Understand AI, Watch How It Evolves | Quanta Magazine

Naomi Saphra thinks that most research into language models focuses too much on the finished product. She’s mining the history of their training for insights into why these systems work the way they d...

www.quantamagazine.org

2 13 66

Reposted by Siyuan Song✈️COLM

Andrew Lampinen @lampinen.bsky.social · 16d

Why does AI sometimes fail to generalize, and what might help? In a new paper (arxiv.org/abs/2509.16189), we highlight the latent learning gap — which unifies findings from language modeling to agent navigation — and suggest that episodic memory complements parametric learning to bridge it. Thread:

Latent learning: episodic memory complements parametric learning by enabling flexible reuse of experiences

When do machine learning systems fail to generalize, and what mechanisms could improve their generalization? Here, we draw inspiration from cognitive science to argue that one weakness of machine lear...

arxiv.org

1 10 44

Reposted by Siyuan Song✈️COLM

Stefan Frank @stefanfrank.bsky.social · 19d

Announcing the first (and perhaps only) Multilingual Minds and Machines Meeting! Come join us in Nijmegen, June 22-23, 2026, if you are interested in computational models of human multilingualism: mmmm2026.github.io

7 13

Reposted by Siyuan Song✈️COLM

Catherine Arnett @ 🍁COLM🍁 @catherinearnett.bsky.social · 19d

Did you know?

❌77% of language models on @hf.co are not tagged for any language
📈For 95% of languages, most models are multilingual
🚨88% of models with tags are trained on English

In a new blog post, @tylerachang.bsky.social and I dig into these trends and why they matter! 👇

1 2 13

Reposted by Siyuan Song✈️COLM

Brenden Lake @brendenlake.bsky.social · Sep 8

Our new lab for Human & Machine Intelligence is officially open at Princeton University!

Consider applying for a PhD or Postdoc position, either through Computer Science or Psychology. You can register interest on our new website lake-lab.github.io (1/2)

2 15 51

Reposted by Siyuan Song✈️COLM

Conference on Language Modeling @colmweb.org · Aug 26

COLM 2025 accepted submissions are now public:
openreview.net/group?id=col...

Congratulations to all the authors, and see you all in Montreal!

COLM 2025 Conference

Welcome to the OpenReview homepage for COLM 2025 Conference

openreview.net

1 6

Reposted by Siyuan Song✈️COLM

Kyle Mahowald (COLM 2025) @kmahowald.bsky.social · Aug 26

Can AI introspect? Surprisingly tricky to define what that means! And also interesting to test. New work from @siyuansong.bsky.social, @harveylederman.bsky.social, @jennhu.bsky.social and me on introspection in LLMs. See paper and thread for a definition and some experiments!

Siyuan Song✈️COLM @siyuansong.bsky.social · Aug 26

How reliable is what an AI says about itself? The answer depends on whether models can introspect. But, if an LLM says its temperature parameter is high (and it is!)….does that mean it’s introspecting? Surprisingly tricky to pin down. Our paper: arxiv.org/abs/2508.14802 (1/n)

1 11

Reposted by Siyuan Song✈️COLM

Jennifer Hu @ COLM (recruiting PhDs and postdocs!) @jennhu.bsky.social · Aug 26

Can AI models introspect? What does introspection even mean for AI?

We revisit a recent proposal by Comșa & Shanahan, and provide new experiments + an alternate definition of introspection.

Check out this new work w/ @siyuansong.bsky.social, @harveylederman.bsky.social, & @kmahowald.bsky.social 👇

Siyuan Song✈️COLM @siyuansong.bsky.social · Aug 26

How reliable is what an AI says about itself? The answer depends on whether models can introspect. But, if an LLM says its temperature parameter is high (and it is!)….does that mean it’s introspecting? Surprisingly tricky to pin down. Our paper: arxiv.org/abs/2508.14802 (1/n)

1 5 21

Reposted by Siyuan Song✈️COLM

Harvey Lederman @harveylederman.bsky.social · Aug 26

exciting new paper from Siyuan! I really enjoyed working with him on this, inspired by important work by Murray Shanahan and Julia Comsa. Hard questions about how to operationalize the notion of “introspection” that’s relevant for practical applications in AI today. Hope you’ll check it out!

Siyuan Song✈️COLM @siyuansong.bsky.social · Aug 26

How reliable is what an AI says about itself? The answer depends on whether models can introspect. But, if an LLM says its temperature parameter is high (and it is!)….does that mean it’s introspecting? Surprisingly tricky to pin down. Our paper: arxiv.org/abs/2508.14802 (1/n)

2 6

Siyuan Song✈️COLM @siyuansong.bsky.social · Aug 26

Introspection in AI is important, yet controversial - we need a definition that is both conceptually sound and practically useful. (12/12)
arxiv.org/abs/2508.14802

again, thanks to @harveylederman.bsky.social @jennhu.bsky.social @kmahowald.bsky.social for guidance and support!

Privileged Self-Access Matters for Introspection in AI

Whether AI models can introspect is an increasingly important practical question. But there is no consensus on how introspection is to be defined. Beginning from a recently proposed ''lightweight'' de...

arxiv.org

2

Siyuan Song✈️COLM @siyuansong.bsky.social · Aug 26

Also check out our previous work that LMs do not introspect in grammaticality judgment and word prediction, to appear at COLM 2025: arxiv.org/pdf/2503.07513

And important work by Binder et al. that show evidence of privileged self-access in fine-tuned LLMs: openreview.net/forum?id=eb5...

(11/n)

arxiv.org

1 4

Siyuan Song✈️COLM @siyuansong.bsky.social · Aug 26

Taken together, our results suggest that: although LLMs can reason about the possible states of systems like themself, this does not imply that they have privileged self-access to their internal state - and this matters for introspection in AI systems.(10/n)

1 2

Siyuan Song✈️COLM @siyuansong.bsky.social · Aug 26

study2:we examined whether LLMs report their own temperature better than other models do. We found that self-reflection offers no advantage over temperature prediction (predicting based on the prompt and the generated text), whether within the same model or across different models.(9/n)

1 2

Siyuan Song✈️COLM @siyuansong.bsky.social · Aug 26

study1:we reproduced C&S’s temperature self-reporting case using a broader set of prompt and temperature settings. We found such self-reflection is highly sensitive to the prompt: even when the sampling temperature is low, a prompt 'generate a crazy sentence' leads to a high-temperature report.(8/n)

1 2

Siyuan Song✈️COLM @siyuansong.bsky.social · Aug 26

We performed two studies showing LLM failures to introspect per our definition. We think they illustrate some of the interesting subtleties in defining what AI introspection is in the relevant sense. (7/n)

1 2