Lightnews — Scholar-powered news

Reposted by Juan Diego Rodriguez (@ COLM 2025)

thebes @vgel.me · 14h

if you're interesting in gaining a better intuition for how llms behave at inference time, you should try logitloom🌱, the open-source tool i made for exploring token trajectory trees (aka looming) on base and instruct models! more info in thread

🌱 vgel.me/logitloom
💻 github.com/vgel/logitloom

5 22 95

Reposted by Juan Diego Rodriguez (@ COLM 2025)

Nishant Subramani @ ACL @nsubramani23.bsky.social · 1d

At @colmweb.org all week 🥯🍁! Presenting 3 mechinterp + actionable interp papers at @interplay-workshop.bsky.social

1. BERTology in the Modern World w/ @bearseascape.bsky.social
2. MICE for CATs
3. LLM Microscope w/ Jiarui Liu, Jivitesh Jain, @monadiab77.bsky.social

Reach out to chat! #COLM2025

2 7

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · 1d

Excited to present this at #COLM2025 tomorrow! (Tuesday, 11:00 AM poster session)

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · Apr 16

One of the ways that LLMs can be inconsistent is the "generator-validator gap," where LLMs deem their own answers incorrect.

🎯 We demonstrate that ranking-based discriminator training can significantly reduce this gap, and improvements on one task often generalize to others!

🧵👇

A visualization of the generator-validator gap, where the LM likelihoods of for the generator and discriminator forms of questions are poorly correlated.

Aligning the validator and generator rankings can fix it!

4 10

Reposted by Juan Diego Rodriguez (@ COLM 2025)

Maria Antoniak @mariaa.bsky.social · 1d

Here’s a #COLM2025 feed!

Pin it 📌 to follow along with the conference this week!

2 17 24

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · 1d

en.m.wikipedia.org/wiki/Strahle...

Strahler number - Wikipedia

en.m.wikipedia.org

1

Reposted by Juan Diego Rodriguez (@ COLM 2025)

Jessy Li @jessyjli.bsky.social · 2d

On my way to #COLM2025 🍁

Check out jessyli.com/colm2025

QUDsim: Discourse templates in LLM stories arxiv.org/abs/2504.09373

EvalAgent: retrieval-based eval targeting implicit criteria arxiv.org/abs/2504.15219

RoboInstruct: code generation for robotics with simulators arxiv.org/abs/2405.20179

4 12

Reposted by Juan Diego Rodriguez (@ COLM 2025)

Kyle Mahowald (COLM 2025) @kmahowald.bsky.social · 2d

I’m at #COLM2025 from Wed with:

@siyuansong.bsky.social Tue am introspection arxiv.org/abs/2503.07513

@qyao.bsky.social Wed am controlled rearing: arxiv.org/abs/2503.20850

@sashaboguraev.bsky.social INTERPLAY ling interp: arxiv.org/abs/2505.16002

I’ll talk at INTERPLAY too. Come say hi!

Language Models Fail to Introspect About Their Knowledge of Language

There has been recent interest in whether large language models (LLMs) can introspect about their own internal states. Such abilities would make LLMs more interpretable, and also validate the use of s...

arxiv.org

1 6 20

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · 2d

Excited to present this at COLM tomorrow! (Tuesday, 11:00 AM poster session)

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · Apr 16

One of the ways that LLMs can be inconsistent is the "generator-validator gap," where LLMs deem their own answers incorrect.

🎯 We demonstrate that ranking-based discriminator training can significantly reduce this gap, and improvements on one task often generalize to others!

🧵👇

2 3

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · 2d

Yes, smartphones are a great example.
As far as computer technology more generally, they are often invisible to many people... They do not realize that our modern world would just stop working without them.

1 1

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · 2d

(honest question, genuinely curious about your opinion)-- do you think text/image/video generation has improved people's well-being directly in certain ways they are ignoring? (people who are not programmers or researchers)

1 1

Reposted by Juan Diego Rodriguez (@ COLM 2025)

Social Media Lab @socialmedialab.ca · 2d

🤖 Yeah, this place, like most of social media, is crawling with Russians, Chinese, Israelis, and others running their games. (See: readsludge.com/2025/09/15/d... and www.voanews.com/a/bluesky-co...)

Wish we had more time to hunt the bots. Stay sharp out there.

7 15

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · 2d

👀🥳

Grace Lindsay @neurograce.bsky.social · 2d

If people have interest in Marr's levels and AI, we're working on updating this: arxiv.org/abs/2408.12664

Multilevel Interpretability Of Artificial Neural Networks: Leveraging Framework And Methods From Neuroscience

As deep learning systems are scaled up to many billions of parameters, relating their internal structure to external behaviors becomes very challenging. Although daunting, this problem is not new: Neu...

arxiv.org

1

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · 3d

👀

Shravan Vasishth @shravanvasishth.bsky.social · Mar 20

Computational Psycholinguistics Meeting 2025

cpl2025.sites.uu.nl

When: December 18–19, 2025

Where: Utrecht, the Netherlands

Abstract submission deadline: June 15, 2025

Organizers: Jakub Dotlačil, Lena Jäger, Bruno Nicenboim, Ece Takmaz

Computational Psycholinguistics Meeting 2025 | Universiteit Utrecht

Universiteit Utrecht

cpl2025.sites.uu.nl

2

Reposted by Juan Diego Rodriguez (@ COLM 2025)

Bruno J. Navarro @brunojnavarro.bsky.social · 3d

The Nobel prize winner Maria Ressa has said Americans are like “deer in the headlights” amid the collapse of US institutions and free speech under the Trump administration, particularly after Jimmy Kimmel’s suspension.

Americans are ‘deer in the headlights’ in face of Trump assault on free speech, Maria Ressa tells Jon Stewart

Nobel prize winner says US institutions have collapsed much quicker than expected under the Trump administration

www.theguardian.com

6 110 280

Reposted by Juan Diego Rodriguez (@ COLM 2025)

Kristina Cooke @kristinacooke.bsky.social · 4d

I spoke to a Venezuelan woman who was arrested in this raid and later released with her 4yo son. She said agents broke down their door, pointed guns at them and made sexualized remarks about Venezuelan women. When she returned to her apartment it was boarded up and all her possessions were gone.

Ted Hesson @tedhesson.bsky.social · 4d

US Border Patrol raid sweeps in citizens, families as Chicago crackdown intensifies, w/ @reneehickman.bsky.social @kristinacooke.bsky.social www.reuters.com/world/us/us-...

US Border Patrol raid sweeps in citizens, families as Chicago crackdown intensifies

U.S. Border Patrol agents deployed to Chicago led a late-night raid on an apartment building this week, rappelling from helicopters onto rooftops and breaking down doors in an operation authorities said targeted gang members but which swept up U.S. citizens and families.

www.reuters.com

240 3.3K 5.6K

Reposted by Juan Diego Rodriguez (@ COLM 2025)

clyde bruckman's ghost's final repost says free link @ultralaser.bsky.social · 4d

terrible things are happening outside. poor helpless people are being dragged out of their homes. families are torn apart. men, women, and children are separated. children come home from school to find that their parents have disappeared.

diary of anne frank, january 13, 1943.

18 1.5K 2.9K

Reposted by Juan Diego Rodriguez (@ COLM 2025)

Aaron Roth @aaroth.bsky.social · 3d

One more thought: AI tools are a very useful research accelerator for an expert, and I plan to use them whenever I can. But at the moment it is very easy to be led down false paths if you let them get ahead of yourself and lure you too far from your expertise.

1 2 10

Reposted by Juan Diego Rodriguez (@ COLM 2025)

Sheridan Feucht @ COLM @sfeucht.bsky.social · Jun 24

Nikhil's recent paper is a tour de force in causal analysis! They show that LLMs keep track of what characters know in a story using "pointer" mechanisms. Definitely worth checking out.

nikhil07prakash.bsky.social @nikhil07prakash.bsky.social · Jun 24

How do language models track mental states of each character in a story, often referred to as Theory of Mind?

We reverse-engineered how LLaMA-3-70B-Instruct handles a belief-tracking task and found something surprising: it uses mechanisms strikingly similar to pointer variables in C programming!

2 4

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · 4d

I’m excited for COLM this week!

Looking forward to chatting with people about interpretability, data efficient training, cog sci and LLM consistency.

1 4

Reposted by Juan Diego Rodriguez (@ COLM 2025)

Chris Bertram @crookedfootball.bsky.social · 4d

Stefan Zweig, The World of Yesterday, p. 436

1 21 89

Reposted by Juan Diego Rodriguez (@ COLM 2025)

Ethan Mollick @emollick.bsky.social · 14d

Some important findings in this paper:
1) Working with AI boosts the performance of people solving math, science & ethics questions
2) The biggest boost is for the hardest problems
3) High performers remain highest performing, but low performers gain more
4) People who are good with AI gain most

2 24 97

Reposted by Juan Diego Rodriguez (@ COLM 2025)

Acyn @acyn.bsky.social · 4d

Abughazaleh: I think Kristi Noem should be tried at The Hague. And if the response from ICE to people exercising their first amendment right is to drive vehicles through them, they should not be an agency in the US.

670 6.9K 25K

Reposted by Juan Diego Rodriguez (@ COLM 2025)

𝕮 @chrisshank.com · 5d

The best writing I’ve seen on this topic is the essay “Technically Radical: On the Unrecognized Potential of Tech Workers and Hackers” by @mutual-a.bsky.social

wedontagree.net/technically-...

“Given all this, I posit that the crux of the conflict today is, contra Karl Marx, not over wage relations. Rather it’s a conflict over what technology is developed and how it is deployed (conflicts over wage relations are merely a subset of this broader struggle). And while anyone who wants can play a part, those with technical skills and scientific knowledge have a key role to play.”

1 2 12

Reposted by Juan Diego Rodriguez (@ COLM 2025)

Tom (kid codger) O'Neill @doctecazoid.bsky.social · 6d

Gift 🎁 Article

www.nytimes.com/2025/09/30/t...

12 30

Reposted by Juan Diego Rodriguez (@ COLM 2025)

Jason Koebler @jasonkoebler.bsky.social · 6d

i made this meme which is better than the article:

trade meme. open ai receives: total sum of creative output from all humanity, $500 billion valuation
you receive: polluted internet, polluted world, collapse of society and nature of truth, no jobs, can put your face in my slop app

5 220 910