Lightnews — Scholar-powered news

Raj Movva

@rajmovva.bsky.social

230 followers 130 following 46 posts

NLP, ML & society, healthcare. PhD student at Berkeley, previously CS at MIT. https://rajivmovva.com/

Posts Media Videos Starter Packs

Pinned

Raj Movva @rajmovva.bsky.social · Mar 18

💡New preprint & Python package: We use sparse autoencoders to generate hypotheses from large text datasets.

Our method, HypotheSAEs, produces interpretable text features that predict a target variable, e.g. features in news headlines that predict engagement. 🧵1/

Reposted by Raj Movva

Vauhini Vara @vauhinivara.bsky.social · Sep 2

I've been working for many months on this article on Silicon Valley's under-the-radar role in bringing AI into schools across the US. I really hope you'll read it — here's a gift link — but I'll tell you some of the highlights in this thread. (1/x)

How Chatbots and AI Are Already Transforming Kids' Classrooms

Educators across the country are bringing chatbots into their lesson plans. Will it help kids learn or is it just another doomed ed-tech fad?

www.bloomberg.com

Reposted by Raj Movva

Emma Pierson @emmapierson.bsky.social · Aug 22

🚨 New postdoc position in our lab at Berkeley EECS! 🚨

(please reshare)

We seek applicants with experience in language modeling who are excited about high-impact applications in the health and social sciences!

More info in thread

1/3

Raj Movva @rajmovva.bsky.social · Aug 19

What a crossover!

Raj Movva @rajmovva.bsky.social · Aug 18

This is great, & there's clear analogy to the burgeoning mechanism design community for AI alignment: who is providing RLHF votes? Do their preferences reflect yours? Discussions about social choice and collective constitutions are interesting, but "what and who is in the data" is just as important.

Nikhil Garg @nkgarg.bsky.social · Aug 11

New piece, out in the Sigecom Exchanges! It's my first solo-author piece, and the closest thing I've written to being my "manifesto." #econsky #ecsky
arxiv.org/abs/2507.03600

Screenshot of paper abstract, with text: "A core ethos of the Economics and Computation (EconCS) community is that people have complex private preferences and information of which the central planner is unaware, but which an appropriately designed mechanism can uncover to improve collective decisionmaking. This ethos underlies the community’s largest deployed success stories, from stable matching systems to participatory budgeting. I ask: is this choice and information aggregation “worth it”? In particular, I discuss how such systems induce heterogeneous participation: those already relatively advantaged are, empirically, more able to pay time costs and navigate administrative burdens imposed by the mechanisms. I draw on three case studies, including my own work – complex democratic mechanisms, resident crowdsourcing, and school matching. I end with lessons for practice and research, challenging the community to help reduce participation heterogeneity and design and deploy mechanisms that meet a “best of both worlds” north star: use preferences and information from those who choose to participate, but provide a “sufficient” quality of service to those who do not."

Raj Movva @rajmovva.bsky.social · Aug 16

This is amazing

Raj Movva @rajmovva.bsky.social · Aug 6

They're in their move fast and break things era 🙃

Raj Movva @rajmovva.bsky.social · Aug 5

This take emerged organically from just how well our method on SAEs for hypothesis generation (HypotheSAEs) performed, which surprised all of us!

See the paper arxiv.org/abs/2506.23845

Thanks @kennypeng.bsky.social, Jon, @emmapierson.bsky.social, @nkgarg.bsky.social for another nice collaboration.

Use Sparse Autoencoders to Discover Unknown Concepts, Not to Act on Known Concepts

While sparse autoencoders (SAEs) have generated significant excitement, a series of negative results have added to skepticism about their usefulness. Here, we establish a conceptual distinction that r...

Raj Movva @rajmovva.bsky.social · Aug 5

This capability of discovering unknown concepts opens many opportunities for applied machine learning. We can design better whitebox predictors, better audit high-stakes models for bias, and generate hypotheses for CSS research. More broadly, SAEs can help bridge the "prediction-explanation" gap.

Raj Movva @rajmovva.bsky.social · Aug 5

These tasks lie in contrast to probing, where we're trying to predict the presence of a *known* concept; and steering, where we're trying to include a *known* concept in an LLM output. SAEs lose to simple baselines on these tasks. (2 good papers on this: "AxBench" and Kantamneni, Engels et al. 2025)

Raj Movva @rajmovva.bsky.social · Aug 5

How do we reconcile our view with recent negative results? Our key distinction is that SAEs are useful when you don't know what you're looking for: how does my text classifier predict which headlines will go viral? How does my LLM perform addition? These are "unknown unknowns".

Raj Movva @rajmovva.bsky.social · Aug 5

📢New POSITION PAPER: Use Sparse Autoencoders to Discover Unknown Concepts, Not to Act on Known Concepts

Despite recent results, SAEs aren't dead! They can still be useful to mech interp, and also much more broadly: across FAccT, computational social science, and ML4H. 🧵

Raj Movva @rajmovva.bsky.social · Jul 12

Nice work! Cool to see that item difficulty predicts human-llm disagreement. We also studied similar questions with the DICES dataset: aclanthology.org/2024.emnlp-m...

Annotation alignment: Comparing LLM and human annotations of conversational safety

Rajiv Movva, Pang Wei Koh, Emma Pierson. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024.

aclanthology.org

Reposted by Raj Movva

Kenny Peng @kennypeng.bsky.social · Jul 3

Are LLMs correlated when they make mistakes? In our new ICML paper, we answer this question using responses of >350 LLMs. We find substantial correlation. On one dataset, LLMs agree on the wrong answer ~2x more than they would at random. 🧵(1/7)

arxiv.org/abs/2506.07962

Heat map showing that more accurate models have more correlated errors.

Reposted by Raj Movva

Ben Recht @beenwrekt.bsky.social · Jun 24

@jessica.bsky.social on individual reporting as a means to build collective knowledge.

Individual experiences and collective evidence

Jessica Dai on theory for the world as it could be

Raj Movva @rajmovva.bsky.social · Jun 17

ARR question: If I submit to a cycle, how long do those reviews "last"? e.g. if I submit to the July cycle but can't go to AACL, can I commit my July reviews to the conference associated with the next (October) cycle? @aclrollingreview.bsky.social

Reposted by Raj Movva

Divya Shanmugam @dmshanmugam.bsky.social · Jun 14

New work 🎉: conformal classifiers return sets of classes for each example, with a probabilistic guarantee the true class is included. But these sets can be too large to be useful.

In our #CVPR2025 paper, we propose a method to make them more compact without sacrificing coverage.

A gif explaining the value of test-time augmentation to conformal classification. The video begins with an illustration of TTA reducing the size of the predicted set of classes for a dog image, and goes on to explain that this is because TTA promotes the true class's predicted probability to be higher, even when it's predicted to be unlikely.

Raj Movva @rajmovva.bsky.social · Jun 5

I would like to spend up to 5-10 hours to learn about basic macroeconomics (I know it's maybe fake, but setting that aside for a moment...). Does anyone have any recommendations?

Raj Movva @rajmovva.bsky.social · Jun 5

Huge congrats, Marianne!!

Raj Movva @rajmovva.bsky.social · May 27

I find that I've actually gone out of my way to stop using bullet points in reviews now because Any Review With Bullet Points is a Bot 🥲

Raj Movva @rajmovva.bsky.social · May 10

People love to hate on the transition 3-pointer as evidence of how the 3 has ruined basketball, but I think it's usually just the right play... if you have numbers in transition, your teammate can easily get a putback off a miss, so might as well try the 3

Raj Movva @rajmovva.bsky.social · May 5

We'll present HypotheSAEs at ICML this summer! 🎉
Draft: arxiv.org/abs/2502.04382

We're continuing to cook up new updates for our Python package: github.com/rmovva/Hypot...

(Recently, "Matryoshka SAEs", which help extract coarse and granular concepts without as much hyperparameter fiddling.)

Raj Movva @rajmovva.bsky.social · May 5

So awesome, congrats Lucy!!! 🧀

Raj Movva @rajmovva.bsky.social · May 3

Yesterday's Game 6 was depressing, and this article precisely delineated the reasons why. And sometimes, a precise retelling of what you're feeling is all you need to feel better. www.nytimes.com/athletic/633... @thompsonscribe.bsky.social

These Warriors are old, tired and in trouble as Game 7 looms against Rockets

They're not done yet. Maybe a legendary performance awaits on Sunday. But the Warriors look like they're out of gas and out of answers.

www.nytimes.com

Raj Movva @rajmovva.bsky.social · May 3

Did you take the hot air balloon pic?!

Raj Movva @rajmovva.bsky.social · May 1

Check out Erica's nice work. They not only develop a well-grounded model for disparities in disease progression, but also conduct experiments with real NYP cardiology data! (Anyone who works in healthcare knows how much of a feat it is to use data other than MIMIC)

Erica Chiang @ericachiang.bsky.social · May 1

I’m really excited to share the first paper of my PhD, “Learning Disease Progression Models That Capture Health Disparities” (accepted at #CHIL2025)! ✨ 1/

📄: arxiv.org/abs/2412.16406