Raj Movva
@rajmovva.bsky.social
230 followers 130 following 46 posts
NLP, ML & society, healthcare. PhD student at Berkeley, previously CS at MIT. https://rajivmovva.com/
Posts Media Videos Starter Packs
Pinned
rajmovva.bsky.social
💡New preprint & Python package: We use sparse autoencoders to generate hypotheses from large text datasets.

Our method, HypotheSAEs, produces interpretable text features that predict a target variable, e.g. features in news headlines that predict engagement. 🧵1/
Reposted by Raj Movva
vauhinivara.bsky.social
I've been working for many months on this article on Silicon Valley's under-the-radar role in bringing AI into schools across the US. I really hope you'll read it — here's a gift link — but I'll tell you some of the highlights in this thread. (1/x)
How Chatbots and AI Are Already Transforming Kids' Classrooms
Educators across the country are bringing chatbots into their lesson plans. Will it help kids learn or is it just another doomed ed-tech fad?
www.bloomberg.com
Reposted by Raj Movva
emmapierson.bsky.social
🚨 New postdoc position in our lab at Berkeley EECS! 🚨

(please reshare)

We seek applicants with experience in language modeling who are excited about high-impact applications in the health and social sciences!

More info in thread

1/3
rajmovva.bsky.social
What a crossover!
rajmovva.bsky.social
This is great, & there's clear analogy to the burgeoning mechanism design community for AI alignment: who is providing RLHF votes? Do their preferences reflect yours? Discussions about social choice and collective constitutions are interesting, but "what and who is in the data" is just as important.
nkgarg.bsky.social
New piece, out in the Sigecom Exchanges! It's my first solo-author piece, and the closest thing I've written to being my "manifesto." #econsky #ecsky
arxiv.org/abs/2507.03600
Screenshot of paper abstract, with text: "A core ethos of the Economics and Computation (EconCS) community is that people have complex private preferences and information of which the central planner is unaware, but which an appropriately designed mechanism can uncover to improve collective decisionmaking. This ethos underlies the community’s largest deployed success stories, from stable matching systems to participatory budgeting. I ask: is this choice and information aggregation “worth it”? In particular, I discuss how such systems induce heterogeneous participation: those already relatively advantaged are, empirically, more able to pay time costs and navigate administrative burdens imposed by the mechanisms. I draw on three case studies, including my own work – complex democratic mechanisms, resident crowdsourcing, and school matching. I end with lessons for practice and research, challenging the community to help reduce participation heterogeneity and design and deploy mechanisms that meet a “best of both worlds” north star: use preferences and information from those who choose to participate, but provide a “sufficient” quality of service to those who do not."
rajmovva.bsky.social
This is amazing
rajmovva.bsky.social
They're in their move fast and break things era 🙃
rajmovva.bsky.social
This capability of discovering unknown concepts opens many opportunities for applied machine learning. We can design better whitebox predictors, better audit high-stakes models for bias, and generate hypotheses for CSS research. More broadly, SAEs can help bridge the "prediction-explanation" gap.
rajmovva.bsky.social
These tasks lie in contrast to probing, where we're trying to predict the presence of a *known* concept; and steering, where we're trying to include a *known* concept in an LLM output. SAEs lose to simple baselines on these tasks. (2 good papers on this: "AxBench" and Kantamneni, Engels et al. 2025)
rajmovva.bsky.social
How do we reconcile our view with recent negative results? Our key distinction is that SAEs are useful when you don't know what you're looking for: how does my text classifier predict which headlines will go viral? How does my LLM perform addition? These are "unknown unknowns".
rajmovva.bsky.social
📢New POSITION PAPER: Use Sparse Autoencoders to Discover Unknown Concepts, Not to Act on Known Concepts

Despite recent results, SAEs aren't dead! They can still be useful to mech interp, and also much more broadly: across FAccT, computational social science, and ML4H. 🧵
Reposted by Raj Movva
kennypeng.bsky.social
Are LLMs correlated when they make mistakes? In our new ICML paper, we answer this question using responses of >350 LLMs. We find substantial correlation. On one dataset, LLMs agree on the wrong answer ~2x more than they would at random. 🧵(1/7)

arxiv.org/abs/2506.07962
Heat map showing that more accurate models have more correlated errors.
rajmovva.bsky.social
ARR question: If I submit to a cycle, how long do those reviews "last"? e.g. if I submit to the July cycle but can't go to AACL, can I commit my July reviews to the conference associated with the next (October) cycle? @aclrollingreview.bsky.social
Reposted by Raj Movva
dmshanmugam.bsky.social
New work 🎉: conformal classifiers return sets of classes for each example, with a probabilistic guarantee the true class is included. But these sets can be too large to be useful.

In our #CVPR2025 paper, we propose a method to make them more compact without sacrificing coverage.
A gif explaining the value of test-time augmentation to conformal classification. The video begins with an illustration of TTA reducing the size of the  predicted set of classes for a dog image, and goes on to explain that this is because TTA promotes the true class's predicted probability to be higher, even when it's predicted to be unlikely.
rajmovva.bsky.social
I would like to spend up to 5-10 hours to learn about basic macroeconomics (I know it's maybe fake, but setting that aside for a moment...). Does anyone have any recommendations?
rajmovva.bsky.social
Huge congrats, Marianne!!
rajmovva.bsky.social
I find that I've actually gone out of my way to stop using bullet points in reviews now because Any Review With Bullet Points is a Bot 🥲
rajmovva.bsky.social
People love to hate on the transition 3-pointer as evidence of how the 3 has ruined basketball, but I think it's usually just the right play... if you have numbers in transition, your teammate can easily get a putback off a miss, so might as well try the 3
rajmovva.bsky.social
We'll present HypotheSAEs at ICML this summer! 🎉
Draft: arxiv.org/abs/2502.04382

We're continuing to cook up new updates for our Python package: github.com/rmovva/Hypot...

(Recently, "Matryoshka SAEs", which help extract coarse and granular concepts without as much hyperparameter fiddling.)
rajmovva.bsky.social
So awesome, congrats Lucy!!! 🧀
rajmovva.bsky.social
Did you take the hot air balloon pic?!
rajmovva.bsky.social
Check out Erica's nice work. They not only develop a well-grounded model for disparities in disease progression, but also conduct experiments with real NYP cardiology data! (Anyone who works in healthcare knows how much of a feat it is to use data other than MIMIC)
ericachiang.bsky.social
I’m really excited to share the first paper of my PhD, “Learning Disease Progression Models That Capture Health Disparities” (accepted at #CHIL2025)! ✨ 1/

📄: arxiv.org/abs/2412.16406