Ekdeep Singh @ ICML
@ekdeepl.bsky.social
260 followers 380 following 48 posts
Postdoc at CBS, Harvard University (New around here)
Posts Media Videos Starter Packs
ekdeepl.bsky.social
Tubingen just got ultra-exciting :D
maksym-andr.bsky.social
🚨 Incredibly excited to share that I'm starting my research group focusing on AI safety and alignment at the ELLIS Institute Tübingen and Max Planck Institute for Intelligent Systems in September 2025! 🚨

1/n
ekdeepl.bsky.social
Submit your latest and greatest papers to the hottest workshop on the block---on cognitive interpretability! 🔥
jennhu.bsky.social
Excited to announce the first workshop on CogInterp: Interpreting Cognition in Deep Learning Models @ NeurIPS 2025! 📣

How can we interpret the algorithms and representations underlying complex behavior in deep learning models?

🌐 coginterp.github.io/neurips2025/

1/4
Home
First Workshop on Interpreting Cognition in Deep Learning Models (NeurIPS 2025)
coginterp.github.io
Reposted by Ekdeep Singh @ ICML
jennhu.bsky.social
Excited to announce the first workshop on CogInterp: Interpreting Cognition in Deep Learning Models @ NeurIPS 2025! 📣

How can we interpret the algorithms and representations underlying complex behavior in deep learning models?

🌐 coginterp.github.io/neurips2025/

1/4
Home
First Workshop on Interpreting Cognition in Deep Learning Models (NeurIPS 2025)
coginterp.github.io
ekdeepl.bsky.social
I'll be at ICML beginning this Monday---hit me up if you'd like to chat!
ekdeepl.bsky.social
Our recent paper may be relevant (arxiv.org/abs/2506.17859)! We take a rational analysis lens to argue that beyond simplicity bias, we must model how well a hypothesis explains the data to yield a *predictive* account of behavior in neural nets! This helps explain learning of more complex functions.
In-Context Learning Strategies Emerge Rationally
Recent work analyzing in-context learning (ICL) has identified a broad set of strategies that describe model behavior in different experimental conditions. We aim to unify these findings by asking why...
arxiv.org
ekdeepl.bsky.social
I am definitely not biased :)
ekdeepl.bsky.social
Check out one of the most exciting papers of the year! :D
danielwurgaft.bsky.social
🚨New paper! We know models learn distinct in-context learning strategies, but *why*? Why generalize instead of memorize to lower loss? And why is generalization transient?

Our work explains this & *predicts Transformer behavior throughout training* without its weights! 🧵

1/
ekdeepl.bsky.social
I'll be attending NAACL at New Mexico beginning today---hit me up if you'd like to chat!
ekdeepl.bsky.social
Check out our new work on the duality between SAEs and how concepts are organized in model representations!
sumedh-hindupur.bsky.social
New preprint alert!
Do Sparse Autoencoders (SAEs) reveal all concepts a model relies on? Or do they impose hidden biases that shape what we can even detect?
We uncover a fundamental duality between SAE architectures and concepts they can recover.
Link: arxiv.org/abs/2503.01822
Reposted by Ekdeep Singh @ ICML
sumedh-hindupur.bsky.social
New preprint alert!
Do Sparse Autoencoders (SAEs) reveal all concepts a model relies on? Or do they impose hidden biases that shape what we can even detect?
We uncover a fundamental duality between SAE architectures and concepts they can recover.
Link: arxiv.org/abs/2503.01822
ekdeepl.bsky.social
Oh god, I had no clue this happened. :/

Does bsky not do GIFs?
Reposted by Ekdeep Singh @ ICML
lampinen.bsky.social
Very nice paper; quite aligned with the ideas in our recent perspective on the broader spectrum of ICL. In large models, there's probably a complicated, context dependent mixture of strategies that get learned, not a single ability.
ekdeepl.bsky.social
New paper–accepted as *spotlight* at #ICLR2025! 🧵👇

We show a competition dynamic between several algorithms splits a toy model’s ICL abilities into four broad phases of train/test settings! This means ICL is akin to a mixture of different algorithms, not a monolithic ability.
ekdeepl.bsky.social
Dynamics of attention maps is particularly striking in this competition: e.g., with high diversity, we see a bigram counter forming, but memorization eventually occurs and the pattern becomes uniform! This means models can remove learned components if they are not useful anymore!
ekdeepl.bsky.social
Beyond corroborating our phase diagram, LIA confirms a persistent competition underlies ICL: once the induction head forms, with enough diversity bigram-based inference takes over; under low diversity, memorization occurs faster, yielding a bigram-based retrieval solution!
ekdeepl.bsky.social
Given optimization and data diversity are independent axes, we then ask if these forces race against each other to yield our observed algorithmic phases. We propose a tool called LIA (linear interpolation of algorithms) for this analysis.
ekdeepl.bsky.social
The tests check out! We see before/after a critical # of train steps are met (where induction head emerges), the model relies on unigram/bigram stats. With few chains (less diversity), there is retrieval behavior: we can literally reconstruct transition matrices from MLP neurons!
ekdeepl.bsky.social
To test the above claim, we compute the effect of shuffling a sequence on next-token probs: this breaks bigram stats, but preserves unigrams. We check how “retrieval-like” or memorization-based model behavior is by comparing predicted transitions’ KL to a random set of chains.
ekdeepl.bsky.social
We claim four algorithms explain the model’s behavior in diff. train/test settings. These algos compute uni-/bi- gram frequency statistics of an input to either *retrieve* a memorized chain or to in-context *infer* the chain used to define the input: latter performs better OOD!
ekdeepl.bsky.social
We analyze models trained on a fairly simple task: learning to simulate a *finite mixture* of Markov chains. The sequence modeling nature of this task makes it a better abstraction for studying ICL abilities in LMs, (compared to abstractions of few-shot learning like linear reg.)
ekdeepl.bsky.social
New paper–accepted as *spotlight* at #ICLR2025! 🧵👇

We show a competition dynamic between several algorithms splits a toy model’s ICL abilities into four broad phases of train/test settings! This means ICL is akin to a mixture of different algorithms, not a monolithic ability.
ekdeepl.bsky.social
ajyl.bsky.social
New paper <3
Interested in inference-time scaling? In-context Learning? Mech Interp?
LMs can solve novel in-context tasks, with sufficient examples (longer contexts). Why? Bc they dynamically form *in-context representations*!
1/N
ekdeepl.bsky.social
Some threads about recent works ;-)

bsky.app/profile/ekde...
ekdeepl.bsky.social
Paper alert—accepted as a *Spotlight* at NeurIPS!🧵

Building on our work relating emergent abilities to task compositionality, we analyze the *learning dynamics* of compositional abilities & find there exist latent interventions that can elicit them much before input prompting works! 🤯
ekdeepl.bsky.social
Our group, funded by NTT Research, Inc., uniquely bridges industry and academia. We integrate approaches from physics, neuroscience, and psychology while grounding our work in empirical AI research.

Apply from "Physics of AI Group Research Intern": careers.ntt-research.com
CareerPortal
careers.ntt-research.com