Lightnews — Scholar-powered news

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Aug 6

Tubingen just got ultra-exciting :D

Maksym Andriushchenko @maksym-andr.bsky.social · Aug 6

🚨 Incredibly excited to share that I'm starting my research group focusing on AI safety and alignment at the ELLIS Institute Tübingen and Max Planck Institute for Intelligent Systems in September 2025! 🚨

1/n

2

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Jul 16

Submit your latest and greatest papers to the hottest workshop on the block---on cognitive interpretability! 🔥

Jennifer Hu @ COLM (recruiting PhDs and postdocs!) @jennhu.bsky.social · Jul 16

Excited to announce the first workshop on CogInterp: Interpreting Cognition in Deep Learning Models @ NeurIPS 2025! 📣

How can we interpret the algorithms and representations underlying complex behavior in deep learning models?

🌐 coginterp.github.io/neurips2025/

1/4

Home

First Workshop on Interpreting Cognition in Deep Learning Models (NeurIPS 2025)

coginterp.github.io

1 7

Reposted by Ekdeep Singh @ ICML

Jennifer Hu @ COLM (recruiting PhDs and postdocs!) @jennhu.bsky.social · Jul 16

Excited to announce the first workshop on CogInterp: Interpreting Cognition in Deep Learning Models @ NeurIPS 2025! 📣

How can we interpret the algorithms and representations underlying complex behavior in deep learning models?

🌐 coginterp.github.io/neurips2025/

1/4

Home

First Workshop on Interpreting Cognition in Deep Learning Models (NeurIPS 2025)

coginterp.github.io

1 19 58

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Jul 12

I'll be at ICML beginning this Monday---hit me up if you'd like to chat!

2

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Jul 8

Our recent paper may be relevant (arxiv.org/abs/2506.17859)! We take a rational analysis lens to argue that beyond simplicity bias, we must model how well a hypothesis explains the data to yield a *predictive* account of behavior in neural nets! This helps explain learning of more complex functions.

In-Context Learning Strategies Emerge Rationally

Recent work analyzing in-context learning (ICL) has identified a broad set of strategies that describe model behavior in different experimental conditions. We aim to unify these findings by asking why...

arxiv.org

3

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Jun 28

I am definitely not biased :)

1

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Jun 28

Check out one of the most exciting papers of the year! :D

Daniel Wurgaft @danielwurgaft.bsky.social · Jun 28

🚨New paper! We know models learn distinct in-context learning strategies, but *why*? Why generalize instead of memorize to lower loss? And why is generalization transient?

Our work explains this & *predicts Transformer behavior throughout training* without its weights! 🧵

1/

1 2

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Apr 29

I'll be attending NAACL at New Mexico beginning today---hit me up if you'd like to chat!

1

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Mar 7

Check out our new work on the duality between SAEs and how concepts are organized in model representations!

Sumedh Hindupur @sumedh-hindupur.bsky.social · Mar 7

New preprint alert!
Do Sparse Autoencoders (SAEs) reveal all concepts a model relies on? Or do they impose hidden biases that shape what we can even detect?
We uncover a fundamental duality between SAE architectures and concepts they can recover.
Link: arxiv.org/abs/2503.01822

2

Reposted by Ekdeep Singh @ ICML

Sumedh Hindupur @sumedh-hindupur.bsky.social · Mar 7

New preprint alert!
Do Sparse Autoencoders (SAEs) reveal all concepts a model relies on? Or do they impose hidden biases that shape what we can even detect?
We uncover a fundamental duality between SAE architectures and concepts they can recover.
Link: arxiv.org/abs/2503.01822

1 2 13

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Feb 17

Oh god, I had no clue this happened. :/

Does bsky not do GIFs?

1

Reposted by Ekdeep Singh @ ICML

Andrew Lampinen @lampinen.bsky.social · Feb 16

Very nice paper; quite aligned with the ideas in our recent perspective on the broader spectrum of ICL. In large models, there's probably a complicated, context dependent mixture of strategies that get learned, not a single ability.

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Feb 16

New paper–accepted as *spotlight* at #ICLR2025! 🧵👇

We show a competition dynamic between several algorithms splits a toy model’s ICL abilities into four broad phases of train/test settings! This means ICL is akin to a mixture of different algorithms, not a monolithic ability.

1 2 17

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Feb 16

Paper co-led with @corefpark, and in collaboration with the ever-awesome @PresItamar and @Hidenori8Tanaka! :)

arXiv link: arxiv.org/abs/2412.01003

Competition Dynamics Shape Algorithmic Phases of In-Context Learning

In-Context Learning (ICL) has significantly expanded the general-purpose nature of large language models, allowing them to adapt to novel tasks using merely the inputted context. This has motivated a ...

arxiv.org

3

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Feb 16

Dynamics of attention maps is particularly striking in this competition: e.g., with high diversity, we see a bigram counter forming, but memorization eventually occurs and the pattern becomes uniform! This means models can remove learned components if they are not useful anymore!

1 3

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Feb 16

Beyond corroborating our phase diagram, LIA confirms a persistent competition underlies ICL: once the induction head forms, with enough diversity bigram-based inference takes over; under low diversity, memorization occurs faster, yielding a bigram-based retrieval solution!

1 1

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Feb 16

Given optimization and data diversity are independent axes, we then ask if these forces race against each other to yield our observed algorithmic phases. We propose a tool called LIA (linear interpolation of algorithms) for this analysis.

1 2

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Feb 16

The tests check out! We see before/after a critical # of train steps are met (where induction head emerges), the model relies on unigram/bigram stats. With few chains (less diversity), there is retrieval behavior: we can literally reconstruct transition matrices from MLP neurons!

1 1

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Feb 16

To test the above claim, we compute the effect of shuffling a sequence on next-token probs: this breaks bigram stats, but preserves unigrams. We check how “retrieval-like” or memorization-based model behavior is by comparing predicted transitions’ KL to a random set of chains.

1 2

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Feb 16

We claim four algorithms explain the model’s behavior in diff. train/test settings. These algos compute uni-/bi- gram frequency statistics of an input to either *retrieve* a memorized chain or to in-context *infer* the chain used to define the input: latter performs better OOD!

1 2

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Feb 16

We analyze models trained on a fairly simple task: learning to simulate a *finite mixture* of Markov chains. The sequence modeling nature of this task makes it a better abstraction for studying ICL abilities in LMs, (compared to abstractions of few-shot learning like linear reg.)

1 3

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Feb 16

New paper–accepted as *spotlight* at #ICLR2025! 🧵👇

We show a competition dynamic between several algorithms splits a toy model’s ICL abilities into four broad phases of train/test settings! This means ICL is akin to a mixture of different algorithms, not a monolithic ability.

2 5 32

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Feb 12

bsky.app/profile/ajyl...

Andrew Lee @ajyl.bsky.social · Jan 5

New paper <3
Interested in inference-time scaling? In-context Learning? Mech Interp?
LMs can solve novel in-context tasks, with sufficient examples (longer contexts). Why? Bc they dynamically form *in-context representations*!
1/N

1 1

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Feb 12

Some threads about recent works ;-)

bsky.app/profile/ekde...

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Nov 10

Paper alert—accepted as a *Spotlight* at NeurIPS!🧵

Building on our work relating emergent abilities to task compositionality, we analyze the *learning dynamics* of compositional abilities & find there exist latent interventions that can elicit them much before input prompting works! 🤯

1

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Feb 12

Our group, funded by NTT Research, Inc., uniquely bridges industry and academia. We integrate approaches from physics, neuroscience, and psychology while grounding our work in empirical AI research.

Apply from "Physics of AI Group Research Intern": careers.ntt-research.com

CareerPortal

careers.ntt-research.com

1

Ekdeep Singh @ ICML @ekdeepl.bsky.social · Feb 12

Check out our recent work, including 5 ICLR 2025 papers all (co)led by amazing past and current interns: sites.google.com/view/htanaka...

Hidenori Tanaka

Hidenori Tanaka Group Leader, Science of Intelligence for AlignmentCBS-NTT Program in Physics of Intelligence, Harvard University Google Scholar email: hidenori_tanaka [at] fas.harvard.edu, twitter ...

sites.google.com

1