Lightnews — Scholar-powered news

Reposted by Koyena Pal

Aaron Mueller @amuuueller.bsky.social · 7d

What's the right unit of analysis for understanding LLM internals? We explore in our mech interp survey (a major update from our 2024 ms).

We’ve added more recent work and more immediately actionable directions for future work. Now published in Computational Linguistics!

2 14 38

Koyena Pal @koyena.bsky.social · Jun 30

🚨 Registration is live! 🚨

The New England Mechanistic Interpretability (NEMI) Workshop is happening Aug 22nd 2025 at Northeastern University!

A chance for the mech interp community to nerd out on how models really work 🧠🤖

🌐 Info: nemiconf.github.io/summer25/
📝 Register: forms.gle/v4kJCweE3UUH...

8 10

Reposted by Koyena Pal

Sheridan Feucht @ COLM @sfeucht.bsky.social · Apr 7

[📄] Are LLMs mindless token-shifters, or do they build meaningful representations of language? We study how LLMs copy text in-context, and physically separate out two types of induction heads: token heads, which copy literal tokens, and concept heads, which copy word meanings.

1 20 77

Koyena Pal @koyena.bsky.social · Mar 5

The other key tasks are model search and benchmarking, with important applications like document generation and auditing.

Read more in our paper (with @davidbau.bsky.social
and Renée Miller) here: arxiv.org/abs/2403.02327

Excited to share that this is accepted to #EDBT2025! 🎉

🧵5/5

Model Lakes

Given a set of deep learning models, it can be hard to find models appropriate to a task, understand the models, and characterize how models are different one from another. Currently, practitioners re...

arxiv.org

1

Koyena Pal @koyena.bsky.social · Mar 5

The second one is model versioning — where the aim is to map a model’s position within a lake of models, capturing these relationships using directed model graphs.

Other tasks, like model tree heritage recovery and differentiating outputs from various LLMs, are part of model versioning.

🧵4/5

1

Koyena Pal @koyena.bsky.social · Mar 5

We see four major tasks for Model Lakes.

The first is model attribution — tracing & understanding a model's output through attack techniques like model inversion (recovering user inputs) and interpretability methods like reverse engineering to analyze model behavior.

bsky.app/profile/srus...

🧵3/5

Sasha Rush @srushnlp.bsky.social · Nov 20

Talk: Inverting Language Models

youtube.com/watch?v=lguT...

Techniques for extracting text from vector databases and prompts from LLM APIs.

Inverting Language Models: Raw Text from Vectors and LLM APIs

Talk on inverting language models, i.e. * Can we extract the raw text out of vector databases? * Can we extract system prompts out of LLM APIs, even when jai...

https://youtube.com/watch?v=lguThumFUl4…

1

Koyena Pal @koyena.bsky.social · Mar 5

Model Lake is a system containing numerous heterogenous pre-trained models and related data in their natural formats. This concept is inspired from data lakes, which collect raw, unstructured data at scale.

By addressing shared challenges across research, we can unlock meaningful solutions. 👇

🧵2/5

1

Koyena Pal @koyena.bsky.social · Mar 5

🚀 How would you know what model to use? 🤗

With millions of models emerging rapidly, how do we verify, track, and find the right one?

We survey and formalize Model Lakes 🌊🤖 — a framework to structure, navigate, and make sense of this landscape.

Website: lakes.baulab.info

#AI #Database

🧵1/5

Model Lakes Design. A model lake stores models and processes them using techniques, like inference, interpretability, weight-space modeling and indexing to support various user interactions. It generates outputs like version graphs, model cards and ranked models, refining them into human-readable results, as shown on the figure's right side.

1 3 9

Reposted by Koyena Pal

David Bau @davidbau.bsky.social · Dec 7

PhD Applicants: remember that the Northeastern Computer Science PhD application deadline is Dec 15.

It's a terrific time to do a PhD, with so many interesting things happening in AI.

Apply here:

www.khoury.northeastern.edu/apply/phd-ap...

PhD Apply - Khoury College of Computer Sciences

www.khoury.northeastern.edu

5 33

Reposted by Koyena Pal

NDIF Team @ndif-team.bsky.social · Dec 10

More big news! Applications are open for the NDIF Summer Engineering Fellowship—an opportunity to work on cutting-edge AI research infrastructure this summer in Boston! 🚀

1 6 9