Lightnews — Scholar-powered news

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · 2d

Y’all, I am at #COLM this week, very excited to learn, and meet old and new friends. Please reach out on Whova!

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Jul 8

Check out the full paper here: www.arxiv.org/pdf/2506.17052 🎓 Work by Jingtong Su, @kempelab.bsky.social, @nyudatascience.bsky.social , @aiatmeta.bsky.social

www.arxiv.org

1

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Jul 8

Plus, we generate importance maps showing where in the transformer the concept is encoded — providing interpretable insights into model internals.

1

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Jul 8

SAMI: Diminishes or amplifies these modules to control the concept's influence

With SAMI, we can scale the importance of these modules — either amplifying or suppressing specific concepts.

1

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Jul 8

SAMD: Finds the attention heads most correlated with a concept

Using SAMD, we find that only a few attention heads are crucial for a wide range of concepts—confirming the sparse, modular nature of knowledge in transformers.

1

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Jul 8

How would you make an LLM "forget" the concept of dog — or any other arbitrary concept? 🐶❓

We introduce SAMD & SAMI — a novel, concept-agnostic approach to identify and manipulate attention modules in transformers.

1 1 12

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · May 1

Aligned Multi-Objective Optimization (A-🐮) has been accepted at #ICML2025! 🎉
We explore optimization scenarios where objectives align rather than conflict, introducing new scalable algorithms with theoretical guarantees. #MachineLearning #AI #Optimization

2 8

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Jan 22

🎉🎉 Our paper just got accepted to #ICLR2025! 🎉🎉

Byte-level LLMs without training and guaranteed performance? Curious how? Dive into our work! 📚✨

Paper: arxiv.org/abs/2410.09303
Github: github.com/facebookrese...

Screenshot of arxiv paper "EXACT BYTE-LEVEL PROBABILITIES FROM TOKENIZED LANGUAGE MODELS FOR FIM-TASKS AND MODEL ENSEMBLES."

12

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Dec 12

Thursday is busy:
9-11am I will be at the Meta AI Booth
12.30-2pm
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs (neurips.cc/virtual/2024...)
OR
End-To-End Causal Effect Estimation from Unstructured Natural Language Data (neurips.cc/virtual/2024...)

NeurIPS Poster Mission Impossible: A Statistical Perspective on Jailbreaking LLMsNeurIPS 2024

neurips.cc

3

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Dec 11

Starting with Fei-Fei Li’s talk 2.30, after that I will mostly be meeting people and wonder the poster sessions.

5

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Dec 10

Folks, I am posting my NeurIPS schedule daily in hopes to see folks, thanks @tkipf.bsky.social for the idea ;)

11-12.30 WiML round tables
1.30-4 Beyond Decoding, Tutorial

1 5

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Dec 6

I will be at #Neurips2024 next week to talk about these two papers and host a workshop on #NeuralCompression.

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Sep 26

🎉 Exciting News! 🎉
Two papers have been accepted at #NeurIPS2024 ! 🙌🏼 These papers are the first outcomes of my growing focus on LLMs. 🍾 Cheers to Nikita Dhawan and Jingtong Su + all involved collaborators: @cmaddis.bsky.social Leo Cotta, Rahul Krishnan, Julia Kempe

3

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Nov 28

next one on the list is Yury Polyanskiy's "Information Theory: From Coding to Learning" which will hopefully hit the shelfs in February... can not wait

2

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Nov 28

Pro-tip: Use massive black Friday deals at scientific publishing houses to for example buy a copy of @jmtomczak.bsky.social
book on generative modeling (long overdue)

1 10

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Nov 20

🫠

1

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Nov 18

Me

2

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Nov 18

Me

1

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Oct 30

What do you think do we need to sharpen our understanding of tokenization? Or will we soon be rid of it by developing models such as "MegaByte" by
Yu et al?
And add more paper to the threat!

2 4

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Oct 30

Phan et al, found a method to mitigate some of the tokenization problems Karpathy mentioned by projecting tokens into byte space. The key to their method is to develop a map between statistically equivalent token and byte-level models.

1 3

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Oct 30

In "The Foundations of Tokenization:
Statistical and Computational Concerns", Gastaldi et al. try to make first steps towards defining what a tokenizer should be and define properties it ought to have.

4

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Oct 30

In "Toward a Theory of Tokenization in LLMs" Rajaraman et al., the authors discuss why we can think of tokenization to cause lower perplexity/ a better entropy bound.

4

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Oct 30

A must watch entry point is @karpathy.bsky.social hy's "Let's build the GPT Tokenizer" video, where he discusses some tokenization problems.

4

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Oct 30

#Tokenization is undeniably a key player in the success story of #LLMs but we poorly understand why.
I want to highlight progress we made in understanding the role of tokenization, developing the core incidents and mitigating its problems. 🧵👇

10 7 33

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Sep 30

🚨 Internship Opportunity at FAIR NY 🚨

I got one PhD internship position available for 2025!

Interested in exploring the intersection of information theory, probabilistic reasoning, and LLMs?

📩 Send me a DM with your CV, website, and GScholar profile by October 14th.

1 2

Karen Ullrich (s/h) ✈️ COLM @karen-ullrich.bsky.social · Sep 26

🎉 Exciting News! 🎉
Two papers have been accepted at #NeurIPS2024 ! 🙌🏼 These papers are the first outcomes of my growing focus on LLMs. 🍾 Cheers to Nikita Dhawan and Jingtong Su + all involved collaborators: @cmaddis.bsky.social Leo Cotta, Rahul Krishnan, Julia Kempe

3