Lightnews — Scholar-powered news

Ambroise Odonnat

@ambroiseodt.bsky.social

84 followers 130 following 36 posts

Ph.D. student in Machine Learning at Inria. Website: https://ambroiseodt.github.io/ Blog: https://logb-research.github.io

ambroiseodt.github.io

Posts Media Videos Starter Packs

Pinned

Ambroise Odonnat @ambroiseodt.bsky.social · Dec 3

🚨So, you want to predict your model's performance at test time?🚨

💡Our NeurIPS 2024 paper proposes 𝐌𝐚𝐍𝐨, a training-free and SOTA approach!

📑 arxiv.org/pdf/2405.18979
🖥️https://github.com/Renchunzi-Xie/MaNo

1/🧵(A surprise at the end!)

2 6 16

Reposted by Ambroise Odonnat

Rémi Flamary @rflamary.bsky.social · Jul 29

SKADA-Bench : Benchmarking Unsupervised Domain Adaptation Methods with Realistic Validation On Diverse Modalities, has been published published in TMLR today 🚀. It was a huge team effort to design (and publish) an open source fully reproducible DA benchmark 🧵1/n. openreview.net/forum?id=k9F...

SKADA-Bench: Benchmarking Unsupervised Domain Adaptation Methods...

Unsupervised Domain Adaptation (DA) consists of adapting a model trained on a labeled source domain to perform well on an unlabeled target domain with some data distribution shift. While many...

openreview.net

1 7 16

Ambroise Odonnat @ambroiseodt.bsky.social · Jul 22

🚀 We are happy to organize the BERT²S workshop @neuripsconf.bsky.social 2025 on Recent Advances in Time Series Foundation Models.
🌐 berts-workshop.github.io
📜Submit by August 22
🎓Speakers and panelists: Chenghao Liu, Mingsheng Long, Zoe Piran, Danielle C. Maddix, Ameet Talwalkar, Qingsong Wen

2 5

Ambroise Odonnat @ambroiseodt.bsky.social · Jun 24

Here is the recording with the slides for those interested!
🎤 youtu.be/UONvP1TL0-g?...
📊 drive.google.com/file/d/14ZIo...
📑 arxiv.org/pdf/2410.02724

@cohere.com @cohereforai.bsky.social

2 2

Ambroise Odonnat @ambroiseodt.bsky.social · Jun 13

🚀 Very happy to be presenting Large Language Models as Markov Chains at Cohere Labs on June 19th at 6 pm CET (Paris time)!!

Huge thanks to Andrej Jovanović @cohere.com @cohereforai.bsky.social for the invitation 🤗

Paper: arxiv.org/pdf/2410.02724
Learn more: cohere.com/events/Coher...

Reposted by Ambroise Odonnat

Théo Gnassounou @tgnassou.bsky.social · May 20

Skada Sprint Alert: Contribute to Domain Adaptation in Python

📖 Machine learning models often fail when the data distribution changes between training and testing. That’s where Domain Adaptation comes in — helping models stay reliable across domains.

1 6 12

Ambroise Odonnat @ambroiseodt.bsky.social · Apr 17

Congrats!

Ambroise Odonnat @ambroiseodt.bsky.social · Feb 28

📑Paper: arxiv.org/pdf/2410.02724
📈Slides: drive.google.com/file/d/1JDrV... (better with Adobe Reader for nice GIFs)
🌐Website: ambroiseodt.github.io

arxiv.org

Ambroise Odonnat @ambroiseodt.bsky.social · Feb 28

🤗Thanks a lot @haeggee.bsky.social and @mjaggi.bsky.social for having me in the MLO group at EPFL @icepfl.bsky.social to present "Large Language Models as Markov Chains".

Slides are available on my website (link in thread).

🎉 New experiments with Llama and Gemma models in the updated paper!

1 2 4

Ambroise Odonnat @ambroiseodt.bsky.social · Feb 12

🤗 Very happy to have (humbly) contributed to this work!

This is a collab with the usual open-source suspects from Inria, @polytechniqueparis.bsky.social and @univparissaclay.bsky.social.

Check it out if you are interested in open-source reproducible research 😇

Théo Gnassounou @tgnassou.bsky.social · Feb 12

🚀 I’m pleased to announce a new preprint!

"SKADA-Bench: Benchmarking Unsupervised Domain Adaptation Methods with Realistic Validation On Diverse Modalities"

📢 Check it out & contribute!
📜 Paper: arxiv.org/abs/2407.11676
💻 Code: github.com/scikit-adapt...

Reposted by Ambroise Odonnat

Oussama Zekri @ozekri.bsky.social · Feb 4

🚀 Policy gradient methods like DeepSeek’s GRPO are great for finetuning LLMs via RLHF.

But what happens when we swap autoregressive generation for discrete diffusion, a rising architecture promising faster & more controllable LLMs?

Introducing SEPO !

📑 arxiv.org/pdf/2502.01384

🧵👇

1 2 5

Ambroise Odonnat @ambroiseodt.bsky.social · Feb 4

Finally, I can't thank you enough Wes and @viviencabannes.bsky.social for this collab: you are a rare combination of super-smart and fun to work with!

Hopefully, more to come soon🤠

"Moi, si je devais résumer ma vie aujourd’hui avec vous, je dirais que c’est d’abord des rencontres."

Ambroise Odonnat @ambroiseodt.bsky.social · Feb 4

We want to thank Elvis Dohmatob, Eshaan Nichani, @giupaolo.bsky.social , Faniriana Rakoto Endor, and Ievgen Redko for fruitful discussions during the elaboration of this work 😇

1 1

Ambroise Odonnat @ambroiseodt.bsky.social · Feb 4

From the theoretical side, we show that clustering heads can be learned via gradient descent and provide theoretical insights into the two-stage learning observed in practice.
6/🧵

Ambroise Odonnat @ambroiseodt.bsky.social · Feb 4

We investigate loss spikes, suggesting potential strategies for mitigation, which could lead to more stable training processes. We also peek into the transferability of circuits to showcase the usefulness of curriculum learning and data curation.
5/🧵

Ambroise Odonnat @ambroiseodt.bsky.social · Feb 4

In the second, we unveil "𝑪𝒍𝒖𝒔𝒕𝒆𝒓𝒊𝒏𝒈 𝑯𝒆𝒂𝒅𝒔", circuits that learn the invariance of the task. Their training dynamic is in two phases: 1) clustering of the attention embeddings according to invariance and 2) classifier fitting.
4/🧵

Ambroise Odonnat @ambroiseodt.bsky.social · Feb 4

In the first paper, we show how GD (gradient descent) reinforces useful circuits in transformers while pruning others to create sub-circuits that help solve complex tasks by breaking them down into intermediate reasoning steps.

3/🧵

Ambroise Odonnat @ambroiseodt.bsky.social · Feb 4

We consider the 𝒔𝒑𝒂𝒓𝒔𝒆 𝒎𝒐𝒅𝒖𝒍𝒂𝒓 𝒂𝒅𝒅𝒊𝒕𝒊𝒐𝒏 problem where the inputs are sequences of L tokens in the ring of integers modulo p and the corresponding targets are the sum of the first k terms modulo p. Formally, we aim to learn the mapping:

2/🧵

Ambroise Odonnat @ambroiseodt.bsky.social · Feb 4

🚀Proud to share our work on the training dynamics in Transformers with Wassim Bouaziz & @viviencabannes.bsky.social @Inria @MetaAI

📝Easing Optimization Paths arxiv.org/pdf/2501.02362 (accepted @ICASSP 2025 🥳)

📝Clustering Heads 🔥https://arxiv.org/pdf/2410.24050

🖥️ github.com/facebookrese...

1/🧵

1 4 5

Ambroise Odonnat @ambroiseodt.bsky.social · Jan 25

Happy to see Disentangled In-Context Learning accepted at ICLR 2025 🥳

Make zero-shot reinforcement learning with LLMs go brrr 🚀

🖥️ github.com/abenechehab/...

📜 arxiv.org/pdf/2410.11711

Congrats Abdelhakim (abenechehab.github.io) for leading it, always fun working with nice and strong people 🤗

GitHub - abenechehab/dicl: Official implementation of DICL (Disentangled In-Context Learning), featured in the paper Zero-shot Model-based Reinforcement Learning using Large Language Models.

Official implementation of DICL (Disentangled In-Context Learning), featured in the paper Zero-shot Model-based Reinforcement Learning using Large Language Models. - abenechehab/dicl

github.com

2 5

Ambroise Odonnat @ambroiseodt.bsky.social · Dec 10

🎤Presenting our work on Unsupervised Accuracy Estimation at #NeurIPS2024 this week!

✋🏾Poster Session 4 West - on Thu. at 4:30 pm

📍 Poster #4310 - East Exhibit Hall A-C

DM me if you'd like to chat :)

Ambroise Odonnat @ambroiseodt.bsky.social · Dec 6

Checkout the new version of this awesome domain adaptation library! So nice to work with such good people 🤗

Théo Gnassounou @tgnassou.bsky.social · Dec 6

🚀 Skada v0.4.0 is out!

Skada is an open-source Python library built for domain adaptation (DA), helping machine learning models to adapt to distribution shifts.
Github: github.com/scikit-adapt...
Doc: scikit-adaptation.github.io
DOI: doi.org/10.5281/zeno...
Installation: `pip install skada`

1 2

Ambroise Odonnat @ambroiseodt.bsky.social · Dec 4

Hi @vickiboykis.com, thanks for your interest. Don’t hesitate if you have any questions on the paper, we would be happy to help with @ozekri.bsky.social :)

1 3

Ambroise Odonnat @ambroiseodt.bsky.social · Dec 3

Ahah, thanks, still a lot to learn before that 😅

Ambroise Odonnat @ambroiseodt.bsky.social · Dec 3

🤗This is joint work with Renchunzi Xie, Vasilii Feofanov, Weijian Deng, Jianfeng Zhang, and Bo An.

Finally, I want to thank @ramealexandre.bsky.social Youssef Attia El Hili for fruitful discussions during the elaboration of this work.

🧵/🧵

Ambroise Odonnat @ambroiseodt.bsky.social · Dec 3

🥳Finally the awaited surprise!
Our work includes a result akin to the one of
@petar-v.bsky.social in “softmax is not sharp enough” (arxiv.org/pdf/2410.01104). We discuss its implications in the context of unsupervised accuracy estimation.

12/🧵

1 1