Alexandre Ramé
@ramealexandre.bsky.social
280 followers 89 following 3 posts
Research Scientist at DeepMind. PhD from Sorbonne Université. Merging and aligning Gemmas. https://alexrame.github.io/
Posts Media Videos Starter Packs
Reposted by Alexandre Ramé
damienteney.bsky.social
Coming up at ICML: 🤯Distribution shifts are still a huge challenge in ML. There's already a ton of algorithms to address specific conditions. So what if the challenge was just selecting the right algorithm for the right conditions?🤔🧵
Reposted by Alexandre Ramé
natolambert.bsky.social
ChatBotArena is far from the first eval to be overfit to. It's becoming underrated. Likely the single most impactful evaluation project since ChatGPT. The labs are the ones releasing these slightly off models.
Reposted by Alexandre Ramé
serge.belongie.com
Would you present your next NeurIPS paper in Europe instead of traveling to San Diego (US) if this was an option? Søren Hauberg (DTU) and I would love to hear the answer through this poll: (1/6)
NeurIPS participation in Europe
We seek to understand if there is interest in being able to attend NeurIPS in Europe, i.e. without travelling to San Diego, US. In the following, assume that it is possible to present accepted papers ...
docs.google.com
ramealexandre.bsky.social
Hiring two student researchers for Gemma post-training team at @GoogleDeepMind Paris! First topic is about diversity in RL for LLMs (merging, generalization, exploration & creativity), second is about distillation. Ideal if you're finishing PhD. DMs open!
Reposted by Alexandre Ramé
jeffdean.bsky.social
🥁Introducing Gemini 2.5, our most intelligent model with impressive capabilities in advanced reasoning and coding.

Now integrating thinking capabilities, 2.5 Pro Experimental is our most performant Gemini model yet. It’s #1 on the LM Arena leaderboard. 🥇
Reposted by Alexandre Ramé
natolambert.bsky.social
This is a very tidy little RL paper for reasoning. Their GRPO changes:
1 Two different clip hyperparams, so positive clipping can uplift more unexpected tokens
2 Dynamic sampling -- remove samples w flat reward in batch
3 Per token loss
4 Managing too long generations in loss
dapo-sia.github.io
ramealexandre.bsky.social
Welcome Gemma 3, Google’s new open-weight LLM. All sizes (1B, 4B, 12B and 27B) excel on benchmarks, but the key result may be the 27B reaching 1338 on LMSYS. For this, we scaled post-training, with our novel distillation, RL and merging strategies.

Report: storage.googleapis.com/deepmind-med...
Reposted by Alexandre Ramé
andrewgwils.bsky.social
My new paper "Deep Learning is Not So Mysterious or Different": arxiv.org/abs/2503.02113. Generalization behaviours in deep learning can be intuitively understood through a notion of soft inductive biases, and formally characterized with countable hypothesis bounds! 1/12
ramealexandre.bsky.social
Modern post-training is essentially distillation then RL. While reward hacking is well-known and feared, could there be such a thing as teacher hacking? Our latest paper confirms it. Fortunately, we also show how to mitigate it! The secret: diversity and onlineness! arxiv.org/abs/2502.02671
Reposted by Alexandre Ramé
honualx.bsky.social
We just released the Helium-1 model , a 2B multi-lingual LLM which @exgrv.bsky.social and @lmazare.bsky.social have been crafting for us! Best model so far under 2.17B params on multi-lingual benchmarks 🇬🇧🇮🇹🇪🇸🇵🇹🇫🇷🇩🇪
On HF, under CC-BY licence: huggingface.co/kyutai/heliu...
Reposted by Alexandre Ramé
natolambert.bsky.social
ILYA: "PRETRAINING IS DONE. WE ARE NOW IN THE POST TRAINING ERA."
Reposted by Alexandre Ramé
natolambert.bsky.social
Of all of OpenAI's days, the RL API is still the most revealing of the state of AI research trends. Lots of open doors for those looking at RL.

OpenAI's Reinforcement Finetuning and RL for the masses
The cherry on Yann LeCun’s cake has finally been realized.
OpenAI's Reinforcement Finetuning and RL for the masses
The cherry on Yann LeCun’s cake has finally been realized.
buff.ly
Reposted by Alexandre Ramé
ambroiseodt.bsky.social
🚨So, you want to predict your model's performance at test time?🚨

💡Our NeurIPS 2024 paper proposes 𝐌𝐚𝐍𝐨, a training-free and SOTA approach!

📑 arxiv.org/pdf/2405.18979
🖥️https://github.com/Renchunzi-Xie/MaNo

1/🧵(A surprise at the end!)
Reposted by Alexandre Ramé
lchoshen.bsky.social
The right place for your phd:
With Colin Raffel, UofT works on decentralizing, democratizing, and derisking large-scale AI. Wanna work on model m(o)erging, collaborative/decentralized learning, identifying & mitigating risks, etc. Apply (deadline is Monday!)
web.cs.toronto.edu/graduate/how...
🤖📈
How to Apply — Department of Computer Science, University of Toronto
web.cs.toronto.edu
Reposted by Alexandre Ramé
eloizablocki.bsky.social
🥐 Building a Computer Vision FR Starter Pack!

👉 Who else should be included?

Comment below or DM me to be added

go.bsky.app/dfvcLZ
Reposted by Alexandre Ramé
douillard.bsky.social
distributed learning for LLM?

recently, @primeintellect.bsky.social have announced finishing their 10B distributed learning, trained across the world.

what is it exactly?

🧵
Reposted by Alexandre Ramé
abursuc.bsky.social
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling by Bairu Hou et al. #ICML2024
tl;dr: generate multiple clarifications of input txt w/ external LLM then forward:
>disagreement btw outputs -> data uncertainty
>avg uncertainty in each output -> model uncertainty
Reposted by Alexandre Ramé
davidpicard.bsky.social
This year, there are 16 positions at CNRS in computer science (8 in "applied" domains → ask me - 8 on "fundamental" domains → ask the other David).

@mathurinmassias.bsky.social has a good list of advice mathurinm.github.io/cnrs_inria_a...

Official 🔗 www.ins2i.cnrs.fr/en/cnrsinfo/...

Don't wait!
Reposted by Alexandre Ramé
davidpicard.bsky.social
Feeling that Paris is "The Place To Be" for computer vision and AI in general.
Reposted by Alexandre Ramé
douillard.bsky.social
Min-p Sampling: arxiv.org/abs/2407.01082

1. Get max prob
2. Find min prob based on a threshold \in [0, 1] \times that max prob
3. Gather only tokens probs above that min prob
4. Sample in that pool, according to renormalized probs

More robust to change in temperature!
Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs
Large Language Models (LLMs) generate text by sampling the next token from a probability distribution over the vocabulary at each decoding step. However, popular sampling methods like top-p (nucleus…
arxiv.org
Reposted by Alexandre Ramé
csprofkgd.bsky.social
My growing list of #computervision researchers on Bsky.

Missed you? Let me know.

go.bsky.app/M7HGC3Y
Reposted by Alexandre Ramé
jay.bsky.team
Bluesky now has over 20M people!! 🎉

We've been adding over a million users per day for the last few days. To celebrate, here are 20 fun facts about Bluesky: