Nathan Lambert
banner
natolambert.bsky.social
Nathan Lambert
@natolambert.bsky.social
A LLN - large language Nathan - (RL, RLHF, society, robotics), athlete, yogi, chef
Writes http://interconnects.ai
At Ai2 via HuggingFace, Berkeley, and normal places
Opus?
Sorry, living under rocks today.
November 24, 2025 at 10:41 PM
I asked (on ChinaTalk) the head of product at Z ai, one of the leading Chinese companies building open models, how long it takes them to get their model out the door once its done training. Incredible stuff:

"a few hours" and the model is on HuggingFace.
www.chinatalk.media/p/the-zai-pl...
November 21, 2025 at 5:05 PM
People: "You must be so relaxed, proud, and happy that the model you worked on all year is out."

Me:
November 21, 2025 at 12:29 AM
the Epstein files have been trending on HuggingFace.

> This dataset is provided for...
> Evaluating information retrieval and retrieval augmented generation (RAG) systems.
> It is not intended for: Fine-tuning language models.

??
November 20, 2025 at 9:49 PM
Happy Olmo day to all who celebrate.
Sorry to all who delayed releases today to get out of our way.
We're hiring.
November 20, 2025 at 6:40 PM
We present Olmo 3, our next family of fully open, leading language models.
This family of 7B and 32B models represents:

1. The best 32B base model.
2. The best 7B Western thinking & instruct models.
3. The first 32B (or larger) fully open reasoning model.
November 20, 2025 at 2:32 PM
Chinese models are enabling AI research. US progress needs to be accelerated
November 20, 2025 at 1:16 AM
A new tab on Google Scholar???
scholar.google.com/scholar_labs...
November 18, 2025 at 5:48 PM
I'm excited to announce my RLHF Book is now in pre-order for the @manning.com Early Access Program (MEAP), and for this milestone it's 50% off.

Excited to land in print in early 2026! Lots of improvements coming soon.

Thanks for the support!
hubs.la/Q03Tc37Q0
November 14, 2025 at 9:02 PM
Many people are sleeping on, or even making fun of this plot in the GPT 5.1 release. This is a crucial plot for anyone serving a thinking model in real world use-cases. Latency to an answer is a huge cause of user churn and not thinking enough is a fast track to having your model's output be bad.
November 13, 2025 at 7:18 PM
OpenAI showing very clearly why you should care about Character Training with GPT 5.1: It's the leading selling point of the release.
November 13, 2025 at 2:45 AM
Lol currently every X account with 2fa auth enabled is locked out, while if you don't have 2fa you can use the app as usual. Iconic levels of broken.
November 12, 2025 at 6:28 PM
I’m starting a new series of interviews with all the leading open model labs around the world to show why people are doing this, how people train great models, and where the ecosystem is going.
November 12, 2025 at 3:12 PM
New bike day!
November 9, 2025 at 1:18 AM
I appreciate the shoutout from @simonwillison.net

I'm building up a much richer (and direct) understanding of Chinese AI labs. Excited to share more here soon :)
November 7, 2025 at 6:13 PM
Thoughts on Kimi K2 Thinking
Congrats to the Moonshot AI team on the awesome open release. For close followers of Chinese AI models, this isn't shocking, but more inflection points are coming. Pressure is building on US labs with more expensive models.
www.interconnects.ai/p/kimi-k2-th...
November 6, 2025 at 6:53 PM
The Great Lock In
November 6, 2025 at 1:07 AM
We're starting to hire for our 2026 Olmo interns! Looking for excellent students to do research to help build our best models (primarily enrolled in Ph.D. with experience or interest in any area of the language modeling pipeline).
job-boards.greenhouse.io/thealleninst...
November 5, 2025 at 11:27 PM
The first research on the fundamentals of character training -- i.e. applying modern post training techniques to ingrain specific character traits into models.

All models, datasets, code etc released.
Really excited about this project! Sharan, the lead student author, was a joy to work with.
November 4, 2025 at 4:51 PM
Interesting chart where service based sectors are using AI more (even though, e.g. the US has way less trust or optimism in AI than a place like China) could be a resounding advantage in a willingness to fund the endeavor as it gets even more expensive in the next couple years.
November 4, 2025 at 2:54 AM
refreshing wrap to the weekend
November 3, 2025 at 2:07 AM
too real
November 1, 2025 at 4:03 PM
I'm a total sucker for nice RL training scaling plots.
They're very neglected vis-a-vis the much easier inference-time scaling plots.
October 29, 2025 at 5:30 PM
Cursor announced some new coding models. I'd put money on this being a finetune of one of the large, Chinese MoE models.

Excited to see more companies able to train models that suit their needs. Bodes very well for the ecosystem that specific data is stronger than a bigger, general model.
October 29, 2025 at 5:22 PM
Most people working in the cutting edge of AI seem to have no long-term plan for their unsustainable work habits.
October 25, 2025 at 5:54 PM