Nathan Lambert
@natolambert.bsky.social
13K followers 270 following 1.6K posts
A LLN - large language Nathan - (RL, RLHF, society, robotics), athlete, yogi, chef Writes http://interconnects.ai At Ai2 via HuggingFace, Berkeley, and normal places
Posts Media Videos Starter Packs
Pinned
natolambert.bsky.social
First draft online version of The RLHF Book is DONE. Recently I've been creating the advanced discussion chapters on everything from Constitutional AI to evaluation and character training, but I also sneak in consistent improvements to the RL specific chapter.

rlhfbook.com
natolambert.bsky.social
no its how AI works these days
natolambert.bsky.social
For folks at COLM, my talk is in 524C @ 12:00PM to share the various things that go into building a reasoning model from scratch. See you soon!

Will not be recorded and slides will only be released when we can get models out that we're happy with.
natolambert.bsky.social
When talks start with "we can all likely agree that reasoning is the path to AGI" you know its going to be a doozy
natolambert.bsky.social
Talk from Wenting Zhao of Qwen on their plans during COLM. Seems like 1 word is the plan still: scaling training up! Let’s go.
natolambert.bsky.social
TODAY, so in like 80minutes
natolambert.bsky.social
Open Models Talk and COLM 2025 is happening at 524C (end of the conference center) at 2pm.
natolambert.bsky.social
Perfect timing for COLM2025 here in Montreal.
natolambert.bsky.social
Augmentation of humans and a restructuring of the research org is far more likely.

Among other thoughts on a great conference!

buff.ly/mzrRumA
Thoughts on The Curve
The conference and the trajectory.
buff.ly
natolambert.bsky.social
The Curve is a new style of mini AI conference to debate AI progress.

Here I reflect on it and explain why the argument that AI will fully replace human research engineers, and then scientists, is far fetched in the years of compute scarcity.
natolambert.bsky.social
I can't believe we're actually as a general public going to need to be using quadrillion all the time in every day discourse, starting with tokens processed.

We'll blink and measuring in trillions won't cut it anymore.
natolambert.bsky.social
Oh yeah I do this but wear many hats and things change fast these days…
natolambert.bsky.social
You should be spending 10+minutes on slides per minute of your talk.
Doing too many talks makes it so you don't have time for top quality.
natolambert.bsky.social
One of the mindviruses we may never fully squash is that people think you should and CAN actually ban open weights AI models in the U.S.

Good luck with that buckaroo. We have plenty of time to prepare for any potentially dangerous open models of the future.
natolambert.bsky.social
I gave a talk today at The Curve on the state of open models.
Here are the slides, recording soon.

Topics include: Chinese ecosystem, reflections on DeepSeek, the demise of Llama, who will fill the U.S. market, what local models do, ATOM project & ai2, and more topics
buff.ly/8BiC67C
natolambert.bsky.social
A ton of attention over the years goes to plots comparing open to closed models.
The real trend that matters for AI impacts on society is the gap between closed frontier models and local consumer models.
Local models passing major milestones will have major repercussions.
buff.ly/ccMJydQ
natolambert.bsky.social
What changed? Despite many wonderful models, Anthropic never really remotely translated to LMArena.

The core question -- has LMArena's users or Anthropic's models shifted? Or both?
natolambert.bsky.social
Mostly releasing base models, but it’s not a substantive analysis just a tweet
natolambert.bsky.social
Seeing Qwen as trying to build Android for AI models: Cheap, everywhere, powerful, modifiable, should give you a good sense of their strategy.
natolambert.bsky.social
Recent dean: buff.ly/n2ygaU5
My original post: buff.ly/rUgJUxQ
More on character training with a model spec update: buff.ly/YwVvjiM
On sychophancy in GPT4o, and how it related to the model spec: buff.ly/p1b4M1H
"Be It Enacted"
A Proposal for Federal AI Preemption
buff.ly
natolambert.bsky.social
As Olmo gets better, this has been on my list to craft one and share the process, difficulties of following it, and so on. I welcome pressure on needing to deliver this in order to set a better example.
natolambert.bsky.social
The behavior of these models is actually remarkably steerable (sharing more research I'm involved with on this soon!) and the lack of model specs is pretty awful as a community standard.

Links to Dean's piece, and my older pieces on model specs are all below.
natolambert.bsky.social
Largely these seem to be blocked on politics -- both internal where teams actually cant agree what the model should do -- and external, where labs fear pushback.
natolambert.bsky.social
The model spec sets the intentions of how the model should behave, irregardless as to if it succeeds in it.

Again, I'm happy to discuss this with labs as a free consult as I think its great for the world.
natolambert.bsky.social
I haven't posted about Model Spec's in a while, but Dean gave me a shoutout on my earlier writing on them, so its time to say definitively again that every frontier lab should have a model spec. It builds long term trust with users, developers and regulators.