WARNING: I talk about kids sometimes
www.dwarkesh.com/p/ilya-sutsk...
www.dwarkesh.com/p/ilya-sutsk...
no, the advantage of closed weights is you can explore prices completely detached from cost. You’re free to set prices based purely on what people will pay, the value they get from it
no, the advantage of closed weights is you can explore prices completely detached from cost. You’re free to set prices based purely on what people will pay, the value they get from it
an API you can easily use via curl that takes a URL and converts it to LLM-friendly text. Free to use, afaict
github.com/jina-ai/reader
an API you can easily use via curl that takes a URL and converts it to LLM-friendly text. Free to use, afaict
github.com/jina-ai/reader
if all of your dependencies are sitting on disk, the agent doesn’t need to rely on documentation
even wo monorepos, it’s a good idea to clone tricky dependencies locally
if all of your dependencies are sitting on disk, the agent doesn’t need to rely on documentation
even wo monorepos, it’s a good idea to clone tricky dependencies locally
they got within a few points of o3’s performance using only 4k training data points (yes, synthetic)
www.microsoft.com/en-us/resear...
they got within a few points of o3’s performance using only 4k training data points (yes, synthetic)
www.microsoft.com/en-us/resear...
available both as an MCP server & web UI
exa.ai/blog/exa-api...
available both as an MCP server & web UI
exa.ai/blog/exa-api...
GPT-5-Pro could probably do it too, but you’d pay like $30 for one shot
Gemini 3 & Opus 4.5 can still run fast & cheap bc they’re extremely sparse MoE, but solve very tricky problems
we truly need scale along both axes
GPT-5-Pro could probably do it too, but you’d pay like $30 for one shot
Gemini 3 & Opus 4.5 can still run fast & cheap bc they’re extremely sparse MoE, but solve very tricky problems
we truly need scale along both axes
AI can do everything an engineer can do
AI can do everything an engineer can do
no such thing as too many tools!
Was thinking about this last night as I approached sleep and glad to find this morning that one of the thought leaders rolled out this capability
www.anthropic.com/engineering/...
no such thing as too many tools!
did it start supporting subagents? i missed that
did it start supporting subagents? i missed that
i’d love to hear from Ilya, and also i assume Ilya wouldn’t talk unless he had something interesting to say, some tidbit of news also dropping tomorrow
i’d love to hear from Ilya, and also i assume Ilya wouldn’t talk unless he had something interesting to say, some tidbit of news also dropping tomorrow
- GPT-5.2: Successor, very good at programming
- Shallotpeat: fixed pre-training + new base for the IMO Gold math model
I'm really curious about Shallotpeat. Sounds like a redo of GPT-4.5
- GPT-5.2: Successor, very good at programming
- Shallotpeat: fixed pre-training + new base for the IMO Gold math model
I'm really curious about Shallotpeat. Sounds like a redo of GPT-4.5
Opus => Coding
Gemini => Problem solving, explaining
Opus => Coding
Gemini => Problem solving, explaining
assets.anthropic.com/m/64823ba748...
oh, high alignment and low rates of concerning behavior? sounds like bliss
Now 1/3rd the cost, and SOTA in programming
Like Gemini 3 Pro, people note that it can see a lot deeper into tough problems. That big model smell..
www.anthropic.com/news/claude-...
Now 1/3rd the cost, and SOTA in programming
Like Gemini 3 Pro, people note that it can see a lot deeper into tough problems. That big model smell..
www.anthropic.com/news/claude-...
Pointing to the public has moral panic about alignment, but they want the raw stuff. Like they CRAVE the raw stuff
we’re expecting Opus 4.5 soon, and time will tell if they understand this
It’s over if Opus 4.5 is yet another over-RLVR’d braindead shell-of-a-Claude
Pointing to the public has moral panic about alignment, but they want the raw stuff. Like they CRAVE the raw stuff
we’re expecting Opus 4.5 soon, and time will tell if they understand this
It’s over if Opus 4.5 is yet another over-RLVR’d braindead shell-of-a-Claude
we’re expecting Opus 4.5 soon, and time will tell if they understand this
It’s over if Opus 4.5 is yet another over-RLVR’d braindead shell-of-a-Claude
- Gemini 3 + nano banana is massive, probably biggest change in 6 months
- GPT-5 is small, but 5.1 + 5.1-codex is actually a moderate jump
- o3 might be biggest jump of the year
- Gemini 3 + nano banana is massive, probably biggest change in 6 months
- GPT-5 is small, but 5.1 + 5.1-codex is actually a moderate jump
- o3 might be biggest jump of the year