WARNING: I talk about kids sometimes
Modern LLMs (GPT-5.1, Claude 4.5, Gemini 3) produce excellent code and can be a significant productivity boost to software engineers who take the time to learn how to effectively apply them - especially if used with coding agent tools
Modern LLMs (GPT-5.1, Claude 4.5, Gemini 3) produce excellent code and can be a significant productivity boost to software engineers who take the time to learn how to effectively apply them - especially if used with coding agent tools
but the model is an advertisement for their infrastructure (that you should use!), and so peek at that too! You should be able to replicate this for your domain
but the model is an advertisement for their infrastructure (that you should use!), and so peek at that too! You should be able to replicate this for your domain
Fascinating paper that explores how to RL but focused on process over outcome
It’s sort of similar to a GAN, but with loops for each the generator & verifier as well as an outer loop
github.com/deepseek-ai/...
Fascinating paper that explores how to RL but focused on process over outcome
It’s sort of similar to a GAN, but with loops for each the generator & verifier as well as an outer loop
github.com/deepseek-ai/...
that was Obama’s schtick. Social, tech, economic, any kind of progress will do
now it feels like the left and right are fighting over which kind of *regress* is better
seems like someone will probably win
that was Obama’s schtick. Social, tech, economic, any kind of progress will do
now it feels like the left and right are fighting over which kind of *regress* is better
seems like someone will probably win
i have a hunch that that's why he's taking a lot of crap from some parts of the tech bro crowd that's started leaning into eugenics. Might have nothing to do with his AI views
i have a hunch that that's why he's taking a lot of crap from some parts of the tech bro crowd that's started leaning into eugenics. Might have nothing to do with his AI views
the even cooler part is this all independent research
the even cooler part is this all independent research
that’s why he’s on a podcast, to shape minds. he can’t just release a shitty model and be called a saint. he needs to control the narrative and provide context for what he’s done
if this doesn’t land, he’s likely screwed (ngl i don’t think it landed)
but it’s not done, it’s still got to learn
in our current approaches, it’s hard to conceive of that, because we’re bombarded by hype and marketing. i can’t imagine releasing an incapable model..
that’s why he’s on a podcast, to shape minds. he can’t just release a shitty model and be called a saint. he needs to control the narrative and provide context for what he’s done
if this doesn’t land, he’s likely screwed (ngl i don’t think it landed)
We're all pursuing a single behemoth that is *already* smarter than all humans when it's launched
He's pursuing an entity that is *capable of* being smarter
i.e. he's all in on continual learning
We're all pursuing a single behemoth that is *already* smarter than all humans when it's launched
He's pursuing an entity that is *capable of* being smarter
i.e. he's all in on continual learning
Opus 4.5 basically does not do doom loops, period. It's legit, I'm impressed.
Opus 4.5 basically does not do doom loops, period. It's legit, I'm impressed.
www.dwarkesh.com/p/ilya-sutsk...
www.dwarkesh.com/p/ilya-sutsk...
no, the advantage of closed weights is you can explore prices completely detached from cost. You’re free to set prices based purely on what people will pay, the value they get from it
no, the advantage of closed weights is you can explore prices completely detached from cost. You’re free to set prices based purely on what people will pay, the value they get from it
an API you can easily use via curl that takes a URL and converts it to LLM-friendly text. Free to use, afaict
github.com/jina-ai/reader
an API you can easily use via curl that takes a URL and converts it to LLM-friendly text. Free to use, afaict
github.com/jina-ai/reader
if all of your dependencies are sitting on disk, the agent doesn’t need to rely on documentation
even wo monorepos, it’s a good idea to clone tricky dependencies locally
if all of your dependencies are sitting on disk, the agent doesn’t need to rely on documentation
even wo monorepos, it’s a good idea to clone tricky dependencies locally
they got within a few points of o3’s performance using only 4k training data points (yes, synthetic)
www.microsoft.com/en-us/resear...
they got within a few points of o3’s performance using only 4k training data points (yes, synthetic)
www.microsoft.com/en-us/resear...
available both as an MCP server & web UI
exa.ai/blog/exa-api...
available both as an MCP server & web UI
exa.ai/blog/exa-api...
GPT-5-Pro could probably do it too, but you’d pay like $30 for one shot
Gemini 3 & Opus 4.5 can still run fast & cheap bc they’re extremely sparse MoE, but solve very tricky problems
we truly need scale along both axes
GPT-5-Pro could probably do it too, but you’d pay like $30 for one shot
Gemini 3 & Opus 4.5 can still run fast & cheap bc they’re extremely sparse MoE, but solve very tricky problems
we truly need scale along both axes