Sancus
sancus.bsky.social
Sancus
@sancus.bsky.social
Principal Engineer @ Thunderbird, former Firefox contributor, likes vidya games, tea, and cocktails.
OpenAI, specifically, clearly has an unsustainable plan for future training. I just think their plans are commercially unnecessary. A model with order of magnitude higher training costs won't come close to break even. But that doesn't mean much to the viability of all models.
December 29, 2025 at 1:47 AM
Now if your business model is only barely profitable at current API costs or even requires cheaper, good luck with that. I think you're fucked.
December 29, 2025 at 1:40 AM
Basically I'm saying that even once you account for companies going out of business, the ones left standing will still be operating pretty good models. And perhaps training will become research only. Plenty of technologies only exist from public research spend that was never profitable.
December 29, 2025 at 1:39 AM
Of course. But my point is the models that have already been trained are pretty good and there's no long term commercial need to burn $2T/year training. Those costs *will* drive companies bankrupt. But they'll be bought and some companies will operate and tune models much closer to inference cost.
December 29, 2025 at 1:37 AM
So even if the cost of an H100 worth of compute never goes down, inference is worth it. This analysis ignores the cost of training, which is the big question mark and enormously hard to calculate. But training costs are already sunk for existing models and we don't need 1000 new models per week.
December 29, 2025 at 1:32 AM
Now yes, Sonnet 4.5 is more capable. But the output token cost is $15/M. There is just no way the cost of inference is many times that amount. And I think that's a very fair price for Sonnet 4.5 right now as a coding assistant.
December 29, 2025 at 1:29 AM
It's definitely hard to say, but let's say we are renting H100s at $3/hr, which is about 50% higher than the going price. Using Deepseek R1 f ex, you're going to get maybe 150 output tokens/s per H100, or a bit more. That is ~$6/1M *output* tokens which is actually cheaper than Sonnet 4.5.
December 29, 2025 at 1:27 AM
I don't think it's necessarily true they'll need to get more expensive. We run open LLMs in the cloud, done the math on the costs. Sure, training costs can expand infinitely but models that are not on the bleeding edge will only get closer as LLMs plateau. Which I personally believe has begun.
December 29, 2025 at 12:39 AM
"the prompt" means what exactly? The whole conversation with all context? I don't think much code is generated by one shot prompts anymore, for me the context and conversation is always many times longer than the PR itself. Nobody is gonna read all that.
December 14, 2025 at 5:40 PM
I can see this being popular with some devs but does everybody even want to code with voice? If you type actually fast there's not a significant speed gain from it and I would personally not enjoy talking constantly all day.
October 29, 2025 at 2:09 PM
Problem is the predatory techniques are always more lucrative and we've chosen not to regulate this stuff. If it was legal to open a casino on every street corner, there would be a lot of those. Mobile games are the unavoidable consequence of anything goes monetization.
October 9, 2025 at 9:25 AM
Canada's not bad :)
September 2, 2025 at 4:04 AM