Might only need internet access when you want to gossip.
Might only need internet access when you want to gossip.
But more params means more costs! and in a lot of applications, I predict people will choose cheap/local/fast AI
But more params means more costs! and in a lot of applications, I predict people will choose cheap/local/fast AI
Params. needed to achieve a certain performance level halves every ~4 months:
arxiv.org/abs/2412.04315
So we'll run modern frontier models on our laptops in 2029?? With way lower latency.
Running a 70B param. model today would take two 3090's. Costs about $2K. So 700B -> $20K?
Params. needed to achieve a certain performance level halves every ~4 months:
arxiv.org/abs/2412.04315
So we'll run modern frontier models on our laptops in 2029?? With way lower latency.
Running a 70B param. model today would take two 3090's. Costs about $2K. So 700B -> $20K?
Good-faith replies are more virtuous in some sense, but not sustainable for me.
Angry replies are a step backwards and I wish people would avoid it.
Good-faith replies are more virtuous in some sense, but not sustainable for me.
Angry replies are a step backwards and I wish people would avoid it.
bsky.app/profile/hars...
The fact that everyone serves in 4-bit quant now really muddles things. Perhaps model size should be measured in memory rather than params.
bsky.app/profile/hars...
bsky.app/profile/segy...
bsky.app/profile/segy...
"Probably some of these won’t replicate, and in a few years we’ll be left with a thinner and more believable profile of GLP-1 effects."
"Probably some of these won’t replicate, and in a few years we’ll be left with a thinner and more believable profile of GLP-1 effects."
I'd rather people publish Bold Theories than only utter what can be supported with a dozen footnotes.
I'd rather people publish Bold Theories than only utter what can be supported with a dozen footnotes.