Matthew Carrigan
banner
carrigmat.bsky.social
Matthew Carrigan
@carrigmat.bsky.social
Engineer @huggingface. I'm the reason your LLM frontend has a jinja2cpp dependency. Sometimes yells about housing and trans rights instead of working
He/him
PRs and issues on @hf.co have gotten a lot sloppier and weirder since the advent of code agents, but the weirdest ones still have an inexplicable human touch
November 3, 2025 at 1:40 PM
In particular, this bit suggests that if you inject a concept too weakly the model doesn't notice, and too strongly it just talks about the concept rather than 'introspecting'. But maybe that just means a medium strength biases towards the concept without totally overriding the original question?
October 29, 2025 at 7:32 PM
Yup, you can very clearly see a halving of stock value right after GPT-4 is released
June 15, 2025 at 9:06 PM
the betting markets are asking the real questions today
April 27, 2025 at 4:09 PM
If all goes well, you should witness a short load period followed by the stream of consciousness as a state-of-the-art local LLM begins to ponder your question.
January 28, 2025 at 2:40 PM
You'll also need heatsinks - 4U/tower heatsinks for Socket SP5 EPYC processors are hard to find, but some lunatic is making them in China and selling on Ebay/Aliexpress. Buy two.
January 17, 2025 at 5:04 PM
Here's a sample setup you can use:

Motherboard: Gigabyte MZ73-LM1
CPU: 2x AMD EPYC 9015
RAM: 24x 32GB DDR5 RDIMM
PSU: Corsair HX1000i (You don't need all that power, you just need lots of CPU power cables for 2 sockets!)
Case: PHANTEKS Enthoo Pro 2 Server
January 17, 2025 at 5:04 PM
Server motherboards can have way more than 2 channels. The current leader is AMD EPYC, which has 12 channels of DDR5 *per socket*!

1 EPYC CPU = 12 RAM channels = 600GB/sec
2 EPYC CPUs = 24 RAM channels = 1200GB/sec!

This should be enough to squeeze ~9tok/sec out of our model!
January 17, 2025 at 5:04 PM
What about CPU RAM? Consumer RAM is quite cheap, but to fit 400+GB in a single motherboard, we're probably looking at a server board, which means we need RDIMM server memory. Right now, you can get 64GB of this for about $300-400, so 500GB should be $3000 or less.
January 17, 2025 at 5:04 PM
The highest-memory datacenter GPUs (A100/H100) are 80GB. We would need ~6 of these. Although we can fit that in a single case, the cost at current prices will probably be >$60,000.
January 17, 2025 at 5:04 PM
Let's work it out: The upcoming 5090 has 32GB of RAM. We would need about 16 of them for this. This would cost >$30,000, and you wouldn't even be able to fit them in a single case anyway.
January 17, 2025 at 5:04 PM