Lightnews — Scholar-powered news

Matthew Carrigan

@carrigmat.bsky.social

220 followers 150 following 70 posts

Engineer @huggingface. I'm the reason your LLM frontend has a jinja2cpp dependency. Sometimes yells about housing and trans rights instead of working
He/him

Posts Replies Media Videos

Matthew Carrigan

@carrigmat.bsky.social

PRs and issues on @hf.co have gotten a lot sloppier and weirder since the advent of code agents, but the weirdest ones still have an inexplicable human touch

November 3, 2025 at 1:40 PM

Matthew Carrigan

@carrigmat.bsky.social

In particular, this bit suggests that if you inject a concept too weakly the model doesn't notice, and too strongly it just talks about the concept rather than 'introspecting'. But maybe that just means a medium strength biases towards the concept without totally overriding the original question?

October 29, 2025 at 7:32 PM

Matthew Carrigan

@carrigmat.bsky.social

Yup, you can very clearly see a halving of stock value right after GPT-4 is released

June 15, 2025 at 9:06 PM

Matthew Carrigan

@carrigmat.bsky.social

the betting markets are asking the real questions today

April 27, 2025 at 4:09 PM

Matthew Carrigan

@carrigmat.bsky.social

If all goes well, you should witness a short load period followed by the stream of consciousness as a state-of-the-art local LLM begins to ponder your question.

January 28, 2025 at 2:40 PM

Matthew Carrigan

@carrigmat.bsky.social

You'll also need heatsinks - 4U/tower heatsinks for Socket SP5 EPYC processors are hard to find, but some lunatic is making them in China and selling on Ebay/Aliexpress. Buy two.

January 17, 2025 at 5:04 PM

Matthew Carrigan

@carrigmat.bsky.social

Here's a sample setup you can use:

Motherboard: Gigabyte MZ73-LM1
CPU: 2x AMD EPYC 9015
RAM: 24x 32GB DDR5 RDIMM
PSU: Corsair HX1000i (You don't need all that power, you just need lots of CPU power cables for 2 sockets!)
Case: PHANTEKS Enthoo Pro 2 Server

January 17, 2025 at 5:04 PM

Matthew Carrigan

@carrigmat.bsky.social

Server motherboards can have way more than 2 channels. The current leader is AMD EPYC, which has 12 channels of DDR5 *per socket*!

1 EPYC CPU = 12 RAM channels = 600GB/sec
2 EPYC CPUs = 24 RAM channels = 1200GB/sec!

This should be enough to squeeze ~9tok/sec out of our model!

January 17, 2025 at 5:04 PM

Matthew Carrigan

@carrigmat.bsky.social

What about CPU RAM? Consumer RAM is quite cheap, but to fit 400+GB in a single motherboard, we're probably looking at a server board, which means we need RDIMM server memory. Right now, you can get 64GB of this for about $300-400, so 500GB should be $3000 or less.

January 17, 2025 at 5:04 PM

Matthew Carrigan

@carrigmat.bsky.social

The highest-memory datacenter GPUs (A100/H100) are 80GB. We would need ~6 of these. Although we can fit that in a single case, the cost at current prices will probably be >$60,000.

January 17, 2025 at 5:04 PM

Matthew Carrigan

@carrigmat.bsky.social

Let's work it out: The upcoming 5090 has 32GB of RAM. We would need about 16 of them for this. This would cost >$30,000, and you wouldn't even be able to fit them in a single case anyway.

January 17, 2025 at 5:04 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news