Matthew Carrigan
banner
carrigmat.bsky.social
Matthew Carrigan
@carrigmat.bsky.social
Engineer @huggingface. I'm the reason your LLM frontend has a jinja2cpp dependency. Sometimes yells about housing and trans rights instead of working
He/him
Now seems like a good time to repeat this thread, since Kimi-K2-Thinking has just arrived and might actually be the strongest LLM in the world right now, open or closed huggingface.co/moonshotai/K...
November 7, 2025 at 6:27 PM
PRs and issues on @hf.co have gotten a lot sloppier and weirder since the advent of code agents, but the weirdest ones still have an inexplicable human touch
November 3, 2025 at 1:40 PM
Extremely fascinated by the latest Anthropic post, but parts of the results feel like they might just be the result of "the right amount of steering" rather than genuine introspection. www.anthropic.com/research/int...
Emergent introspective awareness in large language models
Research from Anthropic on the ability of large language models to introspect
www.anthropic.com
October 29, 2025 at 7:32 PM
An underappreciated thing about the Turing test is that every teacher, writer and artist on the planet is now intimately familiar with the markers of AI output.

The post-ChatGPT era is like a global training montage to ensure the bot's job in that test is as hard as possible
October 10, 2025 at 7:42 PM
Underappreciated linguistic fact: "Thou" was originally an informal, friendly pronoun, but feels extremely archaic and formal to modern ears because of its association with Shakespeare and the KJV. You'd use it for speaking to family and friends (and to God).
May 13, 2025 at 3:21 PM
the betting markets are asking the real questions today
April 27, 2025 at 4:09 PM
The discussion pages for Open-R1 on @hf.co are such a goldmine for actual practical information on how to train a reasoning model.

Like look at this! If you're not reading those community tabs you're missing so much! huggingface.co/spaces/open-...
open-r1/README · [Experiment] Training R1-Zero-like models with Open R1
There are several recent research papers which explore various aspects of R1-Zero-like training on open base models like Qwen2.5-7B and Llama-3.1-8B:
huggingface.co
April 25, 2025 at 2:56 PM
I call this The Paper. It gets written quite often in machine learning, and it's valuable every time!

The core of it is "Everyone had a complex setup to do X task. With enough scale, none of that complexity is necessary, and a simple model does it better."

huggingface.co/papers/2503....
Paper page - Your ViT is Secretly an Image Segmentation Model
Join the discussion on this paper page
huggingface.co
April 17, 2025 at 2:28 PM
Reposted by Matthew Carrigan
Here's EsportsBench v5!

72k new matches added from 2025-01-01 through 2025-03-31 and some data quality improvements to past data as well.

Over 2.4 million rows of esports match data from 20 titles spanning over 25 years

huggingface.co/datasets/Esp...
EsportsBench/EsportsBench · Datasets at Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
April 16, 2025 at 3:50 AM
I believe ArXiv and Archive Of Our Own should swap places for April 1st. I believe this more strongly than I believe anything else
March 29, 2025 at 5:00 PM
People are reading MSFT dropping power contracts as a sign that AI investment will fall off, but if reasoning is the new paradigm then most training compute will be inference and that doesn't have to be centralized

Massive monolithic datacentres are much less necessary now
March 25, 2025 at 4:34 PM
Preliminary take is that V3-0324 is a major upgrade on the V3 base. Increasingly confident that it's the strongest open-source LLM, and likely competitive with the top tier of closed source too
March 24, 2025 at 10:22 PM
Deepseek V3-0324 just landed, an upgraded version of the V3 model that was used as the base for Deepseek-R1. Weights on @hf.co , and it'll start appearing on inference providers soon. It seems very strong in early testing, likely the best non-reasoning OS model (!)

huggingface.co/deepseek-ai/...
deepseek-ai/DeepSeek-V3-0324 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
March 24, 2025 at 6:43 PM
Reposted by Matthew Carrigan
Last week, we launched a waitlist to move builders on @hf.co from LFS to Xet. This was made possible through months of hard work and staged migrations to test our infrastructure in real-time.

This post provides an inside look into the day of our first migrations and the weeks after.
Xet is on the Hub
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
March 18, 2025 at 2:18 PM
Anyone want to explain to me where Anthropic are getting "powerful AI will arrive somewhere from late 2026 to early 2027"?

I totally get being AGI-pilled and extrapolating scaling laws, but we've just moved to a new reasoning scaling regime! We don't even have enough points to extrapolate!
March 17, 2025 at 8:46 PM
Shower thoughts: Claude can play Pokemon but it's obviously far too slow for other games. What would an LLM system that can actually play Sonic look like? An LLM giving high level direction and "rewards" to a fast small convnet-RL model that actually mashes the buttons?
March 6, 2025 at 2:21 PM
I work at @hf.co and monitor every issue/PR to Transformers. I estimate about 20% of those are now written by AI, run by various users and companies. In some cases I chat with the AI during the PR review and it makes improvements
February 24, 2025 at 6:20 PM
Complete hardware + software setup for running Deepseek-R1 locally. The actual model, no distillations, and Q8 quantization for full quality. Total cost, $6,000. All download and part links below:
January 28, 2025 at 2:40 PM
You can just keep putting more DDR5 sticks in your server and it will keep getting more intelligent. This scaling law may run out at some point, but it shows no sign of saturation yet
January 26, 2025 at 8:29 PM
Non-AI friends are messaging me and asking me about Deepseek-R1. Don't think anything in the field has gotten this much attention since the ChatGPT launch
January 26, 2025 at 8:13 PM
imo the most bullish case for open-source models is not that they'll be #1 forever, but that closed-source providers will be unable to share the CoTs due to distillation fears. If you care in the slightest about what your model is mumbling to itself, you need open-source
January 25, 2025 at 11:58 AM
Looked at the DeepSeek-R1 repo like "Woah, it's big, but once I quantize with llama.cpp it'll be half the size" before realizing that no it won't because a lot of the weights are float8 already.

Whatever, 650GB GGUF file let's go
January 24, 2025 at 7:58 PM
Reposted by Matthew Carrigan
If you have ever tried to read free books from sites like Project Gutenberg, you noticed that they can be uncomfortable to read, due to their layouts, type & occasional errors

This project takes those free books and makes them beautiful (and still free). standardebooks.org
December 25, 2024 at 1:45 PM
From an LLM's perspective, we're all crowding around going❓🔢 🇷 ▶️ 🍓 and then bursting out laughing when it says "two"
January 19, 2025 at 11:38 AM