Matt Wallace
m9e.bsky.social
Matt Wallace
@m9e.bsky.social
CTO building #AI
Love this detail. reuters on the ruling: www.reuters.com/legal/litiga...
June 25, 2025 at 12:41 AM
as an AI founder who is ridiculously all in, when I’m like flabbergasted by the temerity of your product I just can’t even…
June 5, 2025 at 6:18 PM
Reposted by Matt Wallace
How much faster would the science of large-scale AI advance if we could open-source the *process* of building a frontier model?

Not just the final models/code/data, but also negative results, toy experiments, and even spontaneous discussions.

That's what we're trying @ marin.community
May 19, 2025 at 7:05 PM
Suno is so fucking good. Wow.
May 12, 2025 at 4:47 PM
The year is 2030. nVidia announces new DGX with attached Fusion reactor.
May 7, 2025 at 3:48 PM
Claude did not understand the mission and tried to write his system prompt to Notion. haha!
May 6, 2025 at 4:50 PM
Home setup: rtx3090+4090. testing AWQ Qwen3; both 32B and 32B w/ 4B speculative decoding

TL;DR: ⚠️ on spec in vanilla vllm. w/ a 75% token acceptance generation in batch-4 went from 150t/s (no speculative) -> 50t/s (w/ speculative)

I'd get 400t/s+ max tput w/out speculative, larger batch
May 6, 2025 at 1:36 PM
Wild and wonderful watching a keynote demo 100ft wide and being able to literally picture the code in your head. 😁
April 1, 2025 at 5:04 PM
Every schema for content should have a human vs ai flag (or perhaps an enum with human, ai, and then some hybrid roles).
March 26, 2025 at 3:53 PM
Jensen: "I'd never buy a hopper!"

Azure: "We don't have any Ampere GPUs to turn up even."

Me: 😠
March 19, 2025 at 6:45 PM
640KB ought to be enough for anybody.
March 15, 2025 at 6:55 AM
It's interesting because the dials of draft model, draft model quant, --draft length, they all play into the sweet spot. and it's clearly not super consistent. like I was playing with 32B-Q8 w/ 3B vs 7B 4KL drafts; with those the default --draft 16 seems like the sweet spot. (>12, >32 informal test)
32B Coder-Q8 w/ and w/out 7B-Q4_K_L draft - PSA speculative decoding is in llamacpp and works. (depending on your hardware, experiment w/ diff model sizes - ymmv vary wildly)
March 11, 2025 at 1:01 PM
32B Coder-Q8 w/ and w/out 7B-Q4_K_L draft - PSA speculative decoding is in llamacpp and works. (depending on your hardware, experiment w/ diff model sizes - ymmv vary wildly)
March 11, 2025 at 12:50 PM
Reward function engineer
February 28, 2025 at 2:56 PM
ok. told claude 3.7 to extra some settings mgmt components from an app layer into generic fastapi router+react component. I expected that to work. Him writing this beautiful readme with emojis I did *NOT* expect.
February 25, 2025 at 12:08 AM
continue

continue

continue

continue

continue, and btw, I think you were interrupted and may have already written some stuff you can't see

continue

continue

ifykyk
February 25, 2025 at 12:02 AM
Agent-Oriented Programming
February 15, 2025 at 4:07 AM
Asked ChatGPT to compare and contrast the coconut (reasoning in latent space with reasoning tokens & passing hidden states) to the new recurrent depth paper.

chatgpt.com/share/67aa59...

Coolest thing was GPT suggesting a hybrid approach which I had been thinking before I even got to the bottom
ChatGPT - Latent Space Reasoning Comparison
Shared via ChatGPT
chatgpt.com
February 10, 2025 at 7:58 PM
I live in a world of language all day and startup land is not conducive to a lot of distraction, but damn if every time I pop into Suno I'm not blown away.
February 7, 2025 at 5:04 AM
Based purely on the speed I feel like o3-mini is getting a solid reception. The tps is like 1/5th what it was last week
February 4, 2025 at 3:53 PM
deep research seems immediately useful and may have just solved an annoying sglang issue with a particular model for me.
February 3, 2025 at 2:40 PM
🤣❤️
January 31, 2025 at 12:18 AM
heheh.
January 24, 2025 at 2:01 PM
Feel like the "peak of inflated expectations" call may age poorly, although the engineering to really run AI apps is bootstrapping fast into reality, so maybe it isn't that expectations go down but that eng catches up some?
January 20, 2025 at 3:33 PM