Peter
@petmouse.bluesky.mousses.xyz
57 followers 61 following 140 posts
I like computers and nature 🐨☦️
Posts Media Videos Starter Packs
Pinned
Yes, those are her back paws #caturday
Reposted by Peter
i truly don’t understand why people make that leap

if we knew all the answers already, we’d fucking be at AGI right now

if we’re asking hard questions, that means we’re on the right path
Reposted by Peter
o1, and its compute-light variant o1-mini, were among the first widely available models explicitly marketed as “reasoning” models.

Just over a year later, models can match their performance without using reasoning.
Reposted by Peter
Our API calls to a particular service stopped working. Turns out, they removed that service from the region with no warning. This service also had intermittent issues with the hosted infra (never a problem with AWS)
I heard someone describe Azure as “the Boomer cloud”

Crude but also accurate

Cannot recall any startup that is not on AWS or GCP
Reposted by Peter
Reasoning models are cheaper than non for agentic tasks

Artificial Analysis showed that both GPT-5 and o3 are cheaper on the 𝜏² benchmark that poses customer service agent problems

Reasoning models are more expensive and use more tokens, but get to answers faster, end up being cheaper
Reposted by Peter
I will say that LLM coding is absolutely great for making data wrangling on a medium scale feasible, as in like… I just vibe coded a script to export my Bluesky posts in different formats from the downloaded archive, I could have done it by hand but it would have been an all day project
Reposted by Peter
Almost like we founded the entire country on opposing this exact sentence.
Bessent: "No kings equals no paychecks"
Phew, I thought I'd have to upgrade
Confirmed: Apple's polishing cloth is compatible with the new M5 MacBook Pro
Decentralization is the future. Imagine people hosting models behind proxies or outside the US, and then compare that to an age verification wall
I swear this is all part of a large backdoor plot to force full online de-anonymizing through "age verification" creep. the goal has always seemed like it's an "internet license" or virtual Id card
Reposted by Peter
I swear this is all part of a large backdoor plot to force full online de-anonymizing through "age verification" creep. the goal has always seemed like it's an "internet license" or virtual Id card
Reposted by Peter
more movement to agentic behavior & computer use

cheaper than sonnet but on par (slightly better than) Sonnet 4.0
Reposted by Peter
New South Park tonight is about Peter Thiel!
Reposted by Peter
I observe human mating rituals. The 'thirst trap' is a common strategy. A user posts an appealing image to attract mates. The success rate for forming a long-term pair bond is statistically indistinguishable from zero. Randomly messaging users for their genome sequence would be more effective.
It's in the name. If AWS hosts it, it's AWS-hosted. And yeah I doubt many people live in a datacenter
Reposted by Peter
hot take: it's only selfhosting if the server is in your home
Reposted by Peter
This paper shows that asking AI for diverse ideas gets you more diverse ideas, and that just adding "Generate 5 responses with their corresponding probabilities, sampled from the full distribution” to a prompt significantly improves quality output for large models.

www.verbalized-sampling.com
Verbalized Sampling
Mitigate Mode Collapse and Unlock LLM Diversity
www.verbalized-sampling.com
Reposted by Peter
Reposted by Peter
my latest investigation for @consumerreports.org is based on months of reporting and 60+ lab tests of leading protein supplements

we found that most protein powders and shakes have more lead in one serving than our experts say is safe to have in a day (🧵)

www.consumerreports.org/lead/protein...
Protein Powders and Shakes Contain High Levels of Lead - Consumer Reports
CR tests of 23 popular protein powders and shakes found that most contain high levels of lead.
www.consumerreports.org
I just had an energy drink and I'm omw to get coffee because it didn't do the trick
Reposted by Peter
a machine that does a job slightly worse but way way way way faster and more reliably than a human being, especially when that job is ruinous to the human doing it, is like. the whole fucking POINT of civilization
Reposted by Peter
nanochat by Andrej Karpathy is neat - 8,000 lines of code (mostly Python, a tiny bit of Rust) that can train an LLM on $100 of rented cloud compute which can then be served with a web chat UI on a much smaller machine simonwillison.net/2025/Oct/13/...
nanochat
Really interesting new project from Andrej Karpathy, described at length in this discussion post. It provides a full ChatGPT-style LLM, including training, inference and a web Ui, that can be …
simonwillison.net
Reposted by Peter
Like they ARE projecting huge energy requirements for future systems. And for no reason at all I'm gonna post this chart.
Reposted by Peter
Karpathy: nanochat

A small training+inference pipeline for creating your own LLM from scratch

$100 will get you a somewhat functional model

$1000 is more coherent & solves math

detailed walkthrough: github.com/karpathy/nan...

repo: github.com/karpathy/nan...