tobycrisford.bsky.social
@tobycrisford.bsky.social
Reposted
It’s basically as useless a label as „judeo-christian“. Yeah there are some shared ideas, but putting the person running your neighborhood food bank in the same group as the Westboro Baptist Church, mother Theresa and Netanyahu won’t really help you much when making sense of the world.
September 30, 2025 at 5:54 AM
Reposted
you heard of running Doom on random gadgets but have you ever heard of…
“Hosting a WebSite on a Disposable Vape”
bogdanthegeek.github.io/blog/project...
this explains how!
((the internet is healing 😌))
#IndieDev
Hosting a WebSite on a Disposable Vape
Someone's trash is another person's web server.
bogdanthegeek.github.io
September 28, 2025 at 8:20 AM
Reposted
The Extreme Inefficiency of RL for Frontier Models
🧵
The switch from training frontier models by next-token-prediction to reinforcement learning (RL) requires 1,000s to 1,000,000s of times as much compute per bit of information the model gets to learn from…
1/11
www.tobyord.com/writing/inef...
The Extreme Inefficiency of RL for Frontier Models — Toby Ord
The new scaling paradigm for AI reduces the amount of information a model could learn per hour of training by a factor of 1,000 to 1,000,000. I explore what this means and its implications for scaling...
www.tobyord.com
September 19, 2025 at 5:18 PM
Reposted
We really need to dial down the anti-free speech bullshit in this country. Police should almost never detain people over their choice of T-shirt! (And I would say the same if the political valence was reversed.) www.reddit.com/r/ThatsInsan...
From the ThatsInsane community on Reddit: Scotland police detain a man wearing an anti-Al shirt because it sounds similar to "Palestine Action"
Explore this post and more from the ThatsInsane community
www.reddit.com
August 19, 2025 at 12:04 PM
Reposted
It's a very rough heuristic, but: if you're a government arresting Quakers, you might want to sit down and have a bit of a think.
August 10, 2025 at 9:06 AM
Reposted
I found another fun Veo 3 prompt: "Realistic footage of the moon landing, but it took place in [year]"

Here is 1883 AD, 1255 AD, 44 AD, 2300 BC, 30,000 BC, and 65 million years ago

Yes, there was apparently wind on the moon back then, you are just going to have suspend disbelief a bit.
July 25, 2025 at 6:01 AM
Reposted
LLMs have been causing many problems in sectors such as education, but it's conventional wisdom among AI proponents that it speeds up coding tasks.

That's why it may be surprising that this RCT by METR, evaluating selected open-source devs' speed, found that AI assistance *slowed them down*.
July 10, 2025 at 6:12 PM
Reposted
GitHub MCP suffers from the lethal trifecta for prompt injection: access to private data, exposure to malicious instructions + the ability to exfiltrate information

Be really careful with this stuff: attackers can trick your "agent" into stealing your private data simonwillison.net/2025/May/26/...
May 27, 2025 at 12:26 AM
Reposted
OpenAI have released their o3 reasoning model to much fanfare, showing how it can "creatively and effectively solve more complex problems".
But in one of their examples it just quietly cheats its way through the puzzle — googling the answer then presenting it as if it solved it…
1/
🧵
April 23, 2025 at 2:24 PM
Reposted
Long ago I internalized the standard "don't anthropomorphize" advice. But I've gradually come to believe it's a mistake to internalize too strongly. So much "human" behaviour is obviously rooted in our animal nature!
April 2, 2025 at 3:57 AM
Reposted
For those who aren't familiar with the tasks, my 10-year-old can solve them in 5 mins each.
8/8
April 1, 2025 at 2:27 PM
Reposted
Finally, I want to note how preposterous the o3-high attempt was. It took 1,024 attempts at each task, writing about 137 pages of text for each attempt, so more than an Encyclopedia Brittanica per task, at a cost of more than $30,000.
That's *something* but doesn't look like intelligence.
7/n
April 1, 2025 at 2:24 PM
Reposted
A simple yet compelling argument that we should (typically) maximize expected value, which also helps to make sense of the principled exceptions to this rule...
www.goodthoughts.blog/p/shuffling-...
Shuffling around Expected Value
A simple proof that we should (typically) maximize expected value
www.goodthoughts.blog
March 3, 2025 at 8:15 PM
Reposted
Starmer just confirmed UK aid cut from 0.5% to 0.3% GNI to fund increased defence spending

This is so bleak - thousands in low income countries will die.

I get (and support) that we need to increase defence spending/capabilities in europe, but surely better ways to fund it 🧵
February 25, 2025 at 1:34 PM
Reposted
This is even worse than it looks. I checked on X, and the handcuffs image is a video--posted by the official White House account--that shows them chaining someone up and making them march shackled onto a plane

x.com/WhiteHouse/s...

IMO this is a huge red flag for potential human rights abuses
February 19, 2025 at 2:07 AM
Reposted
New paper:
Inference Scaling Reshapes AI Governance
The shift from scaling up the pre-training compute of AI systems to scaling up their inference compute may have profound effects on AI governance.
🧵
1/
www.tobyord.com/writing/infe...
Inference Scaling Reshapes AI Governance — Toby Ord
The shift from scaling up the pre-training compute of AI systems to scaling up their inference compute may have profound effects on AI governance. The nature of these effects depends crucially on whet...
www.tobyord.com
February 13, 2025 at 11:30 AM
Reposted
I've just tried out o3-mini-high on the following question.

I'm wondering if you can help me with the following maths problem. I'd like to find a property of pairs of distinct primes such that if X is any infinite set of primes, then one can find a pair of distinct primes in X ... 🧵
February 4, 2025 at 5:31 PM
Reposted
Huh, LLMs have a form of intuitive self-awareness of their own personalities.
January 31, 2025 at 10:29 AM
Reposted
The Scaling Paradox:
AI capabilities have improved remarkably quickly, fuelled by the explosive scale-up of resources being used to train the leading models. But the scaling laws that inspired this rush actually show very poor returns to scale. What’s going on?
1/
www.tobyord.com/writing/the-...
The Scaling Paradox — Toby Ord
AI capabilities have improved remarkably quickly, fuelled by the explosive scale-up of resources being used to train the leading models. But if you examine the scaling laws that inspired this rush, th...
www.tobyord.com
January 13, 2025 at 5:16 PM
Reposted
After more than a decade of procrastinating, this week I finally filled out the paperwork (ok, webform) to pledge to give at least 10% of my income to effective charities for the rest of my career. 🧵
December 19, 2024 at 5:00 PM