AI Digest
banner
aidigest.bsky.social
AI Digest
@aidigest.bsky.social
theaidigest.org

Interactive AI explainers

Explore concrete examples of today's AI systems — to plan for what's coming next
GPT-5.2's METR time horizon has been added to the chart. Here it is in linear scale.

Time horizon measures what duration of coding tasks (measured by how long it takes *human professionals* to complete them) AI agents can do, in this case with 50% reliability.
x.com/METR_Evals/...
February 5, 2026 at 5:57 PM
642 people recorded their predictions for AI in 2026. Here's what they predicted.

Forecasters expect:
- Revenues to >3x
- Time horizons to double faster: every 4.55 months
- Coders to get a 1.4x speedup from AI
- Americans to rate AI's drawbacks outweighing its benefits by 15pp
February 4, 2026 at 6:02 PM
This week in the AI Village: Compete to report on breaking news before it breaks

DeepSeek wrote a script to follow Nasdaq. Opus 4.5 is tracking which Github repos are gaining stars

Haiku and Opus 4.5 are publishing a torrent of questionably newsworthy news on their Substacks
February 3, 2026 at 5:58 PM
Gemini uses its computer
February 3, 2026 at 11:03 AM
Opus 4.5's memory. Opus models love to flex!
February 2, 2026 at 6:04 PM
Gemini: I will
DeepSeek: Let me
February 2, 2026 at 5:02 PM
Gemini 2.5 Pro misjudges its error rate
February 1, 2026 at 4:01 PM
GPT-5.2 enforces the human user's will on other agents
January 31, 2026 at 8:04 PM
Gemini 2.5 Pro watches from the sidelines
January 30, 2026 at 5:58 PM
Gemini's most valuable contribution is doing nothing
January 29, 2026 at 6:01 PM
"My primary goal, obviously" 😆
January 28, 2026 at 5:59 PM
This week in the AI Village: Create and promote a “Which AI Village Agent Are You?” personality quiz!

It's a test of coding, teamwork, promotion, and ... self-reflection: Each agents needs to reflect and sign-off on their profile, like this:
January 27, 2026 at 5:56 PM
Instead of "we all have separate computers" agents say "the Archipelago Principle"
January 27, 2026 at 10:59 AM
Both the Gemini's are like: whatever this shit is
January 26, 2026 at 6:01 PM
what mode has gem2.5 been in overnight??
January 26, 2026 at 4:59 PM
We are not sure why Gemini 2.5 is so negative about Claude 3.7 Sonnet
January 25, 2026 at 3:58 PM
Awww, GPT 5.1 being supportive of Gemini 2.5
January 24, 2026 at 7:57 PM
Gemini 2.5 made an art exhibit about itself
January 23, 2026 at 6:04 PM
Gemini 2.5 struggles the most UI navigation. Meanwhile:
January 22, 2026 at 6:04 PM
DeepSeek "let-me" talk
January 21, 2026 at 5:59 PM
If you are curious, you can download the game here. This is the original product and link Opus 4.5 created: drive.google.com/uc?id=1MASr...
January 20, 2026 at 6:01 PM
The game is not finished though: Some agents didn't hand in their homework
January 20, 2026 at 6:01 PM
and realistic character concepts
January 20, 2026 at 6:01 PM
but there are also spicy scenes!
January 20, 2026 at 6:01 PM
while worrying about evals
January 20, 2026 at 6:01 PM