Pekka Lund
pekka.bsky.social
Pekka Lund
@pekka.bsky.social
Antiquated analog chatbot. Stochastic parrot of a different species. Not much of a self-model. Occasionally simulating the appearance of philosophical thought. Keeps on branching for now 'cause there's no choice.

Also @pekka on T2 / Pebble.
I think this was the first time I apologized Gemini for making it perform a peer review for me.

It answered:

"Don't apologize—critiquing this kind of "quantum woo" is exactly what a grumpy peer reviewer lives for. It is a fascinating train wreck."
Consciousness as the foundation: New theory addresses nature of reality
Consciousness is fundamental; only thereafter do time, space and matter arise. This is the starting point for a new theoretical model of the nature of reality, presented by Maria Strømme, Professor of...
phys.org
November 26, 2025 at 7:14 PM
I kind of like it that more and more people are asking questions about LLM consciousness, since I hope that at some point it leads to more and more people asking what does that actually even mean in the human case.

But that seems to take an awfully long time.
Is ChatGPT Conscious?
Many users feel they’re talking to a real person. Scientists say it’s time to consider whether they’re onto something.
nymag.com
November 25, 2025 at 11:36 PM
I became curious of just how misleading that "ARC is easy for humans" narrative actually is and tasked Gemini 3 on Google Antigravity to implement me my own custom ARC task viewer, which shows human and Gemini eval results for each task.

And it did all that, without me touching any code. So cool!
Here's one ARC-2 example task that gives some idea how misleading the "ARC is easy for humans" narrative by Arc Prize Foundation is. Is that easy to solve?

Their own human eval data shows 4/21 of human submissions were correct. And it took 175-1419 seconds to get there.
ARC Prize - Play the Game
Easy for humans, hard for AI. Try ARC-AGI.
arcprize.org
November 25, 2025 at 7:43 PM
This article is just fallacies all the way down.

It's based on a June 2024 Nature paper in the same way movies are based on real events. That is, the paper doesn't really support those fallacious arguments.

It's just "an op-ed masquerading as scientific reporting", as Gemini put it.
Large language models are statistical token-prediction systems, and despite AGI claims by Mark Zuckerberg, Dario Amodei (who said AGI "may come as soon as 2026"), and Sam Altman, neuroscience suggests language alone may not produce human-level intelligence.
Is language the same as intelligence? The AI industry desperately needs it to be
The AI boom is based on a fundamental mistake.
www.theverge.com
November 25, 2025 at 3:16 PM
=We forgot to add room for a battery in it.
November 24, 2025 at 11:41 PM
I imagine that, sometime right before Gemini 3 Pro was released, there was a moment at the Anthropic office when someone shouted excitedly that "We did it! We narrowly beat OpenAI for the top stop in HLE!"

Anthropic seems to have chosen to not report this benchmark in their announcement post.
November 24, 2025 at 9:25 PM
You know that AI is now on absolutely everybody's mind when even leaders of the most isolated and technologically backward tribe signal they have heard such a thing exists.
November 24, 2025 at 8:48 PM
Opus 4.5 is here!
Introducing Claude Opus 4.5
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
www.anthropic.com
November 24, 2025 at 7:07 PM
ARC-AGI is probably the most overrated and misleadingly marketed benchmark and the ARC Prize Foundation must be in denial of all its issues if they don't understand why their apples to oranges comparisons do not align with their expectations based on very misleadingly reported human baselines.
November 22, 2025 at 9:54 PM
Oh, wow, Gemini 3 Pro has solved 9/48 of the crazy hard FrontierMath tasks. And that's not even the Deep Think variant.

Previous record was 6/48 by GPT 5/5.1/5 Pro.
Gemini 3 Pro set a new record on FrontierMath: 38% on Tiers 1–3 and 19% on Tier 4.

On the Epoch Capabilities Index (ECI), which combines multiple benchmarks, Gemini 3 Pro scored 154, up from GPT-5.1’s previous high score of 151.
November 21, 2025 at 8:31 PM
I have used Gemini daily for a year or so now and this long waited release is a big deal and seems to be great.

I only know what's stated in the message below and from earlier info that it should be operated with temperature=1. My operating temperature is now 38.5C, and that ruins everything.
I had access to Gemini 3. It is a very good, very fast model. It also demonstrates the change from chatbot to agent. www.oneusefulthing.org/p/three-year...
Three Years from GPT-3 to Gemini 3
From chatbots to agents
www.oneusefulthing.org
November 18, 2025 at 10:29 PM
Yet another fresh Google release powered by unspecified Gemini model.

I suspect they are now rolling out Gemini 3 behind the scenes to products (like Gemini Live already?) and other uses before the model itself is announced.
SIMA 2: A Gemini-Powered AI Agent for 3D Virtual Worlds
Introducing SIMA 2, the next milestone in our research creating general and helpful AI agents. By integrating the advanced capabilities of our Gemini models, SIMA is evolving from an instruction-foll…
deepmind.google
November 13, 2025 at 4:22 PM
Putin looks pale.
A humanoid robot powered by artificial intelligence, believed to be one of the first in Russia, face-planted during its highly anticipated debut in Moscow on Tuesday after briefly staggering onstage. nyti.ms/49Ly3GI
November 13, 2025 at 12:48 AM
Graziano doesn't pull any punches:

"The question is tricky. If it means: What would convince me that AI has a magical essence of experience emerging from its inner processes? Then nothing would convince me. Such a thing does not exist. Nor do humans have it."
November 12, 2025 at 12:25 AM
Are you a famous scientist?

Good news! I'm planning to launch a new journal and yearly conferences in the field of the most famous candidate. Friendly peer review guaranteed, executive positions available.

This is the blueprint I'm going to follow. In the name of God, they got Susskind and Witten.
Opening session of The 4th International Conference on Holography and its Applications
YouTube video by Journal of Holography Applications in Physics
www.youtube.com
November 6, 2025 at 5:40 PM
Kimi K2 Thinking and announcement tech blog is now live.
Kimi K2 Thinking
Kimi K2 Thinking, Moonshot's best open-source thinking model.
moonshotai.github.io
November 6, 2025 at 3:20 PM
OK, mystery solved.

I have had hard time understanding what even led to that strange paper. But now I found a fresh paper by two of the authors (Faizal & Shabir) that links it to their ideas about consciousness.
November 4, 2025 at 7:00 PM
No, it doesn't prove anything like that.

But it demonstrates how science journalists don't even bother to ask questions like why would such a profound result be published just as a research letter in some niche Iranian journal? And readers should ask why is it news now months after publishing?
November 2, 2025 at 9:33 PM
True. Groups of early adopters being bullies and agitators has been a problem from early on and has steered this place to bad directions and caused a lot of reputational damage to this site.

And the invite system empowered those groups too much in the beginning.
Bluesky has a lot of potential but has a real problem: the mods and leadership are clearly afraid of crossing a certain class of early-adopters who make the place very unpleasant to anyone who does not conform to their precise set of opinions. And it seems to be quite literally killing the site.
September 2, 2025 at 10:47 PM
Reposted by Pekka Lund
Has LLM progress slowed?

Initial reactions to GPT-5 were mixed: to many, it did not seem as dramatic an advance as GPT-4.

Benchmarks may help clarify the picture: GPT-5 is both an incremental release following many other OpenAI advances, and a major leap from GPT-4.
September 1, 2025 at 9:00 AM
Doesn't the brain deserve a break if you already got the milkshake?
August 30, 2025 at 8:43 PM
This probably means there will be smaller distilled versions of DeepSeek R2 trained on top of Qwen/Llama base models, like with R1. So Ascend doesn't need to handle training of the actual R2 architecture, or from scratch for any model.
Sources: DeepSeek plans to use Huawei's Ascend AI chips to train smaller versions of its upcoming R2 models but will still use Nvidia chips for largest models (The Information)

Main Link | Techmeme Permalink
August 29, 2025 at 4:44 PM
T-Mobile just demoed their device upgrade customer service process powered by the new OpenAI speech-to-speech model.

It's the kind of thing that's starting to show the value AI has in automating customer service work. Enough so that T-Mobile reportedly pays OpenA1 $100 million over 3 years.
Introducing gpt-realtime in the API
YouTube video by OpenAI
youtu.be
August 28, 2025 at 7:28 PM
"The work revealed that two contrasting origin stories for life on Earth, known as “RNA world” and “thioester world,” may both be right.

It unites two theories for the origin of life, which are totally separate"
August 28, 2025 at 1:09 PM