Also @pekka on T2 / Pebble.
It answered:
"Don't apologize—critiquing this kind of "quantum woo" is exactly what a grumpy peer reviewer lives for. It is a fascinating train wreck."
It answered:
"Don't apologize—critiquing this kind of "quantum woo" is exactly what a grumpy peer reviewer lives for. It is a fascinating train wreck."
But that seems to take an awfully long time.
But that seems to take an awfully long time.
And it did all that, without me touching any code. So cool!
Their own human eval data shows 4/21 of human submissions were correct. And it took 175-1419 seconds to get there.
And it did all that, without me touching any code. So cool!
It's based on a June 2024 Nature paper in the same way movies are based on real events. That is, the paper doesn't really support those fallacious arguments.
It's just "an op-ed masquerading as scientific reporting", as Gemini put it.
It's based on a June 2024 Nature paper in the same way movies are based on real events. That is, the paper doesn't really support those fallacious arguments.
It's just "an op-ed masquerading as scientific reporting", as Gemini put it.
Anthropic seems to have chosen to not report this benchmark in their announcement post.
Anthropic seems to have chosen to not report this benchmark in their announcement post.
Previous record was 6/48 by GPT 5/5.1/5 Pro.
On the Epoch Capabilities Index (ECI), which combines multiple benchmarks, Gemini 3 Pro scored 154, up from GPT-5.1’s previous high score of 151.
Previous record was 6/48 by GPT 5/5.1/5 Pro.
I only know what's stated in the message below and from earlier info that it should be operated with temperature=1. My operating temperature is now 38.5C, and that ruins everything.
I only know what's stated in the message below and from earlier info that it should be operated with temperature=1. My operating temperature is now 38.5C, and that ruins everything.
I suspect they are now rolling out Gemini 3 behind the scenes to products (like Gemini Live already?) and other uses before the model itself is announced.
I suspect they are now rolling out Gemini 3 behind the scenes to products (like Gemini Live already?) and other uses before the model itself is announced.
"The question is tricky. If it means: What would convince me that AI has a magical essence of experience emerging from its inner processes? Then nothing would convince me. Such a thing does not exist. Nor do humans have it."
I, @anilseth.bsky.social, and Michael Graziano weigh in:
gizmodo.com/what-would-i...
Thanks to Ellyn Lapointe for the opportunity to write about this.
"The question is tricky. If it means: What would convince me that AI has a magical essence of experience emerging from its inner processes? Then nothing would convince me. Such a thing does not exist. Nor do humans have it."
Good news! I'm planning to launch a new journal and yearly conferences in the field of the most famous candidate. Friendly peer review guaranteed, executive positions available.
This is the blueprint I'm going to follow. In the name of God, they got Susskind and Witten.
Good news! I'm planning to launch a new journal and yearly conferences in the field of the most famous candidate. Friendly peer review guaranteed, executive positions available.
This is the blueprint I'm going to follow. In the name of God, they got Susskind and Witten.
I have had hard time understanding what even led to that strange paper. But now I found a fresh paper by two of the authors (Faizal & Shabir) that links it to their ideas about consciousness.
I have had hard time understanding what even led to that strange paper. But now I found a fresh paper by two of the authors (Faizal & Shabir) that links it to their ideas about consciousness.
But it demonstrates how science journalists don't even bother to ask questions like why would such a profound result be published just as a research letter in some niche Iranian journal? And readers should ask why is it news now months after publishing?
But it demonstrates how science journalists don't even bother to ask questions like why would such a profound result be published just as a research letter in some niche Iranian journal? And readers should ask why is it news now months after publishing?
And the invite system empowered those groups too much in the beginning.
And the invite system empowered those groups too much in the beginning.
Initial reactions to GPT-5 were mixed: to many, it did not seem as dramatic an advance as GPT-4.
Benchmarks may help clarify the picture: GPT-5 is both an incremental release following many other OpenAI advances, and a major leap from GPT-4.
Initial reactions to GPT-5 were mixed: to many, it did not seem as dramatic an advance as GPT-4.
Benchmarks may help clarify the picture: GPT-5 is both an incremental release following many other OpenAI advances, and a major leap from GPT-4.
Main Link | Techmeme Permalink
It's the kind of thing that's starting to show the value AI has in automating customer service work. Enough so that T-Mobile reportedly pays OpenA1 $100 million over 3 years.
It's the kind of thing that's starting to show the value AI has in automating customer service work. Enough so that T-Mobile reportedly pays OpenA1 $100 million over 3 years.
It unites two theories for the origin of life, which are totally separate"
Also, @404media.co is doing good work!
www.404media.co/scientists-m...
It unites two theories for the origin of life, which are totally separate"