@spellbanisher.bsky.social
This one wasn't too bad
December 1, 2025 at 12:35 AM
Same prompt, but on a foggy car window instead of a chalkboard
November 29, 2025 at 2:13 PM
Doubling the length of video quadruples its cost. At 10c per second for a 10 second video, an 80 minute video (short movie) would cost about $260,000. That's just for the base model at 720p. If you wanted pro at 1024p, about $1.3 million.
October 15, 2025 at 7:29 PM
It doesn't fundamentally change your point, but openai released api prices for sora 2. It is 10 cents per second for 720p video on the base model, 30 cents per second for the pro model, and 50 cents per second for 1024p.
October 8, 2025 at 2:48 AM
Openai doesn't create any original content, so basically sora is trained on almost nothing but copyrighted content.
October 2, 2025 at 3:17 AM
15 cents per second or .15 cents?
October 2, 2025 at 12:33 AM
Before agi they should try to make ai that can reliably take drive thru fast food orders

gizmodo.com/taco-bell-sa...
Taco Bell Says 'No Más' to AI Drive-Thru Experiment
If you think humans get your order wrong, wait until you try AI.
gizmodo.com
August 28, 2025 at 7:39 PM
Not even the fast food checkout ai is reliable enough.

gizmodo.com/taco-bell-sa...
Taco Bell Says 'No Más' to AI Drive-Thru Experiment
If you think humans get your order wrong, wait until you try AI.
gizmodo.com
August 28, 2025 at 7:36 PM
Too big to fail
August 5, 2025 at 6:28 PM
Using AI to write this article made it tedious to read. The 'rule of three' (where you have lists of attributes in 3 clauses) is especially egregious here, where it is used 5 times in 4 sentences.
July 5, 2025 at 1:41 PM
I've seen several videos of walking humanoids being kicked and shoved without falling over. Why or how is it that they are still too unsafe?
April 11, 2025 at 2:38 AM
17-20$ for the high efficiency mode. Low efficiency mode used about 172x as much compute.
December 29, 2024 at 1:36 AM
I don't think they have individuals take the entire test. Rather, they'll have turkers complete like 5 tasks and after they get a sufficient sample size they can estimate what an average person would score if they took the whole test.
December 26, 2024 at 4:47 PM
They have a private test, but it can only be taken after the benchmark is beaten on public and semi-private test set. O3 did not beat the benchmark. It met the score requirement (85% or above), but not the cost requirement (less than $10k).
December 26, 2024 at 4:42 PM
Are Amazon turkers considered above or below American average?
December 25, 2024 at 7:11 PM
It is actually based on the training set. Another study found that the two-shot average on the evaluation set was 60%, with a high of 98%. They also estimated that an ensemble of 10 randomly selected people online would score 100%.
arxiv.org/html/2409.01...
December 23, 2024 at 9:06 PM
A smaller open source model running on less than .10$ per task managed 56% on arc-agi. O3 used 30,000x as much compute to get 88%. Wouldn't be surprised if used similar methods, with difference being compute. Openai did train the model for this domain.
December 21, 2024 at 2:58 PM
Openai did say they fine-tuned o3 on the 400 public eval questions. Since gpt4o and o1 preview were tested on the semi-private evaluations, they would also have those questions if they wanted to fine their models on that as well.
December 21, 2024 at 2:08 AM
"Semi-private" refers to the test set arc-agi created specifically for frontier models. The problems can't be seen beforehand (hence private) but because frontier models run on the developers api, the developers have the test set once their model takes the test.
December 21, 2024 at 2:05 AM
From google labs Imagefx
December 18, 2024 at 2:54 AM