Their low compute version satisfies the <10k cost cap for the public leaderboard. The public leaderboard is separate from the main prize and allows closed models.
So their “official” public leaderboard score is 75.7 % which is still much higher than any other models.
Their low compute version satisfies the <10k cost cap for the public leaderboard. The public leaderboard is separate from the main prize and allows closed models.
So their “official” public leaderboard score is 75.7 % which is still much higher than any other models.
OpenAI’s o3 got 75.7 % with this requirement so this is the official score on the leaderboard.
OpenAI then tried the high compute version to target the 85% goal and they spent millions to get 87.5 % on that.
OpenAI’s o3 got 75.7 % with this requirement so this is the official score on the leaderboard.
OpenAI then tried the high compute version to target the 85% goal and they spent millions to get 87.5 % on that.
So the open source requirement is only relevant for the prize money.
OpenAI targeted the public leaderboard which allows closed models to participate.
So the open source requirement is only relevant for the prize money.
OpenAI targeted the public leaderboard which allows closed models to participate.
Interesting bifurcation seems to be SOF more focusing on firearms and mobility while regular infantry also needs to protect against shrapnel as well.
Interesting bifurcation seems to be SOF more focusing on firearms and mobility while regular infantry also needs to protect against shrapnel as well.
x.com/madebyollin/...
x.com/madebyollin/...
bsky.app/profile/sedi...
bsky.app/profile/sedi...