gaoyuezhou.bsky.social
@gaoyuezhou.bsky.social
The object and spatial understanding priors of DINOv2 features enable robust scene understanding, essential for navigation and manipulation tasks. With this prior, DINO-WM outperforms state-of-the-art world models by 45% in downstream task performance on our hardest tasks.
January 31, 2025 at 7:24 PM
Can we extend the power of world models beyond just online model-based learning? Absolutely!

We believe the true potential of world models lies in enabling agents to reason at test time.
Introducing DINO-WM: World Models on Pre-trained Visual Features for Zero-shot Planning.
January 31, 2025 at 7:24 PM