@saurabhsaxena.bsky.social
440 followers 47 following 2 posts
Posts Media Videos Starter Packs
Reposted
mtschannen.bsky.social
Have you ever wondered how to train an autoregressive generative transformer on text and raw pixels, without a pretrained visual tokenizer (e.g. VQ-VAE)?

We have been pondering this during summer and developed a new model: JetFormer 🌊🤖

arxiv.org/abs/2411.19722

A thread 👇

1/
saurabhsaxena.bsky.social
SfM failing on dynamic videos? 😠 RoMo to the rescue! 💪 Our simple method uses epipolar cues and semantic features for robustly estimating motion masks, boosting dynamic SfM performance 🚀 Plus, a new dataset of dynamic scenes with ground truth cameras! 🤯 #computervision

🧵👇
lilygoli.bsky.social
Hello everyone!! 👋
Excited to be here and share our latest work to get started!

RoMo: Robust Motion Segmentation Improves Structure from Motion

romosfm.github.io

Boost the performance of your SfM pipeline on dynamic scenes! 🚀 RoMo masks dynamic objects in a video, in a zero-shot manner.