Haofei Xu
@haofeixu.bsky.social
76 followers 150 following 7 posts
PhD student at ETH Zurich & University of Tübingen, working on 3D Vision https://haofeixu.github.io/
Posts Media Videos Starter Packs
haofeixu.bsky.social
Check out Frano's amazing work on multi-view 3D point tracking! 🚀 Code, models, datasets, and interactive results — all available!
franorajic.bsky.social
1/4 🚀 We’re excited to release MVTracker (ICCV 2025 Oral), the first data-driven multi-view 3D point tracker. MVTracker tracks arbitrary 3D points across multiple cameras, handling occlusions and varied camera setups without per-sequence optimization.
Reposted by Haofei Xu
haoyuhe.bsky.social
🚀 Introducing our new paper, MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models.

📄 Paper: www.scholar-inbox.com/papers/He202...
arxiv.org/pdf/2508.13148
💻 Code: github.com/autonomousvi...
🌐 Project Page: cli212.github.io/MDPO/
haofeixu.bsky.social
Project page: chengzhag.github.io/publication/...
Code: github.com/chengzhag/Pa...
Interact with the scene in this video (best viewed on a desktop browser or Youtube app): www.youtube.com/watch?v=9bKZ...
haofeixu.bsky.social
Wanna scale your feed-forward Gaussian Splatting model to 4K resolution? Come check out our #CVPR2025 poster PanSplat today 10:30–12:30 (June 14) at ExHall D, Poster #74!
haofeixu.bsky.social
Catch us on Saturday, June 14 at 5 PM, ExHall D Poster #58!
Reposted by Haofei Xu
andreasgeiger.bsky.social
Your personalized CVPR 25 @cvprconference.bsky.social conference programs are now available for you!
www.scholar-inbox.com/conference/c...
haofeixu.bsky.social
DepthSplat: Connecting Gaussian Splatting and Depth
Project page: haofeixu.github.io/depthsplat/
Code, models, data: github.com/cvg/depthsplat
haofeixu.bsky.social
Excited to present our #CVPR2025 paper DepthSplat next week!
DepthSplat is a feed-forward model that achieves high-quality Gaussian reconstruction and view synthesis in just 0.6 seconds.
Looking forward to great conversations at the conference!
andreasgeiger.bsky.social
🏠 Introducing DepthSplat: a framework that connects Gaussian splatting with single- and multi-view depth estimation. This enables robust depth modeling and high-quality view synthesis with state-of-the-art results on ScanNet, RealEstate10K, and DL3DV.
🔗 haofeixu.github.io/depthsplat/
Reposted by Haofei Xu
katrinrenz.bsky.social
📣 Excited to share our #CVPR2025 Spotlight paper and my internship project @wayve: SimLingo.
A Vision-Language-Action (VLA) model that achieves state-of-the-art driving performance with language capabilities.

Code: github.com/RenzKa/simli...
Paper: arxiv.org/abs/2503.09594
Reposted by Haofei Xu
s-esposito.bsky.social
📢 New paper CVPR 25!
Can meshes capture fuzzy geometry? Volumetric Surfaces uses adaptive textured shells to model hair, fur without the splatting / volume overhead. It’s fast, looks great, and runs in real time even on budget phones.
🔗 autonomousvision.github.io/volsurfs/
📄 arxiv.org/pdf/2409.02482
Reposted by Haofei Xu
bernhard-jaeger.bsky.social
Introducing CaRL: Learning Scalable Planning Policies with Simple Rewards
We show how simple rewards enable scaling up PPO for planning.
CaRL outperforms all prior learning-based approaches on nuPlan Val14 and CARLA longest6 v2, using less inference compute.
arxiv.org/abs/2504.17838
Reposted by Haofei Xu
andreasgeiger.bsky.social
🏠 Introducing DepthSplat: a framework that connects Gaussian splatting with single- and multi-view depth estimation. This enables robust depth modeling and high-quality view synthesis with state-of-the-art results on ScanNet, RealEstate10K, and DL3DV.
🔗 haofeixu.github.io/depthsplat/
Reposted by Haofei Xu
haiwen-huang.bsky.social
Excited to introduce LoftUp!

A strong (than ever) and lightweight feature upsampler for vision encoders that can boost performance on dense prediction tasks by 20%–100%!

Easy to plug into models like DINOv2, CLIP, SigLIP — simple design, big gains. Try it out!

github.com/andrehuang/l...
Reposted by Haofei Xu
kashyap7x.bsky.social
🐎 Centaur, our first foray into test-time training for end-to-end driving. No retraining needed, just plug-and-play at deployment given a trained model. Also, theoretically nearly no overhead in latency with some clever use of buffers. Surprising how effective this is! arxiv.org/abs/2503.11650
Reposted by Haofei Xu
andreasgeiger.bsky.social
This week we had our winter retreat jointly with Daniel Cremer's group in Montafon, Austria. 46 talks, 100 Km of slopes and night sledding with some occasionally lost and found. It has been fun!