PhD from CUHK. 3D vision, SLAM, SfM, Image Matching (https://github.com/ericzzj1989/Awesome-Image-Matching).
🔹 GlobustVP
Convex Relaxation for Robust Vanishing Point Estimation in Manhattan World
🧱 Global optimality
💥 Tolerates up to 70% outliers
⚡ Fast runtime
📄 Paper: arxiv.org/abs/2505.04788
💻 Code: github.com/WU-CVGL/GlobustVP
1/
Yu Hu, Chong Cheng, Sicheng Yu, Xiaoyang Guo, Hao Wang
tl;dr: VGGT global attention->gram similarity statistics->gradient-aware refinement->dynamic masks->VGGT shallow attentions
arxiv.org/abs/2511.19971
Yu Hu, Chong Cheng, Sicheng Yu, Xiaoyang Guo, Hao Wang
tl;dr: VGGT global attention->gram similarity statistics->gradient-aware refinement->dynamic masks->VGGT shallow attentions
arxiv.org/abs/2511.19971
Hengyi Wang, Lourdes Agapito
tl;dr: VGGT+scale head->pointmaps+geometric features->sparse voxels->1D sequence->transformer->fused features->zero-convolution->VGGT decoder
arxiv.org/abs/2511.20343
Hengyi Wang, Lourdes Agapito
tl;dr: VGGT+scale head->pointmaps+geometric features->sparse voxels->1D sequence->transformer->fused features->zero-convolution->VGGT decoder
arxiv.org/abs/2511.20343
Zhimin Shao, Abhay Yadav, Rama Chellappa, Cheng Peng
tl;dr: 3D VFM+2D ConvNet->feature extraction backbone; 3D descriptor head (for geometry)+2D warp head (for pattern) fusion
arxiv.org/abs/2511.17750
Zhimin Shao, Abhay Yadav, Rama Chellappa, Cheng Peng
tl;dr: 3D VFM+2D ConvNet->feature extraction backbone; 3D descriptor head (for geometry)+2D warp head (for pattern) fusion
arxiv.org/abs/2511.17750
Jungho Lee, Minhyeok Lee, Sunghun Yang, Minseok Kang, Sangyoun Lee
tl;dr: depth/scale-guided point sampling->non-iterative Sim(3) alignment; DINO patch token->loop closure
arxiv.org/abs/2511.18290
Jungho Lee, Minhyeok Lee, Sunghun Yang, Minseok Kang, Sangyoun Lee
tl;dr: depth/scale-guided point sampling->non-iterative Sim(3) alignment; DINO patch token->loop closure
arxiv.org/abs/2511.18290
Haonan Wang, Hanyu Zhou, Haoyue Liu, Luxin Yan
tl;dr: 4D version of VGGT
arxiv.org/abs/2511.18416
Haonan Wang, Hanyu Zhou, Haoyue Liu, Luxin Yan
tl;dr: 4D version of VGGT
arxiv.org/abs/2511.18416
Kuan Wei Huang, Brandon Li, @bharathhariharan.bsky.social, @snavely.bsky.social
tl;dr: in title; paired floor plans and ground-view photos with annotated correspondences and poses
arxiv.org/abs/2511.18559
Kuan Wei Huang, Brandon Li, @bharathhariharan.bsky.social, @snavely.bsky.social
tl;dr: in title; paired floor plans and ground-view photos with annotated correspondences and poses
arxiv.org/abs/2511.18559
Carl Lindström, Mahan Rafidashti, Maryam Fatemi, Lars Hammarstrand, @martin-r-oswald.bsky.social, Lennart Svensson
tl;dr: coherent instances->dynamic objects
arxiv.org/abs/2511.19235
Carl Lindström, Mahan Rafidashti, Maryam Fatemi, Lars Hammarstrand, @martin-r-oswald.bsky.social, Lennart Svensson
tl;dr: coherent instances->dynamic objects
arxiv.org/abs/2511.19235
Kehua Chen, et al.
tl;dr: 2DGS+dense enhancement with π3+monocular&PatchMatch-based multi-view opt.+depth-guided appearance modeling with Tri-MipRF
arxiv.org/abs/2511.19172
Kehua Chen, et al.
tl;dr: 2DGS+dense enhancement with π3+monocular&PatchMatch-based multi-view opt.+depth-guided appearance modeling with Tri-MipRF
arxiv.org/abs/2511.19172
Xueyu Du, Lilian Zhang, Fuan Duan, Xincan Luo, Maosong Wang, Wenqi Wu, Jun Mao
tl;dr: implicit environment map with keyframes and 2D keypoints->loop closure
arxiv.org/abs/2511.18756
Xueyu Du, Lilian Zhang, Fuan Duan, Xincan Luo, Maosong Wang, Wenqi Wu, Jun Mao
tl;dr: implicit environment map with keyframes and 2D keypoints->loop closure
arxiv.org/abs/2511.18756
Samuel Cerezo, Seong Hun Lee, @jcivera.bsky.social
tl;dr: in title; not local solver; small-rotation and constant-velocity approximations->analytical solver->VI states
arxiv.org/abs/2511.18910
Samuel Cerezo, Seong Hun Lee, @jcivera.bsky.social
tl;dr: in title; not local solver; small-rotation and constant-velocity approximations->analytical solver->VI states
arxiv.org/abs/2511.18910
Yan Xu, Yixing Wang, Stella X. Yu
tl;dr: GS init.+interpolated poses->guidance images with uncertainties->video diffusion->pseudo views->GS supervision
arxiv.org/abs/2511.17932
Yan Xu, Yixing Wang, Stella X. Yu
tl;dr: GS init.+interpolated poses->guidance images with uncertainties->video diffusion->pseudo views->GS supervision
arxiv.org/abs/2511.17932
Yiming Wang, Shaofei Wang, Marko Mihajlovic, Siyu Tang
tl;dr: tri-plane+neural decoder->local RGBA texture fields
arxiv.org/abs/2511.18873
Yiming Wang, Shaofei Wang, Marko Mihajlovic, Siyu Tang
tl;dr: tri-plane+neural decoder->local RGBA texture fields
arxiv.org/abs/2511.18873
Seunghun Oh, Jaesung Choe, Dongjae Lee, Daeun Lee, Seunghoon Jeong, Yu-Chiang Frank Wang, Jaesik Park
tl;dr: SDF->sparse voxel rasterization; initialization+loss improve spatial coherence
arxiv.org/abs/2511.17364
Seunghun Oh, Jaesung Choe, Dongjae Lee, Daeun Lee, Seunghoon Jeong, Yu-Chiang Frank Wang, Jaesik Park
tl;dr: SDF->sparse voxel rasterization; initialization+loss improve spatial coherence
arxiv.org/abs/2511.17364
Kunyi Li, @miniemeyer.bsky.social, Sen Wang, Stefano Gasperini, Nassir Navab, Federico Tombari
tl;dr: local 3D reconstruction+global GS
arxiv.org/abs/2511.17207
Kunyi Li, @miniemeyer.bsky.social, Sen Wang, Stefano Gasperini, Nassir Navab, Federico Tombari
tl;dr: local 3D reconstruction+global GS
arxiv.org/abs/2511.17207
@idris1.bsky.social, @ericzzj.bsky.social, Samuel Schmidgall, Yumeng Wang, Paul Maria Scheikl, Axel Krieger
tl;dr: Gaussian Surfels->dynamic surgical scenes
arxiv.org/abs/2503.04079
Zijian Wu, Mingfeng Jiang, Zidian Lin, Ying Song, Hanjie Ma, Qun Wu, Dongping Zhang, Guiyang Pu
tl;dr: real views+multiple perturbation magnitudes->pseudo-views->optimization
arxiv.org/abs/2511.16030
Zijian Wu, Mingfeng Jiang, Zidian Lin, Ying Song, Hanjie Ma, Qun Wu, Dongping Zhang, Guiyang Pu
tl;dr: real views+multiple perturbation magnitudes->pseudo-views->optimization
arxiv.org/abs/2511.16030
@parskatt.bsky.social et 11 al.
tl;dr: in title.
Predict covariance per-pixel, more datasets, use DINOv3, adjust architecture.
arxiv.org/abs/2511.15706
@parskatt.bsky.social et 11 al.
tl;dr: in title.
Predict covariance per-pixel, more datasets, use DINOv3, adjust architecture.
arxiv.org/abs/2511.15706
Here are the main improvements we made since RoMa:
Here are the main improvements we made since RoMa:
The International Workshop on AI4Robotics by @naverlabseurope
2dys of Spatial AI, SLAM, robot learning, HRI, autonomy
This AM CET: @martinhumenberger.bsky.social @marcpollefeys.bsky.social Andrea Vedaldi Cordelia Schmid & @andrewdavidson.bsky.social ⬇️
The International Workshop on AI4Robotics by @naverlabseurope
2dys of Spatial AI, SLAM, robot learning, HRI, autonomy
This AM CET: @martinhumenberger.bsky.social @marcpollefeys.bsky.social Andrea Vedaldi Cordelia Schmid & @andrewdavidson.bsky.social ⬇️
Hoang Chuong Nguyen, Wei Mao, Jose M. Alvarez, Miaomiao Liu
tl;dr: base color from 3DGS rendering and learned residual inferred from nearby training images->pixel color
arxiv.org/abs/2511.14357
Hoang Chuong Nguyen, Wei Mao, Jose M. Alvarez, Miaomiao Liu
tl;dr: base color from 3DGS rendering and learned residual inferred from nearby training images->pixel color
arxiv.org/abs/2511.14357
Yutian Chen, @yuhengqiu.bsky.social, Ruogu Li, Ali Agha, Shayegan Omidshafiei, Jay Patrikar, @smash0190.bsky.social
tl;dr: ViT->distillation->per-token confidence->rank tokens->selective merging
arxiv.org/abs/2511.14751
Yutian Chen, @yuhengqiu.bsky.social, Ruogu Li, Ali Agha, Shayegan Omidshafiei, Jay Patrikar, @smash0190.bsky.social
tl;dr: ViT->distillation->per-token confidence->rank tokens->selective merging
arxiv.org/abs/2511.14751
Xinrui Li, Qi Cai, Yuanxin Wu
tl;dr: pose-only->decouple translation from rotation->rotation-only; reprojection error on rotation manifold
arxiv.org/abs/2511.12415
Xinrui Li, Qi Cai, Yuanxin Wu
tl;dr: pose-only->decouple translation from rotation->rotation-only; reprojection error on rotation manifold
arxiv.org/abs/2511.12415
Yuqi Zhang, Guanying Chen, Jiaxing Chen, Chuanyu Fu, Chuan Huang, Shuguang Cui
tl;dr: enhance the quality of conditioning images
arxiv.org/abs/2511.13121
Yuqi Zhang, Guanying Chen, Jiaxing Chen, Chuanyu Fu, Chuan Huang, Shuguang Cui
tl;dr: enhance the quality of conditioning images
arxiv.org/abs/2511.13121
Haosong Peng, Hao Li, Yalun Dai, @yushi-lan.bsky.social, Yihang Luo, Tianyu Qi, Zhengshen Zhang, Yufeng Zhan, Junfei Zhang, Wenchao Xu, Ziwei Liu
tl;dr: depth and camera intrinsics/extrinsics->VGGT
arxiv.org/abs/2511.10560
Haosong Peng, Hao Li, Yalun Dai, @yushi-lan.bsky.social, Yihang Luo, Tianyu Qi, Zhengshen Zhang, Yufeng Zhan, Junfei Zhang, Wenchao Xu, Ziwei Liu
tl;dr: depth and camera intrinsics/extrinsics->VGGT
arxiv.org/abs/2511.10560