PhD from CUHK. 3D vision, SLAM, SfM, Image Matching (https://github.com/ericzzj1989/Awesome-Image-Matching).
🔹 GlobustVP
Convex Relaxation for Robust Vanishing Point Estimation in Manhattan World
🧱 Global optimality
💥 Tolerates up to 70% outliers
⚡ Fast runtime
📄 Paper: arxiv.org/abs/2505.04788
💻 Code: github.com/WU-CVGL/GlobustVP
1/
Haodi Yao, Fenghua He, Ning Hao, Yao Su
tl;dr: different feature scales->sampled offsets->sampled features->aggregated descriptor
arxiv.org/abs/2601.09230
Haodi Yao, Fenghua He, Ning Hao, Yao Su
tl;dr: different feature scales->sampled offsets->sampled features->aggregated descriptor
arxiv.org/abs/2601.09230
Sheng-Chi Hsu, Ting-Yu Yen, Shih-Hsuan Hung, Hung-Kuo Chu
tl;dr: anisotropic texture->primitive; gradient-based adaptive texture control->resolution and aspect ratio
arxiv.org/abs/2601.09243
Sheng-Chi Hsu, Ting-Yu Yen, Shih-Hsuan Hung, Hung-Kuo Chu
tl;dr: anisotropic texture->primitive; gradient-based adaptive texture control->resolution and aspect ratio
arxiv.org/abs/2601.09243
Sooyeun Yang, Cheyul Im, Jee Won Lee, Jongseong Brad Choi
tl;dr: evidence and contex->cleanup->3DGS; monocular depth regularizer
arxiv.org/abs/2601.09291
Sooyeun Yang, Cheyul Im, Jee Won Lee, Jongseong Brad Choi
tl;dr: evidence and contex->cleanup->3DGS; monocular depth regularizer
arxiv.org/abs/2601.09291
Yuchen Wu, Jiahe Li, Xiaohan Yu, Lina Yu, Jin Zheng, Xiao Bai
tl;dr: DPVO+flow/scene coordinate branches; spatially nearby historical patches->attention->scale; scene coordinate BA->pose+scale
arxiv.org/abs/2601.09665
Yuchen Wu, Jiahe Li, Xiaohan Yu, Lina Yu, Jin Zheng, Xiao Bai
tl;dr: DPVO+flow/scene coordinate branches; spatially nearby historical patches->attention->scale; scene coordinate BA->pose+scale
arxiv.org/abs/2601.09665
Christopher Thirgood, Oscar Mendez, Erin Ling, Jon Storey, Simon Hadfield
tl;dr: SAM2+GS SLAM
arxiv.org/abs/2601.05738
Christopher Thirgood, Oscar Mendez, Erin Ling, Jon Storey, Simon Hadfield
tl;dr: SAM2+GS SLAM
arxiv.org/abs/2601.05738
Edgar Sucar, Eldar Insafutdinov, Zihang Lai, Andrea Vedaldi
tl;dr: VGGT+ time-variant/invariant point maps
arxiv.org/abs/2601.09499
Edgar Sucar, Eldar Insafutdinov, Zihang Lai, Andrea Vedaldi
tl;dr: VGGT+ time-variant/invariant point maps
arxiv.org/abs/2601.09499
Zirui Wu, Zeren Jiang, @martin-r-oswald.bsky.social, Jie Song
tl;dr: context views->MapAnything->depth maps->rasterizing->point cloud projection image->fine-tuning
arxiv.org/abs/2601.05116
Zirui Wu, Zeren Jiang, @martin-r-oswald.bsky.social, Jie Song
tl;dr: context views->MapAnything->depth maps->rasterizing->point cloud projection image->fine-tuning
arxiv.org/abs/2601.05116
Zichen Wang, Ang Cao, Liam J. Wang, Jeong Joon Park
tl;dr: multiple depth predictions and weights->softmax weighting-based fusion->depth estimation
arxiv.org/abs/2601.05208
Zichen Wang, Ang Cao, Liam J. Wang, Jeong Joon Park
tl;dr: multiple depth predictions and weights->softmax weighting-based fusion->depth estimation
arxiv.org/abs/2601.05208
Wei Long, Haifeng Wu, Shiyin Jiang, Jinhua Zhang, Xinchun Ji, Shuhang Gu
tl;dr: iterative warp->multiple epipolar attention maps->refined depth map->Gaussian means
arxiv.org/abs/2601.03824
Wei Long, Haifeng Wu, Shiyin Jiang, Jinhua Zhang, Xinchun Ji, Shuhang Gu
tl;dr: iterative warp->multiple epipolar attention maps->refined depth map->Gaussian means
arxiv.org/abs/2601.03824
Jiaxin Huang, Yuanbo Yang, Bangbang Yang, Lin Ma, Yuewen Ma, @yiyiliao.bsky.social
tl;dr: geometric latents from VGGT as VAE + appearance latents from video diffusion
arxiv.org/abs/2601.04090
Jiaxin Huang, Yuanbo Yang, Bangbang Yang, Lin Ma, Yuewen Ma, @yiyiliao.bsky.social
tl;dr: geometric latents from VGGT as VAE + appearance latents from video diffusion
arxiv.org/abs/2601.04090
@xudongjiang.bsky.social, @fangjinhuawang.bsky.social, Silvano Galliani, Christoph Vogel, @marcpollefeys.bsky.social
arxiv.org/abs/2601.04185
@xudongjiang.bsky.social, @fangjinhuawang.bsky.social, Silvano Galliani, Christoph Vogel, @marcpollefeys.bsky.social
arxiv.org/abs/2601.04185
Zunhai Su, Weihao Ye, Hansen Feng, Keyu Fan, Jing Zhang, Dahai Yu, Zhengwu Liu, Ngai Wong
tl;dr: token importance->pruning->KV quantization
arxiv.org/abs/2601.01204
Zunhai Su, Weihao Ye, Hansen Feng, Keyu Fan, Jing Zhang, Dahai Yu, Zhengwu Liu, Ngai Wong
tl;dr: token importance->pruning->KV quantization
arxiv.org/abs/2601.01204
Shuai Yuan, Yantai Yang, Xiaotian Yang, Xupeng Zhang, Zhonghao Zhao, Lingming Zhang, Zhipeng Zhang
tl;dr: key cosine similarity->attention-independen proxy for token importance
arxiv.org/abs/2601.02281
Shuai Yuan, Yantai Yang, Xiaotian Yang, Xupeng Zhang, Zhonghao Zhao, Lingming Zhang, Zhipeng Zhang
tl;dr: key cosine similarity->attention-independen proxy for token importance
arxiv.org/abs/2601.02281
Mengfei Li, Peng Li, Zheng Zhang, Jiahao Lu, Chengfeng Zhao, Wei Xue, Qifeng Liu, Sida Peng, Wenxiao Zhang, Wenhan Luo, Yuan Liu, Yike Guo
arxiv.org/abs/2601.01222
Mengfei Li, Peng Li, Zheng Zhang, Jiahao Lu, Chengfeng Zhao, Wei Xue, Qifeng Liu, Sida Peng, Wenxiao Zhang, Wenhan Luo, Yuan Liu, Yike Guo
arxiv.org/abs/2601.01222
Xiaopeng Guo, Yinzhe Xu, Huajian Huang, Sai-Kit Yeung
tl;dr: SphereNet+residual blocks->features from omnidirectional image->DPVO->omnidirectional BA
arxiv.org/abs/2601.02309
Xiaopeng Guo, Yinzhe Xu, Huajian Huang, Sai-Kit Yeung
tl;dr: SphereNet+residual blocks->features from omnidirectional image->DPVO->omnidirectional BA
arxiv.org/abs/2601.02309
Jiewen Chan, @ericzzj.bsky.social, Yu-Lun Liu
tl;dr: extend Gaussians to frequency domain->hybrid Gabor & Gaussian->novel video representation->adaptive high & low-frequency balance
arxiv.org/abs/2601.00796
Jiewen Chan, @ericzzj.bsky.social, Yu-Lun Liu
tl;dr: extend Gaussians to frequency domain->hybrid Gabor & Gaussian->novel video representation->adaptive high & low-frequency balance
arxiv.org/abs/2601.00796
Wei-Tse Cheng, Yen-Jen Chiou, Yuan-Fu Yang
tl;dr: keyframe->DINOv3->dense matching->triangulation->one-shot initialization->Gaussian seed prior
arxiv.org/abs/2601.00705
Wei-Tse Cheng, Yen-Jen Chiou, Yuan-Fu Yang
tl;dr: keyframe->DINOv3->dense matching->triangulation->one-shot initialization->Gaussian seed prior
arxiv.org/abs/2601.00705
Samuel Cerezo, @jcivera.bsky.social
tl;dr: embedded deformation graph->non–rigid warp; decouple rigid & non-rigid; observability analysis->visual–inertial deformable odometry
arxiv.org/abs/2601.00702
Samuel Cerezo, @jcivera.bsky.social
tl;dr: embedded deformation graph->non–rigid warp; decouple rigid & non-rigid; observability analysis->visual–inertial deformable odometry
arxiv.org/abs/2601.00702
Yuchen Wu, Jiahe Li, Fabio Tosi, Matteo Poggi, Jin Zheng, Xiao Bai
tl;dr: foundation depth models->flow matching
arxiv.org/abs/2512.25008
Yuchen Wu, Jiahe Li, Fabio Tosi, Matteo Poggi, Jin Zheng, Xiao Bai
tl;dr: foundation depth models->flow matching
arxiv.org/abs/2512.25008
Yi-Chuan Huang, Hao-Jen Chien, Chin-Yang Lin, Ying-Huan Chen, Yu-Lun Liu
tl;dr: multi-view outpainting->sparse-view reconstruction
arxiv.org/abs/2512.25073
Yi-Chuan Huang, Hao-Jen Chien, Chin-Yang Lin, Ying-Huan Chen, Yu-Lun Liu
tl;dr: multi-view outpainting->sparse-view reconstruction
arxiv.org/abs/2512.25073
@marwantaher.bsky.social, Ignacio Alzugaray, @makezur.bsky.social, Xin Kong, @ajdavison.bsky.social
tl;dr: KV-cache is underlying implicit scene representation
arxiv.org/abs/2512.22581
@marwantaher.bsky.social, Ignacio Alzugaray, @makezur.bsky.social, Xin Kong, @ajdavison.bsky.social
tl;dr: KV-cache is underlying implicit scene representation
arxiv.org/abs/2512.22581
Zhenbao Yu, Shirong Ye, Ronghe Jin, Shunkun Liang, Zibin Liu, Huiyun Zhang, Banglei Guan
tl;dr: 2 AC+vertical direction->relative pose+focal length
arxiv.org/abs/2512.22833
Zhenbao Yu, Shirong Ye, Ronghe Jin, Shunkun Liang, Zibin Liu, Huiyun Zhang, Banglei Guan
tl;dr: 2 AC+vertical direction->relative pose+focal length
arxiv.org/abs/2512.22833
tl;dr: survey in title
arxiv.org/abs/2512.22983
tl;dr: survey in title
arxiv.org/abs/2512.22983
Huan Li, Longjun Luo, Yuling Shi, Xiaodong Gu
tl;dr: in title
arxiv.org/abs/2512.21691
Huan Li, Longjun Luo, Yuling Shi, Xiaodong Gu
tl;dr: in title
arxiv.org/abs/2512.21691
Tianchen Deng, Wenhua Wu, Kunzhen Wu, Guangming Wang, Siting Zhu, Shenghai Yuan, Xun Chen, Guole Shen, Zhe Liu, Hesheng Wang
tl;dr: multi-frame relocalization using VGGT
arxiv.org/abs/2512.21883
Tianchen Deng, Wenhua Wu, Kunzhen Wu, Guangming Wang, Siting Zhu, Shenghai Yuan, Xun Chen, Guole Shen, Zhe Liu, Hesheng Wang
tl;dr: multi-frame relocalization using VGGT
arxiv.org/abs/2512.21883