Alexandre Morgand, PhD
@alexmrgd.bsky.social
48 followers 33 following 140 posts
Computer Vision Research Scientist at @Simulon , music lover, fond of scientific/musical/geeky/useless stuff
Posts Media Videos Starter Packs
alexmrgd.bsky.social
"No Pose at All Self-Supervised Pose-Free 3DGS from Sparse Views"
TLDR: 3DGS + no poses during training/inference; shared feature extraction backbone; simultaneous prediction of 3D Gaussian primitives+camera poses in a canonical space from unposed (1 feed-forward step).
alexmrgd.bsky.social
"Any-to-Bokeh: One-Step Video Bokeh via Multi-Plane Image Guided Diffusion"

📖TL;DR: Any-to-Bokeh is a novel one-step video bokeh framework that converts arbitrary input videos into temporally coherent, depth-aware bokeh effects.
alexmrgd.bsky.social
@sharathgirish97 1,2 @_TianyeLi 2* Amrita Mazumdar 2* @abhi2610 2 @davedotluebke 2 @shalinidemello 2

1 @umdglobalcampus
2 @nvidia

*Equal contributions
alexmrgd.bsky.social
"QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint Videos"

TL;DR: Streamable free-viewpoint videos efficient representations for with dynamic Gaussians. Reduce model size to just 0.7 MB per frame while training in < 5s and rendering at 350 FPS
alexmrgd.bsky.social
Jiawei Yang *,¶, Jiahui Huang ¶, Yuxiao Chen ¶,
Yan Wang ¶, Boyi Li ¶, Yurong You ¶, Maximilian Igl ¶, Apoorva Sharma ¶, Peter Karkus ¶, Danfei Xu $,¶, Boris Ivanovic ¶, Yue Wang †,*,¶ Marco Pavone †,§,¶

* University of Southern California
$ GIT
§ Stanford University
¶ NVIDIA

† Equal advising
alexmrgd.bsky.social
"STORM: Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes"

TL;DR: Data driven transformer in a feed forward manner; dense reconstruction in dynamic environment with 3D gaussians and velocities; self-supervised scene flows
alexmrgd.bsky.social
(2/2) capturing static+dynamic scene geometry while maintaining 3D correspondences; long-range correspondences, effectively combining 3D reconstruction with 3D tracking; re-projection loss.
alexmrgd.bsky.social
St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World

TL;DR: a feed-forward; (reconstructs+tracks dynamic video content); dust3r-like pointmaps for a pair of frames captured at different moments (1/2)
alexmrgd.bsky.social
Shangzhan Zhang 1,2*, @jianyuan_wang
3*, @YinghaoXu1
4*†, Nan Xue 2, Christian Rupprecht 3, @XiaoweiZhou5
1†, Yujun Shen 2, @GordonWetzstein 4

1 Zhejiang University
2 AntGroup
3 University of Oxford
4 Stanford University

*, † equal contributions (?)
alexmrgd.bsky.social
FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views

TL;DR: feed-forward model; cascaded learning paradigm with camera pose serving as the critical bridge, recognizing its essential role in mapping 3D structures onto 2D image planes.
alexmrgd.bsky.social
⚡️Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

TL;DR: multi-view generalization to DUSt3R; processing many views in parallel: Transformer-based architecture forwards N images in a single forward pass, bypassing the need for iterative alignment.
alexmrgd.bsky.social
🪄 VACE: All-in-One Video Creation and Editing

from @alibabagroup.bsky.social's Tongyi Lab with:

Zeyinzi Jiang* Zhen Han* Chaojie Mao*† Jingfeng Zhang Yulin Pan Yu Liu

*Equal contribution, †Project lead
alexmrgd.bsky.social
From @nvidia (1), @NUSingapore (2), @UofT (3) and @VectorInst (4)

@jayzhangjiewu 1,2*, Yuxuan Zhang 1*, Haithem Turki 1, Xuanchi Ren 1,3,4, @JunGao33210520 1,3,4, Mike Zheng Shou 2, @FidlerSanja 1,3,4, @ZGojcic 1†, @HuanLing6 1,3,4†

*, † equal contribution
alexmrgd.bsky.social
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

TL;DR: single-step diffusion models; a single-step image diffusion model trained to enhance and remove artifacts in rendered novel views caused by underconstrained regions of the 3D representation.
alexmrgd.bsky.social
Authors: Jovana Videnović , Alan Lukežič , Matej Kristan
from Faculty of Computer and Information Science, University of Ljubljana

Project page: jovanavidenovic.github.io/dam-4-sam/
Paper: arxiv.org/abs/2411.17576
Source code: github.com/jovanavideno...
DAM4SAM
Project page of the paper: A Distractor-Aware Memory for Visual Object Tracking with SAM2
jovanavidenovic.github.io
alexmrgd.bsky.social
A Distractor-Aware Memory (DAM) for Visual Object Tracking with SAM2

TL;DR: SAM2.1 based; distractor-distilled (DiDi) dataset to better study the distractor problem