Lightnews — Scholar-powered news

Reposted by Robin Courant

Nicolas Dufour

@nicolasdufour.bsky.social

We introduce MIRO: a new paradigm for T2I model alignment integrating reward conditioning into pretraining, eliminating the need for separate fine-tuning/RL stages. This single-stage approach offers unprecedented efficiency and control.

- 19x faster convergence ⚡
- 370x less FLOPS than FLUX-dev 📉

October 31, 2025 at 11:24 AM

Reposted by Robin Courant

Nicolas Dufour

@nicolasdufour.bsky.social

🚀 DinoV3 just became the new go-to backbone for geoloc!
It outperforms CLIP-like models (SigLip2, finetuned StreetCLIP)… and that’s shocking 🤯
Why? CLIP models have an innate advantage — they literally learn place names + images. DinoV3 doesn’t.

August 18, 2025 at 3:14 PM

Reposted by Robin Courant

Nicolas Dufour

@nicolasdufour.bsky.social

Come see us in poster 186 to see our poster Around the World in 80 timesteps: A generative Approach to Global Visual Geolocation!

Cc @loicland.bsky.social @davidpicard.bsky.social @vickykalogeiton.bsky.social

June 15, 2025 at 3:30 PM

Reposted by Robin Courant

Nicolas Dufour

@nicolasdufour.bsky.social

Check out our latest work on Text-to-Image generation! We've successfully trained a T2I model using only ImageNet data by leveraging captioning and data augmentation.

David Picard @davidpicard.bsky.social · Mar 3

🚨 New preprint!
How far can we go with ImageNet for Text-to-Image generation? w. @arrijitghosh.bsky.social @lucasdegeorge.bsky.social @nicolasdufour.bsky.social @vickykalogeiton.bsky.social
TL;DR: Train a text-to-image model using 1000 less data in 200 GPU hrs!

📜https://arxiv.org/abs/2502.21318
🧵👇

March 3, 2025 at 10:32 AM

Reposted by Robin Courant

Thibaut Loiseau

@thibautloiseau.bsky.social

🧩 Excited to share our paper "RUBIK: A Structured Benchmark for Image Matching across Geometric Challenges" (arxiv.org/abs/2502.19955) accepted to #CVPR2025! We created a benchmark that systematically evaluates image matching methods across well-defined geometric difficulty levels. 🔍

February 28, 2025 at 3:23 PM

Reposted by Robin Courant

Antoine Guédon

@antoine-guedon.bsky.social

⚠️Reconstructing sharp 3D meshes from a few unposed images is a hard and ambiguous problem.

☑️With MAtCha, we leverage a pretrained depth model to recover sharp meshes from sparse views including both foreground and background, within mins!🧵

🌐Webpage: anttwo.github.io/matcha/

December 11, 2024 at 2:59 PM

Reposted by Robin Courant

Nicolas Dufour

@nicolasdufour.bsky.social

🌍 Guessing where an image was taken is a hard, and often ambiguous problem. Introducing diffusion-based geolocation—we predict global locations by refining random guesses into trajectories across the Earth's surface!

🗺️ Paper, code, and demo: nicolas-dufour.github.io/plonk

December 10, 2024 at 3:56 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news