Nicolas Dufour
@nicolasdufour.bsky.social
400 followers 430 following 42 posts
PhD student at IMAGINE (ENPC) and GeoVic (Ecole Polytechnique). Working on image generation. http://nicolas-dufour.github.io
Posts Media Videos Starter Packs
Pinned
nicolasdufour.bsky.social
🌍 Guessing where an image was taken is a hard, and often ambiguous problem. Introducing diffusion-based geolocation—we predict global locations by refining random guesses into trajectories across the Earth's surface!

🗺️ Paper, code, and demo: nicolas-dufour.github.io/plonk
Reposted by Nicolas Dufour
davidpicard.bsky.social
Today is Antoine Guedon's PhD! Already pretty cool visuals right at the start.
Reposted by Nicolas Dufour
davidpicard.bsky.social
Annnnnd it's a reject!

Scale is a religion and if you go against it, you're a heretic and you should burn, "despite [the reviewers] final ratings".

But scale is still not necessary!

Side note: First time swinging reviews up (from 2,2,4,4 to 2,4,4,5) does not get the paper accepted. Strange days.
davidpicard.bsky.social
Dear bsky friends, I have a question: Do you really think that the visual quality of these images is so bad that the research that produced them is deeply flawed?
And if I told you that the model was mostly trained on ImageNet with a bit of artistic fine-tuning at 1024 resolution, still really bad?
Reposted by Nicolas Dufour
francois-rozet.bsky.social
Does a smaller latent space lead to worse generation in latent diffusion models? Not necessarily! We show that LDMs are extremely robust to a wide range of compression rates (10-1000x) in the context of physics emulation.

We got lost in latent space. Join us 👇
Reposted by Nicolas Dufour
davidpicard.bsky.social
Next week, I'll be in Strasbourg for the GRETSI (@gretsi-info.bsky.social) to present a small discovery on transformers generalization we made with Simon and Jérémie while working on generative recommender systems. I love these "phase transition" plots.

📜: arxiv.org/abs/2508.03934

Short summary 👇
nicolasdufour.bsky.social
Makes me think of StyleGAN3 visualizations
nicolasdufour.bsky.social
Congrats to the Dino team for the DinoV3 release!

Seeing it outperforms CLIP on "cultural knowledge" based task like geoloc make me very hopeful for it working really well in VLMs!
nicolasdufour.bsky.social
🌍 Geoloc is a fantastic downstream benchmark:

- Requires fine-grained visual understanding (textures, vegetation, road signs, architecture)

- Tests global generalization

- Forces models to pick up real-world cues

That’s why DinoV3 shining here is such a big deal 🚀
nicolasdufour.bsky.social
Even crazier 🤯 DinoV3 works in some out-of-distribution setups too — as long as there are geographical cues 🌄🗺️

(Remember: the network is trained only on road images!)

Where DinoV2 totally failed, DinoV3 is holding up 👊
nicolasdufour.bsky.social
The setup 👉 We use our riemannian flow matching model PLONK (CVPR25: Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation) 🌍

We simply swap StreetCLIP with DinoV3 as a drop-in backbone, and train on OpenStreetView-5M.

And boom 💥 — DinoV3 wins.
nicolasdufour.bsky.social
🚀 DinoV3 just became the new go-to backbone for geoloc!
It outperforms CLIP-like models (SigLip2, finetuned StreetCLIP)… and that’s shocking 🤯
Why? CLIP models have an innate advantage — they literally learn place names + images. DinoV3 doesn’t.
Reposted by Nicolas Dufour
davidpicard.bsky.social
Dear bsky friends, I have a question: Do you really think that the visual quality of these images is so bad that the research that produced them is deeply flawed?
And if I told you that the model was mostly trained on ImageNet with a bit of artistic fine-tuning at 1024 resolution, still really bad?
nicolasdufour.bsky.social
I had the privilege to be invited to speak about our work "Around the World in 80 Timesteps" at the French Podcast Underscore! If you speak french, i highly recommend it they did a great job with the montage!

If you want to learn more nicolas-dufour.github.io/plonk

www.youtube.com/watch?v=s5oH...
Il a conçu la première IA d’OSINT (terrifiant… et génial)
YouTube video by Underscore_
www.youtube.com
Reposted by Nicolas Dufour
abursuc.bsky.social
1/ Can open-data models beat DINOv2? Today we release Franca, a fully open-sourced vision foundation model. Franca with ViT-G backbone matches (and often beats) proprietary models like SigLIPv2, CLIP, DINOv2 on various benchmarks setting a new standard for open-source research.
nicolasdufour.bsky.social
Really cool work! I've seen that you haven't used registers but seem to have smooth latents anyway. Is this a consequence of having the matryoshka loss that require both global and local knowledge?
Have u tried using registers?
Reposted by Nicolas Dufour
pierremarion.bsky.social
✨Thrilled to see EurIPS launch — the first officially endorsed European NeurIPS presentation venue!

👀 But NeurIPS now requires at least one author to attend in San Diego or Mexico (and not just virtually as before). This is detrimental to many. Why not allow presenting at EurIPS or online?
1/4
euripsconf.bsky.social
EurIPS is coming! 📣 Mark your calendar for Dec. 2-7, 2025 in Copenhagen 📅

EurIPS is a community-organized conference where you can present accepted NeurIPS 2025 papers, endorsed by @neuripsconf.bsky.social and @nordicair.bsky.social and is co-developed by @ellis.eu

eurips.cc
nicolasdufour.bsky.social
Sadly, helium doesn't carry enough weight 😭
Reposted by Nicolas Dufour
imagineenpc.bsky.social
Some of our IMAGINE members at #CVPR2025
Reposted by Nicolas Dufour
davidpicard.bsky.social
Come on! Who else has a hot air ballon on their poster?

(fun fact: there is no hot air ballon emoji, but @loicland.bsky.social made a tikz macro for it! 😅)
nicolasdufour.bsky.social
Come see us in poster 186 to see our poster Around the World in 80 timesteps: A generative Approach to Global Visual Geolocation!

Cc @loicland.bsky.social @davidpicard.bsky.social @vickykalogeiton.bsky.social
nicolasdufour.bsky.social
Come see us in poster 186 to see our poster Around the World in 80 timesteps: A generative Approach to Global Visual Geolocation!

Cc @loicland.bsky.social @davidpicard.bsky.social @vickykalogeiton.bsky.social
Reposted by Nicolas Dufour
elliotvincent.bsky.social
A bit disappointed by the PAMI TC meeting, mostly repetitions of what’s been said at the opening, the "open discussion" slide was really just there to *exist* but no discussion/vote took place, no topic was debated. What space is left to reflect on our community and what we stand for as scientists?
Reposted by Nicolas Dufour
vincentlepetit.bsky.social
I am heartbroken that I am not at the conference, but seeing what the government is doing to its people and the world, I simply couldn't go there.
Reposted by Nicolas Dufour
elliotvincent.bsky.social
I will also be presenting CoDeX at the same workshop between 1:15PM and 1:45PM.

Abhishek Kuriyal, Mathieu Aubry, @loicland.bsky.social and I improve the performance of deep learning models in challenging domain shift settings by learning how to combine spatial domain experts.
Reposted by Nicolas Dufour
elliotvincent.bsky.social
Discover DAFA-LS, a dataset of SITS centered on Afghan archeological sites and annotated with preservation classification labels.

🎤 1:45PM Oral (room 208 B)
📰 4:30PM Poster (poster boards #419 – #443)
Reposted by Nicolas Dufour
elliotvincent.bsky.social
I will be presenting our work on the detection of archaeological looting with satellite image time series at CVPR 2025 EarthVision workshop tomorrow!

Honored and grateful that this paper received the best student paper award!