Spyros Gidaris
@spyrosgidaris.bsky.social
150 followers 130 following 5 posts
Senior Research Scientist at Valeo.ai (@valeoai.bsky.social) https://gidariss.github.io/
Posts Media Videos Starter Packs
Reposted by Spyros Gidaris
valeoai.bsky.social
Congratulations to our lab colleagues who have been named Outstanding Reviewers at #ICCV2025 👏

Andrei Bursuc @abursuc.bsky.social
Anh-Quan Cao @anhquancao.bsky.social
Renaud Marlet
Eloi Zablocki @eloizablocki.bsky.social

@iccv.bsky.social
iccv.thecvf.com/Conferences/...
2025 ICCV Program Committee
iccv.thecvf.com
Reposted by Spyros Gidaris
gillespuy.bsky.social
Update: ResearchGate has investigated the case, and, as far as I can see, all the suspicious papers (~200) have now been removed. Many thanks to the @researchgate.bsky.social team!
gillespuy.bsky.social
Discovered that our RangeViT paper keeps being cited in what might be LLM-generated papers. Number of citations increased rapidly in the last weeks. Too good to be true.

Papers popped up on different platforms, but mainly on ResearchGate with ~80 papers in just 3 weeks.
[1/]
spyrosgidaris.bsky.social
Three papers accepted to #NeurIPS2025 (one spotlight)! 🎉

Awesome works in generative modeling, multi-token prediction, and future prediction.

Congratulations to all collaborators!
@nasosger.bsky.social, sta8is.bsky.social, @nicolabourbaki.bsky.social, @ikakogeorgiou.bsky.social & N. Komodakis!
Reposted by Spyros Gidaris
gillespuy.bsky.social
Discovered that our RangeViT paper keeps being cited in what might be LLM-generated papers. Number of citations increased rapidly in the last weeks. Too good to be true.

Papers popped up on different platforms, but mainly on ResearchGate with ~80 papers in just 3 weeks.
[1/]
Reposted by Spyros Gidaris
abursuc.bsky.social
1/ Can open-data models beat DINOv2? Today we release Franca, a fully open-sourced vision foundation model. Franca with ViT-G backbone matches (and often beats) proprietary models like SigLIPv2, CLIP, DINOv2 on various benchmarks setting a new standard for open-source research.
Reposted by Spyros Gidaris
abursuc.bsky.social
1/ New & old work on self-supervised representation learning (SSL) with ViTs:
MOCA ☕ - Predicting Masked Online Codebook Assignments w/ @spyrosgidaris.bsky.social O. Simeoni, A. Vobecky, @matthieucord.bsky.social, N. Komodakis, @ptrkprz.bsky.social #TMLR #ICLR2025
Grab a ☕ & brace for a story & a🧵
Reposted by Spyros Gidaris
ssirko.bsky.social
1/n 🚀New paper out - accepted at #ICCV2025!

Introducing DIP: unsupervised post-training that enhances dense features in pretrained ViTs for dense in-context scene understanding

Below: Low-shot in-context semantic segmentation examples. DIP features outperform DINOv2!
Reposted by Spyros Gidaris
paulcouairon.bsky.social
🚀Thrilled to introduce JAFAR—a lightweight, flexible, plug-and-play module that upsamples features from any Foundation Vision Encoder to any desired output resolution (1/n)

Paper : arxiv.org/abs/2506.11136
Project Page: jafar-upsampler.github.io
Github: github.com/PaulCouairon...
Reposted by Spyros Gidaris
gkordo.bsky.social
Are you at @cvprconference.bsky.social? Come by our poster!
📅 Sat 14/6, 10:30-12:30
📍 Poster #395, ExHall D
spyrosgidaris.bsky.social
I am at #CVPR2025 this week in Nashville!

Presenting "Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers" on multi-modal semantic future prediction.

Come discuss!

Fri 13 Jun 10:30-12:30, poster #345
bsky.app/profile/sta8...
sta8is.bsky.social
🧵 Excited to share our latest work: FUTURIST - A unified transformer architecture for multimodal semantic future prediction, is accepted to #CVPR2025! Here's how it works (1/n)
👇 Links to the arxiv and github below
Reposted by Spyros Gidaris
nicolabourbaki.bsky.social
1/n Introducing ReDi (Representation Diffusion): a new generative approach that leverages a diffusion model to jointly capture
– Low-level image details (via VAE latents)
– High-level semantic features (via DINOv2)🧵
Reposted by Spyros Gidaris
abursuc.bsky.social
The @valeoai.bsky.social team is presenting a few exciting works @iclr-conf.bsky.social this year on masked generative transformers, adaptation of VLMs, self-supervised representation learning, neural solvers. #iclr2025
Check them out 👇
valeoai.bsky.social
Our recent research will be presented at #ICLR2025 @iclr_conf: VLMs, LLMs, diffusion models, self-supervised learning, physics-informed learning…

Find out more below 🧵

valeoai.github.io/posts/2025-0...
Reposted by Spyros Gidaris
lebellig.bsky.social
Nice research work from @nicolabourbaki.bsky.social et al. Enhances latent generative models by regularizing the VAE's latent space with an equivariance loss. The finetuning process is straightforward + demonstrates improvements in just 5 epochs!

📄 arxiv.org/abs/2502.09509
🐍 github.com/zelaki/eqvae
Reposted by Spyros Gidaris
abursuc.bsky.social
Still mesmerized by this work and its results: a mid-to-end driving agent trained with self-play on just 8 maps on 1.6B km of driving (9500 years of subjective driving experience) smashes in off-the-shelf manner all existing benchmarks (nuPlan, CARLA, Waymax) 😮
abursuc.bsky.social
Crazily amazing work by @eugenevinitsky.bsky.social @senerozan.bsky.social & team, setting the bar so high for anyone working in autonomous driving these days. Check it out arxiv.org/abs/2502.03349
Reposted by Spyros Gidaris
abursuc.bsky.social
EQ-VAE: Such a simple & cool trick to regularize multiple kinds of autoencoders: align reconstruction of transformed latents w/ the corresponding transformed inputs.
🚀REPA: 4x training speedup
🚀MaskGIT: 2x training speedup
🚀DiT-XL/2: 7x faster convergence

Kudos @nicolabourbaki.bsky.social et al.
nicolabourbaki.bsky.social
1/n🚀If you’re working on generative image modeling, check out our latest work! We introduce EQ-VAE, a simple yet powerful regularization approach that makes latent representations equivariant to spatial transformations, leading to smoother latents and better generative models.👇
Reposted by Spyros Gidaris
eugenevinitsky.bsky.social
The things I've found hardest about research have all been non-technical: maintaining confidence and self-esteem, not abandoning the work when it's too hard or stressful, finding time to learn new things. In comparison, the technical parts are much easier
Reposted by Spyros Gidaris
davidpicard.bsky.social
🚨 Just a quick note that following requests, we trained a 512px version of our Coherence-Aware Diffusion model (CVPR'24) and updated the paper on arxiv: arxiv.org/abs/2405.20324

It has a package and pretrained models!

🖥️ nicolas-dufour.github.io/cad.html
🤖 github.com/nicolas-dufo...
Reposted by Spyros Gidaris
nicolabourbaki.bsky.social
1/n🚀If you’re working on generative image modeling, check out our latest work! We introduce EQ-VAE, a simple yet powerful regularization approach that makes latent representations equivariant to spatial transformations, leading to smoother latents and better generative models.👇
Reposted by Spyros Gidaris
sta8is.bsky.social
1/n 🚀 Excited to share our latest work: DINO-Foresight, a new framework for predicting the future states of scenes using Vision Foundation Model features!
Links to the arXiv and Github 👇
Reposted by Spyros Gidaris
abursuc.bsky.social
This amazing team ❤️
valeoai.bsky.social
We've just had our annual gathering to get together and brainstorm on new exciting ideas and projects ahead -- stay tuned!
This is also an excellent occasion to fit all team members in a photo 📸
Reposted by Spyros Gidaris
abursuc.bsky.social
Thrilled to announce our workshop on Embodied Intelligence for Autonomous Systems on the Horizon @cvprconference.bsky.social featuring a crazy line-up of speakers and challenges.
Mark it in your agendas and also in your registration #cvpr2025
opendrivelab.com/cvpr2025/wor...