David Picard
@davidpicard.bsky.social
2.7K followers 420 following 2.1K posts
Professor of Computer Vision/Machine Learning at Imagine/LIGM, École nationale des Ponts et Chaussées @ecoledesponts.bsky.social Music & overall happiness 🌳🪻 Born well below 350ppm 📍Paris 🔗 https://davidpicard.github.io/
Posts Media Videos Starter Packs
Pinned
davidpicard.bsky.social
🚨Updated: "How far can we go with ImageNet for Text-to-Image generation?"

TL;DR: train a text2image model from scratch on ImageNet only and beat SDXL.

Paper, code, data available! Reproducible science FTW!
🧵👇

📜 arxiv.org/abs/2502.21318
💻 github.com/lucasdegeorg...
💽 huggingface.co/arijitghosh/...
Reposted by David Picard
vickykalogeiton.bsky.social
Very proud of our recent work, kudos to the team! Read @davidpicard.bsky.social’s excellent post for more details or the paper arxiv.org/pdf/2502.21318
davidpicard.bsky.social
And of course, all the authors who worked really hard to produce these valuable contributions:
@lucasdegeorge.bsky.social
@arrijitghosh.bsky.social
@nicolasdufour.bsky.social
@vickykalogeiton.bsky.social
davidpicard.bsky.social
Final note: I'm (we're) tempted to organize a challenge on that topic as a workshop at a CV conf. ImageNet is the only source of images allowed and then you compete to get the bold numbers.

Do you think there would be people in for that? Do you think it would make for a nice competition?
davidpicard.bsky.social
The paper has a gigantic number of supmat (owing to its reviewing curse): arxiv.org/abs/2502.21318

It's now 29 pages. All you ever need to know to train your own T2I model and then fine-tune/LoRa it to whatever you need.
We show you don't need to start from SDXL or Flux, you can be much more frugal
davidpicard.bsky.social
We release everything:
The training code: github.com/lucasdegeorg...
The data (captions, cutmix, all): huggingface.co/arijitghosh/...
And even some models (eventually all, once it's user-friendly).

You're 500hrs away from training your T2I model from scratch! Can you wrap your head around that?🤯
GitHub - lucasdegeorge/T2I-ImageNet: Code for "How far can we go with ImageNet for Text-to-Image generation?" paper
Code for "How far can we go with ImageNet for Text-to-Image generation?" paper - lucasdegeorge/T2I-ImageNet
github.com
davidpicard.bsky.social
But there's more: that checkpoint has all you can expect from a good pretrained model.

We take the checkpoint, upscale it to 1k² and fine-tune it on Laion-POP (400k imgs) for high aesthetics targets.

I would have never bet that you could get those images with ImageNet pretraining + a bit of FT.
davidpicard.bsky.social
We train models at 256² resolution and then finetune at 512² to get competitive results on composition benchmarks.

This show that a rather small model (400M) trained on few but curated data has good understanding and generative capabilities.

Contrarily to popular belief: scale is not required!
davidpicard.bsky.social
To enable training T2I on ImageNet, we:
- augment the entire dataset with rich detailed caption (TA)
- remove the object-centric bias with CutMix augmentations (IA)

Using both augmentations is sufficient to successfully train a model producing the images in the teaser (1 post), using only ImageNet😲
davidpicard.bsky.social
Main motivation:
- T2I research is impossible to reproduce because massive datasets are kept secret and open datasets (LAION) are decaying
- Methods are impossible to compare because of the lack of common data

We proposed a reproducible setup: train on ImageNet! It's affordable and ticks all boxes
davidpicard.bsky.social
🚨Updated: "How far can we go with ImageNet for Text-to-Image generation?"

TL;DR: train a text2image model from scratch on ImageNet only and beat SDXL.

Paper, code, data available! Reproducible science FTW!
🧵👇

📜 arxiv.org/abs/2502.21318
💻 github.com/lucasdegeorg...
💽 huggingface.co/arijitghosh/...
davidpicard.bsky.social
Tu sais, nous on est en train de changer de logiciel de RH¹ et je ne serais pas surpris que certains ne soient pas payés à cause de problème de bascule. Les outils numériques sont toujours la cata.

¹On va utiliser RenoiRH. Oui, tu as bien lu. Et personne n'a tiqué sur le nom lors du développement.
Reposted by David Picard
nholzschuch.bsky.social
To paraphrase Douglas Adams, the US had always considered it was vastly superior the EU because it had generated billionnaires like Sam Altman, Elon Musk or Marc Andreessen, and the EU had not.
An the EU considered it was vastly superior to the US for exactly the same reason.
davidpicard.bsky.social
Ce qui en soit n'a pas l'air trop difficile à trouver.
davidpicard.bsky.social
Oui, pareil ! Franchement, j'étais pas emballé au départ, mais la différence avec la MGEN est le jour et la nuit. Je ne sais pas combien de temps ça va durer, mais c'est très appréciable.
davidpicard.bsky.social
à 3 euros près, ce sera le maximum de toute façon.
Reposted by David Picard
cnrsinformatics.bsky.social
#Distinction 🏆| Charlotte Pelletier, lauréate d'une chaire #IUF, développe des méthodes d’intelligence artificielle appliquées aux séries temporelles d’images satellitaires.
➡️ www.ins2i.cnrs.fr/fr/cnrsinfo/...
🤝 @irisa-lab.bsky.social @cnrs-bretagneloire.bsky.social
davidpicard.bsky.social
C'est l'automne 🍂🍁🍄🪾
Reposted by David Picard
poischiche.bsky.social
Il y a cent ans, entre fin septembre et début octobre 1925, la barbarie fasciste tombait méthodiquement sur Florence, tuait, brûlait et bastonnait pour "purifier" dans sa "Sainte violence" anti-maçonnique l'Italie d'une part de son heritage vivant des Lumières.
agone.org/les-crimes-f...
Les crimes fascistes (Florence, 3 octobre 1925)
De la Grande Guerre à la Guerre d'Espagne, Camillo Berneri (1897–1937) a lutté contre le fascisme par la plume et par l'action. Cet intellectuel et militant anarchiste italien est l'un des premiers à ...
agone.org
davidpicard.bsky.social
High dimensional wolves or low dimensional wolves? That changes quite a lot what I think.
davidpicard.bsky.social
That's the best way of catching a pneumonia. Not sure I would call that fun. 🥶
davidpicard.bsky.social
If you were to make the most potent and addictive drug ever, how would you know? Because if you test it on yourself...😅
Reposted by David Picard
archambault-avocat.fr
Imaginez que le gouvernement décide d'imposer à La Poste d'ouvrir tout le courrier envoyé par les Français, pour vérifier qu'il n'y ait pas de contenu pédocriminel.

C'est pourtant ce que 🇪🇺 s'apprête potentiellement à faire, en disant "mais ne vous inquiétez pas on refermera juste après".
Reposted by David Picard
t-martyniuk.bsky.social
Another great event for @valeoai.bsky.social team: a PhD defense of Corentin Sautier.

His thesis «Learning Actionable LiDAR Representations w/o Annotations» covers the papers BEVContrast (learning self-sup LiDAR features), SLidR, ScaLR (distillation), UNIT and Alpine (solving tasks w/o labels).