Benjamin Lefaudeux 🇺🇦
bentheegg.bsky.social
Benjamin Lefaudeux 🇺🇦
@bentheegg.bsky.social
Back to France after some time in sunny California and happy Copenhagen. Mistral, Photoroom, Meta (xformers, FairScale, R&D), EyeTribe (acq) Mostly writing around AI
Reposted by Benjamin Lefaudeux 🇺🇦
Limits of vector search

a new GDM paper shows that embeddings can’t represent combinations of concepts well

e.g. Dave likes blue trucks AND Ford trucks

even k=2 sub-predicates make SOTA embedding models fall apart

www.alphaxiv.org/pdf/2508.21038
On the Theoretical Limitations of Embedding-Based Retrieval | alphaXiv
View recent discussion. Abstract: Vector embeddings have been tasked with an ever-increasing set of retrieval tasks over the years, with a nascent rise in using them for reasoning, instruction-followi...
www.alphaxiv.org
August 31, 2025 at 11:07 AM
Reposted by Benjamin Lefaudeux 🇺🇦
Longcat-Flash-Chat (560B)

uh, holy shit this one is intriguing. bare minimum they compare themselves to all the (actual) top models and do okay

but inside.. damn this one has some cool ideas

huggingface.co/meituan-long...
August 31, 2025 at 11:20 AM
Reposted by Benjamin Lefaudeux 🇺🇦
In 2012 when I had to clean data it seemed natural to look for rules I could use to clean it.

Now it seems natural to model the noise, find new clean data it can destroy, and then train a model to reverse the process.

Machine learning makes you a sicko.
July 27, 2025 at 11:16 AM
Reposted by Benjamin Lefaudeux 🇺🇦
Three things to note about this:

1) AI has obvious utility to many, this is a tremendous amount of use already
2) There is room for multiple frontier model providers, at least for now
3) Any losses from subsidizing cost of AI use (and it is not clear this is happening) are now relatively small
July 26, 2025 at 7:33 PM
"The Serial Scaling Hypothesis" (arxiv.org/abs/2507.125..., Liu et al) is interesting I think, not as new as it completely looks (autoregressive models are used serially, models have depth,..) but feels like a good formalization and intuition as of where current GPT based LLMs will typically fail
July 26, 2025 at 9:58 PM
Reposted by Benjamin Lefaudeux 🇺🇦
1/ Can open-data models beat DINOv2? Today we release Franca, a fully open-sourced vision foundation model. Franca with ViT-G backbone matches (and often beats) proprietary models like SigLIPv2, CLIP, DINOv2 on various benchmarks setting a new standard for open-source research.
July 21, 2025 at 2:47 PM
In the coming age of agents, I think vibe coding will die out, same lasting power as prompt engineering. For things LLMs excell at, you might as well stick to higher level directives and let it own the work, Claude Code is a good example. 1/2
July 18, 2025 at 9:52 AM
Reposted by Benjamin Lefaudeux 🇺🇦
this is probably why Meta was able to poach OpenAI ppl

aside from the absolute piles of cash, Sama is very SV-minded and can’t imagine building apart from a product

a lot of accelerationists see things differently, more broadly, and ids dissatisfying to be forced into a product box
explaining why they open sourced — to ensure that it’s broadly useful

OpenAI self-admits that they optimize their models for ChatGPT, o3 was made for DeepResearch

Moonshot was dissatisfied with that
July 13, 2025 at 10:27 PM
Still not a lot of ML talk on bsky (at least in my feed), hence paper Sunday: my two most interesting recent reads
- H Nets arxiv.org/abs/2507.07955
- Energy Based Transformers arxiv.org/abs/2507.02092
Dynamic Chunking for End-to-End Hierarchical Sequence Modeling
Despite incredible progress in language models (LMs) in recent years, largely resulting from moving away from specialized models designed for specific tasks to general models based on powerful archite...
arxiv.org
July 13, 2025 at 6:14 AM
Little bit of personal news, shared in other circles already: I'm moving to Mistral in August, after three years at Photoroom. I'm really proud of what we built in the ML team with relatively limited means, lasting SOTA on the existing foundations (saliency segmentation) while growing a lot on genAI
July 12, 2025 at 8:01 AM
Reposted by Benjamin Lefaudeux 🇺🇦
𝗗𝗲𝗽𝘁𝗵 𝗔𝗻𝘆𝘁𝗵𝗶𝗻𝗴 𝗮𝘁 𝗔𝗻𝘆 𝗖𝗼𝗻𝗱𝗶𝘁𝗶𝗼𝗻
Boyuan Sun, Modi Jin, Bowen Yin, Qibin Hou
arxiv.org/abs/2507.01634
Trending on www.scholar-inbox.com
July 7, 2025 at 6:00 AM
Reposted by Benjamin Lefaudeux 🇺🇦
kyutai open sources its TTS model as well as Unmute, a framework for building audio AI apps

notable:
- high accuracy
- actually streaming (can use streaming text input)
- serves 32 simultaneous users on a single GPU
- voice cloning
- supports all 24 official EU languages

kyutai.org/next/tts
A text-to-speech optimized for real-time usage.
kyutai.org
July 7, 2025 at 11:07 AM
Alex Nichol is one the rare many-hits researchers of the field, with on top of that a track record of practical models which affect the public/ship. That Meta wouldn't target him is pretty rich
Kinda offended that meta didn't try to recruit me 😂
June 29, 2025 at 8:12 PM
Automatically generate a fused megakernel in triton.. diving in, but if it works half as well as it reads it would already be quite something. Aligns with torch.compile of course

github.com/mirage-proje...
GitHub - mirage-project/mirage: Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA
Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA - mirage-project/mirage
github.com
June 24, 2025 at 7:49 PM
Sharing that Photoroom open sourced _Dataroom_, as promised some time ago.

Accompanying blog post and mini thread
github.com/photoroom/da...
www.photoroom.com/inside-photo...

1/N
Photoroom Visual Ads Automation & GenerateBanners Acquisition
Photoroom launches Visual Ads Automation, a GenAI API turning product catalogs into branded ad creatives; GenerateBanners acquisition adds text automation.
www.photoroom.com
June 23, 2025 at 10:00 AM
Still haven't tried Cursor, but I recently moved from Github Copilot to Continue with Codestral (free API), and it's absurd how much better Continue with Codestral is (vs. Copilot with expensive and slow models).

Made me realize that there is zero moat in this field, at least for Copilot.
June 19, 2025 at 8:14 AM
Reposted by Benjamin Lefaudeux 🇺🇦
In the last 2 weeks:

- Slack locked down its messages data.
- X locked down its post data.
- Anthropic cut off OpenAI's Windsurf.
- Google will stop using Scale.

The dream of unfettered MCP interconnects is a mirage.

www.dbreunig.com/2025/06/16/d...
The Drawbridges Go Up
The AI era is speedrunning the Web 2.0 story. Open and accessible MCPs are not our future. Integrations will be tightly governed.
www.dbreunig.com
June 16, 2025 at 5:38 PM
Reposted by Benjamin Lefaudeux 🇺🇦
While framed as a critique of Apple’s recent paper, I found this article mostly interesting because it made me think about reasoning in general: mikecaulfield.substack.com/p/the-apple-...
The Apple "Reasoning Collapse" Paper Is Even Dumber Than You Think
We're this far into reasoners and neither hypesters nor skeptics really understand their significance. Also: Read Toulmin.
mikecaulfield.substack.com
June 14, 2025 at 7:01 PM
Self adapting language models, still early but fascinating prospects. There's a dimensionality curse of course: since the dimensions the LLM can touch per generated token are very small as such, needs a massive lever / dimension reduction to be able to self improve.

arxiv.org/pdf/2506.10943
June 14, 2025 at 8:30 AM
Great write up of AMD new offerings, catching the nvidia train on the software side it seems. 3x speedup on MI300X since release, was required but still great to grab

morethanmoore.substack.com/p/amds-ai-fu...
AMD's AI Future is Rack Scale 'Helios'
Key Announcements from AMD Advancing AI 2025
morethanmoore.substack.com
June 12, 2025 at 7:53 PM
Reposted by Benjamin Lefaudeux 🇺🇦
Got nerdsniped into printing this a little while ago.
June 6, 2025 at 4:40 AM
datago now available with webdataset compatibility (streaming tarballs, so you get the data as it arrives). Just pip install datago and give it a whirl if you'd like ? Speed without the dataloader processes, and typical ViT/DiT pre-processing baked in.
example code here github.com/Photoroom/da...
datago/python/benchmark_webdataset.py at main · Photoroom/datago
A Rust-based data loader which can be used from Python. Processing data per sample at GB/s speeds, covering various use cases eventually. - Photoroom/datago
github.com
June 4, 2025 at 8:37 PM
Great link with bsky.app/profile/dbre...
Some interesting work suggesting that recent shocking RL+LLM results are due to incorrect baselines and most of the gains are from better format following: safe-lip-9a8.notion.site/Incorrect-Ba...
Incorrect Baseline Evaluations Call into Question Recent LLM-RL Claims | Notion
Authors*: Nikhil Chandak, Shashwat Goel, Ameya Prabhu
safe-lip-9a8.notion.site
May 31, 2025 at 5:31 AM
SageAttention3 paper reads great, and looks like B200s just got a good value boost. QAT or PTQ-free use of FP4, I expected this to be much more complicated or come later to be honest. Only at the attention level and LLMs are most often MLP bottlenecked but stil
arxiv.org/abs/2505.11594
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training
The efficiency of attention is important due to its quadratic time complexity. We enhance the efficiency of attention through two key contributions: First, we leverage the new FP4 Tensor Cores in Blac...
arxiv.org
May 29, 2025 at 9:58 PM
Reposted by Benjamin Lefaudeux 🇺🇦
Big Marigold update!
Last year, we showed how to turn Stable Diffusion 2 into a SOTA depth estimator with a few synthetic samples and 2–3 days on just 1 GPU.
Today's release features:
🏎️ 1-step inference
🔢 New modalities
🫣 High resolution
🧨 Diffusers support
🕹️ New demos
🧶👇
May 15, 2025 at 4:23 PM