Valérie Castin
@vcastin.bsky.social
94 followers 53 following 1 posts
PhD student in Machine learning at Ecole Normale Supérieure, Paris My webpage: https://vcastin.github.io/
Posts Media Videos Starter Packs
Reposted by Valérie Castin
francois.fleuret.org
I asked "on the other platform" what were the most important improvements to the original 2017 transformer.

That was quite popular and here is a synthesis of the responses:
Reposted by Valérie Castin
pierreablin.bsky.social
Excited to share Soup-of-Experts, a new neural network architecture that, for any given specific task, can instantiate in a flash a small model that is very good on it.

Made with ❤️ at Apple

Thanks to my co-authors David Grangier, Angelos Katharopoulos, and Skyler Seto!

arxiv.org/abs/2502.01804
Reposted by Valérie Castin
gabrielpeyre.bsky.social
A cute result from Valérie’s work is that Gaussian distributions remain closed under evolution by attentions layers, allowing one to study an ODE in the (mean, covariance) space. In particular, this enables the analysis of the “clustering of tokens” toward low-rank covariances.
vcastin.bsky.social
How do tokens evolve as they are processed by a deep Transformer?

With José A. Carrillo, @gabrielpeyre.bsky.social and @pierreablin.bsky.social, we tackle this in our new preprint: A Unified Perspective on the Dynamics of Deep Transformers arxiv.org/abs/2501.18322

ML and PDE lovers, check it out!
Reposted by Valérie Castin
gabrielpeyre.bsky.social
The Mathematics of Artificial Intelligence: In this introductory and highly subjective survey, aimed at a general mathematical audience, I showcase some key theoretical concepts underlying recent advancements in machine learning. arxiv.org/abs/2501.10465
Reposted by Valérie Castin
carl-allen.bsky.social
Machine learning has made incredible breakthroughs, but our theoretical understanding lags behind.

We take a step towards unravelling its mystery by explaining why the phenomenon of disentanglement arises in generative latent variable models.

Blog post: carl-allen.github.io/theory/2024/...
Reposted by Valérie Castin
carissaveliz.bsky.social
It's like when Google decided to fund itself through ads, but worse, because chatbots are already much more misleading and anthropomorphic than search engines. #AIEthics www.ft.com/content/9350...
OpenAI explores advertising as it steps up revenue drive
ChatGPT maker hires advertising talent from big tech rivals
www.ft.com