Valérie Castin
@vcastin.bsky.social
94 followers
53 following
1 posts
PhD student in Machine learning at Ecole Normale Supérieure, Paris
My webpage: https://vcastin.github.io/
Posts
Media
Videos
Starter Packs
Reposted by Valérie Castin
François Fleuret
@francois.fleuret.org
· Apr 28
Reposted by Valérie Castin
Reposted by Valérie Castin
Reposted by Valérie Castin
Pierre Ablin
@pierreablin.bsky.social
· Jan 24
Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Attention is a key part of the transformer architecture. It is a sequence-to-sequence mapping that transforms each sequence element into a weighted sum of values. The weights are typically obtained as...
arxiv.org
Reposted by Valérie Castin
Reposted by Valérie Castin