Berfin Simsek
banner
bsimsek.bsky.social
Berfin Simsek
@bsimsek.bsky.social
Deep Learning x {Symmetries, Structures, Randomness} 🦄

Researcher at Flatiron Computational Maths in NYC. PhD from EPFL. https://www.bsimsek.com/
Below this threshold, a mild overparameterization (log k factor) for k index vectors is sufficient to match the neurons to the ideal vectors due to a coupon collector argument. 🤠
May 3, 2025 at 10:06 PM
When the ideal vectors form an equiangular frame, all learned weights converge to their average (no matter how much overparameterization is used), which has turned from a saddle to a local minimum after a certain threshold of the dot product. 😲
May 3, 2025 at 10:06 PM
Searching for an exact inverse map from learned weights to ideal (concept) vectors is an intricate geometry question, even for "simple" idealized models. 🧐
May 3, 2025 at 10:06 PM