Excited to share our ICML Oral paper on learning dynamics in linear RNNs!
with @clementinedomine.bsky.social @mpshanahan.bsky.social and Pedro Mediano
openreview.net/forum?id=KGO...
Excited to share our ICML Oral paper on learning dynamics in linear RNNs!
with @clementinedomine.bsky.social @mpshanahan.bsky.social and Pedro Mediano
openreview.net/forum?id=KGO...
arxiv.org/abs/2409.14623
A thread on how relative weight initialization shapes learning dynamics in deep networks. 🧵 (1/9)
arxiv.org/abs/2409.14623
A thread on how relative weight initialization shapes learning dynamics in deep networks. 🧵 (1/9)
We study how task abstractions emerge in gated linear networks and how they support cognitive flexibility.
We study how task abstractions emerge in gated linear networks and how they support cognitive flexibility.
We study how task abstractions emerge in gated linear networks and how they support cognitive flexibility.