Lukas Galke
lukasgalke.bsky.social
Lukas Galke
@lukasgalke.bsky.social
Assistant Professor @SDU tracing connectionist mechanisms.

https://lgalke.github.io
What can we conclude? Humans and deep nets are not so different after all when learning a new language. The simplicity bias of overparameterized models seems to guide them towards learning compositional structures, even though they could easily memorize all different combinations.
December 30, 2024 at 6:34 PM
When analyzing the learning trajectory of RNNs throughout training, we make several other interesting observations: medium-structured languages have an learnability advantage early in training (likely due to same word being used for multiple meanings) but fall behind high-structured languages later.
December 30, 2024 at 6:34 PM
We find a similar effect when looking at memorization errors. In the memorization test, the task for in-context LLMs boils down to copying a word that is present earlier in the prompt. But even here, we can see an advantage of language structure.
December 30, 2024 at 6:34 PM
All these learning systems, small RNNs, pre-trained LLMs, and humans, show *very* similar memorization and generalization behavior -- with more structured languages leading to generalizations that are more systematic generalization and more similar to the generalization of human participants.
December 30, 2024 at 6:34 PM
Investigating the relationship between language learning and language structure, we find striking similarities between humans and language models: small recurrent neural networks trained from scratch and large pre-trained language models via in-context learning.
December 30, 2024 at 6:34 PM
Thanks!
November 19, 2024 at 10:55 PM