bowphs.bsky.social
@bowphs.bsky.social
Looking at Bruegel's Tower of Babel in Vienna makes you wonder: How can multilingual language models overcome the language barriers? Find out tomorrow!
📍 Level 1 (ironic, right?), Room 1.15-1
🕐 2 PM
#ACL2025NLP
July 27, 2025 at 9:11 PM
This phenomenon has a visible effect on text generation: In BLOOM-560m, activating 'earthquake' neurons derived from Spanish data at checkpoint 10,000 generates Spanish text. At checkpoint 400,000, the same method yields English text!
June 7, 2025 at 10:12 AM
This is not a bug, it's a feature! These layers are repurposing the space to form cross-lingual abstractions.
We track this by examining how specific concepts (like "earthquake" or "joy") align across languages.
June 7, 2025 at 10:12 AM
How and when do multilingual LMs achieve cross-lingual generalization during pre-training? And why do later, supposedly more advanced checkpoints, lose some language identification abilities in the process? Our #ACL2025 paper investigates.
June 7, 2025 at 10:12 AM
This phenomenon has a visible effect on text generation: In BLOOM-560m, activating 'earthquake' neurons derived from Spanish data at checkpoint 10,000 generates Spanish text. At checkpoint 400,000, the same method yields English text!
June 7, 2025 at 10:07 AM
This is not a bug, it's a feature! These layers are repurposing the space to form cross-lingual abstractions.
We track this by examining how specific concepts (like "earthquake" or "joy") align across languages.
June 7, 2025 at 10:07 AM
This phenomenon has a visible effect on text generation: In BLOOM-560m, activating 'earthquake' neurons derived from Spanish data at checkpoint 10,000 generates Spanish text. At checkpoint 400,000, the same method yields English text!
June 6, 2025 at 5:22 PM
This is not a bug, it's a feature! These layers are repurposing the space to form cross-lingual abstractions.
We track this by examining how specific concepts (like "earthquake" or "joy") align across languages.
June 6, 2025 at 5:22 PM