bowphs.bsky.social
@bowphs.bsky.social
September 14, 2025 at 9:13 AM
Read the full paper here: arxiv.org/pdf/2506.01629

Reach out if you have any questions or if you are attending ACL and want to say hi. 🙋
arxiv.org
June 7, 2025 at 10:12 AM
This phenomenon has a visible effect on text generation: In BLOOM-560m, activating 'earthquake' neurons derived from Spanish data at checkpoint 10,000 generates Spanish text. At checkpoint 400,000, the same method yields English text!
June 7, 2025 at 10:12 AM
This is not a bug, it's a feature! These layers are repurposing the space to form cross-lingual abstractions.
We track this by examining how specific concepts (like "earthquake" or "joy") align across languages.
June 7, 2025 at 10:12 AM
We ask a probing classifier: "Given this hidden state from layer l, what is the language of the source text?" The results are striking: earlier checkpoints consistently solve this with high accuracy across layers. Later checkpoints, however, exhibit clear performance drops.
June 7, 2025 at 10:12 AM
Read the full paper here: arxiv.org/pdf/2506.01629

Reach out if you have any questions or if you are attending ACL and want to say hi. 🙋
arxiv.org
June 7, 2025 at 10:07 AM
This phenomenon has a visible effect on text generation: In BLOOM-560m, activating 'earthquake' neurons derived from Spanish data at checkpoint 10,000 generates Spanish text. At checkpoint 400,000, the same method yields English text!
June 7, 2025 at 10:07 AM
This is not a bug, it's a feature! These layers are repurposing the space to form cross-lingual abstractions.
We track this by examining how specific concepts (like "earthquake" or "joy") align across languages.
June 7, 2025 at 10:07 AM
We ask a probing classifier: "Given this hidden state from layer l, what is the language of the source text?" The results are striking: earlier checkpoints consistently solve this with high accuracy across layers. Later checkpoints, however, exhibit clear performance drops.
June 7, 2025 at 10:07 AM
Read the full paper here: arxiv.org/pdf/2506.01629

Reach out if you have any questions or if you are attending ACL and want to say hi. 🙋
arxiv.org
June 6, 2025 at 5:22 PM
This phenomenon has a visible effect on text generation: In BLOOM-560m, activating 'earthquake' neurons derived from Spanish data at checkpoint 10,000 generates Spanish text. At checkpoint 400,000, the same method yields English text!
June 6, 2025 at 5:22 PM
This is not a bug, it's a feature! These layers are repurposing the space to form cross-lingual abstractions.
We track this by examining how specific concepts (like "earthquake" or "joy") align across languages.
June 6, 2025 at 5:22 PM
We ask a probing classifier: "Given this hidden state from layer l, what is the language of the source text?" The results are striking: earlier checkpoints consistently solve this with high accuracy across layers. Later checkpoints, however, exhibit clear performance drops.
June 6, 2025 at 5:22 PM
Read the full paper: aclanthology.org/2025.finding...

Work by Creston Brooks, Johannes Haubold, Charlie Cowen-Breen, Jay White, Desmond DeVaul, me, Karthik Narasimhan, and Barbara Graziosi
An Annotated Dataset of Errors in Premodern Greek and Baselines for Detecting Them
Creston Brooks, Johannes Haubold, Charlie Cowen-Breen, Jay White, Desmond DeVaul, Frederick Riemenschneider, Karthik R Narasimhan, Barbara Graziosi. Findings of the Association for Computational Lingu...
aclanthology.org
May 1, 2025 at 11:29 AM
Our work brings new computational methods to a field traditionally dominated by manual scholarship, potentially accelerating the discovery of textual errors that have remained hidden for centuries.
May 1, 2025 at 11:29 AM
Perhaps most surprising: even powerful models like GPT-4 performed barely above random chance on this specialized task! This highlights the limitations of general-purpose LLMs when dealing with ancient text restoration.
May 1, 2025 at 11:29 AM
We tested several error detection methods and found that our discriminator-based approach outperforms all others. Interestingly, scribal errors (the oldest type) are universally more difficult to detect than print or digitization errors across ALL methods.
May 1, 2025 at 11:29 AM
Prior work has only evaluated error detection on artificially-generated errors. Our dataset contains REAL errors that naturally accumulated over centuries - the subtle mistakes that survived precisely because they often appear perfectly reasonable.
May 1, 2025 at 11:29 AM
Creating this dataset was painstaking! Our domain expert spent over 100 hours reviewing potential errors, categorizing them as scribal errors (from manuscript copying), print errors (from creating editions), or digitization errors (from converting to digital).
May 1, 2025 at 11:29 AM
In "An Annotated Dataset of Errors in Premodern Greek and Baselines for Detecting Them," we introduce the first expert-labeled dataset of real errors in ancient texts, enabling proper evaluation of error detection methods on authentic textual problems.
May 1, 2025 at 11:29 AM