Computational Linguistics @UPF
@colt-upf.bsky.social
690 followers 350 following 18 posts
Gemma Boleda, Marco Baroni, Thomas Brochhagen, Iria de Dios Flores | Computational Linguistics and Linguistic Theory Universitat Pompeu Fabra. upf.edu/web/colt Barcelona
Posts Media Videos Starter Packs
colt-upf.bsky.social
Do you use a pronoun more often when the entity you’re talking about is more predictable?

Previous work offers diverging answers so we conducted a meta-analysis, combining data from 20 studies across 8 different languages.

Now out in Language: muse.jhu.edu/article/969615
Reposted by Computational Linguistics @UPF
traduccioupf.bsky.social
📢 Seminari de recerca organitzat pel COLT- URLING, "LLM and human language: representations, judgments, and historical change".

📆 29/09/2025
🕦 15:30
🎤 Adele Goldberg (Princeton University)
🚩55.410, Edifici Tànger del Campus Poblenou - UPF
ℹ️ ja.cat/wi2t7

@colt-upf.bsky.social
Reposted by Computational Linguistics @UPF
traduccioupf.bsky.social
📢 Seminari de recerca organitzat pel COLT- URLING, "Associative memory in psycholinguistics and in AI architectures".

📆 01/10/2025
🕦 12:00
🎤 Jakub Dotlačil
🚩55.410, Edifici Tànger del Campus Poblenou - UPF
ℹ️ ja.cat/U5xH2

@colt-upf.bsky.social
Reposted by Computational Linguistics @UPF
gboleda.bsky.social
New paper! 🚨 I argue that LLMs represent a synthesis between distributed and symbolic approaches to language, because, when exposed to language, they develop highly symbolic representations and processing mechanisms in addition to distributed ones.
arxiv.org/abs/2502.11856
Sigmoid function. Non-linearities in neural network allow it to behave in distributed and near-symbolic fashions.
Reposted by Computational Linguistics @UPF
delliott.bsky.social
📢I am hiring a Postdoc to work on post-training methods for low-resource languages. Apply by August 15 employment.ku.dk/faculty/?sho....
Let's talk at #ACL2025NLP in Vienna if you want to know more about the position and life in Denmark.
Postdoc in Natural Language Processing
employment.ku.dk
Reposted by Computational Linguistics @UPF
alexanderhoyle.bsky.social
Evaluating topic models (and document clustering methods) is hard. In fact, since our paper critiquing standard evaluation practices four years ago, there hasn't been a good replacement metric

That ends today (we hope)! Our new ACL paper introduces an LLM-based evaluation protocol 🧵
Screenshot of first page of paper. It is here: https://arxiv.org/pdf/2507.00828

Abstract: Topic model and document-clustering evaluations either use automated metrics that align poorly with human preferences or require expert labels that are intractable to scale. We design a scalable human evaluation protocol and a corresponding automated approximation that reflect practitioners' real-world usage of models. Annotators -- or an LLM-based proxy -- review text items assigned to a topic or cluster, infer a category for the group, then apply that category to other documents. Using this protocol, we collect extensive crowdworker annotations of outputs from a diverse set of topic models on two datasets. We then use these annotations to validate automated proxies, finding that the best LLM proxies are statistically indistinguishable from a human annotator and can therefore serve as a reasonable substitute in automated evaluations
colt-upf.bsky.social
🎉New paper "Prediction Hubs are Context-Informed Frequent Tokens in LLMs" from our lab, accepted at ACL 2025!

If you're interested in representational geometry, come find Beatrix Nielsen and Marco Baroni at the poster :)
beatrixmgn.bsky.social
Our paper "Prediction Hubs are Context-Informed Frequent Tokens in LLMs" has been accepted at ACL 2025!

Main points:
1. Hubness is not a problem when language models do next-token prediction.
2. Nuisance hubness can appear when other comparisons are made.
colt-upf.bsky.social
Today at UPF Campus de la Ciutadella at 2:30 pm! Come slightly earlier to check in!

Sala Polivalent 24S18

maps.app.goo.gl/n1hBxiviKcLW...
colt-upf.bsky.social
⭐ Registration open til May 27th! ⭐
Website: www.upf.edu/web/colt/sym...

June 2nd, UPF

𝗦𝗽𝗲𝗮𝗸𝗲𝗿 𝗹𝗶𝗻𝗲𝘂𝗽:
Arianna Bisazza (language acquisition with NNs)
Naomi Saphra (emergence in LLM training dynamics)
Jean-Rémi King (TBD)
Louise McNally (pitfalls of contextual/formal accounts of semantics)
colt-upf.bsky.social
Announcing the COLT Symposium on June 2nd!

𝗘𝗺𝗲𝗿𝗴𝗲𝗻𝘁 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 𝗼𝗳 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗶𝗻 𝗺𝗶𝗻𝗱𝘀 𝗮𝗻𝗱 𝗺𝗮𝗰𝗵𝗶𝗻𝗲𝘀

What properties of language are emerging from work in experimental and theoretical linguistics, neuroscience & LLM interpretability?

Info: tinyurl.com/colt-site
Register: tinyurl.com/colt-register

🧵1/3
colt-upf.bsky.social
📢 𝗟𝗼𝗰𝗮𝘁𝗶𝗼𝗻 𝗰𝗵𝗮𝗻𝗴𝗲📢

UPF Campus de la Ciutadella
**Sala Polivalent 24.S18**

Thank you for bearing with us!
colt-upf.bsky.social
Last day to sign up for the COLT Symposium!
Register: tinyurl.com/colt-register

📢 𝗟𝗼𝗰𝗮𝘁𝗶𝗼𝗻 𝗰𝗵𝗮𝗻𝗴𝗲📢
June 2nd, 14:30 - 19:00

UPF Campus de la Ciutadella
Room 40.101

maps.app.goo.gl/1216LJRsWmTE...
colt-upf.bsky.social
⭐ Registration open til May 27th! ⭐
Website: www.upf.edu/web/colt/sym...

June 2nd, UPF

𝗦𝗽𝗲𝗮𝗸𝗲𝗿 𝗹𝗶𝗻𝗲𝘂𝗽:
Arianna Bisazza (language acquisition with NNs)
Naomi Saphra (emergence in LLM training dynamics)
Jean-Rémi King (TBD)
Louise McNally (pitfalls of contextual/formal accounts of semantics)
colt-upf.bsky.social
Last day to sign up for the COLT Symposium!
Register: tinyurl.com/colt-register

📢 𝗟𝗼𝗰𝗮𝘁𝗶𝗼𝗻 𝗰𝗵𝗮𝗻𝗴𝗲📢
June 2nd, 14:30 - 19:00

UPF Campus de la Ciutadella
Room 40.101

maps.app.goo.gl/1216LJRsWmTE...
colt-upf.bsky.social
⭐ Registration open til May 27th! ⭐
Website: www.upf.edu/web/colt/sym...

June 2nd, UPF

𝗦𝗽𝗲𝗮𝗸𝗲𝗿 𝗹𝗶𝗻𝗲𝘂𝗽:
Arianna Bisazza (language acquisition with NNs)
Naomi Saphra (emergence in LLM training dynamics)
Jean-Rémi King (TBD)
Louise McNally (pitfalls of contextual/formal accounts of semantics)
colt-upf.bsky.social
Announcing the COLT Symposium on June 2nd!

𝗘𝗺𝗲𝗿𝗴𝗲𝗻𝘁 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 𝗼𝗳 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗶𝗻 𝗺𝗶𝗻𝗱𝘀 𝗮𝗻𝗱 𝗺𝗮𝗰𝗵𝗶𝗻𝗲𝘀

What properties of language are emerging from work in experimental and theoretical linguistics, neuroscience & LLM interpretability?

Info: tinyurl.com/colt-site
Register: tinyurl.com/colt-register

🧵1/3
colt-upf.bsky.social
⭐ Registration open til May 27th! ⭐
Website: www.upf.edu/web/colt/sym...

June 2nd, UPF

𝗦𝗽𝗲𝗮𝗸𝗲𝗿 𝗹𝗶𝗻𝗲𝘂𝗽:
Arianna Bisazza (language acquisition with NNs)
Naomi Saphra (emergence in LLM training dynamics)
Jean-Rémi King (TBD)
Louise McNally (pitfalls of contextual/formal accounts of semantics)
colt-upf.bsky.social
Announcing the COLT Symposium on June 2nd!

𝗘𝗺𝗲𝗿𝗴𝗲𝗻𝘁 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 𝗼𝗳 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗶𝗻 𝗺𝗶𝗻𝗱𝘀 𝗮𝗻𝗱 𝗺𝗮𝗰𝗵𝗶𝗻𝗲𝘀

What properties of language are emerging from work in experimental and theoretical linguistics, neuroscience & LLM interpretability?

Info: tinyurl.com/colt-site
Register: tinyurl.com/colt-register

🧵1/3
colt-upf.bsky.social
colt-upf.bsky.social
Announcing the COLT Symposium on June 2nd!

𝗘𝗺𝗲𝗿𝗴𝗲𝗻𝘁 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 𝗼𝗳 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗶𝗻 𝗺𝗶𝗻𝗱𝘀 𝗮𝗻𝗱 𝗺𝗮𝗰𝗵𝗶𝗻𝗲𝘀

What properties of language are emerging from work in experimental and theoretical linguistics, neuroscience & LLM interpretability?

Info: tinyurl.com/colt-site
Register: tinyurl.com/colt-register

🧵1/3
colt-upf.bsky.social
𝗚𝗲𝘁𝘁𝗶𝗻𝗴 𝘁𝗵𝗲𝗿𝗲:

𝗪𝗵𝗲𝗻: 2nd June 2025, 14:30 - 19:00
𝗪𝗵𝗲𝗿𝗲: UPF Poblenou, Auditori (enter via Roc Boronat building) maps.app.goo.gl/2WMt21hR5L9r...

In-person only, with mandatory registration:
tinyurl.com/colt-register

See you there!

🧵3/3
colt-upf.bsky.social
Our speakers span a wide range of expertise between AI, linguistics, and neuroscience.

14:30 Arianna Bisazza (Uni. Groningen)
15:30 Naomi Saphra (Harvard)

-- coffee break --

17:00 Jean-Rémi King (Meta AI)
18:00 Louise McNally (UPF)

Abstracts: tinyurl.com/colt-site

🧵2/3
colt-upf.bsky.social
Announcing the COLT Symposium on June 2nd!

𝗘𝗺𝗲𝗿𝗴𝗲𝗻𝘁 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 𝗼𝗳 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗶𝗻 𝗺𝗶𝗻𝗱𝘀 𝗮𝗻𝗱 𝗺𝗮𝗰𝗵𝗶𝗻𝗲𝘀

What properties of language are emerging from work in experimental and theoretical linguistics, neuroscience & LLM interpretability?

Info: tinyurl.com/colt-site
Register: tinyurl.com/colt-register

🧵1/3
colt-upf.bsky.social
Please find us at #ICLR2025! We will present our work on intrinsic dimension as a cue for stages of language processing in LLMs.

Saturday morning, Poster session 5
Hall 3 + Hall2B #563
iclr.cc/virtual/2025...

Arxiv: arxiv.org/abs/2405.15471
Reposted by Computational Linguistics @UPF
ercbravenewword.bsky.social
📢 Upcoming Seminar

Words are weird? On the role of lexical ambiguity in language
🗣 Gemma Boleda (Universitat Pompeu Fabra, Spain)
Why is language so ambiguous? Discover how ambiguity balances cognitive simplicity and communicative complexity through large-scale studies.
📍 UniMiB, Room U6-01C, Milan
Reposted by Computational Linguistics @UPF
gboleda.bsky.social
This year, CoNLL will be accepting *non-archival* (as well as archival) submissions! www.conll.org #CoNLL2025

Follow CoNLL at
@conll-conf.bsky.social
CoNLL 2025 | CoNLL
www.conll.org
Reposted by Computational Linguistics @UPF
emcheng.bsky.social
Here's our work accepted to #ICLR2025!

We look at how intrinsic dimension evolves over LLM layers, spotting a universal high-dimensional phase.

This ID peak is where:

- linguistic features are built
- different LLMs are most similar,

with implications for task transfer

🧵 1/6
Reposted by Computational Linguistics @UPF
dlbcnai.bsky.social
Què és l’aprenentatge profund ?

La @marionamec.bsky.social de @neurofregides.bsky.social ens ho explica en motiu del Deep Learning Barcelona Symposium 2024 (@dlbcn.ai), aquest dijous 19 de desembre.

#deeplearning #ciencia #català #barcelona

www.youtube.com/shorts/R4u_Z...
Què és l'aprenentatge profund ? - La Dimoni de Maxwell #deeplearning #ciencia #català #barcelona
YouTube video by Deep Learning Barcelona
www.youtube.com
colt-upf.bsky.social
Conclusion: for communication in-context,

Lexical systems with a soft mapping between referents and names let speakers maximize communication accuracy while minimizing complexity.

Paper: aclanthology.org/2024.emnlp-m...

3/3
colt-upf.bsky.social
We explored, for a color naming task, why a soft mapping between referents and words is a good solution for communication...

...by taking into account
1⃣ in-context communication
2⃣ the hierarchical structure of the lexicon

2/3