Jelle Zuidema 🟥
@wzuidema.bsky.social
2.6K followers 1.2K following 220 posts
Associate Professor of Natural Language Processing & Explainable AI, University of Amsterdam, ILLC
Posts Media Videos Starter Packs
Pinned
wzuidema.bsky.social
Perhaps of interest to some: we made our own series of videos on interpretability, that are visually not quite as spectacular, but add some more depth and some important context. E.g., Sandro Pezelle and Michael Hanna discuss
*how* you can find key 'circuits'

m.youtube.com/watch?v=Jfuk...

2/n
Transformer Interpretability 5: What is a circuit, and how does it explain LLM behavior?
YouTube video by ILLC Science
m.youtube.com
Reposted by Jelle Zuidema 🟥
profsimonfisher.bsky.social
Twenty-four years ago today, our paper “A forkhead-domain gene is mutated in a severe speech and language disorder” was published: www.nature.com/articles/350....
A personal thread about the ups & downs of the journey we took to get to that point....1/n
🗣️🧬🧪
Image shows the first two printed pages of the paper “A forkhead-domain gene is mutated in a severe speech and language disorder” by Cecilia Lai and colleagues, published in Nature in 2001 (volume 413, pages 519-523). The abstract reads as follows:
Individuals affected with developmental disorders of speech and language have substantial difficulty acquiring expressive and/or receptive language in the absence of any profound sensory or neurological impairment and despite adequate intelligence and opportunity. Although studies of twins consistently indicate that a significant genetic component is involved, most families segregating speech and language deficits show complex patterns of inheritance, and a gene that predisposes individuals to such disorders has not been identified. We have studied a unique three-generation pedigree, KE, in which a severe speech and language disorder is transmitted as an autosomal-dominant monogenic trait. Our previous work mapped the locus responsible, SPCH1, to a 5.6-cM interval of region 7q31 on chromosome 7. We also identified an unrelated individual, CS, in whom speech and language impairment is associated with a chromosomal translocation involving the SPCH1 interval. Here we show that the gene FOXP2, which encodes a putative transcription factor containing a polyglutamine tract and a forkhead DNA-binding domain, is directly disrupted by the translocation breakpoint in CS. In addition, we identify a point mutation in affected members of the KE family that alters an invariant amino-acid residue in the forkhead domain. Our findings suggest that FOXP2 is involved in the developmental process that culminates in speech and language.
Reposted by Jelle Zuidema 🟥
stefanfrank.bsky.social
Announcing the first (and perhaps only) Multilingual Minds and Machines Meeting! Come join us in Nijmegen, June 22-23, 2026, if you are interested in computational models of human multilingualism: mmmm2026.github.io
wzuidema.bsky.social
I look forward to listening to it! Critical reflection on journalism by journalists is much appreciated.
wzuidema.bsky.social
Honest question: why do you say he has no mandate? What would a mandate be? Given how difficult it is to change an existing system (and yet how important), I'm wondering if there ever is a practical route to transitioning to PR, if a supermajority like Labour's is not enough.
wzuidema.bsky.social
Let's seek shelter under those trees over there! 😱
liefhebberv.bsky.social
Fotograaf Debbie Parker legde deze "bliksem" vast in West Virginia.
wzuidema.bsky.social
This is embarrassing reporting, where AI-is-almost-sentient is the "nuanced view", and @mustafasuleymanai.bsky.social the radical.

One of @robert-booth.bsky.social's sources: Ufair "a small, undeniably fringe organisation, led by three humans and seven AIs with names such as Aether & Buzz."😱
Reposted by Jelle Zuidema 🟥
togelius.bsky.social
New blog post: AI Allergy.

On my increasing disgust with the AI discourse, even though I still like the technical and philosophical. And how I wish I could be excited about AI again.

togelius.blogspot.com/2025/08/ai-a...
AI Allergy
I remember being excited about AI. I remember 20 years ago, being excited about neuroevolutionary methods for learning adaptive behaviors in...
togelius.blogspot.com
Reposted by Jelle Zuidema 🟥
mdhk.net
Had such a great time presenting our tutorial on Interpretability Techniques for Speech Models at #Interspeech2025! 🔍

For anyone looking for an introduction to the topic, we've now uploaded all materials to the website: interpretingdl.github.io/speech-inter...
wzuidema.bsky.social
I, for one, was happy you skip existential threat concerns, but focus on real dangers that current human populations face.

* The remaining issue is of course whether your concerns will really translate into real actions/guardrails at MS, even if they'd run against big economic interests

* Thx! 2/2
wzuidema.bsky.social
Since you ask for comments:
* It's a really good essay. I share your concerns, and mostly agree with your analysis that building SCAI will be possible soon.
* I don't know why you expect that many will see it as "ungrounded, more science fiction than reality" or "unnecessarily alarmist". (1/2)
wzuidema.bsky.social
This was an interesting exercise (not published yet), unearthing many methodological problems when you want to use an LLM to explain something about *novel* human behaviour, while acknowledging that it has been trained on gigantic datasets of human behaviour.

www.arxiv.org/abs/2407.02136
Black Big Boxes: Do Language Models Hide a Theory of Adjective Order?
In English and other languages, multiple adjectives in a complex noun phrase show intricate ordering patterns that have been a target of much linguistic theory. These patterns offer an opportunity to ...
www.arxiv.org
wzuidema.bsky.social
I don't know exactly what you have in mind w/ stress testing, but perhaps this ACL'25 tutorial is an interesting resource:
acl2025-eyetracking-and-nlp.github.io

I've been racking my brain over what LLMs really tell us about cognitive/linguistic theories (beyond the lazy "nothing, because of size")
ACL 2025 Tutorial: Eye Tracking and NLP
ACL 2025 Tutorial on Eye Tracking and NLP
acl2025-eyetracking-and-nlp.github.io
wzuidema.bsky.social
Congrats, Verna, Dieuwke, Elia and Mathijs!
Reposted by Jelle Zuidema 🟥
q.pheevr.ca
The heartbreaking thing about this
is that there’s already a proven way
to invest lots of money in a knowledge machine
that produces unforeseeable results
that include fantastically profitable ideas
(and some life-saving ones)
and generally benefit society
and this machine is called
a university
warrenterra.bsky.social
From the replies (bsky.app/profile/dasb...) here's Sam Altman doing what the quoted post described. He seems serious about it.
wzuidema.bsky.social
Thought provoking work from my Amsterdam/ILLC colleagues
pettertornberg.com
We built the simplest possible social media platform. No algorithms. No ads. Just LLM agents posting and following.

It still became a polarization machine.

Then we tried six interventions to fix social media.

The results were… not what we expected.

arxiv.org/abs/2508.03385
Can We Fix Social Media? Testing Prosocial Interventions using Generative Social Simulation
Social media platforms have been widely linked to societal harms, including rising polarization and the erosion of constructive debate. Can these problems be mitigated through prosocial interventions?...
arxiv.org
wzuidema.bsky.social
Bakenessegracht? Heel bijzonder
wzuidema.bsky.social
Human individual judgements correlate even more strongly with the difference between a model's scores, but that says nothing about a model's abilities *in the wild*! This is contra Hu et al. '24 (www.pnas.org/doi/10.1073/...), & most importantly, provides a fresh dataset for use in this debate.
wzuidema.bsky.social
My favourite image from the paper, illustrating that LLMs are surprisingly weak at judging grammaticality. Human judgment correlates quite strongly with the *difference* in likelihood (or SLOR) that LLMs assign to pairs of grammatical & ungrammatical sentences, but that's the wrong measure.
wzuidema.bsky.social
I'll be in Vienna only from tomorrow, but today my star PhD student Marianne is already presenting some of our work:

BLIMP-NL, in which we create a large new dataset for syntactic evaluation of Dutch LLMs, and learn a lot about dataset creation, LLM evaluation and grammatical abilities on the way.
mdhk.net
Next week I’ll be in Vienna for my first *ACL conference! 🇦🇹✨

I will present our new BLiMP-NL dataset for evaluating language models on Dutch syntactic minimal pairs and human acceptability judgments ⬇️

🗓️ Tuesday, July 29th, 16:00-17:30, Hall X4 / X5 (Austria Center Vienna)
The BLiMP-NL dataset consists of 84 Dutch minimal pair paradigms covering 22 syntactic phenomena, and comes with graded human acceptability ratings & self-paced reading times. 

An example minimal pair:
A. Ik bekijk de foto van mezelf in de kamer (I watch the photograph of myself in the room; grammatical)
B. Wij bekijken de foto van mezelf in de kamer (We watch the photograph of myself in the room; ungrammatical)

Differences in human acceptability ratings between sentences correlate with differences in model syntactic log-odds ratio scores.
wzuidema.bsky.social
Here's a 2017 review paper by Monica Tamariz on all kinds of experiments with humans learning to agree on the meaning of signals. Much of it in the context of origins of language; some experiments use gestures to try to avoid biases from existing language.

www.annualreviews.org/content/jour...
Experimental Studies on the Cultural Evolution of Language | Annual Reviews
Why are languages the way they are? The biases and constraints that explain why languages display the traits they do—instead of other possible ones—include human cognition, social dynamics, communicat...
www.annualreviews.org
wzuidema.bsky.social
Ik haal dat niet uit de tekst (de tekst leest als een onhandig compromis, maar wel één waar ik mee kan leven), ik zeg alleen dat het een uitkomst kan zijn van het onderzoek, als dat onderzoek met kennis van zaken wordt uitgevoerd.