Myrthe Reuver
@myrthereuver.bsky.social
2.1K followers 880 following 59 posts
PhD #NLProc from CLTL, Vrije Universiteit Amsterdam || Interests: Computational Argumentation, Responsible AI, interdisciplinarity, cats || I express my own views
Posts Media Videos Starter Packs
Pinned
myrthereuver.bsky.social
Last week, I defended my dissertation "𝘈 𝘗𝘶𝘻𝘻𝘭𝘦 𝘰𝘧 𝘗𝘦𝘳𝘴𝘱𝘦𝘤𝘵𝘪𝘷𝘦𝘴: 𝘐𝘯𝘵𝘦𝘳𝘥𝘪𝘴𝘤𝘪𝘱𝘭𝘪𝘯𝘢𝘳𝘺 𝘓𝘢𝘯𝘨𝘶𝘢𝘨𝘦 𝘛𝘦𝘤𝘩𝘯𝘰𝘭𝘰𝘨𝘺 𝘧𝘰𝘳 𝘙𝘦𝘴𝘱𝘰𝘯𝘴𝘪𝘣𝘭𝘦 𝘕𝘦𝘸𝘴 𝘙𝘦𝘤𝘰𝘮𝘮𝘦𝘯𝘥𝘢𝘵𝘪𝘰𝘯" at the Vrije Universiteit Amsterdam. *the* moment: #PhDone! 🎓✨🎉

I couldn’t have asked for better supervisors than Antske Fokkens & @suzanv.bsky.social 💖
Reposted by Myrthe Reuver
eugenevinitsky.bsky.social
For folks considering grad school in ML, my advice is to explore programs that mix ML with a domain interest. ML programs are wildly oversubscribed while a lot of the fun right now is in figuring out what you can do with it
Reposted by Myrthe Reuver
djoerd.idf.social.ap.brid.gy
So, what *is* the @ecir2026.eu Information Retrieval for Good track? by Maria Heuss and Bhaskar Mitra:

https://bhaskar-mitra.github.io/posts/2025/09/01/what-is-ir-for-good/
myrthereuver.bsky.social
Super important paper and what a nice interdisciplinary group of co authors!!! 😁
Reposted by Myrthe Reuver
joachimbaumann.bsky.social
🚨 New paper alert 🚨 Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**.

Paper: arxiv.org/pdf/2509.08825
We present our new preprint titled "Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation".
We quantify LLM hacking risk through systematic replication of 37 diverse computational social science annotation tasks.
For these tasks, we use a combined set of 2,361 realistic hypotheses that researchers might test using these annotations.
Then, we collect 13 million LLM annotations across plausible LLM configurations.
These annotations feed into 1.4 million regressions testing the hypotheses. 
For a hypothesis with no true effect (ground truth $p > 0.05$), different LLM configurations yield conflicting conclusions.
Checkmarks indicate correct statistical conclusions matching ground truth; crosses indicate LLM hacking -- incorrect conclusions due to annotation errors.
Across all experiments, LLM hacking occurs in 31-50\% of cases even with highly capable models.
Since minor configuration changes can flip scientific conclusions, from correct to incorrect, LLM hacking can be exploited to present anything as statistically significant.
myrthereuver.bsky.social
Curious about my PhD research?
▶️ Watch a 10-min talk + my defense: lnkd.in/ej_MWDtt
📘 Read the dissertation: lnkd.in/efBW97WB
📰 Or read the short news article: lnkd.in/eizZg5VN
myrthereuver.bsky.social
Amazing co-authors broadened my perspective and made me a better scientist. Thank you so much for that! 🙏

Also to my doctoral committee: @damiantrilling.net , Annette Hautli-Janisz, reshmi G Pillai, @Khalid Al Khatib & Antal van den Bosch: thank you for your thoughtful (and fun!) questions.
myrthereuver.bsky.social
And huge thanks to my incredible paranymphs @urjakh.bsky.social and Selene Baez Santamaria 👯‍♀️. From Zoom rooms to the stage, our journey has been full of growth, laughter, and mutual support. ❤️

In fact, all PhDs from @cltl.bsky.social were a great community of support. 💖
myrthereuver.bsky.social
Last week, I defended my dissertation "𝘈 𝘗𝘶𝘻𝘻𝘭𝘦 𝘰𝘧 𝘗𝘦𝘳𝘴𝘱𝘦𝘤𝘵𝘪𝘷𝘦𝘴: 𝘐𝘯𝘵𝘦𝘳𝘥𝘪𝘴𝘤𝘪𝘱𝘭𝘪𝘯𝘢𝘳𝘺 𝘓𝘢𝘯𝘨𝘶𝘢𝘨𝘦 𝘛𝘦𝘤𝘩𝘯𝘰𝘭𝘰𝘨𝘺 𝘧𝘰𝘳 𝘙𝘦𝘴𝘱𝘰𝘯𝘴𝘪𝘣𝘭𝘦 𝘕𝘦𝘸𝘴 𝘙𝘦𝘤𝘰𝘮𝘮𝘦𝘯𝘥𝘢𝘵𝘪𝘰𝘯" at the Vrije Universiteit Amsterdam. *the* moment: #PhDone! 🎓✨🎉

I couldn’t have asked for better supervisors than Antske Fokkens & @suzanv.bsky.social 💖
myrthereuver.bsky.social
Its the final countdown 🎶🎤 (I am re-reading my dissertation for my defense next week), and actually I realized I had some fun findings hidden in some papers that I myself forgot about! 😂 I don’t know if that’s a good or bad sign for my defense.. 😂
myrthereuver.bsky.social
But then working as a (university) researcher also comes with a lot of downsides, including insecurity and pressure in random “which grant or paper wins” arenas which I do not vibe well with.

But what then? What do?
myrthereuver.bsky.social
Btw I’m serious about this career change comment.

I’m having a sort of post-PhD career reflection where I realize that these kind of things don’t spark joy for me but seem to be a big part of being an AI dev in industry.
myrthereuver.bsky.social
I mean, I have heard people say they enjoy the puzzling aspect and the feeling accomplished when they fix it.

Personally, for me that never weights up against the annoyance and what feels like endless wasted time.
myrthereuver.bsky.social
Also, I realize some people really love the “puzzle” aspect but I don’t like these kind of puzzles. It makes me stressed and annoyed. Maybe I should find another field to work in. 😛
myrthereuver.bsky.social
I also really hate it when people who do not work in NLP/LLMs then say “oh no but with conda and a requirements.txt it’s easy, right?”, not realizing the morass of ever-new models and architectures I live in.
myrthereuver.bsky.social
Realization: I really, really, really hate the part of my job where it is managing conda environments and going through a deep deep cave of issue reports trying to find why something randomly doesn’t work.
Reposted by Myrthe Reuver
astrokatie.com
Chatbots — LLMs — do not know facts and are not designed to be able to accurately answer factual questions. They are designed to find and mimic patterns of words, probabilistically. When they’re “right” it’s because correct things are often written down, so those patterns are frequent. That’s all.
Reposted by Myrthe Reuver
gabriellalapesa.bsky.social
Deadline approaching! Workshop on Computational Linguistics for the Political and Social Sciences #KONVENS2025, archival long-short papers (acl anthology) & non-archival abstracts and phd project descriptions (get feedback from a great community!) ! Deadline: June 13th.
gabriellalapesa.bsky.social
Remember the deadline for the CPSS (Computational Linguistics for the Political and Social Sciences) #KONVENS2025 is approaching! And also: besides archival submissions which will appear in the ACL anthology, we also have ...
gabriellalapesa.bsky.social
Interested in the application of NLP to research questions form the Political and Social Sciences? This is the workshop for you! CPSS at #KONVENS2025 cpss-sig.github.io/CPSS-2025/cf... . Archival papers in the ACL anthology, deadline: June 13th!
Reposted by Myrthe Reuver
bramvanroy.bsky.social
Yay, so happy to host CLIN in Leuven this year! It'll take place on September 12th. Abstract submission deadline on June 13th!
clin35-2025.bsky.social
📅 Don't forget! The deadline for submitting your abstract to the #CLIN conference in Leuven is coming: 13th of June! Submitting is easy: name, title of your work, 500-word abstract, done! #nlp #nlproc #compling #llm #ai #dutch clin35.ccl.kuleuven.be
CLIN35
Computational Linguistics in The Netherlands (CLIN) is a yearly conference on computational linguistics. Each year the conference is organized by a different institution in the Dutch-speaking region. ...
clin35.ccl.kuleuven.be
myrthereuver.bsky.social
My love language is sending my academic friends the papers/datasets/posts on social media that I know align with their research interest. 💖
Reposted by Myrthe Reuver
gesistraining.bsky.social
Unlock the power of large language models for your research!
Join this #GESISworkshop with Julia Romberg, @vigneshwaran-s.bsky.social, and @mmmaurer.bsky.social to explore adapters — an efficient alternative to fine-tuning your models.

🔗 Book now ➡️ t1p.de/adapters-lig...

@gesis.org
GESIS Workshop
Adapters: Lightweight Machine Learning for Social Science Research
02 to 04 June 2025 | Hybrid (Cologne | Online)
Julia Romberg, Vigneshwaran Shankaran, Maximilian Maurer (all GESIS)
myrthereuver.bsky.social
While I am not at #NAACL, I gave a talk about this paper (and more work in my dissertation) last Friday at @annarogers.bsky.social ’s lab, very nice discussion there! 😃

Paper: lnkd.in/eBBSi6_p
Code: lnkd.in/ezwRGpjP
Slides: lnkd.in/erPP5fpV

Want to know more? Message me!
aclanthology.org
myrthereuver.bsky.social
💡We find that:
- Experts use 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝘀𝘁𝗿𝗮𝘁𝗲𝗴𝗶𝗲𝘀 to assess the LLM;
- Surprisingly, 𝗹𝗼𝗻𝗴𝗲𝗿 𝗮𝗻𝗱 𝗺𝗼𝗿𝗲 𝗻𝘂𝗮𝗻𝗰𝗲𝗱 𝗱𝗲𝗳𝗶𝗻𝗶𝘁𝗶𝗼𝗻𝘀 𝗼𝗳 𝘀𝗲𝘅𝗶𝘀𝗺 developed via LLM-human collaboration;
- Some experts improve zero-shot performance with their improved definition.

#NLProc #CSS #computationalsocialscience
myrthereuver.bsky.social
Our study consisted of four components:

1) a survey of sexism researchers
two interactive experiments on expert-LLM interactions; 2). assessing the LLM;
3). co-creating of sexism definitions with the LLM;
4) using these definitions in zero-shot detection with LLMs on five sexism datasets: 👩‍🔬 + 🤖
myrthereuver.bsky.social
This work was the outcome of my Junior Research Visit grant at @gesis.org last year, and is the final chapter of my dissertation! 🤩

Our method allowed us to measure connections between experts, sexism definition, dataset, & classification performance in zero-shot sexism classification. 🔍🔬
myrthereuver.bsky.social
Expert + LLM = Better Sexism Detection? ✨

Paper:
𝘛𝘦𝘭𝘭 𝘔𝘦 𝘞𝘩𝘢𝘵 𝘠𝘰𝘶 𝘒𝘯𝘰𝘸 𝘈𝘣𝘰𝘶𝘵 𝘚𝘦𝘹𝘪𝘴𝘮: 𝘌𝘹𝘱𝘦𝘳𝘵-𝘓𝘓𝘔 𝘐𝘯𝘵𝘦𝘳𝘢𝘤𝘵𝘪𝘰𝘯 𝘚𝘵𝘳𝘢𝘵𝘦𝘨𝘪𝘦𝘴 𝘢𝘯𝘥 𝘊𝘰-𝘊𝘳𝘦𝘢𝘵𝘦𝘥 𝘋𝘦𝘧𝘪𝘯𝘪𝘵𝘪𝘰𝘯𝘴 𝘧𝘰𝘳 𝘡𝘦𝘳𝘰-𝘚𝘩𝘰𝘵 𝘚𝘦𝘹𝘪𝘴𝘮 𝘋𝘦𝘵𝘦𝘤𝘵𝘪𝘰𝘯

w: @indiiigo.bsky.social, @matteo-mls.bsky.social y.social & @gabriellalapesa.bsky.social

@ Findings #NAACL2025 !🤩
A visual description of how our expert survey led to two interactive experiments and finally to definitions that were used in zero-shot sexism detection.