Suzan Verberne
@suzanv.bsky.social
350 followers 100 following 12 posts
1980 | Professor of Natural Language Processing, @LIACS @UniLeiden | Lives in #Nijmegen | Mother of 2 | 👩🏻‍💻👩🏻‍🏫🤹‍♀️🌱 🎼
Posts Media Videos Starter Packs
Reposted by Suzan Verberne
myr55.bsky.social
Omdat niet iedereen een abonnement heeft op de Volkskrant hierbij een screenshot van deze rake column.
Reposted by Suzan Verberne
joachimbaumann.bsky.social
🚨 New paper alert 🚨 Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**.

Paper: arxiv.org/pdf/2509.08825
We present our new preprint titled "Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation".
We quantify LLM hacking risk through systematic replication of 37 diverse computational social science annotation tasks.
For these tasks, we use a combined set of 2,361 realistic hypotheses that researchers might test using these annotations.
Then, we collect 13 million LLM annotations across plausible LLM configurations.
These annotations feed into 1.4 million regressions testing the hypotheses. 
For a hypothesis with no true effect (ground truth $p > 0.05$), different LLM configurations yield conflicting conclusions.
Checkmarks indicate correct statistical conclusions matching ground truth; crosses indicate LLM hacking -- incorrect conclusions due to annotation errors.
Across all experiments, LLM hacking occurs in 31-50\% of cases even with highly capable models.
Since minor configuration changes can flip scientific conclusions, from correct to incorrect, LLM hacking can be exploited to present anything as statistically significant.
Reposted by Suzan Verberne
myrthereuver.bsky.social
Last week, I defended my dissertation "𝘈 𝘗𝘶𝘻𝘻𝘭𝘦 𝘰𝘧 𝘗𝘦𝘳𝘴𝘱𝘦𝘤𝘵𝘪𝘷𝘦𝘴: 𝘐𝘯𝘵𝘦𝘳𝘥𝘪𝘴𝘤𝘪𝘱𝘭𝘪𝘯𝘢𝘳𝘺 𝘓𝘢𝘯𝘨𝘶𝘢𝘨𝘦 𝘛𝘦𝘤𝘩𝘯𝘰𝘭𝘰𝘨𝘺 𝘧𝘰𝘳 𝘙𝘦𝘴𝘱𝘰𝘯𝘴𝘪𝘣𝘭𝘦 𝘕𝘦𝘸𝘴 𝘙𝘦𝘤𝘰𝘮𝘮𝘦𝘯𝘥𝘢𝘵𝘪𝘰𝘯" at the Vrije Universiteit Amsterdam. *the* moment: #PhDone! 🎓✨🎉

I couldn’t have asked for better supervisors than Antske Fokkens & @suzanv.bsky.social 💖
Reposted by Suzan Verberne
ecir2026.eu
The organization of #ECIR2026 has started! We just had our first call with all track chairs. With the calls now finalized, online and distributed across mailing lists, we’re moving on to the rest of the conference preparation!

@ecir2026.eu
📍 Delft, 30 Mar – 2 Apr 2026
👉 ecir2026.eu
Reposted by Suzan Verberne
bmitra.bsky.social
Announcing the call for papers for #ECIR2026 IR-for-Good Track: ecir2026.eu/calls/call-f...

Abstracts due: Oct 21
Papers due: Oct 28

This year, we are revamping the #IR4Good track as a core track at the conference. A short thread on the changes we are introducing this year... 🧵
1/
IR for Good
call for ir for good papers
ecir2026.eu
suzanv.bsky.social
I was interviewed for the Dutch newspaper NRC. The article (about Grok) came out 2 weeks ago. Yesterday (back from vacation) I and found the paper copy

Kudos to Toon Beemsterboer who asked critical questions and gave me space to explain the steps of LLM-chatbot training

www.nrc.nl/nieuws/2025/...
Reposted by Suzan Verberne
ecir2026.eu
📢 The website for ECIR 2026 is live – and so is our first call for papers!

🔗 ecir2026.eu
📍 Delft, The Netherlands
🗓️ March 30 – April 1, 2026 (main conference)
🎓 Tutorials: March 29
🛠 Workshops: April 2
ECIR2026
The 48th European Conference on Information Retrieval (ECIR 2026) is Europe's premier forum for cutting-edge research in
ecir2026.eu
Reposted by Suzan Verberne
metaror.bsky.social
Today's MetaROR paper discusses "the use of document data networks to control the topic clustering of a science map" with "rigorous methodology" & "careful presentation".

Reviewers consider it "a valuable contribution to the literature", and provide the authors detailed feedback.

👇 Read on MetaROR
Use of diverse data sources to control which topics emerge in a science map
metaror.org
Reposted by Suzan Verberne
acmsigir-ap.bsky.social
🎉 Exciting news for #SIGIRAP2025! This year, we're continuing our tradition of offering both face-to-face and remote participation options. Remote presentations are fully supported! Join us to participate from anywhere! 📡 #HybridConference
Reposted by Suzan Verberne
lisawesterveld.bsky.social
Politieke prioriteiten in plaatjes.

Links de zaal waarin het debat over gehandicaptenbeleid zou plaatsvinden.
Foto van een lege Commissiezaal in de Tweede Kamer: een grijze zaal met een halfronde cirkel met tafels. Links in een hoek zit in, aan de andere kant van de tafel staan Thibault en Thijs. Foto is wat van een afstand genomen zodat de leegte goed in beeld is. Screenshot van de site van de NOS: met Yesilgoz, Van der Plas en Van Vroonhoven die worden geïnterviewd door diverse journalisten met gekleurde microfoons. Daaronder de kop: asielportefeuille verdeeld over drie ministers van VVD, NSC en BBB.
Reposted by Suzan Verberne
djoerd.idf.social.ap.brid.gy
#dir2025, the 22nd Dutch-Belgian Information Retrieval workshop will take place at Radboud University Nijmegen on 27 October 2025!

https://informagus.nl/dir2025/
DIR 2025
informagus.nl
suzanv.bsky.social
If, due to exceptional circumstances, you are not able to travel to Padua for #SIGIR2025 to present your paper, please contact us at [email protected] on May 30th the latest to discuss possible solutions.

sigir2025.dei.unipd.it/inpresence-p...
sigir2025.dei.unipd.it
Reposted by Suzan Verberne
marieke.bsky.social
Misschien heb je dit nieuws gemist, want het stond slechts als kort berichtje in de NOS-liveblog: een schip met burgers die hulp naar Gaza wilden brengen, is in internationale wateren beschoten door drones. Greta Thunberg zou ook meevaren.
www.theguardian.com/world/2025/m...
Gaza humanitarian aid ship ‘bombed by drones’ in waters off Malta
Freedom Flotilla Coalition claims Israel to blame for attack on unarmed civilian vessel in international waters
www.theguardian.com
suzanv.bsky.social
Of course.
But I hadn’t understood your message was about that concern. It seemed to be about genAI not being impressive.
suzanv.bsky.social
In Retrieval-Augmented Generation (RAG) the sources are retrieved before generation. The parametric knowledge is fixed, but external sources are retrieved to base the response on. This helps the user verify the information. See en.wikipedia.org/wiki/Retriev... Example:
suzanv.bsky.social
In retrieval-augmented generation, the LLM relies on a retrieval engine to provide sources based on which the answer is generated. These sources come from an index external to the LLM.

GPT-4o uses RAG for quite some request types and shows the sources in the output.
Reposted by Suzan Verberne
arxiv-cs-cl.bsky.social
Ali Satvaty, Suzan Verberne, Fatih Turkmen
Undesirable Memorization in Large Language Models: A Survey
https://arxiv.org/abs/2410.02650
Reposted by Suzan Verberne
arxiv-cs-cl.bsky.social
I-Fan Lin, Faegheh Hasibi, Suzan Verberne
Generate then Refine: Data Augmentation for Zero-shot Intent Detection
https://arxiv.org/abs/2410.01953
Reposted by Suzan Verberne
arxiv-cs-ir.bsky.social
Yumeng Wang, Xiuying Chen, Suzan Verberne
QUIDS: Query Intent Generation via Dual Space Modeling
https://arxiv.org/abs/2410.12400
Reposted by Suzan Verberne
arxiv-cs-cl.bsky.social
Amin Abolghasemi, Leif Azzopardi, Seyyed Hadi Hashemi, Maarten de Rijke, Suzan Verberne
Evaluation of Attribution Bias in Retrieval-Augmented Large Language Models
https://arxiv.org/abs/2410.12380
Reposted by Suzan Verberne
arxiv-cs-cl.bsky.social
I-Fan Lin, Faegheh Hasibi, Suzan Verberne
SPILL: Domain-Adaptive Intent Clustering based on Selection and Pooling with Large Language Models
https://arxiv.org/abs/2503.15351
Reposted by Suzan Verberne
arxiv-cs-ir.bsky.social
Jujia Zhao, Wenjie Wang, Chen Xu, Xiuying Wang, Zhaochun Ren, Suzan Verberne
Unifying Search and Recommendation: A Generative Paradigm Inspired by Information Theory
https://arxiv.org/abs/2504.06714