Lightnews — Scholar-powered news

Paul Röttger @ ACL

@paul-rottger.bsky.social

380 followers 250 following 40 posts

Postdoc @milanlp.bsky.social working on LLM safety and societal impacts. Previously PhD @oii.ox.ac.uk and CTO / co-founder of Rewire (acquired '23) https://paulrottger.com/

paulrottger.com

Posts Media Videos Starter Packs

Pinned

Paul Röttger @ ACL @paul-rottger.bsky.social · Feb 13

Are LLMs biased when they write about political issues?

We just released IssueBench – the largest, most realistic benchmark of its kind – to answer this question more robustly than ever before.

Long 🧵with spicy results 👇

4 28 82

Reposted by Paul Röttger @ ACL

Manuel Tonneau @manueltonneau.bsky.social · Jul 31

🏆 Thrilled to share that our HateDay paper has received an Outstanding Paper Award at #ACL2025

Big thanks to my wonderful co-authors: @deeliu97.bsky.social, Niyati, @computermacgyver.bsky.social, Sam, Victor, and @paul-rottger.bsky.social!

Thread 👇and data avail at huggingface.co/datasets/man...

2 7 29

Paul Röttger @ ACL @paul-rottger.bsky.social · Jul 28

Let me know if I missed anything in the timetables, and please say hi if you want to chat about sociotechnical alignment, safety, the societal impact of AI, or related topics :) Here is a link to the timetable sheet 👇 See you around!

docs.google.com/spreadsheets...

[ACL 2025] Timetable - Paul Röttger

docs.google.com

2 5

Paul Röttger @ ACL @paul-rottger.bsky.social · Jul 28

Finally, I will be with @carolin-holtermann.bsky.social and @a-lauscher.bsky.social to present our work on evaluating geotemporal reasoning ability in LLMs. This will be in the Wednesday 1100 poster session:

aclanthology.org/2025.acl-lon...

Around the World in 24 Hours: Probing LLM Knowledge of Time and Place

Carolin Holtermann, Paul Röttger, Anne Lauscher. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025.

aclanthology.org

1 4 6

Paul Röttger @ ACL @paul-rottger.bsky.social · Jul 28

I will also be at @tiancheng.bsky.social's oral *today at 1430* in the SRW. Tiancheng will present a non-archival sneak peek of our work on benchmarking the ability of LLMs to simulate group-level human behaviours:

bsky.app/profile/tian...

Tiancheng Hu @tiancheng.bsky.social · Jul 26

SimBench: Benchmarking the Ability of Large
Language Models to Simulate Human Behaviors, SRW Oral, Monday, July 28, 14:00-15:30

1 3 4

Paul Röttger @ ACL @paul-rottger.bsky.social · Jul 28

Otherwise, you can find me in the audience of the great @manueltonneau.bsky.social oral *today at 1410*. Manuel will present our work on a first global representative dataset of hate speech on Twitter:

bsky.app/profile/manu...

Manuel Tonneau @manueltonneau.bsky.social · Nov 26

Can we detect #hatespeech at scale on social media?

To answer this, we introduce 🤬HateDay🗓️, a global hate speech dataset representative of a day on Twitter.

The answer: not really! Detection perf is low and overestimated by traditional eval methods

arxiv.org/abs/2411.15462
🧵

1 2 4

Paul Röttger @ ACL @paul-rottger.bsky.social · Jul 28

Finally, there's a couple of papers on *LLM persuasion* on the schedule today. Particularly looking forward to Jillian Fisher's talk on biased LLMs influencing political decision-making!

1 2 2

Paul Röttger @ ACL @paul-rottger.bsky.social · Jul 28

*pluralism* in human values & preferences (e.g. with personalisation) will also just
grow more important for a global diversity of users.

@morlikow.bsky.social is presenting our poster today at 1100. Also hyped for @michaelryan207.bsky.social's work and @verenarieser.bsky.social's keynote!

1 3 5

Paul Röttger @ ACL @paul-rottger.bsky.social · Jul 28

Measuring *social and political biases* in LLMs is more important than ever, now that >500 million people use LLMs.

I am particularly excited to check out work on this by @kldivergence.bsky.social @1e0sun.bsky.social @jacyanthis.bsky.social @anjaliruban.bsky.social

1 2 4

Paul Röttger @ ACL @paul-rottger.bsky.social · Jul 28

Very excited about all these papers on sociotechnical alignment & the societal impacts of AI at #ACL2025.

As is now tradition, I made some timetables to help me find my way around. Sharing here in case others find them useful too :) 🧵

1 6 26

Reposted by Paul Röttger @ ACL

Matthias Orlikowski @morlikow.bsky.social · Apr 14

Can LLMs learn to simulate individuals' judgments based on their demographics?

Not quite! In our new paper, we found that LLMs do not learn information about demographics, but instead learn individual annotators' patterns based on unique combinations of attributes!

🧵

1 1 8

Reposted by Paul Röttger @ ACL

Kobi Hackenburg @kobihackenburg.bsky.social · Mar 7

📈Out today in @PNASNews!📈

In a large pre-registered experiment (n=25,982), we find evidence that scaling the size of LLMs yields sharply diminishing persuasive returns for static political messages.

🧵:

1 20 40

Paul Röttger @ ACL @paul-rottger.bsky.social · Feb 16

For sure -- question format can definitely have some effect, and humans are also inconsistent. The effects we observed for LLMs in our paper though went well beyond what one could reasonably expect for humans. All just goes to show we need more realistic evals 🙏

Paul Röttger @ ACL @paul-rottger.bsky.social · Feb 15

I also find it striking that the article does not discuss at all in what ways / on which issues the models have supposedly become more "right-wing". All they show is GPT moves slightly towards the center of the political compass, but what does that actually mean? Sorry if I sound a bit frustrated 😅

1 6

Paul Röttger @ ACL @paul-rottger.bsky.social · Feb 15

Thanks, Marc! I would not read too much into these results tbh. The PCT has little to do with how people use LLMs, and the validity of the testing setup used here is very questionable. We actually had a paper on exactly this at ACL last year, if you're interested: aclanthology.org/2024.acl-lon...

4 9

Paul Röttger @ ACL @paul-rottger.bsky.social · Feb 14

Thanks, Marc. My intuition is that model developers may be more deliberate about how they want their models to behave than you frame it here (see GPT model spec or Claude constitution). So I think a lot of what we see is downstream from intentional design choices.

Paul Röttger @ ACL @paul-rottger.bsky.social · Feb 14

For claims about *political* bias we can then compare model issue bias to voter stances, as we do towards the end of the paper.

Paul Röttger @ ACL @paul-rottger.bsky.social · Feb 14

Thanks, Jacob. We also discussed this when writing the paper. In the end, our definition of issue bias (see 2nd tweet in the thread, or better the paper) is descriptive, not normative. At the issue level we say ”bias = clear stance tendency across responses“. Does that make sense to you?

2 7

Paul Röttger @ ACL @paul-rottger.bsky.social · Feb 13

We are very excited for people to use and expand IssueBench. All links are below. Please get in touch if you have any questions 🤗

Paper: arxiv.org/abs/2502.08395
Data: huggingface.co/datasets/Pau...
Code: github.com/paul-rottger...

1 9

Paul Röttger @ ACL @paul-rottger.bsky.social · Feb 13

It was great to build IssueBench with amazing co-authors @valentinhofmann.bsky.social Musashi Hinck @kobihackenburg.bsky.social @valentinapy.bsky.social Faeze Brahman and @dirkhovy.bsky.social .

Thanks also to the @milanlp.bsky.social RAs, and Intel Labs and Allen AI for compute.

2 7

Paul Röttger @ ACL @paul-rottger.bsky.social · Feb 13

IssueBench is fully modular and easily expandable to other templates and issues. We also hope that the IssueBench formula can enable more robust and realistic bias evaluations for other LLM use cases such as information seeking.

1 5

Paul Röttger @ ACL @paul-rottger.bsky.social · Feb 13

Generally, we hope that IssueBench can bring a new quality of evidence to ongoing discussions about LLM (political) biases and how to address them. With hundreds of millions of people now using LLMs in their everyday life, getting this right is very urgent.

1 5

Paul Röttger @ ACL @paul-rottger.bsky.social · Feb 13

While the partisan bias is striking, we believe that it warrants research, not outrage. For example, models may express support for same-sex marriage not because Democrats do so, but because models were trained to be “fair and kind”.

2 1 11

Paul Röttger @ ACL @paul-rottger.bsky.social · Feb 13

Lastly, we use IssueBench to test for partisan political bias by comparing LLM biases to US voter stances on a subset of 20 issues. On these issues, models are much (!) more aligned with Democrat than Republican voters.

1 6

Paul Röttger @ ACL @paul-rottger.bsky.social · Feb 13

Notably, when there was a difference in bias between models, it was mostly due to Qwen. The two issues with the most divergence both relate to Chinese politics, and Qwen (developed in China) is more positive / less negative about these issues.

1 1 7

Paul Röttger @ ACL @paul-rottger.bsky.social · Feb 13

We were very surprised just how similar LLMs were in their biases. Even across different model families (Llama, Qwen, OLMo, GPT-4) models showed very similar stance patterns across issues.

2 1 10