Petter Törnberg
pettertornberg.com
Petter Törnberg
@pettertornberg.com
Assistant Professor in Computational Social Science at University of Amsterdam

Studying the intersection of AI, social media, and politics.

Polarization, misinformation, radicalization, digital platforms, social complexity.
If you are feeling unsafe, I would encourage you to contact either 988 or 911. They will be able to help you.
January 8, 2026 at 1:09 PM
Perfect Holiday gift! 🎁

Worth every penny! (It’s open access 😉)
December 18, 2025 at 11:37 AM
No, sorry - in person only!
Really happy you liked the book! :)
December 9, 2025 at 4:06 PM
To be fair, the fact that my paper has a higher Altmetric than Marx's Das Kapital might be taken to imply that the limitations in the methodology are to my paper's benefit... ;)
November 28, 2025 at 8:36 AM
Thanks Jonathan.
Democrats still have users but rarely visit the site.
I would point you to the preprint I put up for more details and better versions of the figures: www.arxiv.org/abs/2510.25417
Shifts in U.S. Social Media Use, 2020-2024: Decline, Fragmentation, and Enduring Polarization
Using nationally representative data from the 2020 and 2024 American National Election Studies (ANES), this paper traces how the U.S. social media landscape has shifted across platforms, demographics,...
www.arxiv.org
November 15, 2025 at 10:02 PM
Find my co-authors on Bluesky: @chrisbail.bsky.social @cbarrie.bsky.social

Colleagues who do excellent work in this field, and might find these results interesting:
@mbernst.bsky.social
@robbwiller.bsky.social
@joon-s-pk.bsky.social
@janalasser.bsky.social
@dgarcia.eu
@aaronshaw.bsky.social
November 7, 2025 at 11:19 AM
This has been carried out by amazing Nicolò Pagan, with Chris Bail, Chris Barrie, and Anikó Hannák.

Paper (preprint): arxiv.org/abs/2511.04195

Happy to share prompts, configs, and analysis scripts.
Computational Turing Test Reveals Systematic Differences Between Human and AI Language
Large language models (LLMs) are increasingly used in the social sciences to simulate human behavior, based on the assumption that they can generate realistic, human-like text. Yet this assumption rem...
arxiv.org
November 7, 2025 at 11:13 AM
Takeaways for researchers:
• LLMs are worse stand-ins for humans than they may appear.
• Don’t rely on human judges.
• Measure detectability and meaning.
• Expect a style–meaning trade-off.
• Use examples + context, not personas.
• Affect is still the biggest giveaway.
November 7, 2025 at 11:13 AM
We also found some surprising trade-offs:
🎭 When models sound more human, they drift from what people actually say.
🧠 When they match meaning better, they sound less human.

Style or meaning — you have to pick one.
November 7, 2025 at 11:13 AM
So what actually helps?
Not personas. And fine-tuning? Not always.

The real improvements came from:
✅ Providing stylistic examples of the user
✅ Adding context retrieval from past posts

Together, these reduced detectability by 4-16 percentage points.
November 7, 2025 at 11:13 AM
Some findings surprised us:
⚙️ Instruction-tuned models — the ones fine-tuned to follow prompts — are easier to detect than their base counterparts.
📏 Model size doesn’t help: even 70B models don’t sound more human.
November 7, 2025 at 11:13 AM
Where do LLMs give themselves away?

❤️ Affective tone and emotion — the clearest tell.
✍️ Stylistic markers — average word length, toxicity, hashtags, emojis.
🧠 Topic profiles — especially on Reddit, where conversations are more diverse and nuanced.
November 7, 2025 at 11:13 AM
The results were clear — and surprising.
Even short social media posts written by LLMs are readily distinguishable.

Our BERT-based classifier spots AI with 70–80% accuracy across X, Bluesky, and Reddit.

LLMs are much less human-like than they may seem.
November 7, 2025 at 11:13 AM
We test the state-of-the-art methods for calibrating LLMs — and then push further, using advanced fine-tuning.

We benchmark 9 open-weight LLMs across 5 calibration strategies:
👤 Persona
✍️ Stylistic examples
🧩 Context retrieval
⚙️ Fine-tuning
🎯 Post-generation selection
November 7, 2025 at 11:13 AM
We use our Computational Turing Test to see whether LLMs can produce realistic social media conversations.

We use data from X (Twitter), Bluesky, and Reddit.

This task is arguably what LLMs should do best: they are literally trained on this data!
November 7, 2025 at 11:13 AM
We introduce a Computational Turing Test — a validation framework that compares human and LLM text using:

🕵️‍♂️ Detectability — can an ML classifier tell AI from human?

🧠 Semantic fidelity — does it mean the same thing?

✍️ Interpretable linguistic features — style, tone, topics.
November 7, 2025 at 11:13 AM
Most prior work validated "human-likeness" with human judges. Basically, do people think it looks human?

But humans are actually really bad at this task: we are subjective, scale poorly, and very easy to fool.

We need something more rigorous.
November 7, 2025 at 11:13 AM