Johannes Breuer
@johannesbreuer.com
2.5K followers 1.9K following 200 posts
Professor of Digital Social Science @unidue.bsky.social and head of the team Research Data & Methods @cais-research.bsky.social Interested in digital traces | computational social science | reproducibility | open science | #rstats www.johannesbreuer.com
Posts Media Videos Starter Packs
Reposted by Johannes Breuer
dingdingpeng.the100.ci
A lot of psych is already conducted with online convenience samples & ppl are probably excited about silicon samples bc it would allow them to crank out more studies for even less 💸

How about we reconsider the idea that sciencey science involves collecting own data.
www.science.org/content/arti...
AI-generated ‘participants’ can lead social science experiments astray, study finds
Data produced by “silicon samples” depends on researchers’ exact choice of models, prompts, and settings
www.science.org
Reposted by Johannes Breuer
cais-research.bsky.social
Join us for a #LunchtimeTalk at CAIS by Laura Vodden from QUT Digital Media Research Centre on Oct 1, 1:30–2:30 PM.
🔎 Topic: “AI-assisted frame analysis, and reflections on the value of disagreement in human-LLM collaboration”
🔗 www.cais-research.de/en/event/lun...
Visual announcing the talk on "AI-assisted frame analysis, and reflections on the value of disagreement in human-LLM collaboration" by speaker Laura Vodden, including the date, time, and location.
Reposted by Johannes Breuer
medem.bsky.social
They say: once there’s a paper 📝, it’s real…
🎉 #MEDem in #EPS @ecpr.bsky.social

The paper outlines how MEDem will strengthen #OpenScience by making data #FAIR in #DemocracyResearch – and how scholars across Europe are joining forces to build a truly open #ResearchInfrastructure ⚙️
Screenshot of the article “Open science in democracy research: the research infrastructure ‘Monitoring Electoral Democracy’ (MEDem)” published in European Political Science. The header shows the journal name, DOI link, and the label “DEBATE.” Below, the title is followed by the author list: Hajo Boomgaarden, Alexia Katsanidou, Sylvia Kritzinger, Georg Lutz, Johanna Willmann, and Jakob-Moritz Eberl. The abstract explains the aims of MEDem as a European research infrastructure to make democracy research data FAIR (findable, accessible, interoperable, reusable). Keywords listed include open science, FAIR data, ESFRI roadmap, research infrastructure, democracy research, data harmonization, data linking, and data set search. DOI: https://doi.org/10.1057/s41304-025-00534-8
Reposted by Johannes Breuer
weizenbauminstitut.bsky.social
Out now: Our latest Policy Paper dives into the key provisions of the Digital Services Act #DSA on data access, highlighting its goals, procedures, limits, & external factors impacting implementation: 🔗 doi.org/10.34669/WI....
Cover: LK Seiling, Clara Iglesias Keller, Jakob Ohme, Ulrike Klinger, Claes de Vreese. Data Access for Researchers under the Digital Services Act: From Policy to Practice. Weizenbaum Policy Paper 15, September 2025.
Reposted by Johannes Breuer
cais-research.bsky.social
Registration is open for #DigiMeet2025 on Nov 6!🎉
Join 16 talks & discussions on platform regulation, community building & governance with #EarlyCareerResearchers
🗣️Keynote by Tobias Mast
👉Program + Registration: www.cais-research.de/en/event/dig...
@bidt.bsky.social @weizenbauminstitut.bsky.social
Reposted by Johannes Breuer
cais-research.bsky.social
Im Interview erklären Johannes Breuer und Marco Wähner, wie das #TeamRDM Forschende unterstützt, digitale Daten nutzt, Methodeninnovationen vorantreibt & Transparenz sowie Vertrauen in der Forschung stärken möchte.
👉 www.cais-research.de/news/intervi...
@marco-waehner.bsky.social @johannesbreuer.com
Foto von Marco Wähner und Johannes Breuer, die an einem Tisch sitzen und in die Kamera blicken. Auf der Grafik steht der Text: "Im Interview: Marco Wähner & Johannes Breuer über Research Data & Methods am CAIS".
johannesbreuer.com
It does not earn me any money, but at least I can say I am part of the top 0.1% somewhere 😄
conradhackett.bsky.social
If you have two followers, you are ahead of half of all Bluesky accounts.
If you have 400 followers, you are in the top 1%.
bsky.jazco.dev/stats
Chart showing share of followers per user percentile. A user in the 50th percentile has 1 follower. A user in the 99.99th percentile has 11,241 followers.
Reposted by Johannes Breuer
conradhackett.bsky.social
If you have two followers, you are ahead of half of all Bluesky accounts.
If you have 400 followers, you are in the top 1%.
bsky.jazco.dev/stats
Chart showing share of followers per user percentile. A user in the 50th percentile has 1 follower. A user in the 99.99th percentile has 11,241 followers.
Reposted by Johannes Breuer
jamiecummins.bsky.social
Can large language models stand in for human participants?
Many social scientists seem to think so, and are already using "silicon samples" in research.

One problem: depending on the analytic decisions made, you can basically get these samples to show any effect you want.

THREAD 🧵
The threat of analytic flexibility in using large language models to simulate human data: A call to attention
Social scientists are now using large language models to create "silicon samples" - synthetic datasets intended to stand in for human respondents, aimed at revolutionising human subjects research. How...
arxiv.org
Reposted by Johannes Breuer
cos.io
On Oct. 4, the OSF will launch a refreshed design that makes the platform easier to navigate & use. Updates include:

✨ Streamlined design for easier navigation
⏱️ Faster load times & improved performance
📁 The return of file access on project overview pages!

Read more:
Coming Soon: A New Look for OSF
In early October 2025, the Open Science Framework (OSF) will launch a refreshed design that makes the platform easier to navigate and use. The update modernizes the interface and improves performance, while preserving the familiar workflows that users are accustomed to.
www.cos.io
Reposted by Johannes Breuer
econ4ua.bsky.social
📈Did you know that Economists for Ukraine has created a repository of research papers on #Ukraine and the Russian invasion? It includes categories such as finance, sanctions, trade, agriculture, energy, labor markets, and governance.

#EconSky

See our database: econ4ua.org/research-rep...
Research Repository – Economists for Ukraine
Economists for Ukraine has compiled a repository of research papers on topics related to Russia’s war on Ukraine. To submit a paper that’s not on our list, please fill out this form.
econ4ua.org
Reposted by Johannes Breuer
liamflannery.bsky.social
We're making an automation game set in Windows 95 about making PowerPoint factories #gamedev #indiegames
Reposted by Johannes Breuer
claudiasalowski.bsky.social
Wenn man TV-Interviews zur NRW-Wahl sieht, fällt auf: Der AfD-Begriff „Altparteien“ ist inzwischen im alltäglichen Sprachgebrauch angekommen.
Das ist, was Politikwissenschaftler*innen mit Normalisierung meinen (zu der wir uns die Finger blutig schreiben seit Jahr & Tag 🙄).
Reposted by Johannes Breuer
jenniferhenke.bsky.social
Warten auf den Ausgang von Berufungsverfahren
#IchBinHanna #IchBinReyhan #PDprekär
Reposted by Johannes Breuer
sixtus.net
Köln stabil in die Stichwahl! 🌻
Oberbürgermeisterwahl Köln 2025 – bisherige Ergebnisse/Hochrechnungen (gerundet):
Berîvan Aymaz (Grüne): 28,1 Prozent
Markus Greitemann (CDU): 19,5 Prozent
Torsten Burmester (SPD): 21,3 Prozent
Heiner Kockerbeck (Die Linke): 6,1 Prozent
Volker Görzel (FDP): 3 Prozent
Lars Wolfram (Volt): 2,5 Prozent
Reposted by Johannes Breuer
joachimbaumann.bsky.social
🚨 New paper alert 🚨 Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**.

Paper: arxiv.org/pdf/2509.08825
We present our new preprint titled "Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation".
We quantify LLM hacking risk through systematic replication of 37 diverse computational social science annotation tasks.
For these tasks, we use a combined set of 2,361 realistic hypotheses that researchers might test using these annotations.
Then, we collect 13 million LLM annotations across plausible LLM configurations.
These annotations feed into 1.4 million regressions testing the hypotheses. 
For a hypothesis with no true effect (ground truth $p > 0.05$), different LLM configurations yield conflicting conclusions.
Checkmarks indicate correct statistical conclusions matching ground truth; crosses indicate LLM hacking -- incorrect conclusions due to annotation errors.
Across all experiments, LLM hacking occurs in 31-50\% of cases even with highly capable models.
Since minor configuration changes can flip scientific conclusions, from correct to incorrect, LLM hacking can be exploited to present anything as statistically significant.
Reposted by Johannes Breuer
dingdingpeng.the100.ci
Ever stared at a table of regression coefficients & wondered what you're doing with your life?

Very excited to share this gentle introduction to another way of making sense of statistical models (w @vincentab.bsky.social)
Preprint: doi.org/10.31234/osf...
Website: j-rohrer.github.io/marginal-psy...
Models as Prediction Machines: How to Convert Confusing Coefficients into Clear Quantities

Abstract
Psychological researchers usually make sense of regression models by interpreting coefficient estimates directly. This works well enough for simple linear models, but is more challenging for more complex models with, for example, categorical variables, interactions, non-linearities, and hierarchical structures. Here, we introduce an alternative approach to making sense of statistical models. The central idea is to abstract away from the mechanics of estimation, and to treat models as “counterfactual prediction machines,” which are subsequently queried to estimate quantities and conduct tests that matter substantively. This workflow is model-agnostic; it can be applied in a consistent fashion to draw causal or descriptive inference from a wide range of models. We illustrate how to implement this workflow with the marginaleffects package, which supports over 100 different classes of models in R and Python, and present two worked examples. These examples show how the workflow can be applied across designs (e.g., observational study, randomized experiment) to answer different research questions (e.g., associations, causal effects, effect heterogeneity) while facing various challenges (e.g., controlling for confounders in a flexible manner, modelling ordinal outcomes, and interpreting non-linear models).
Figure illustrating model predictions. On the X-axis the predictor, annual gross income in Euro. On the Y-axis the outcome, predicted life satisfaction. A solid line marks the curve of predictions on which individual data points are marked as model-implied outcomes at incomes of interest. Comparing two such predictions gives us a comparison. We can also fit a tangent to the line of predictions, which illustrates the slope at any given point of the curve. A figure illustrating various ways to include age as a predictor in a model. On the x-axis age (predictor), on the y-axis the outcome (model-implied importance of friends, including confidence intervals).

Illustrated are 
1. age as a categorical predictor, resultings in the predictions bouncing around a lot with wide confidence intervals
2. age as a linear predictor, which forces a straight line through the data points that has a very tight confidence band and
3. age splines, which lies somewhere in between as it smoothly follows the data but has more uncertainty than the straight line.
Reposted by Johannes Breuer
cais-research.bsky.social
💡 As part of the pre-conference program, CAIS hosted the workshop “AI (Tools) for Research in Media Psychology (and Beyond)” led by Prof. Johannes Breuer. Participants explored current methods and applications in media psychology. #MediaPsych #MePsy25 @johannesbreuer.com
Prof. Johannes Breuer introducing the workshop View of the workshop room during the introduction round of the participants Prof. Johannes Breuer discussing the responses provided by participants in the preparatory survey, which he summarized using Google Gemini
Reposted by Johannes Breuer
cais-research.bsky.social
📢 The 14th Media Psychology Conference is underway! Together with Uni Duisburg-Essen, we are co-organizers and our researchers are presenting exciting talks. Over 230 participants from 21 countries are exploring AI, smartphone use, misinformation, and more. #MePsy25 @unidue.bsky.social
Visual of the conference
Reposted by Johannes Breuer
gesis.org
Neuer Beitrag im GESIS #Blog über #Sprache als Faktor für eine verlässlichen #Wissenschaft

Warum ist Sprache so entscheidend im Kontext wissenschaftlicher #Replizierbarkeit und wie können Forschende sprachbezogenen Herausforderungen begegnen?
doi.org/10.34879/ges...