Johannes Wachs
@johanneswachs.bsky.social
200 followers 280 following 19 posts
Researching social computing, crowds, and networks at Corvinus University of Budapest and HUN-REN CERS. More at: https://johanneswachs.com/
Posts Media Videos Starter Packs
Pinned
johanneswachs.bsky.social
Our paper on the effect of ChatGPT on activity on @stackoverflow.com.web.brid.gy is out: academic.oup.com/pnasnexus/ar...

@maria-drc.bsky.social, Nadzeya Laurentsyeva & I find a 25% decrease in activity on SO within 6 months of #ChatGPT 's release vs counterfactuals.

Why does it matter?
Reposted by Johannes Wachs
anetilab.bsky.social
In a recent piece for the @wsj.com commentator @greg_ip cited a 2024 PNAS Nexus study by @johanneswachs.bsky.social + coauthors @maria-drc.bsky.social & N.Laurentsyeva showing that LLM can be a potential substitute for human-generated data & knowledge 📚

www.wsj.com/tech/ai/will...
Will AI Choke Off the Supply of Knowledge?
More people turn to ChatGPT and other large language models for answers, but they don’t add to the stock of knowledge.
www.wsj.com
Reposted by Johannes Wachs
anetilab.bsky.social
In light of yet another scorching summer ☀️, new research by H. Schuster, A. Polleres, A. Anjomshoaa & @johanneswachs.bsky.social reveals how climate change 🌡️ + demographic aging 📈 intersect to shape health risks across Austrian districts.

Read the full story 👉 rdcu.be/eD5XV
Heat, health, and habitats: analyzing the intersecting risks of climate and demographic shifts in Austrian districts
Scientific Reports - Heat, health, and habitats: analyzing the intersecting risks of climate and demographic shifts in Austrian districts
rdcu.be
Reposted by Johannes Wachs
davidhuang.blog
“Our conservative model finds that going from 0→30 % AI share (US 2020-24) predicts 2.4 % increase in commits. Using task & wage data on occupations, this implies genAI creates $9-14 bill/year in US software alone. Larger estimates of effects from RCTs imply $100 billion.“

arxiv.org/abs/2506.08945
Who is using AI to code? Global diffusion and impact of generative AI
Generative coding tools promise big productivity gains, but uneven uptake could widen skill and income gaps. We train a neural classifier to spot AI-generated Python functions in 80 million GitHub com...
arxiv.org
Reposted by Johannes Wachs
emollick.bsky.social
Lots of neat stuff in this paper showing 30% of US python commits use AI

As of the end of 2024: “the annual value of AI-assisted coding in the United States at $9.6−14.4 billion, rising to 64−96 billion if we assume higher estimates of productivity effects reported by randomized control trials”
Reposted by Johannes Wachs
johanneswachs.bsky.social
How much code now comes from AI? With @simonedaniotti.bsky.social, @xfeng.bsky.social & Frank Neffke we estimate that by end-2024 30% of Python functions pushed by US devs on GitHub are AI-generated. Adoption is rapid but diffusion lags globally. How did we do it? arxiv.org/abs/2506.08945
johanneswachs.bsky.social
Our conservative model finds that going from 0→30 % AI share (US 2020-24) predicts 2.4 % increase in commits. Using task & wage data on occupations, this implies genAI creates $9-14 bill/year in US software alone. Larger estimates of effects from RCTs imply upwards of $100 billion in value / year.
johanneswachs.bsky.social
Besides the adoption results, we find newer devs take up AI fastest. We see no gender gap. In fixed-effects models, higher user AI share predicts more commits, and the use of novel code libraries and library pairs. AI extends capabilities and supports exploration.
johanneswachs.bsky.social
The resulting classifier scores an out-of-sample AUC of 0.96. We applied it to 80 million commit snapshots from 2019-24, spanning tens of thousands of public repos and developers, to track how the share of AI-authored code evolves over time and across countries.
johanneswachs.bsky.social
First we built an AI-code detector & gathered data to train it. Human code came from 2018 Python functions & HumanEval 21/23. To create AI-written code examples we had one LLM describe each human example in English then a 2nd LLM coded that description.
johanneswachs.bsky.social
How much code now comes from AI? With @simonedaniotti.bsky.social, @xfeng.bsky.social & Frank Neffke we estimate that by end-2024 30% of Python functions pushed by US devs on GitHub are AI-generated. Adoption is rapid but diffusion lags globally. How did we do it? arxiv.org/abs/2506.08945
Reposted by Johannes Wachs
sandorjuhasz.bsky.social
🎉 New publication in PNAS: Urban highways are barriers to social ties
www.pnas.org/doi/10.1073/...

We illustrate from numerous aspects that highways are physical barriers that cut opportunities for social connections—in the 50 largest metropolitan areas in the US.
Reposted by Johannes Wachs
mszll.datasci.social.ap.brid.gy
🎉 New paper in PNAS: Urban highways are barriers to social ties
https://www.pnas.org/doi/10.1073/pnas.2408937122

Highways are barriers that cut opportunities for social ties. We quantify this effect by overlaying the US highway network with millions of social ties from Twitter.
Map showing a highway section in red and social ties in space crossing the highway. Wherever a tie crosses the highway, there is a cross. There are 94 crosses.
Reposted by Johannes Wachs
lajello.bsky.social
"Urban Highways Are Barriers to Social Ties" out on PNAS!
The 1st large-scale measure of how highways weaken social connections between the communities they separate. This barrier effect is strong in the 50 largest US cities--especially for low-income Black communities.
www.pnas.org/doi/10.1073/...
Stylized map of Detroit (MI) showing the highway network, and the network of social connections between urban residents. The connections intersecting highways are sparser than elsewehere. Image credit Karo Berghuber (Insta: @kariot.lines)
Reposted by Johannes Wachs
nerdsitu.bsky.social
Urban highways are barriers to social connections, report by @itu.dk :
en.itu.dk/About-ITU/Pr...
3 photos of researchers
johanneswachs.bsky.social
The published version of that preprint has a slightly longer descriptive time series in the discussion, see below. We can't extend the counterfactual (comparing SO vs Russian and Chinese platforms) because other LLMs came out.

academic.oup.com/pnasnexus/ar...
Reposted by Johannes Wachs
rlsdvm.bsky.social
23/250 is Large language models reduce public knowledge sharing on online Q&A platforms

This makes me wonder if people are answering Stack Overflow questions with ChatGPT answers . . .
Large language models reduce public knowledge sharing on online Q&A platforms
Abstract. Large language models (LLMs) are a potential substitute for human-generated data and knowledge resources. This substitution, however, can present
doi.org
Reposted by Johannes Wachs
spintheory.bsky.social
🧵🧪 New paper alert! We studied how firms consume information by analyzing online reading patterns across millions of organizations. Some fascinating patterns emerged... (1/7)
Reposted by Johannes Wachs
jmateosgarcia.bsky.social
AI for science could be more impactful than chatbots. It is already helping win Nobel prizes and accelerating drug development and materials discovery.
Today we published an essay about it: why it matters, how it’s happening and its implications. Here is a summary from an econ / social sci lens.
Reposted by Johannes Wachs
johanneswachs.bsky.social
New preprint on innovation in OSS /w Gabor Meszaros: arxiv.org/abs/2411.14894

We extract library import statements from Stack Overflow posts in 12 languages. These elementary building blocks of code appear at a slower rate as ecosystems grow. But novel combos of libraries grow linearly.
johanneswachs.bsky.social
We also find that most novel library imports and combinations are made by less-experienced users, suggesting how important new blood is for long-run ecosystem health.

Feedback warmly welcome!
johanneswachs.bsky.social
[Mirrors results on over 200 years of novelties in US patents by Youn et al: royalsocietypublishing.org/doi/full/10.... ].

Two implications for maintenance:
- single libraries will be widely used as ecosystems grow (see plot)
- the many co-used libraries need to stay compatible with each other