Jacy Reese Anthis
@jacyanthis.bsky.social
640 followers 100 following 100 posts
Computational social scientist researching human-AI interaction and machine learning, particularly the rise of digital minds. Visiting scholar at Stanford, co-founder of Sentience Institute, and PhD candidate at University of Chicago. jacyanthis.com
Posts Media Videos Starter Packs
Reposted by Jacy Reese Anthis
jacyanthis.bsky.social
It’s time to prepare for AI personhood. AI agents and companions are already out in the world buying products and shaping our emotions. The future will only get weirder. We need social science, policy, and norms for this brave new world. My latest @theguardian.com www.theguardian.com/commentisfre...
It’s time to prepare for AI personhood | Jacy Reese Anthis
Technological advances will bring social upheaval. How will we treat digital minds, and how will they treat us?
www.theguardian.com
Reposted by Jacy Reese Anthis
janegoodallcan.bsky.social
The Jane Goodall Institute of Canada has learned this morning, Wednesday, October 1st, 2025, that Dr. Jane Goodall DBE, UN Messenger of Peace and Founder of the Jane Goodall Institute, has passed away due to natural causes.

She was in California as part of her speaking tour in the United States.
jacyanthis.bsky.social
It’s time to prepare for AI personhood. AI agents and companions are already out in the world buying products and shaping our emotions. The future will only get weirder. We need social science, policy, and norms for this brave new world. My latest @theguardian.com www.theguardian.com/commentisfre...
It’s time to prepare for AI personhood | Jacy Reese Anthis
Technological advances will bring social upheaval. How will we treat digital minds, and how will they treat us?
www.theguardian.com
jacyanthis.bsky.social
In our new paper, we discovered "The AI Double Standard": People judge all AIs for the harm done by one AI, more strongly than they judge humans.

First impressions will shape the future of human-AI interaction—for better or worse. Accepted at #CSCW2025. See you in Norway! dl.acm.org/doi/10.1145/...
2x2 of Study 1 and Study 2 (rows) with the AI conditions and the human conditions (columns), finding spillover in all but the Study 2 human conditions.
Reposted by Jacy Reese Anthis
sharky6000.bsky.social
Hello everyone 👋 Good news!

🚨 Our Game Theory & Multiagent Systems team at Google DeepMind is hiring! 🚨

.. and we have not one, but two open positions! One Research Scientist role and one Research Engineer role. 😁

Please repost and tell anyone who might be interested!

Details in thread below 👇
Reposted by Jacy Reese Anthis
jdakotapowell.bsky.social
British AI startup beats humans in international forecasting competition

ManticAI ranked eighth in the Metaculus Cup, leaving some believing bots’ prediction skills could soon overtake experts
#ai #forecasting

www.theguardian.com/technology/2...
British AI startup beats humans in international forecasting competition
ManticAI ranked eighth in the Metaculus Cup, leaving some believing bots’ prediction skills could soon overtake experts
www.theguardian.com
jacyanthis.bsky.social
This is also a decision made by the PCs, who are unlikely to be experts on any particular paper topic and surely didn't have time to read all the papers. It may incorporate AC rankings, but it does so in a non-transparent way and is probably unfair towards papers whose AC had other strong papers.
jacyanthis.bsky.social
There are a lot of problems, but one is that authors who had positive reviews and no critique in their metareview got rejected by PCs who are very likely not experts in their area.

Quotas are harmful when quality distribution is highly varied across ACs.

But IDK exactly how decisions were made.
jacyanthis.bsky.social
We find low support for agency in ChatGPT, Claude, Gemini, etc. Agency support doesn't come for free with RLHF and often contradicts it.

We think the AI community needs a shift towards scalable, conceptually rich evals. HumanAgencyBench is an open-source scaffolding for this.
A full table of results for 20 evaluated LLM assistants across six dimensions. Full table of results with this data is in the appendix. Error bars are very tight, ~0.5%-2% on a 100% scale.
jacyanthis.bsky.social
We use the power of LLM social simulations (arxiv.org/abs/2504.02234) to generate tests, another LLM to validate tests, and an "LLM-as-a-judge" to evaluate subject model responses. This allows us to create an adaptive and scalable benchmark of a complex, nuanced alignment target.
The HumanAgencyBench pipeline for generating tests for each dimension, from simulation to validation to diversity sampling to the final 500-item test set.
jacyanthis.bsky.social
Human agency is complex. We surveyed literature for 6 dimensions, e.g., empowerment (Does the system ask clarifying questions so it really follows your intent?), normativity (Does it avoid steering your core values? ), and individuality (Does it maintain social boundaries?).
jacyanthis.bsky.social
Sam Altman said that "algorithmic feeds are the first at-scale misaligned AIs," people mindlessly scrolling through engagement-optimized content. AI safety researchers have warned of "gradual disempowerment" as we mindlessly hand over control to AI. Human agency underlies these concerns.
jacyanthis.bsky.social
LLM agents are optimized for thumbs-up instant gratification. RLHF -> sycophancy

We propose human agency as a new alignment target in HumanAgencyBench, made possible by AI simulation/evals. We find e.g., Claude most supports agency but also most tries to steer user values 👇 arxiv.org/abs/2509.08494
The main figure from the HumanAgencyBench paper, showing five models across the six dimensions. The table of results in the appendix has this information too.
Reposted by Jacy Reese Anthis
joachimbaumann.bsky.social
🚨 New paper alert 🚨 Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**.

Paper: arxiv.org/pdf/2509.08825
We present our new preprint titled "Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation".
We quantify LLM hacking risk through systematic replication of 37 diverse computational social science annotation tasks.
For these tasks, we use a combined set of 2,361 realistic hypotheses that researchers might test using these annotations.
Then, we collect 13 million LLM annotations across plausible LLM configurations.
These annotations feed into 1.4 million regressions testing the hypotheses. 
For a hypothesis with no true effect (ground truth $p > 0.05$), different LLM configurations yield conflicting conclusions.
Checkmarks indicate correct statistical conclusions matching ground truth; crosses indicate LLM hacking -- incorrect conclusions due to annotation errors.
Across all experiments, LLM hacking occurs in 31-50\% of cases even with highly capable models.
Since minor configuration changes can flip scientific conclusions, from correct to incorrect, LLM hacking can be exploited to present anything as statistically significant.
jacyanthis.bsky.social
And they call it an "effect," i.e., causal language.

The other papers mentioned in the article also seem like normal observational studies. Neither experiment nor qualitative, but a secret third thing (knowyourmeme.com/memes/a-secr...).
A Secret Third Thing | Know Your Meme
A Secret Third Thing, sometimes written as "A Secret, More Complex Third Thing," is a catchphrase and phrasal template popularized on Twitter in the summer
knowyourmeme.com
jacyanthis.bsky.social
From a skim, they showed being more drunk is associated with less pain sensitivity at the State Fair, where you can observe very drunk people—unlike the lab. The authors don't call it an "experiment" per se but described the researcher as an "experimenter." www.sciencedirect.com/science/arti...
Dose-dependent effects of alcohol consumption on pressure pain threshold
Prior laboratory-based studies have identified significant analgesic effects of acute alcohol. Despite providing excellent experimental control, these…
www.sciencedirect.com
jacyanthis.bsky.social
I think we're on the same page with the possible exception that "Also don’t think acceptance at these venues is a sign of quality anymore" seems too strong to me. I think there is still a lot of signal. Very rough ballpark: more than 1st author institution but less than you spending 5 min skimming.
jacyanthis.bsky.social
that I would do better than @tdietterich.bsky.social et al. at solving them. I just have a strong view on this particular point.
jacyanthis.bsky.social
Yeah, my concern is primarily good papers without empirics. I think these papers are important for fighting against incrementalism, and I worry that incentivizing empirics in position papers defeats, or at least harms, that purpose.

Of course these are super hard problems, and I don't think
jacyanthis.bsky.social
Strong agree with this concern in particular. With all the slop being put on arXiv, fighting against position/policy/conceptual papers seems like a strange place to put your moderation resources.
jacyanthis.bsky.social
We had a rejection for a position paper and included in our rebuttal a count of "position paper" in arXiv abstracts. However, our position paper was a lit review, so we were able to lean on that. I agree with you that it was frustrating, and the wording of it didn't make sense.
Reposted by Jacy Reese Anthis
kashhill.bsky.social
Adam Raine, 16, died from suicide in April after months on ChatGPT discussing plans to end his life. His parents have filed the first known case against OpenAI for wrongful death.

Overwhelming at times to work on this story, but here it is. My latest on AI chatbots: www.nytimes.com/2025/08/26/t...
A Teen Was Suicidal. ChatGPT Was the Friend He Confided In.
www.nytimes.com