Sydney Levine
@sydneylevine.bsky.social
360 followers 180 following 22 posts
Cognitive scientist working at the intersection of moral cognition and AI safety. Currently: Google Deepmind. Soon: Assistant Prof at NYU Psychology. More at sites.google.com/site/sydneymlevine.
Posts Media Videos Starter Packs
Pinned
sydneylevine.bsky.social
🔥 New position piece! 🔥 In this paper we lay out our vision for AI Alignment as guided by "Resource Rational Contractualism" (RRC).

But wait -- what's that? A 🧵.
Reposted by Sydney Levine
ashleyjthomas.bsky.social
Interested in applying to graduate programs or research positions in psychology? Want more information and feedback on your submission materials? Then Harvard’s Prospective Ph.D. & RA Event in Psychology (PPREP) Is for you!! psychology.fas.harvard.edu/pprep
More info in the link!! Please retweet!
PPREP Program | Department of Psychology
psychology.fas.harvard.edu
Reposted by Sydney Levine
jfbonnefon.bsky.social
This is the kind of outcome we foretold six years ago in a paper aptly titled Drivers Are Blamed More Than Their Automated Cars When Both Make Mistakes, with @awad.bsky.social @sohandsouza.info @sydneylevine.bsky.social @maxkw.bsky.social @azimshariff.bsky.social @iyadrahwan.bsky.social
jfbonnefon.bsky.social
❝Neither the driver of the Tesla sedan nor the Autopilot software braked in time for an intersection. The jury assigned Tesla one-third of the blame and assigned two-thirds to the driver, who was reaching for his cell phone at the time of the crash❞ www.nbcnews.com/news/us-news...
Tesla hit with $243 million in damages after jury finds its Autopilot feature contributed to fatal crash
The verdict follows a three-week trial that threw a spotlight on how Tesla and CEO Elon Musk have marketed their driver-assistance software.
www.nbcnews.com
Reposted by Sydney Levine
jennhu.bsky.social
Excited to announce the first workshop on CogInterp: Interpreting Cognition in Deep Learning Models @ NeurIPS 2025! 📣

How can we interpret the algorithms and representations underlying complex behavior in deep learning models?

🌐 coginterp.github.io/neurips2025/

1/4
Home
First Workshop on Interpreting Cognition in Deep Learning Models (NeurIPS 2025)
coginterp.github.io
sydneylevine.bsky.social
Excited to be part of this vision!
ryantlowe.bsky.social
Introducing: Full-Stack Alignment 🥞

A research program dedicated to co-aligning AI systems *and* institutions with what people value.

It's the most ambitious project I've ever undertaken.

Here's what we're doing: 🧵
Reposted by Sydney Levine
ryantlowe.bsky.social
Introducing: Full-Stack Alignment 🥞

A research program dedicated to co-aligning AI systems *and* institutions with what people value.

It's the most ambitious project I've ever undertaken.

Here's what we're doing: 🧵
Reposted by Sydney Levine
mcxfrank.bsky.social
memo is a new probabilistic programming language for modeling social inferences quickly. Looks like a real advance over previous approaches: fast, python-based, easily integrated into data analysis. Super cool!

pypi.org/project/memo...
and
osf.io/preprints/ps...
memo-lang
A language for mental models
pypi.org
sydneylevine.bsky.social
Wonderful to work out this vision in conversation with @sethlazar.org @xuanalogue.bsky.social @matijafranklin.bsky.social @yejinchoinka.bsky.social @noahdgoodman.bsky.social, Iason Gabriel, Secil Yanik Guyot, Lionel Wong, Daniel Kilov, Josh Tenenabum.
sydneylevine.bsky.social
This framework was inspired by our work on the cogsci of human morality, which explains how humans use RRC approximations to make moral judgments. So, a virtue of an RRC-aligned AI system would be the ability to interpret human morality and update the interpretation as the world changes.
sydneylevine.bsky.social
We present experimental evidence that current frontier models can be steered to efficiently trade off compute for accuracy on a series of morally charged cases.
Experimental design
sydneylevine.bsky.social
We lay out what those mechanisms would look like along a continuum of resource intensity. E.g., rule-following is the most heuristic. Consulting humans or simulating a bargain are more compute-intensive. Mechanisms can be selected dynamically depending on the needs of the situation.
sydneylevine.bsky.social
But the problem is that contractualism, in its ideal form, requires lots of resources – time, information, and compute – that are often impractical to obtain. Resource Rational Contractualism proposes that AI systems should use *approximations* to the contractualist ideal.
sydneylevine.bsky.social
Contractualism is a solution to the problem of how people with different values and interests can live together peacefully and productively. Contractualism is a useful guide for AI alignment, given that AI systems need to make decisions that impact lots of people with diverse values.
sydneylevine.bsky.social
🔥 New position piece! 🔥 In this paper we lay out our vision for AI Alignment as guided by "Resource Rational Contractualism" (RRC).

But wait -- what's that? A 🧵.
Reposted by Sydney Levine
traceym.bsky.social
If you’ll be at #CogSci2025, consider (or at least consider considering) attending our @cogscisociety.bsky.social workshop on meta reasoning
🤔🤨🧐
We’ll be discussing problem selection through various lenses represented by a great lineup of speakers!
Meta-reasoning @ CogSci
Workshop Description People are general purpose problem solvers. We obtain food and shelter, manage companies, solve moral dilemmas, spend years toiling away at thorny math problems, and even adopt a...
sites.google.com
sydneylevine.bsky.social
🔆 I'm hiring! 🔆

There are two open positions:

1. Summer research position (best for master's or graduate student); focus on computational social cognition.
2. Postdoc (currently interviewing!); focus on computational social cognition and AI safety.

sites.google.com/corp/site/sy...
Sydney Levine - Open Positions
Summer Research Position I am seeking a part-time or full-time researcher for the summer (starting asap) to bring a project to completion. The project asks the question: do people around the world u...
sites.google.com
sydneylevine.bsky.social
Awesome result!
xrg.bsky.social
the functional form of moral judgment is (sometimes) the nash bargaining solution

new preprint👇
figure 2 from our preprint, reporting the results from two experiments 

we measure moral judgments about dividing money between two parties and manipulate the degree of asymmetry in the outside options each party has

we find that moral judgments track predictions from rational bargaining models like the nash bargaining solution and the kalai-smorodinsky solution in a negotiation context

by contrast, in a donation context, moral intuitions completely reverse, instead tracking redistributive and egalitarian principles

preprint link: https://osf.io/preprints/psyarxiv/3uqks_v1
sydneylevine.bsky.social
In the meantime, I'll be spending the next little while collaborating with the incredible folks at Google Deepmind, working on pluralistic alignment. Extremely excited to have the opportunity to concretely contribute to AI safety at this incredibly important moment.
sydneylevine.bsky.social
The NYU psych department is on 🔥 and this is your chance to join what is slated to be the world's top computational social cognitive psychology sub-sub-sub department. ☺️
sydneylevine.bsky.social
And: I'm hiring a post-doc! I'm looking for someone with a strong computational background who can start in the summer (or sooner). Details here: apply.interfolio.com/165122

Feel free to reach out with questions (email is best: [email protected]) and please share widely!
sydneylevine.bsky.social
🔆 Announcement time!! 🔆 In Spring 2026, I will be joining the NYU Psychology department as an Assistant Professor! My lab will study the computational cognitive science of moral judgment and how we can use that knowledge to build AI systems that are safe and aligned with human values.
Reposted by Sydney Levine
knightcolumbia.org
📍EVENT: RSVP to join our April symposium examining the risks that advanced #AI systems pose to democratic freedoms, co-hosted with the Institute's Senior AI Advisor @sethlazar.org. www.eventbrite.com/e/artificial...
Reposted by Sydney Levine
hyunwoo-kim.bsky.social
🚨New Paper! So o3-mini and R1 seem to excel on math & coding. But how good are they on other domains where verifiable rewards are not easily available, such as theory of mind (ToM)? Do they show similar behavioral patterns? 🤔 What if I told you it's...interesting, like the below?🧵
sydneylevine.bsky.social
This paper was an intellectual labor of love, the result of years of conversation with my mentors, @fierycushman.bsky.social, Josh Tenenbaum, and Nick Chater. Excited to be sending it out into the world!