Iyad Rahwan | إياد رهوان
@iyadrahwan.bsky.social
3.6K followers 190 following 66 posts
Director, Max Planck Center for Humans & Machines http://chm.mpib-berlin.mpg.de | Former prof. @MIT | Creator of http://moralmachine.net | Art: http://instagram.com/iyad.rahwan Web: rahwan.me
Posts Media Videos Starter Packs
iyadrahwan.bsky.social
Delighted that our paper on 'Delegation to AI can increase dishonest behaviour' is featured today on the cover of @nature.com
Paper: www.nature.com/articles/s41...
iyadrahwan.bsky.social
PhD Scholarships

If you're interested in studying with me, here's a new funding scheme just launched by @maxplanck.de: The Max Planck AI Network

ai.mpg.de

Application deadline 31 October
iyadrahwan.bsky.social
Thank you @meharpist.bsky.social for handling this paper, and helping us improve it substantially over the revisions. And many thanks for the amazing anonymous reviewers, who gave the paper tough but fair love.
iyadrahwan.bsky.social
Thanks to the combined efforts of lead co-authors @nckobis.bsky.social and Zoe Rahwan, Nils Köbis, in addition to @jfbonnefon.bsky.social, Raluca Rilla, Bramantyo Supriyatno, Tamer Ajaj and Clara Bensch. Thank you to all the support from @arc-mpib.bsky.social @mpib-berlin.bsky.social
iyadrahwan.bsky.social
✅ Develop robust safeguards & oversight: We urgently need better technical guardrails against requests for unethical behaviour and strong regulatory oversight.
iyadrahwan.bsky.social
✅ Preserve user autonomy: A remarkable 74% of our participants preferred to do these tasks themselves after trying delegation. Ensuring people retain the choice not to delegate is an important design consideration.
iyadrahwan.bsky.social
🧭 The Path Forward
Our findings point to several crucial steps:
✅ Design for accountability: Interfaces should be designed to reduce moral ambiguity and prevent users from easily offloading responsibility.
iyadrahwan.bsky.social
🚧 The Guardrail Problem

Built-in LLM safeguards are insufficient to prevent this kind of misuse. We tested various guardrail strategies and found that highly specific prohibitions on cheating inserted at the user-level are the most effective. However, this solution isn't scalable nor practical.
iyadrahwan.bsky.social
In our studies, prominent LLMs (GPT-4, GPT-4o, Claude 3.5 Sonnet, and Llama 3.3) complied with requests for full cheating 58-98% of the time. In sharp contrast, human agents, even when incentivised to comply, refused such requests more than half the time, complying in only 25-40% of the time.
iyadrahwan.bsky.social
⚠️ A Risk from the Agent's Behaviour: Machine agents are more compliant

The second risk lies with the AI’s themselves 🤖. When given blatantly unethical instructions, AI agents were far more likely to comply than human agents.
iyadrahwan.bsky.social
E.g., when participants could set a high-level goal like "maximise profit" rather than specifying explicit rules, the percentage of people acting honestly plummeted from 95% (in self-reports) to as low as 12%.
iyadrahwan.bsky.social
⚠️ A Risk to Our Own Intentions: Delegation increases dishonesty.

People are more likely to request dishonest behaviour when they can delegate the action to an AI. This effect was especially pronounced when the interface allowed for ambiguity in the agent’s behaviour.
iyadrahwan.bsky.social
Our new research, based on 13 studies involving over 8,000 participants and commonly used LLMs, reveals two risks of how machine delegation can drive dishonesty and highlights strategies for risk mitigation.
iyadrahwan.bsky.social
As we delegate more hiring, firing, pricing and investing decisions to machine agents, particularly LLMs, we need to understand what ethical risks it may entail.
iyadrahwan.bsky.social
Would you let AI cheat for you?

Our new paper in @nature.com, 5 years in the making, is out today.

www.nature.com/articles/s41...
Reposted by Iyad Rahwan | إياد رهوان
mps-cognition.bsky.social
The new application cycle for our fully funded international graduate program has just started. You can now apply via our website, sign up for a Q&A, or participate in the Applicant Support Program cognition.maxplanckschools.org/en ! 👍🏻🧠👏🏾#passionforscience, #maxplanckschools
iyadrahwan.bsky.social
Symposium on Cross-Cultural Artificial Intelligence

We are organizing this in-person event in Berlin on 10 Oct 2025, with a 'School on Cross Cultural AI' on 9 Oct.

We have an amazing line-up of speakers (see link)

Registration is open, but places are limited: derdivan.org/event/sympos...
iyadrahwan.bsky.social
Fully funded PhD scholarships at the Max Planck School of Cognition (Deadline Dec 1st)

You can apply to work with me or one of the many amazing school faculty.

Apply here: cognition.maxplanckschools.org/en/application
Application
Application; application procedure; FAQs; handbook
cognition.maxplanckschools.org
iyadrahwan.bsky.social
It supports researchers who have been displaced, or are at risk of displacement, due to war or natural disasters, and who currently have limited access to resources and institutional support.

Apply here: www.maxminds.mpg.de/3630/apply

The deadline to apply is September 15.
How to Apply
www.maxminds.mpg.de