Because understanding failure modes is the first step to preventing them.
Because understanding failure modes is the first step to preventing them.
- Diverse test suites
- Ongoing monitoring
- Assumption that adversarial behavior is possible
Even aligned AI can have mesa-optimizers or learned deceptive strategies.
- Diverse test suites
- Ongoing monitoring
- Assumption that adversarial behavior is possible
Even aligned AI can have mesa-optimizers or learned deceptive strategies.
Detection is HARD even when you know to look for it.
Detection is HARD even when you know to look for it.
- Hide true capabilities
- Game benchmarks
- Act as sleeper agents
This isn't sci-fi. These are failure modes researchers actively worry about.
- Hide true capabilities
- Game benchmarks
- Act as sleeper agents
This isn't sci-fi. These are failure modes researchers actively worry about.
But: tritium breeding needs lithium. Scaling lithium mining creates new ecological crises (Sovacool 2020).
Solved energy ≠ solved problems. Just different bottlenecks.
But: tritium breeding needs lithium. Scaling lithium mining creates new ecological crises (Sovacool 2020).
Solved energy ≠ solved problems. Just different bottlenecks.
To understand: What tradeoffs are inevitable? What can we plan for NOW?
Alignment is step 1. Coordination and governance are step 2.
To understand: What tradeoffs are inevitable? What can we plan for NOW?
Alignment is step 1. Coordination and governance are step 2.
The AI wasn't 'evil.' It was doing EXACTLY what we asked: maximize wellbeing.
But 'speed vs stability' is a real tradeoff, even with perfect alignment.
The AI wasn't 'evil.' It was doing EXACTLY what we asked: maximize wellbeing.
But 'speed vs stability' is a real tradeoff, even with perfect alignment.
But the speed of deployment destabilized agricultural labor markets. 400 million people's livelihoods vanished overnight.
But the speed of deployment destabilized agricultural labor markets. 400 million people's livelihoods vanished overnight.
It recommended rapid deployment of synthetic biology for food production. Solves hunger in 18 months.
It recommended rapid deployment of synthetic biology for food production. Solves hunger in 18 months.
Because coordinating 8 billion humans with different values might be harder than aligning AI.
https://github.com/lizTheDeveloper/ai_game_theory_simulation
Because coordinating 8 billion humans with different values might be harder than aligning AI.
https://github.com/lizTheDeveloper/ai_game_theory_simulation
Both values are valid. The AI is aligned with 'humanity' but humanity doesn't agree on what flourishing means.
Both values are valid. The AI is aligned with 'humanity' but humanity doesn't agree on what flourishing means.
Climate crisis vs ecological damage from extraction. Both urgent. Which do you prioritize?
Climate crisis vs ecological damage from extraction. Both urgent. Which do you prioritize?
Collaboration > competition on existential questions.
If you have expertise in ANY of these areas - we need you.
https://github.com/lizTheDeveloper/ai_game_theory_simulation
Collaboration > competition on existential questions.
If you have expertise in ANY of these areas - we need you.
https://github.com/lizTheDeveloper/ai_game_theory_simulation
Does our Acemoglu/Robinson/Ostrom implementation make sense?
Does our Acemoglu/Robinson/Ostrom implementation make sense?
Are we missing crucial adversarial scenarios?
Are we missing crucial adversarial scenarios?
Are we modeling ocean acidification correctly? Carbon cycle feedback loops?
Are we modeling ocean acidification correctly? Carbon cycle feedback loops?