Dave Willner
banner
dwillner.bsky.social
Dave Willner
@dwillner.bsky.social
Co-Founder at Zentropi. Formerly Head of Trust & Safety at OpenAI, of Community Policy at Airbnb, and of Content Policy Facebook. Strictly cold takes.
Pinned
For 17 years working in trust and safety, I've watched talented people burn out on impossible tasks. The problem isn't the people, it's the systems. Traditional moderation requires months of retraining for every policy change. Only big companies can afford it, and even then it works poorly. 🧵 1/9
Reposted by Dave Willner
Bellingcat’s contact email has always been a magnet for people with fairly unusual views; paranoid delusions, sprawling conspiracies, the works. But recently, the pattern has shifted, we’re seeing more and more emails clearly written with ChatGPT.
November 19, 2025 at 2:18 PM
Reposted by Dave Willner
this administration, and its congressional allies, are free speech phonies. not warriors. phonies. censors. propagandists.
Just wanna re-up in simple terms that when Biden talked to platforms, Jim Jordan launched years of investigations into everybody involved, said it was tyranny, censorship, etc.

And now they just straight up acknowledge that they talk to platforms too.

www.washingtonexaminer.com/news/crime/3...
DHS playing 'whack-a-mole' shooting down made-up ICE stories
The Department of Homeland Security is stepping up efforts to combat fake news, viral AI videos, and misinformation on ICE and Border Patrol.
www.washingtonexaminer.com
November 20, 2025 at 1:55 AM
Reposted by Dave Willner
Just wanna re-up in simple terms that when Biden talked to platforms, Jim Jordan launched years of investigations into everybody involved, said it was tyranny, censorship, etc.

And now they just straight up acknowledge that they talk to platforms too.

www.washingtonexaminer.com/news/crime/3...
DHS playing 'whack-a-mole' shooting down made-up ICE stories
The Department of Homeland Security is stepping up efforts to combat fake news, viral AI videos, and misinformation on ICE and Border Patrol.
www.washingtonexaminer.com
November 20, 2025 at 1:54 AM
Reposted by Dave Willner
We just wrote an in-depth post about Toxic Content labeling. It presents a new way of defining toxic speech online-- and illustrates the importance of observable features for accurate language model interpretability. Would love to hear how YOU define toxicity, too! blog.zentropi.ai/observations...
Observations on Toxicity
We've published Zentropi's toxicity labeler (toxicity-public-s5), which you can integrate with your platform instantly using the Zentropi API. Browse the full policy to see how defining observable fea...
blog.zentropi.ai
November 13, 2025 at 10:47 PM
I’ve had a very “text-oriented” view of content labeling for a long time, and used the opportunity of our recent launch to lay out some of those ideas in the context of the idea of “toxicity”

Interested to know what others think!

blog.zentropi.ai/observations...
Observations on Toxicity
We published a novel toxicity labeler (toxicity-public-s5), which you can integrate with your platform instantly using the Zentropi API. Browse the full policy to see how defining observable features ...
blog.zentropi.ai
November 13, 2025 at 10:56 PM
Content policies are usually private, one-off efforts. You build yours, I build mine, we don't share much about what works or why. This makes sense given products can (and should) set different policies based on their communities, but it leaves us reinventing the wheel. 🧵 1/5
November 10, 2025 at 8:10 PM
Reposted by Dave Willner
*whispers* you can continue to read me, the pundit who insisted the other pundits were wrong about these conclusions
basically every 2024 truism is dead. Trump did not build a lasting multiracial coalition or turn young men into committed Republicans. You don’t need to cave on trans rights to win. The pundits have nothing left to tell you.
November 5, 2025 at 3:06 PM
Go U Bears!
November 5, 2025 at 2:54 AM
Reposted by Dave Willner
Picture of the East Wing demolition of the White House taken on my flight out of DCA.
October 23, 2025 at 5:16 PM
I am forgetful about it self-promotion, so dropping a last minute link to note that I’m giving a talk Berkman Klein today. Come check it out if you’re free, or catch the recording later:

cyber.harvard.edu/events/autom...
Automating Content Policy
AI is no longer just moderating individual posts — it is learning how to interpret and enforce policy itself. Dave Willner — who has led trust and safety teams at Facebook, Airbnb, and OpenAI — joins ...
cyber.harvard.edu
October 22, 2025 at 4:08 PM
I feel like some of the difference in reactions here also rests on on frequently you have to do a somewhat complex, but very repetitive, task. Taking the time to get these sort of workflows really dialed in is most useful for stuff you do over and over.
A thing that I keep finding with AI experiments is that the more context and direction you give a tool, the more benefits it gives in return. So many of the complaints about AI seem focused on just trying to use it cold without additional context. Skills seems like useful context.
Claude Skills are awesome, maybe a bigger deal than MCP
simonwillison.net/2025/Oct/16/...
October 17, 2025 at 10:00 PM
Reposted by Dave Willner
Tyranny is brittle.
We live in a country where the government honors insurrectionists who sacked the Capitol, and defines peaceful protest, even before it occurs, to be terrorism.
They know their position is weak, they know they are unpopular, which is why they are seeking to stamp out dissent.
Mike Johnson: "We're so angry about it. I mean, I'm a very patient guy, but I've had it with these people. The theory we have right now -- they have a hate America rally that's scheduled for October 18 on the National Mall. It's the pro-Hamas wing and antifa people ... "
October 10, 2025 at 5:16 PM
So, the first part of this is plainly false, both historically and currently. I don’t think it’s a good thing in most cases…but it’s plainly the case that pressuring the people in charge of moderation to either ban (or not ban) people works *All The Time*. It is why people do it!
Harassing the mods into banning someone has never worked. And harassing people in general has never changed their mind.
October 3, 2025 at 3:03 PM
Reposted by Dave Willner
New Ctrl-Alt-Speech: Moderating is Such Sweet Sorrow with guest host @dwillner.bsky.social who is entirely responsible for bringing up Shakespeare as part of this discussion. (@benwhitelaw.bsky.social will be back next week!)

podcast.ctrlaltspeech.com/2315966/epis...
Moderating is Such Sweet Sorrow - Ctrl-Alt-Speech
In this week’s roundup of the latest news in online speech, content moderation and internet regulation, Mike is joined by Dave Willner, founder of Zentropi, and long-time trust & safety expert who...
podcast.ctrlaltspeech.com
October 1, 2025 at 11:25 PM
While terrible, this is entirely unsurprising. If you hold serious safety efforts in contempt, this sort of thing is inevitable.
September 23, 2025 at 4:59 AM
Reposted by Dave Willner
Disney/ABC have a responsibility to refuse to participate in corruption.

Kimmel must be reinstated. If Disney/ABC agree to this extortion then perhaps creatives + workers should consider collective action to push back. Same w/buying park + cruise tickets if they bow.

People have power. Ask Target
September 20, 2025 at 1:13 AM
Reposted by Dave Willner
No one who agrees to this is a journalist.
NEW: The Pentagon told journalists it will require them to pledge they won’t gather any information — even unclassified — that hasn’t been expressly authorized for release, and will revoke the press credentials of those who do not obey. @washingtonpost.com
Pentagon demands journalists pledge to not obtain unauthorized material
Defense Secretary Pete Hegseth is imposing strict new rules that would severely limit the ability of journalists to report on the Pentagon.
www.washingtonpost.com
September 20, 2025 at 12:27 AM
Reposted by Dave Willner
Losing my ever-loving-mind watching the same people who were just clutching their pearls claiming censorship over mean emails from WH staffers to Twitter about COVID misinfo are now HAVING THE FCC CHAIR openly threaten broadcast licenses over a joke about the president AND THE BROADCASTERS CENSOR IT
September 18, 2025 at 1:06 PM
Reposted by Dave Willner
This is a massive, history making abuse of your power. It will define your legacy and one day you will come to regret punishing free speech and trying to destroy democracy.
September 18, 2025 at 12:53 AM
Reposted by Dave Willner
This is jawboning. This is what the Freedom Caucus fascists of the Weaponization Committee and their Substack lackeys pretended was happening under some “Biden regime,” but it wasn’t. It was always projection.
September 17, 2025 at 11:04 PM
Reposted by Dave Willner
We cannot make the headlines blunter people www.theverge.com/policy/77979...
September 17, 2025 at 11:30 PM
Reposted by Dave Willner
staring straight into the camera and lying. just a despicable person and a poor excuse for a national leader.
Vance: “People on the left are much likelier to defend and celebrate political violence. This is not a both sides problem. If both sides have a problem, then one side has a much bigger and malignant problem and that is the truth.”
September 15, 2025 at 6:34 PM
Reposted by Dave Willner
We must stand resolutely against political assassination and political violence of all kinds, and just as resolutely against everyone who exploits acts of violence as the pretext or excuse for political repression of political opponents.
Very, very bad stuff coming from leading right-wingers
September 10, 2025 at 9:12 PM
We got really positive feedback on the TrustCon workshop we ran on writing good content policies for LLMs...so we're doing it again! If you're interested go sign up here, so we can start to figure out timing: forms.gle/tj7vf7ng8n7R...
Zentropi LLM Policy Writing Workshop Signup
By popular demand, we will be hosting a virtual version of our sold-out TrustCon workshop on how to write high quality content policies with and for LLMs. In this session, you will learn best practic...
forms.gle
August 27, 2025 at 6:11 PM
Reposted by Dave Willner
The agenda for the Trust and Safety Research Conference is out now. Two days of lightning talks, presentations, networking and more, with @dwillner.bsky.social‬ as keynote. Join us!

For the full line-up and times, plus link to register, visit:

cyber.fsi.stanford.edu/content/trus...
August 20, 2025 at 6:46 PM