Dave Willner
@dwillner.bsky.social
9.5K followers 1.9K following 200 posts
Co-Founder at Zentropi. Formerly Head of Trust & Safety at OpenAI, of Community Policy at Airbnb, and of Content Policy Facebook. Strictly cold takes.
Posts Media Videos Starter Packs
Pinned
dwillner.bsky.social
For 17 years working in trust and safety, I've watched talented people burn out on impossible tasks. The problem isn't the people, it's the systems. Traditional moderation requires months of retraining for every policy change. Only big companies can afford it, and even then it works poorly. 🧵 1/9
dwillner.bsky.social
Right? I don’t even know what the current fight is about, but let’s not be silly now.
dwillner.bsky.social
So, the first part of this is plainly false, both historically and currently. I don’t think it’s a good thing in most cases…but it’s plainly the case that pressuring the people in charge of moderation to either ban (or not ban) people works *All The Time*. It is why people do it!
jay.bsky.team
Harassing the mods into banning someone has never worked. And harassing people in general has never changed their mind.
dwillner.bsky.social
While terrible, this is entirely unsurprising. If you hold serious safety efforts in contempt, this sort of thing is inevitable.
Reposted by Dave Willner
aoc.bsky.social
Disney/ABC have a responsibility to refuse to participate in corruption.

Kimmel must be reinstated. If Disney/ABC agree to this extortion then perhaps creatives + workers should consider collective action to push back. Same w/buying park + cruise tickets if they bow.

People have power. Ask Target
Reposted by Dave Willner
natecardozo.bsky.social
No one who agrees to this is a journalist.
scottnover.bsky.social
NEW: The Pentagon told journalists it will require them to pledge they won’t gather any information — even unclassified — that hasn’t been expressly authorized for release, and will revoke the press credentials of those who do not obey. @washingtonpost.com
Pentagon demands journalists pledge to not obtain unauthorized material
Defense Secretary Pete Hegseth is imposing strict new rules that would severely limit the ability of journalists to report on the Pentagon.
www.washingtonpost.com
Reposted by Dave Willner
klonick.bsky.social
Losing my ever-loving-mind watching the same people who were just clutching their pearls claiming censorship over mean emails from WH staffers to Twitter about COVID misinfo are now HAVING THE FCC CHAIR openly threaten broadcast licenses over a joke about the president AND THE BROADCASTERS CENSOR IT
Reposted by Dave Willner
chrismurphyct.bsky.social
This is a massive, history making abuse of your power. It will define your legacy and one day you will come to regret punishing free speech and trying to destroy democracy.
Reposted by Dave Willner
noupside.bsky.social
This is jawboning. This is what the Freedom Caucus fascists of the Weaponization Committee and their Substack lackeys pretended was happening under some “Biden regime,” but it wasn’t. It was always projection.
The ABC late-night host's remarks constituted "the sickest conduct possible,"
FCC chair Brendan Carr told right-wing podcaster Benny Johnson on Wednesday.
Carr suggested his FCC could move to revoke ABC affiliate licenses as a way to force Disney to punish Kimmel.
"We can do this the easy way or the hard way," Carr said. "These companies can find ways to change conduct and take actions on Kimmel, or there's going to be additional work for the FCC ahead."
Carr added that the broadcasters, including ABC, "have a license granted by us at the FCC, and that comes with it an obligation to operate in the public
Reposted by Dave Willner
Reposted by Dave Willner
jamellebouie.net
staring straight into the camera and lying. just a despicable person and a poor excuse for a national leader.
thebulwark.com
Vance: “People on the left are much likelier to defend and celebrate political violence. This is not a both sides problem. If both sides have a problem, then one side has a much bigger and malignant problem and that is the truth.”
Reposted by Dave Willner
joshtpm.bsky.social
We must stand resolutely against political assassination and political violence of all kinds, and just as resolutely against everyone who exploits acts of violence as the pretext or excuse for political repression of political opponents.
zackbeauchamp.bsky.social
Very, very bad stuff coming from leading right-wingers
dwillner.bsky.social
We got really positive feedback on the TrustCon workshop we ran on writing good content policies for LLMs...so we're doing it again! If you're interested go sign up here, so we can start to figure out timing: forms.gle/tj7vf7ng8n7R...
Zentropi LLM Policy Writing Workshop Signup
By popular demand, we will be hosting a virtual version of our sold-out TrustCon workshop on how to write high quality content policies with and for LLMs. In this session, you will learn best practic...
forms.gle
Reposted by Dave Willner
stanfordcyber.bsky.social
The agenda for the Trust and Safety Research Conference is out now. Two days of lightning talks, presentations, networking and more, with @dwillner.bsky.social‬ as keynote. Join us!

For the full line-up and times, plus link to register, visit:

cyber.fsi.stanford.edu/content/trus...
Reposted by Dave Willner
benjaminwittes.lawfaremedia.org
A sprite of mischief in New York left sunflowers at the Russian consulate.
Reposted by Dave Willner
benjaminwittes.lawfaremedia.org
I mean honestly, people, this is a really good idea. Cut sunflowers make Russian diplomats really really angry.

Buy a few sunflowers.

Drop them at an embassy/consulate near you.

youtu.be/R8tr6Dhn78A?...
Reposted by Dave Willner
mikecaulfield.bsky.social
The joy we felt when after hearing repeatedly to not expect anything sooner than 18 months when we were told they'd be rolling something out before the end of the year. The absolute testament to the power and ingenuity of the American system of science. Moon landing level stuff, and now erased.
lauren.rotatingsandwiches.com
I got a flu booster today and it made me reflect on the sense of national accomplishment I felt when I drove to a public facility, waited in my car until my number was called on an app, and got my first covid jab. It's fucked the right gets to erase what a moment of technological liberation that was
dwillner.bsky.social
Replied over here to a similar question - bsky.app/profile/dwil...
dwillner.bsky.social
It's an area we need to explore more deeply. The classification model isn't trivial to trick because of how strictly it's been trained to take it's lead from the policy document itself, but I'd imagine you could do so with dedicated effort under the right circumstances.
dwillner.bsky.social
Interested in how you'd think about probing it for weaknesses, let me know if you want to chat!
dwillner.bsky.social
That is an advantage against adversarial behavior, since exactly how it behaves won't be obviously the same across users with different policies...but it also means testing for it's across-policy tendencies (which surely exist) is hard, since you'd need a lot of "clearly good" policy to do it.
dwillner.bsky.social
That's tangled up with a broader unsolved evaluation problem for this kind of approach - namely, the results you get on the classification side are a function of *both* CoPE's performance *and* your specific policy formulation.
dwillner.bsky.social
It's an area we need to explore more deeply. The classification model isn't trivial to trick because of how strictly it's been trained to take it's lead from the policy document itself, but I'd imagine you could do so with dedicated effort under the right circumstances.
dwillner.bsky.social
You could also just run that policy using CoPE as the labeler in production - the interpreting model is only 9B parameters and is open sourced, so we can run it for you or you can run it on your own infra! huggingface.co/zentropi-ai/...
zentropi-ai/cope-a-9b · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co