Cas (Stephen Casper)
banner
scasper.bsky.social
Cas (Stephen Casper)
@scasper.bsky.social
AI technical gov & risk management research. PhD student @MIT_CSAIL, fmr. UK AISI. I'm on the CS faculty job market! https://stephencasper.com/
I would love to see the draft. [email protected]
January 23, 2026 at 1:46 AM
"Policy levers to mitigate AI-facilitated terrorism"

"The biggest AI incidents of 2026, and how they could have been prevented"
January 22, 2026 at 4:55 PM
"Technocrats always win: Against 'pluralistic' algorithms and pluralism-washing"

“Is your Machine Unlearning Algorithm Better than a Bag-of-Words Classifier? (No)”

"Don’t overthink it: extremely dumb solutions that improve tamper-resistant unlearning in LLMs"

...
January 22, 2026 at 4:55 PM
And we also have to remember that in this domain, like half of the perpetrators are literal teenagers.
January 12, 2026 at 7:55 PM
But mitigations matter. There is evidence from other fields like digital piracy that reducing the accessibility of illicit things drives up sanctioned uses, even when perfect prevention isn’t possible…
January 12, 2026 at 7:55 PM
Probably by restricting their distribution on platforms like civitai under the same kind of law.

Sometimes people tell me, “that kind of stuff is not gonna work because models will still be accessible on the Internet.”…
January 12, 2026 at 7:55 PM
Finally, this seems like the right thing to do anyway. It would be a strong protection against training data depicting non-consenting people or minors. And many people might reasonably consent to their NSFW likeness to be online in general without consenting to AI training.
January 12, 2026 at 7:30 PM
This kind of approach would make the creation of NCII-capable AI models/apps very onerous. Meanwhile, Congress probably would not run into 1st Amendment issues with this type of modification to fair use law.
January 12, 2026 at 7:30 PM
Or B: The developer could alternatively attest/verify that they developed the system using an externally sourced dataset known to satisfy A1-A3 for their usage.
January 12, 2026 at 7:30 PM
A3: Third, the developer would be required for a period (e.g. 10 years) to preserve a record of the unredacted list, all contracts signed by subjects, and their contact information.
January 12, 2026 at 7:30 PM
A2: Second, the declaration must attest that all subjects provided affirmative, informed consent for their NSFW likeness to be used to develop this technology.
January 12, 2026 at 7:30 PM
A1: First, the declaration needs to present a redacted list (unless subjects consented to non-redaction) of all individuals whose likeness was involved in developing the technology -- i.e. all humans whose likeness is depicted in pornographic training data.
January 12, 2026 at 7:30 PM
In practice, we could prohibit NCII-capable models/applications unless they are hosted/released alongside a declaration under penalty of perjury from the developer, with a few requirements: Either (A1, A2, A3) or B.
January 12, 2026 at 7:30 PM
Instead, I think there is a viable alternative that doesn't categorically restrict NSFW AI capabilities. Instead, we could simply require that anyone whose NSFW likeness is used in training a model or developing software offer affirmative and informed consent.
January 12, 2026 at 7:30 PM
Arguing for categorical prohibitions of realistic NCII-capable technology grounds that there is no alternative might work. Both of the above decisions cited viable, narrower alternative restrictions. But I wouldn't hold my breath.
January 12, 2026 at 7:30 PM
This poses First Amendment challenges in the USA.

1. Reno v. ACLU (1997) prohibits categorical restrictions on internet porn.

2. Ashcroft v Free Speech Coalition (2002) protects legal "child porn" as long as no actual child's likeness is involved.
January 12, 2026 at 7:30 PM
The technical reality: If we want open-weight AI models that are even slightly difficult to use/adapt for making non-consensual personalized deepfake porn, it is overwhelmingly clear from a technical perspective that we will have to limit models' overall NSFW capabilities.
January 12, 2026 at 7:30 PM
First, see this thread for some extra context.

x.com/StephenLCas...
January 12, 2026 at 7:30 PM