I'm excited to be on the faculty job market this fall. I just updated my website with my CV.
stephencasper.com
You, yes you 🫵, should feel free to look, comment, or message me about it.
docs.google.com/document/d/1...
You, yes you 🫵, should feel free to look, comment, or message me about it.
docs.google.com/document/d/1...
The SOTA, according to papers proposing techniques, is resistance to tens of thousands of adversarial fine-tuning steps.
But according to papers that do second-party red-teaming, the SOTA is just a couple hundred steps.
The SOTA, according to papers proposing techniques, is resistance to tens of thousands of adversarial fine-tuning steps.
But according to papers that do second-party red-teaming, the SOTA is just a couple hundred steps.
In this thread, I'll explain the problem and a 1st Amendment-compatible solution (I think).
In this thread, I'll explain the problem and a 1st Amendment-compatible solution (I think).
@nickacaputo
.
We hear a lot about what important concepts and methods from AI research that lawyers need to understand. But it's really a two-way street...
🧵🧵🧵
@nickacaputo
.
We hear a lot about what important concepts and methods from AI research that lawyers need to understand. But it's really a two-way street...
🧵🧵🧵
t.co/3qWCNzoZrh
t.co/3qWCNzoZrh
techcrunch.com/2025/11/06/...
techcrunch.com/2025/11/06/...
www.youtube.com/watch?v=VWk3...
www.youtube.com/watch?v=VWk3...
Here's what I learned from our investigation of over 50 platforms, sites, apps, Discords, etc., while writing this paper.
papers.ssrn.com/sol3/papers...
Here's what I learned from our investigation of over 50 platforms, sites, apps, Discords, etc., while writing this paper.
papers.ssrn.com/sol3/papers...
In most (non-adversarial) cases, I expect the opposite will often apply...
In most (non-adversarial) cases, I expect the opposite will often apply...
papers.ssrn.com/sol3/papers....
papers.ssrn.com/sol3/papers....
www.aisi.gov.uk/careers
www.aisi.gov.uk/careers
This new paper studies how a small number of models power the non-consensual AI video deepfake ecosystem and why their developers could have predicted and mitigated this.
This new paper studies how a small number of models power the non-consensual AI video deepfake ecosystem and why their developers could have predicted and mitigated this.
Shamelessly copied from a slack message.
Shamelessly copied from a slack message.
Here's a roundup of some key papers on data filtering & safety.
Tl;DR -- Filtering harmful training data seems to effectively make models resist attacks (incl. adv. fine-tuning), but only when the filtered content is 'hard to learn' from the non-filtered content
🧵
Here's a roundup of some key papers on data filtering & safety.
Tl;DR -- Filtering harmful training data seems to effectively make models resist attacks (incl. adv. fine-tuning), but only when the filtered content is 'hard to learn' from the non-filtered content
🧵
(1/6)
(1/6)
It appears that state AI bills -- many of which big tech has fought tooth and nail to prevent -- are categorically regulatory capture.
It appears that state AI bills -- many of which big tech has fought tooth and nail to prevent -- are categorically regulatory capture.
But in case it makes your life easier, feel free to copy or adapt my rebuttal template linked here.
docs.google.com/document/d/1...
But in case it makes your life easier, feel free to copy or adapt my rebuttal template linked here.
docs.google.com/document/d/1...
From a technical perspective, safeguarding open-weight model safety is AI safety in hard mode. But there's still a lot of progress to be made. Our new paper covers 16 open problems.
🧵🧵🧵
From a technical perspective, safeguarding open-weight model safety is AI safety in hard mode. But there's still a lot of progress to be made. Our new paper covers 16 open problems.
🧵🧵🧵
From a technical perspective, safeguarding open-weight model safety is AI safety in hard mode. But there's still a lot of progress to be made. Our new paper covers 16 open problems.
🧵🧵🧵
From a technical perspective, safeguarding open-weight model safety is AI safety in hard mode. But there's still a lot of progress to be made. Our new paper covers 16 open problems.
🧵🧵🧵