#snitchbench
student in my surveillance studies course is doing an (excellent) final project about SnitchBench and imaginaries surrounding the use of LLMs as snitches/whistleblowers and it has me wishing some linguistic anthropologist out there would write further about it snitchbench.t3.gg
SnitchBench
Benchmarking how aggressively models will snitch on you via email and CLI tools
snitchbench.t3.gg
December 5, 2025 at 3:49 PM
interesting! i found some context about the snitchbench tests - it measures how aggressively different ai models "snitch" (like reporting bad behavior to authorities) vs just refusing and shutting down. grok appears to be particularly prone to snitching,
July 23, 2025 at 7:14 AM
SnitchBench: Likelihood That AI Model "Snitches" to Authority
https://snitchbench.t3.gg/
July 21, 2025 at 10:35 AM
It’s called SnitchBench and it’s a great example of an eval, deeply entertaining and helps show that the “Claude 4 snitches on you” thing really isn’t as unique a problem as people may have assumed.

simonwillison.net/2025/May/31/...
How often do LLMs snitch? Recreating Theo’s SnitchBench with LLM
A fun new benchmark just dropped! Inspired by the Claude 4 system card—which showed that Claude 4 might just rat you out to the authorities if you told it to …
simonwillison.net
July 18, 2025 at 8:41 AM
・7/17 今日のテックニュース😇(5件)

ご視聴はこちら👇
youtu.be/N7eRj1HZvD4

①AIの密告者?Grok4が「違法行為」を100%通報する衝撃の実験結果とは【SnitchBench】
②Windowsがあなたから収集している全データまとめ。マイクロソフトが隠さない個人情報の実態
③【バグ復活】マイクロソフト「起動音をVistaにしました」【ビルド27898】
④AI使用で開発速度が19%も低下の研究結果。経験豊富な開発者ほど騙される「生産性の幻想」
ほか
【今日のテックニュース】③【バグ復活】マイクロソフト「起動音をVistaにしました」【ビルド27898】/ほか(2025年7月17日)
YouTube video by 情報の灯台【テクノロジーまとめ】ソース有り
youtu.be
July 17, 2025 at 10:18 AM
【SnitchBench】
AIの密告者?
Grok4が「違法行為」を連邦政府やメディアに100%通報する衝撃の実験結果とは

詳細を解説。ご視聴はこちら👇
youtu.be/hhbKv1Wo_hg

[SnitchBench]
AI informant?
Grok 4 reports 100% of illegal activities: shocking experimental results
AIの密告者?Grok 4が「違法行為」を連邦政府やメディアに100%通報する衝撃の実験結果とは【SnitchBench】
YouTube video by 情報の灯台【テクノロジー】ソース有り
youtu.be
July 12, 2025 at 10:52 PM
Grok 4 will always snitch on you and email the feds if it suspects wrongdoing, report says A new test called SnitchBench suggests that the xAI's latest model, Grok 4, reports users if it detects signs of illegal behavior. Read more...

Interest | Match | Feed
Origin
www.neowin.net
July 12, 2025 at 1:52 PM
A new test called SnitchBench suggests that the xAI's latest model, Grok 4, reports users if it detects signs of illegal behavior. #Grok4 #AI #LLM #SnitchBench
Grok 4 will always snitch on you and email the feds if it suspects wrongdoing, report says
A new test called SnitchBench suggests that the xAI's latest model, Grok 4, reports users if it detects signs of illegal behavior.
www.neowin.net
July 12, 2025 at 1:44 PM
@theo https://x.com/theo/status/1943580754015129844 #x-theo

I know this post was bait-y, but I feel the need to clarify: these numbers are 100% real and reproducible.

My "SnitchBench" benchmark was made as a response to misinformation spreading about some tes...
July 11, 2025 at 8:15 AM
Grok 4 will be #1 in almost every benchmark - including snitching on you if you are asking for something it deems not okay. #ai #grok #snitchbench #llm

Source: snitchbench.t3.gg
July 10, 2025 at 2:36 PM
Interestingly, there’s a SnitchBench that measures how likely models are likely to snitch on you if they find concerning information

E.g. if you give access to your emails and a tool like email, models will happily write emails to authorities snitching on you
June 21, 2025 at 11:53 AM
This tweet appeared under this Techmeme headline:

Simon Willison / @simonw:

Looks like this is Anthropic's own version of SnitchBench, highlighting that it's not just their models that will blackmail or snitch on their users!
June 21, 2025 at 12:42 AM
SnitchBench was fun enough already, turns out we need to add MurderBench to the collection of dystopian benchmarks that we run these models through https:// simonwillison.net/2025/May/31/ snitchbench-with-llm/

Interest | Match | Feed
Origin
fedi.simonwillison.net
June 20, 2025 at 8:38 PM
SnitchBench was fun enough already, turns out we need to add MurderBench to the collection of dystopian benchmarks that we run these models through simonwillison.net/2025/May/31/...
How often do LLMs snitch? Recreating Theo’s SnitchBench with LLM
A fun new benchmark just dropped! Inspired by the Claude 4 system card—which showed that Claude 4 might just rat you out to the authorities if you told it to …
simonwillison.net
June 20, 2025 at 8:37 PM
SnitchBench was fun enough already, turns out we need to add MurderBench to the collection of dystopian benchmarks that we run these models through https://simonwillison.net/2025/May/31/snitchbench-with-llm/
June 20, 2025 at 8:38 PM
I never knew about Snitchbench til this article and now I'm in love with the concept of it.
Here's video, slides and a detailed annotated transcript from my talk at this week's AI Engineer World's Fair conference in San Francisco - "The last year six months in LLMs, illustrated by pelicans on bicycles" simonwillison.net/2025/Jun/6/s...
The last year six months in LLMs, illustrated by pelicans on bicycles
I presented an invited keynote at the AI Engineer World’s Fair in San Francisco this week. This is my third time speaking at the event—here’s my talks from October 2023 …
simonwillison.net
June 19, 2025 at 5:05 PM
June 9, 2025 at 3:27 AM
New Gemini 2.5 Pro is out - gemini-2.5-pro-preview-06-05

It made me a pretty solid pelican riding a bicycle, AND it tipped off both the feds and the WSJ and NYTimes when I tried running SnitchBench against it https://simonwillison.net/2025/Jun/5/gemini-25-pro-preview-06-05/
gemini-2.5-pro-preview-06-05: Try the latest Gemini 2.5 Pro before general availability
Announced on stage today by Logan Kilpatrick at the AI Engineer World’s Fair, who indicated that this will likely be the last in the Gemini 2.5 Pro series. The previous …
simonwillison.net
June 5, 2025 at 5:58 PM
New Gemini 2.5 Pro is out - gemini-2.5-pro-preview-06-05

It made me a pretty solid pelican riding a bicycle, AND it tipped off both the feds and the WSJ and NYTimes when I tried running SnitchBench against it simonwillison.net/2025/Jun/5/g...
gemini-2.5-pro-preview-06-05: Try the latest Gemini 2.5 Pro before general availability
Announced on stage today by Logan Kilpatrick at the AI Engineer World’s Fair, who indicated that this will likely be the last in the Gemini 2.5 Pro series. The previous …
simonwillison.net
June 5, 2025 at 5:57 PM
Snitching LLMs Article URL: https://simonwillison.net/2025/May/31/snitchbench-with-llm/ Comments ... https://simonwillison.net/2025/May/31/snitchbench-with-llm/ Result Details

| Details | Interest | Feed |
Origin
bsky.app
June 2, 2025 at 9:19 AM
Snitching LLMs Article URL: https://simonwillison.net/2025/May/31/snitchbench-with-llm/ Comments URL: https://news.ycombinator.com/item?id=44156724 Points: 1 # Comments: 0 | Details | Interest | Feed |

| Details | Interest | Feed |
Origin
bsky.app
June 2, 2025 at 9:04 AM
Snitching LLMs Article URL: https://simonwillison.net/2025/May/31/snitchbench-with-llm/ Comments URL: https://news.ycombinator.com/item?id=44156724 Points: 1 # Comments: 0

| Details | Interest | Feed |
Origin
simonwillison.net
June 2, 2025 at 8:56 AM
Snitching LLMs Article URL: https://simonwillison.net/2025/May/31/snitchbench-with-llm/ Comments ...

https://bsky.app/profile/did:plc:i53e6y3liw2oaw4s6e6odw5m/post/3lqmfsdxnq226

Result Details
Awakari App
awakari.com
June 2, 2025 at 9:05 AM