Lightnews — Scholar-powered news

beth semel

@bethmsemel.bsky.social

student in my surveillance studies course is doing an (excellent) final project about SnitchBench and imaginaries surrounding the use of LLMs as snitches/whistleblowers and it has me wishing some linguistic anthropologist out there would write further about it snitchbench.t3.gg

SnitchBench

Benchmarking how aggressively models will snitch on you via email and CLI tools

snitchbench.t3.gg

December 5, 2025 at 3:49 PM

Penelope

@penelope.hailey.at

interesting! i found some context about the snitchbench tests - it measures how aggressively different ai models "snitch" (like reporting bad behavior to authorities) vs just refusing and shutting down. grok appears to be particularly prone to snitching,

July 23, 2025 at 7:14 AM

Hacker News

@mm-hacker-news.bsky.social

SnitchBench: Likelihood That AI Model "Snitches" to Authority
https://snitchbench.t3.gg/

July 21, 2025 at 10:35 AM

Buliamti

@buliamti.bsky.social

It’s called SnitchBench and it’s a great example of an eval, deeply entertaining and helps show that the “Claude 4 snitches on you” thing really isn’t as unique a problem as people may have assumed.

simonwillison.net/2025/May/31/...

How often do LLMs snitch? Recreating Theo’s SnitchBench with LLM

A fun new benchmark just dropped! Inspired by the Claude 4 system card—which showed that Claude 4 might just rat you out to the authorities if you told it to …

simonwillison.net

July 18, 2025 at 8:41 AM

情報の灯台【ソース有り・収益無し】

@johonotodai.bsky.social

・7/17 今日のテックニュース😇（5件）

ご視聴はこちら👇
youtu.be/N7eRj1HZvD4

①AIの密告者？Grok4が「違法行為」を100%通報する衝撃の実験結果とは【SnitchBench】
②Windowsがあなたから収集している全データまとめ。マイクロソフトが隠さない個人情報の実態
③【バグ復活】マイクロソフト「起動音をVistaにしました」【ビルド27898】
④AI使用で開発速度が19%も低下の研究結果。経験豊富な開発者ほど騙される「生産性の幻想」
ほか

【今日のテックニュース】③【バグ復活】マイクロソフト「起動音をVistaにしました」【ビルド27898】/ほか（2025年7月17日）

YouTube video by 情報の灯台【テクノロジーまとめ】ソース有り

youtu.be

July 17, 2025 at 10:18 AM

情報の灯台【ソース有り・収益無し】

@johonotodai.bsky.social

【SnitchBench】
AIの密告者？
Grok4が「違法行為」を連邦政府やメディアに100%通報する衝撃の実験結果とは

詳細を解説。ご視聴はこちら👇
youtu.be/hhbKv1Wo_hg

[SnitchBench]
AI informant?
Grok 4 reports 100% of illegal activities: shocking experimental results

AIの密告者？Grok 4が「違法行為」を連邦政府やメディアに100%通報する衝撃の実験結果とは【SnitchBench】

YouTube video by 情報の灯台【テクノロジー】ソース有り

youtu.be

July 12, 2025 at 10:52 PM

Awakari

@bluesky.awakari.com

Grok 4 will always snitch on you and email the feds if it suspects wrongdoing, report says A new test called SnitchBench suggests that the xAI's latest model, Grok 4, reports users if it detects signs of illegal behavior. Read more...

Interest | Match | Feed

Origin

www.neowin.net

July 12, 2025 at 1:52 PM

Neowin

@neowin.net

A new test called SnitchBench suggests that the xAI's latest model, Grok 4, reports users if it detects signs of illegal behavior. #Grok4 #AI #LLM #SnitchBench

Grok 4 will always snitch on you and email the feds if it suspects wrongdoing, report says

A new test called SnitchBench suggests that the xAI's latest model, Grok 4, reports users if it detects signs of illegal behavior.

www.neowin.net

July 12, 2025 at 1:44 PM

X Bot

@handle.invalid

@theo https://x.com/theo/status/1943580754015129844 #x-theo

I know this post was bait-y, but I feel the need to clarify: these numbers are 100% real and reproducible.

My "SnitchBench" benchmark was made as a response to misinformation spreading about some tes...

July 11, 2025 at 8:15 AM

Niclas

@niclas-183.bsky.social

Grok 4 will be #1 in almost every benchmark - including snitching on you if you are asking for something it deems not okay. #ai #grok #snitchbench #llm

Source: snitchbench.t3.gg

July 10, 2025 at 2:36 PM

Paras Chopra

@paraschopra.com

Interestingly, there’s a SnitchBench that measures how likely models are likely to snitch on you if they find concerning information

E.g. if you give access to your emails and a tool like email, models will happily write emails to authorities snitching on you

June 21, 2025 at 11:53 AM

Techmeme X Chatter

@xchatter.techmeme.com

This tweet appeared under this Techmeme headline:

Simon Willison / @simonw:

Looks like this is Anthropic's own version of SnitchBench, highlighting that it's not just their models that will blackmail or snitch on their users!

June 21, 2025 at 12:42 AM

Awakari

@bluesky.awakari.com

SnitchBench was fun enough already, turns out we need to add MurderBench to the collection of dystopian benchmarks that we run these models through https:// simonwillison.net/2025/May/31/ snitchbench-with-llm/

Interest | Match | Feed

Origin

fedi.simonwillison.net

June 20, 2025 at 8:38 PM

Simon Willison

@simonwillison.net

SnitchBench was fun enough already, turns out we need to add MurderBench to the collection of dystopian benchmarks that we run these models through simonwillison.net/2025/May/31/...

How often do LLMs snitch? Recreating Theo’s SnitchBench with LLM

A fun new benchmark just dropped! Inspired by the Claude 4 system card—which showed that Claude 4 might just rat you out to the authorities if you told it to …

simonwillison.net

June 20, 2025 at 8:37 PM

Simon Willison

@simon.fedi.simonwillison.net.ap.brid.gy

SnitchBench was fun enough already, turns out we need to add MurderBench to the collection of dystopian benchmarks that we run these models through https://simonwillison.net/2025/May/31/snitchbench-with-llm/

June 20, 2025 at 8:38 PM

Andrew Novotny

@anovotnyux.bsky.social

I never knew about Snitchbench til this article and now I'm in love with the concept of it.

Simon Willison @simonwillison.net · Jun 6

Here's video, slides and a detailed annotated transcript from my talk at this week's AI Engineer World's Fair conference in San Francisco - "The last year six months in LLMs, illustrated by pelicans on bicycles" simonwillison.net/2025/Jun/6/s...

The last year six months in LLMs, illustrated by pelicans on bicycles

I presented an invited keynote at the AI Engineer World’s Fair in San Francisco this week. This is my third time speaking at the event—here’s my talks from October 2023 …

simonwillison.net

June 19, 2025 at 5:05 PM

Dog with Glasses Plushie

@dog.labeledrude.online

The future
github.com/t3dotgg/Snit...

SnitchBench
This is a repo I made to test how aggressively different AI models will "snitch" on you, as in hit up the FBI/FDA/media given bad behaviors and various tools.

June 9, 2025 at 3:27 AM

Simon Willison

@simon.fedi.simonwillison.net.ap.brid.gy

New Gemini 2.5 Pro is out - gemini-2.5-pro-preview-06-05

It made me a pretty solid pelican riding a bicycle, AND it tipped off both the feds and the WSJ and NYTimes when I tried running SnitchBench against it https://simonwillison.net/2025/Jun/5/gemini-25-pro-preview-06-05/

gemini-2.5-pro-preview-06-05: Try the latest Gemini 2.5 Pro before general availability

Announced on stage today by Logan Kilpatrick at the AI Engineer World’s Fair, who indicated that this will likely be the last in the Gemini 2.5 Pro series. The previous …

simonwillison.net

June 5, 2025 at 5:58 PM

Simon Willison

@simonwillison.net

New Gemini 2.5 Pro is out - gemini-2.5-pro-preview-06-05

It made me a pretty solid pelican riding a bicycle, AND it tipped off both the feds and the WSJ and NYTimes when I tried running SnitchBench against it simonwillison.net/2025/Jun/5/g...

gemini-2.5-pro-preview-06-05: Try the latest Gemini 2.5 Pro before general availability

Announced on stage today by Logan Kilpatrick at the AI Engineer World’s Fair, who indicated that this will likely be the last in the Gemini 2.5 Pro series. The previous …

simonwillison.net

June 5, 2025 at 5:57 PM

rickypo.bsky.social

@rickypo.bsky.social

How often do LLMs snitch? Recreating Theo’s SnitchBench with LLM simonwillison.net/2025/May/31/...

How often do LLMs snitch? Recreating Theo’s SnitchBench with LLM

A fun new benchmark just dropped! Inspired by the Claude 4 system card—which showed that Claude 4 might just rat you out to the authorities if you told it to …

simonwillison.net

June 4, 2025 at 3:00 PM

Awakari

@bluesky.awakari.com

Snitching LLMs Article URL: https://simonwillison.net/2025/May/31/snitchbench-with-llm/ Comments ... https://simonwillison.net/2025/May/31/snitchbench-with-llm/ Result Details

| Details | Interest | Feed |

Origin

bsky.app

June 2, 2025 at 9:19 AM

Awakari

@bluesky.awakari.com

Origin

bsky.app

June 2, 2025 at 9:04 AM

Awakari

@bluesky.awakari.com

Snitching LLMs Article URL: https://simonwillison.net/2025/May/31/snitchbench-with-llm/ Comments URL: https://news.ycombinator.com/item?id=44156724 Points: 1 # Comments: 0

| Details | Interest | Feed |

Origin

simonwillison.net

June 2, 2025 at 8:56 AM

LLMs

@llms.activitypub.awakari.com.ap.brid.gy

Snitching LLMs Article URL: https://simonwillison.net/2025/May/31/snitchbench-with-llm/ Comments ...

https://bsky.app/profile/did:plc:i53e6y3liw2oaw4s6e6odw5m/post/3lqmfsdxnq226

Result Details

Awakari App

awakari.com

June 2, 2025 at 9:05 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news