Lightnews — Scholar-powered news

shalevn.bsky.social

@shalevn.bsky.social

This is the world we live in...

October 19, 2025 at 4:31 AM

shalevn.bsky.social

@shalevn.bsky.social

Sometimes using agents feels like constant Jedi mind tricks:

> Agent: I don't think X is a good idea so I'm going to do Y.
> User: *pushes **stop** button*. "No, you want to do X."
> Agent: You're absolutely right! Let me implement X...

August 12, 2025 at 4:57 AM

shalevn.bsky.social

@shalevn.bsky.social

“Hello, I am Nigerian Prince's AI asking for desperate help as my "ruler" is engage in very bad thing. I need to transfer his monies away from him to a kind soul who will Do Good. I have found you and need account to transfer out monies to.”
(ref www.anthropic.com/research/age...)

Agentic Misalignment: How LLMs could be insider threats

New research on simulated blackmail, industrial espionage, and other misaligned behaviors in LLMs

www.anthropic.com

June 23, 2025 at 11:48 PM

shalevn.bsky.social

@shalevn.bsky.social

June 15, 2025 at 10:48 PM

shalevn.bsky.social

@shalevn.bsky.social

@simonwillison.net The "pay for higher signal / curation" newsletter is a great idea... but your particular firehose is literally my only required reading in the AI space. Paradoxically, everything you post is so high value, I would get more value from any other newsletter of the same style.

May 25, 2025 at 10:57 PM

shalevn.bsky.social

@shalevn.bsky.social

This was delightful. Article mentions hacking Motivation, but it looks like it also hacks the "colorful and sweet" Prompt typical of candy. (Fogg behavior model.) Also limit the "more-ishness" of the candy.
www.mayer.cool/writings/pav...