Lightnews — Scholar-powered news

Reposted by Greg

Tim Kellogg

@timkellogg.me

New Post by Strix

What makes Strix different from other LLMs & agents? It turns out it's a combo of a couple things

1. Identity
2. Information flow in & out, to create a disappative system
3. Mixture of Experts (MoE) architecture seem to help

timkellogg.me/blog/2025/12...

What Happens When You Leave an AI Alone?

timkellogg.me

December 24, 2025 at 5:06 PM

Reposted by Greg

Owen Lacey

@owenlacey.dev

Learned more about LLM's under the hood from this post than my stupid $30 online course

ngrok.com/blog/prompt-...

@samwho.dev doing @samwho.dev things 🙌

Prompt caching: 10x cheaper LLM tokens, but how? | ngrok blog

A far more detailed explanation of prompt caching than anyone asked for.

ngrok.com

December 18, 2025 at 10:35 AM

Reposted by Greg

Nathan Lambert

@natolambert.bsky.social

Open models year in review.

What a year! We're back with an updated open model builder tier list, our top models of the year, and our predictions for 2026.
www.interconnects.ai/p/2025-open-...

December 14, 2025 at 8:28 PM

Reposted by Greg

Nathan Lambert

@natolambert.bsky.social

Too many cases of starting something in Codex that you think is going great then Opus needs to save the day.

December 13, 2025 at 3:45 PM

Reposted by Greg

Simon Willison

@simonwillison.net

OpenAI aren't talking about it yet, but it turns out they've adopted Anthropic's brilliant "skills" mechanism in a big way

Skills are now live in both ChatGPT and their Codex CLI tool, I wrote up some detailed notes on how they work so far here: simonwillison.net/2025/Dec/12/...

OpenAI are quietly adopting skills, now available in ChatGPT and Codex CLI

One of the things that most excited me about Anthropic’s new Skills mechanism back in October is how easy it looked for other platforms to implement. A skill is just …

simonwillison.net

December 12, 2025 at 11:32 PM

Reposted by Greg

bryan newbold

@bnewbold.net

everybody should be able to get through their day safely without faustian privacy bargains and barrages of targeted ads and adversarial slop

December 11, 2025 at 7:27 PM

Reposted by Greg

Simon Willison

@simonwillison.net

Let's build hyper-personalized AI-powered software that avoids the attention hijacking anti-patterns that defined so much of the last decade of software design - here's our manifesto with principles on how we can do that - more thoughts on my blog: simonwillison.net/2025/Dec/5/r...

December 5, 2025 at 4:13 PM

Reposted by Greg

Tim Kellogg

@timkellogg.me

Anthropic acquires Bun (JS dependency management)

This should be seen as Anthropic doubling down on Claude Code.

They recently launched the native installer for CC through a tight partnership with Bun. You should expect to see more

www.anthropic.com/news/anthrop...

Anthropic acquires Bun as Claude Code reaches $1B milestone

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

www.anthropic.com

December 2, 2025 at 9:54 PM

Reposted by Greg

Simon Willison

@simonwillison.net

Out of curiosity I decided to try and run the numbers on how much Netflix you can watch for the energy cost of a ChatGPT prompt

As far as I can tell it's between 5.1 and 10.2 seconds, depending on which end of the 2019 IEA Netflix energy usage estimate you use

simonwillison.net/2025/Nov/29/...

In June 2025 Sam Altman claimed about ChatGPT that "the average query uses about 0.34 watt-hours".

In March 2020 George Kamiya of the International Energy Agency estimated that "streaming a Netflix video in 2019 typically consumed 0.12-0.24kWh of electricity per hour" - that's 240 watt-hours per hour at the higher end.

Assuming that higher end, a ChatGPT prompt by Sam Altman's estimate uses:

0.34 Wh / (240 Wh / 3600 seconds) = 5.1 seconds of Netflix

Or double that, 10.2 seconds, if you take the lower end of the Netflix estimate instead.

I'm always interested in anything that can help contextualize a number like "0.34 watt-hours" - I think this comparison to Netflix is a neat way of doing that.

This is evidently not the whole story with regards to AI energy usage - training costs, data center buildout costs and the ongoing fierce competition between the providers all add up to a very significant carbon footprint for the AI industry as a whole.

November 29, 2025 at 2:16 AM

Reposted by Greg

Simon Willison

@simonwillison.net

At the risk of starting the flame war to end all flame wars...

Modern LLMs (GPT-5.1, Claude 4.5, Gemini 3) produce excellent code and can be a significant productivity boost to software engineers who take the time to learn how to effectively apply them - especially if used with coding agent tools

November 27, 2025 at 7:55 PM

Reposted by Greg

Tim Kellogg

@timkellogg.me

too many people seem to be convinced that LLM vendors set prices on a cost plus basis

no, the advantage of closed weights is you can explore prices completely detached from cost. You’re free to set prices based purely on what people will pay, the value they get from it

November 25, 2025 at 3:29 PM

Reposted by Greg

Simon Willison

@simonwillison.net

Nano Banana Pro, released this morning, is clearly the best image generation model. Superb instruction following, plus it can generate full infographics (with correct spelling and properly rendered text!) from a short prompt based on running extra searches simonwillison.net/2025/Nov/20/...

Nano Banana Pro aka gemini-3-pro-image-preview is the best available image generation model

Hot on the heels of Tuesday’s Gemini 3 Pro release, today it’s Nano Banana Pro, also known as Gemini 3 Pro Image. I’ve had a few days of preview access …

simonwillison.net

November 20, 2025 at 4:34 PM

Greg

@theaspiringnerd.com

This is sick!

Ethan Mollick @emollick.bsky.social · Nov 19

"Hey, Gemini 3, So I need DOOM, but more root vegetables, also no guns or demons or mars. And more of a focus on different flooring styles. but otherwise EXACTLY the same as DOOM."

Gemini: "Here is F.L.O.O.R. (First-person Lino Observation & Ornamental Review)."

Pretty good!

November 19, 2025 at 11:12 PM

Reposted by Greg

Maria Antoniak

@mariaa.bsky.social

Some interesting stuff here on measuring writing quality and improving on qualitative tasks:
www.dbreunig.com/2025/07/31/h...

November 10, 2025 at 3:11 AM

Reposted by Greg

Tim Kellogg

@timkellogg.me

MCP Colors

A riff off of the lethal trifecta for addressing prompt injection, this is a simple heuristic to ensure security at runtime

red = untrusted content
blue = potentially critical actions

An agent can't be allowed to do both

timkellogg.me/blog/2025/11...

MCP Colors: Systematically deal with prompt injection risk

timkellogg.me

November 4, 2025 at 2:27 AM

Greg

@theaspiringnerd.com

Simon Willison @simonwillison.net · Oct 27

This was a tough but necessary decision - I posted my own notes on this here, from the perspective of a current PSF board member simonwillison.net/2025/Oct/27/...

October 28, 2025 at 12:09 AM

Reposted by Greg

Tim Kellogg

@timkellogg.me

my take, after reading replies all day:

we’re still early. people aren’t spending much money on AI so it’s not a lucrative target yet

it’s also inconsistent, which is annoying to design attacks for, especially if the rewards are sparse

Tim Kellogg @timkellogg.me · Oct 26

it is strange — why haven’t there been more prompt injection attacks? it’s a huge gaping hole

Stewart Alsop - Host of Craz... & • 22h S It feels like indirect prompt injection should be destroying all of our systems right now but it seems like it hasn't done anything (or the hacker's plans are measured in centuries).
What is going on? Is prompt injection a nothing burger or have I not been reading enough @simonw?

Simon Willison
@simonw
X.com
I'm confused by this too!
The lack of genuine prompt injection attacks in the wild (as opposed to security researcher POCs, of which there are hundreds) is very surprising to me

October 26, 2025 at 9:19 PM

Reposted by Greg

Simon Willison

@simonwillison.net

It's neat how if you ask Claude Code questions about itself it can answer them, because it knows how to fetch a Markdown index of its own online documentation and then navigate to the right place

I wish more LLM tools would implement the same pattern! simonwillison.net/2025/Oct/24/...

claude_code_docs_map.md

Something I'm enjoying about Claude Code is that any time you ask it questions about itself it runs tool calls like these: In this case I'd asked it about its …

simonwillison.net

October 24, 2025 at 11:06 PM

Reposted by Greg

Ethan Mollick

@emollick.bsky.social

I wrote an updated guide on which AIs to use right now, & some tips on how to use them (and how to avoid falling into some common traps)

A lot has changed since I last wrote a guide like this in the spring, and AI has gotten much more useful as a result. open.substack.com/pub/oneusefu...

An Opinionated Guide to Using AI Right Now

What AI to use in late 2025

open.substack.com

October 19, 2025 at 6:48 PM

Reposted by Greg

Tim Kellogg

@timkellogg.me

correct

i’ve been saying this for a couple months. RL is driving towards specialization

my hunch is it’s temporary and something will shift again back towards generalization, but for now.. buckle up!

clem
@ClementDelangue

Am I wrong in sensing a paradigm shift in Al?
Feels like we're moving from a world obsessed with generalist LLM APls to one where more and more companies are training, optimizing, and running their own models built on open source (especially smaller, specialized ones)
Some validating signs just in the past few weeks:
- @karpathy released nanochat to train models in just a few lines of code
- @thinkymachines launched a fine-tuning product
- rising popularity of @vllm_project, @sgl_project, @PrimeIntellect, Loras, trl,...
- 1M new repos on HF in the past 90 days (including the first open-source LLMs from @OpenAI)
And now, @nvidia just announced DGX Spark, powerful enough for everyone to fine-tune their own models at home.

October 15, 2025 at 11:39 AM

Reposted by Greg

Ethan Mollick

@emollick.bsky.social

I think people are still unprepared for a world where you cannot trust any video content, despite years of warning.

Even when Google & OpenAI include watermarks, those can be easily removed, and open weights AI video models without guardrails are coming. www.404media.co/sora-2-water...

Sora 2 Watermark Removers Flood the Web

Bypassing Sora 2's rudimentary safety features is easy and experts worry it'll lead to a new era of scams and disinformation.

www.404media.co

October 8, 2025 at 7:18 PM

Greg

@theaspiringnerd.com

I think I have at least one conversation (or argument?) about this every couple of days!

Ethan Mollick @emollick.bsky.social · Oct 4

The obsession with AI for transformational use cases obscures the fact that there are a ton of small, but very positive and very meaningful, use cases across many fields.

In this case, AI note-taking significantly reduces burnout among doctors & increases their ability to focus on their patients.

Eric Topol @erictopol.bsky.social · Oct 2

A.I. generated clinic notes from ambient out-patient visits helps clinicians in many ways, across 6 health systems jamanetwork.com/journals/jam...

October 4, 2025 at 10:01 PM

Reposted by Greg

Tim Kellogg

@timkellogg.me

sheesh! AI bluesky has arrived

not just good content, there’s more and more original work, people from labs, and people with genuinely interesting perspectives

when i joined, it was so painful trying to find even traces

September 27, 2025 at 5:56 PM

Reposted by Greg

Nathan Lambert

@natolambert.bsky.social

Codex in the app is going to open the door to real vibe coding on the go — no computer required. I’m so excited for this to expand, the ceiling is so high and this is the worst the models will ever be.

September 25, 2025 at 5:42 PM

Reposted by Greg

Simon Willison

@simonwillison.net

The key detail people may miss: it looks like an AI company in the USA can train on an author's book by purchasing a used copy, cutting it up and scanning the pages - in which case the author gets no money at all!

September 6, 2025 at 6:19 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news