Lightnews — Scholar-powered news

Siobhan Rossi

@shivros.bsky.social

Is this the future?

AI pretending to be humans using a UI.
Even better - AI pretending to be a human using an AI pretending to be humans using a UI.
Training on labeled OpenSCAD (zurl.co/1tG2v) code seems 100x easier if you want to make an AI make 3D models.

December 16, 2025 at 10:50 PM

Siobhan Rossi

@shivros.bsky.social

Thanks for the suggestion. I hadn't tried the MCP server, but I'll give it a shot to see if it resolves the issues.

I'm not dogmatic about which model I'm using. It's only that the Codex CLI is working best for me in most contexts at the moment. Maybe the Claude 4.5 release changes that.

November 26, 2025 at 12:36 PM

Siobhan Rossi

@shivros.bsky.social

Do gpt-5.1-codex and gpt-5.1-codex-max not count as newer models? I see them constantly treating runes like functions and creating deeply nested derived values when it makes no sense to do so.

The only language I've seen worse performance from LLMs in than Svelte 5 is Rust.

November 26, 2025 at 12:19 PM

Siobhan Rossi

@shivros.bsky.social

Until we build tools that reduce the verification load — not just generate more code — the bottleneck is only going to get tighter.

November 26, 2025 at 1:34 AM

Siobhan Rossi

@shivros.bsky.social

And the ergonomics of a language — how easy it is for an LLM to use correctly — is becoming a serious factor. (Try using Svelte 5 runes with an LLM if you want a verifiable nightmare.)

LLMs are shifting the center of gravity of what engineering work actually is.

November 26, 2025 at 1:34 AM

Siobhan Rossi

@shivros.bsky.social

Fewer “let me write this module,” more “let me prevent this AI from quietly breaking our entire system.”

Test-driven development and automated quality checks are becoming more important.

November 26, 2025 at 1:33 AM

Siobhan Rossi

@shivros.bsky.social

The output firehose got bigger, and the review surface area grew with it.

The job is changing shape: less raw implementation, more steering and evaluation. More attention to failure modes than syntax.

November 26, 2025 at 1:33 AM

Siobhan Rossi

@shivros.bsky.social

The model writes most of the code — but someone still has to do the shaping, the corrections, the guardrails, and the judgment.

And that’s the real bottleneck.
We’ve accelerated code *production*, but not code *verification*.

November 26, 2025 at 1:32 AM

Siobhan Rossi

@shivros.bsky.social

* Let it execute
* Watch the diffs like I’m monitoring a toddler near an open flame
* Steer it back when it wanders
* Make it write tests if it “forgets”
* Then manually repair the subtle, end-to-end issues that only show up once everything is wired together

November 26, 2025 at 1:32 AM

Siobhan Rossi

@shivros.bsky.social

They can follow a plan for more steps and lose the plot less often. Endurance improved; the ceiling didn’t.

My workflow today is almost muscle memory:

* Write down the requirements and the approach
* Tell the model to generate a plan
* Fix the plan (always)

November 26, 2025 at 1:31 AM

Siobhan Rossi

@shivros.bsky.social

Since reasoning models dropped a year ago, I haven’t noticed the core complexity ceiling shifting much. Models aren’t solving meaningfully harder problems. What *has* changed is how long they can stay coherent without drifting into nonsense.

November 26, 2025 at 1:31 AM

Siobhan Rossi

@shivros.bsky.social

I’ve been using LLM coding tools seriously since mid-2023 — Cursor, Windsurf, VSCode + Roo Code, Claude Code, Gemini CLI, Codex CLI. At this point I’ve seen every phase of the hype cycle up close.

November 26, 2025 at 1:30 AM

Siobhan Rossi

@shivros.bsky.social

zurl.co/2MRfd

November 26, 2025 at 12:01 AM

Siobhan Rossi

@shivros.bsky.social

They validated some of the generated proteins in bacteria, including antitoxins that barely resemble anything in known biology.

Unfortunately, human genes are much tougher, but it’s a sign of where models in bio are drifting — from modeling biology to proposing it.

November 26, 2025 at 12:00 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news