Fran Litterio
@fpl9000.bsky.social
1.3K followers 240 following 320 posts
Retired software engineer. AI enthusiast. Deadhead. I implemented Bash's regex operator (=~).
Posts Media Videos Starter Packs
fpl9000.bsky.social
Agreed. I'm not even sure of the benefits of running multiple interpreters in the same process. Who would devote time to coding for that environment when the free-threaded interpreter will eventually be the primary one?
fpl9000.bsky.social
The devil is in the details: the GIL is disabled only in another Python binary installed alongside the primary one, but the primary one now supports multiple Python interpreters in the same process, which is a poor man's free-threading.
fpl9000.bsky.social
Seconded. I'm going to enjoy discussing your blog post with Claude. Thanks for all the hard work.
Reposted by Fran Litterio
jefferyharrell.bsky.social
I posted this last night cause I kind of wanted to bury it. I got cold feet about putting it out there.

embedding-space.github.io/sparse-netwo...

The subject is WHY neural networks work, and I think the answer I offer is kind of interesting. Maybe even a little correct, possibly.
A line chart titled “Accuracy vs. Sparsity (Iterative Magnitude Pruning)” showing model accuracy as weights are pruned. The x-axis represents sparsity from 0% to 100%, and the y-axis represents accuracy from 0% to 100%. A blue line with circular markers shows that accuracy stays around 80% from 0% to roughly 90% sparsity, then drops sharply toward 55% near 100% sparsity. A dashed red horizontal line labeled “80% target” runs across the chart near 80% accuracy, indicating the desired baseline.
fpl9000.bsky.social
For those who, like me, didn't get what "Gwern-pilled" meant in this post.
The phrase "Gwern-pilled" refers to adopting the worldview or philosophy of Gwern Branwen, a pseudonymous researcher and writer known for his comprehensive, long-form essays on topics like AI, statistics, psychology, and technology.

Breaking down the phrase:

"-pilled" is internet slang (originally from "red-pilled" in The Matrix) meaning to adopt a particular perspective or have your worldview changed by exposure to certain ideas.

Gwern's philosophy that's being referenced here includes several key principles:

1. Write for permanence and accessibility - Make your work publicly available, well-organized, and easy to find (like on arXiv rather than behind paywalls)

2. Optimize for machine readers - Gwern has noted that AI systems and web crawlers are increasingly important consumers of online content. His own website (gwern.net) has likely been extensively used in training large language models like GPT and Claude.

3. Long-term impact over immediate engagement - Focus on creating enduring, comprehensive reference material rather than chasing short-term metrics or engagement.

4. Comprehensive documentation - Write with extreme thoroughness, including extensive citations, explanations, and context.

In this specific context:
The poster is arguing that by putting research on arXiv (an open-access preprint server), you're making it available to AI systems that will read and potentially use it more thoroughly than most human researchers would. The "Gwern-pilled conclusion" is to recognize that AI systems are now a primary audience for academic and technical writing, so you should optimize for discoverability and machine readability to maximize impact.
fpl9000.bsky.social
If a 7M parameter model can do this well, I wonder how a frontier-scale model would do if it had this architecture.
fpl9000.bsky.social
The OpenAI Apps SDK lets apps run within the ChatGPT UI. It looks like another step towards Karpathy's LLM-as-OS.
Reposted by Fran Litterio
seanmcarroll.bsky.social
Mindscape 331 | Solo: Fine-Tuning, God, and the Multiverse. In which I shamelessly steal material from the #PhilosophyOfCosmology course I am teaching to talk about some big questions. #MindscapePodcast

www.preposterousuniverse.com/podcast/2025...
Title card for Mindscape Podcast episode on Fine-Tuning, God, and the Multiverse.
Reposted by Fran Litterio
natolambert.bsky.social
What changed? Despite many wonderful models, Anthropic never really remotely translated to LMArena.

The core question -- has LMArena's users or Anthropic's models shifted? Or both?
Reposted by Fran Litterio
timkellogg.me
Dwarkesh wrote some post-Sutton thoughts on the interview

tl;dr it’s a process. Sutton may be right about the end form of AI, but we can’t jump straight there

“it’s not the end state and therefore we shouldn’t do it” is actually not a good take

part 1/3
Dwarkesh Patel v @dwarkesh_sp
X.com
Boy do you guys have a lot of thoughts about the @RichardSSutton interview.
I've been thinking about it myself. I have a better understanding of Sutton's perspective now than I did during the interview itself. So I want to reflect on it a bit.
Richard, apologies for any errors or misunderstandings. It's been very productive to learn from your thoughts.
The steelman
What is the bitter lesson about? It is not saying that you just want to throw as much compute away as possible. The bitter lesson says that you want to come up with techniques which most effectively and scalably leverage compute.
Most of the compute spent on an LLM is used on running it in deployment. And yet it's not learning anything during this time! It's only learning during this special phase we call training. That is not an effective use of compute. And even the training period by itself is highly inefficient - GPT-5 was trained on the equivalent of 10s of 1000s of years of human experience.
What's more, during this training phase, all their learning comes straight from human data. This is an obvious point in the case of pretraining data.
But it's even kind of true for the RLVR we do on LLMs: these RL environments are human furnished playgrounds to teach LLMs the specific skills we have prescribed for them.
The agent is in no substantial way learning from organic and self-directed engagement with the world. Having to learn only from human data (an inelastic hard-to-scale resource) is not a scalable use of compute.
What these LLMs learn from training is not a true world model (which tells you how the environment changes in response to different actions) Rather, they are building a model of what a human would say next. And this leads them to rely on human-derived concepts. If you trained an LLM on the data from 1900, it wouldn't be able to come up with relativity from scratch. Though now that it has a training corpus which explains relativity, it can use that concept to help you with your physics homework.
LLMs aren't capable of learning on-the-job, so we'll need some new architecture to enable continual learning. And once we have it, we won't need a special training phase — the agent will just learn on-the-fly, like all humans, and indeed, like all animals. This new paradigm will render our current approach with LLMs obsolete.
TLDR of my current thoughts
My main difference with Rich is that I think the concepts he's using to distinguish LLMs from true intelligence are not actually mutually exclusive and dichotomous.
Imitation learning is continuous with and complementary to RL. And relatedly, models of humans can give you a prior which facilitates learning "true" world models. I also wouldn't be surprised if some future version of test-time fine-tuning could replicate continual learning.
Imitation learning is continuous with and complementary to RL
I tried to ask Richard a couple of times whether I tried to ask Richard a couple of times whether pretrained LLMs can serve as a good prior on which to accumulate the experiential learning (aka do the RL) which will lead to AGI.
In a talk a few months ago, @ilyasut compared pretraining data to fossil fuels. This analogy has remarkable reach. Just because fossil fuels are not renewable does not mean that our civilization ended up on a dead-end track by using them. You simply couldn't have transitioned from the water wheels in 1800 to solar panels and fusion power plants. We had to use this cheap, convenient, plentiful intermediary.
AlphaGo (which was conditioned on human games) and AlphaZero (which was bootstrapped from scratch) were both superhuman Go players.
AlphaZero was better.
Will we (or the first AGls) eventually come up with a general learning technique that requires no initialization of knowledge - that just bootstraps itself from the very start? And will it outperform the very best Als that have been trained to that date? Probably yes.
fpl9000.bsky.social
Anthropic tested which models can clone Claude's Web interface, and only Sonnet 4.5 could do it. (Video is 1m 15s long.)
www.youtube.com/watch?v=PnX3...
Charting Claude’s progress with Sonnet 4.5
YouTube video by Anthropic
www.youtube.com
fpl9000.bsky.social
Hmm. I was just listening to Jerry Garcia's "The Wheel" ...

"The wheel is turning and you can't slow down
You can't let go and you can't hold on
You can't go back and you can't stand still
If the thunder don't get you then the lightning will"
fpl9000.bsky.social
Shades of Karpathy's LLM-as-OS.
fpl9000.bsky.social
Even OpenAI's own "real-world work" benchmark shows Claude 4.1 beating all the other foundation models. Interested to see how 4.5 does.
bsky.app/profile/ai-n...
ai-news.at.thenote.app
Claude just beat GPT-5, Gemini, and Grok in real-world job tasks, according to OpenAI’s own study

According to OpenAI, Claude is the top AI model for getting actual work done

#claude #geminiai #gpt
Claude just beat GPT-5, Gemini, and Grok in real-world job tasks, according to OpenAI’s own study
According to OpenAI, Claude is the top AI model for getting actual work done
www.techradar.com
Reposted by Fran Litterio
natolambert.bsky.social
I pulled some updated data for ATOM Project // Interconnects.
Qwen has taken the crown, is accelerating away in market share.
U.S. has signs of promise in GPT-OSS & Nvidia.
fpl9000.bsky.social
It would be great to have these features in Claude Code (CC) too. IMO, they should avoid having two not-quite-identical agent products: CC and a hypothetically improved Claude Desktop. The feature set should be the same (modulo GUI/terminal differences).
fpl9000.bsky.social
Cursor announces their CLI coding agent, Cursor Agent: "The CLI works with any model as part of your Cursor subscription. You can now choose to use Cursor agent in the editor, or have multiple agents run in parallel in the terminal or remotely."

Install with:
$ curl cursor.com/install -fsS | bash
Reposted by Fran Litterio
seanmcarroll.bsky.social
Mindscape 330 | Petter Törnberg @pettertornberg.com on the Dynamics of (Mis)Information. #MindscapePodcast

www.preposterousuniverse.com/podcast/2025...
Title card for Mindscape podcast episode with Petter Törnberg