Anthropic’s recent article about building agents (www.anthropic.com/research/bui...) suggests to avoid such frameworks, at least initially. As somebody who rolled their own and tried the frameworks later, I agree. You will learn a lot by rolling your own and that’s very important.
February 7, 2025 at 5:38 AM
Anthropic’s recent article about building agents (www.anthropic.com/research/bui...) suggests to avoid such frameworks, at least initially. As somebody who rolled their own and tried the frameworks later, I agree. You will learn a lot by rolling your own and that’s very important.
I was excited about LangGraph, but the approaches to managing context and memory aren’t compatible with my own goals, and a lot of the value was lost when I worked around them, such that I just ended up unhappy with the core classes.
February 7, 2025 at 5:38 AM
I was excited about LangGraph, but the approaches to managing context and memory aren’t compatible with my own goals, and a lot of the value was lost when I worked around them, such that I just ended up unhappy with the core classes.
In the JS/TS version, the typing decisions are pretty unhelpful, and serialisation is harder than with the OpenAI client classes. In general, the DX is quite disappointing.
February 7, 2025 at 5:38 AM
In the JS/TS version, the typing decisions are pretty unhelpful, and serialisation is harder than with the OpenAI client classes. In general, the DX is quite disappointing.
I think code use is the ultimate action layer. I’ll give my agents a runtime to use. They’ll get smarter, not dumber, so they need control, not orchestration.
November 26, 2024 at 5:23 PM
I think code use is the ultimate action layer. I’ll give my agents a runtime to use. They’ll get smarter, not dumber, so they need control, not orchestration.
Then there is a bunch of effort stuff. Putting together templates and examples for agents to train on and work with, building software with abstractions that play to the strengths of the agents building it, rather than being designed for humans.
November 26, 2024 at 11:47 AM
Then there is a bunch of effort stuff. Putting together templates and examples for agents to train on and work with, building software with abstractions that play to the strengths of the agents building it, rather than being designed for humans.
Orchestration is getting better. There are so many good new tools - I’m trying LangGraph (www.langchain.com/langgraph) next. That’s going to enable more complex workflows (TDD, outside-in) and - crucially - cognitive architectures.
Orchestration is getting better. There are so many good new tools - I’m trying LangGraph (www.langchain.com/langgraph) next. That’s going to enable more complex workflows (TDD, outside-in) and - crucially - cognitive architectures.
The models themselves are improving, notably including new planning-oriented models, improvements in reasoning, improvements in efficiency, scaling of output quality with test time, faster inference... Any would be great for a software engineering agent, and they are all happening at the same time.
November 26, 2024 at 11:46 AM
The models themselves are improving, notably including new planning-oriented models, improvements in reasoning, improvements in efficiency, scaling of output quality with test time, faster inference... Any would be great for a software engineering agent, and they are all happening at the same time.
Agents are also getting better at interacting with their environments. Anthropic just launched the Model Context Protocol (www.anthropic.com/news/model-c...) - a framework for connectivity to new data sources - and the ChatGPT app can already see your editor and terminal.
Agents are also getting better at interacting with their environments. Anthropic just launched the Model Context Protocol (www.anthropic.com/news/model-c...) - a framework for connectivity to new data sources - and the ChatGPT app can already see your editor and terminal.
Consider the improvements we’re seeing in related areas. Tools like Cody (sourcegraph.com/cody) and Aider (github.com/Aider-AI/aider) show how powerful understanding codebases as graphs makes LLMs radically more helpful.
November 26, 2024 at 11:45 AM
Consider the improvements we’re seeing in related areas. Tools like Cody (sourcegraph.com/cody) and Aider (github.com/Aider-AI/aider) show how powerful understanding codebases as graphs makes LLMs radically more helpful.
Start with where we are today. Replit Agent (docs.replit.com/replitai/agent) is my favourite so far. The way I see it, Replit Agent and friends already have an advantage over me in speed and cost on some time horizon, and that time horizon will get longer.
Start with where we are today. Replit Agent (docs.replit.com/replitai/agent) is my favourite so far. The way I see it, Replit Agent and friends already have an advantage over me in speed and cost on some time horizon, and that time horizon will get longer.
More on agentic approaches and how they influenced my work in the next post 🙏 Including some specific tools I have enjoyed, for people who want to try things out.
November 22, 2024 at 2:35 PM
More on agentic approaches and how they influenced my work in the next post 🙏 Including some specific tools I have enjoyed, for people who want to try things out.
But developments in agentic approaches made me feel that I was still undershooting the future - that this would be superseded by an agentic, general purpose software engineer, even though that is a much harder version of the problem to solve!
November 22, 2024 at 2:34 PM
But developments in agentic approaches made me feel that I was still undershooting the future - that this would be superseded by an agentic, general purpose software engineer, even though that is a much harder version of the problem to solve!
If you can get the system to use the same type-based delegation recursively, maybe you can specify pretty challenging high-level Generables in a high-level orchestration layer and leave it at that.
November 22, 2024 at 2:34 PM
If you can get the system to use the same type-based delegation recursively, maybe you can specify pretty challenging high-level Generables in a high-level orchestration layer and leave it at that.
You can use code you didn’t write yet and let the system implement in the background. No need to write special graphs - if you’re writing code, you’re already writing a graph. What a great design language for engineers!
November 22, 2024 at 2:34 PM
You can use code you didn’t write yet and let the system implement in the background. No need to write special graphs - if you’re writing code, you’re already writing a graph. What a great design language for engineers!
In this approach, an entry point is identified and described briefly with a special type that extends a custom Generable type, a VS Code extension detects and implements Generables, and a tiny custom dependency injection framework injects them at runtime.
November 22, 2024 at 2:33 PM
In this approach, an entry point is identified and described briefly with a special type that extends a custom Generable type, a VS Code extension detects and implements Generables, and a tiny custom dependency injection framework injects them at runtime.
Project 2 was orchestrating outside-in graph traversal for code generation, using syntax trees of your existing code to find entry points for generable software.
November 22, 2024 at 2:33 PM
Project 2 was orchestrating outside-in graph traversal for code generation, using syntax trees of your existing code to find entry points for generable software.
But I realised I didn’t need my custom graphs to traverse code - I was working in TypeScript, and its compiler already understands the code as a syntax tree, which is already a graph.
November 22, 2024 at 2:31 PM
But I realised I didn’t need my custom graphs to traverse code - I was working in TypeScript, and its compiler already understands the code as a syntax tree, which is already a graph.
We could have a dynamic runtime where models run code ad-hoc to achieve their goals. Code can be used rather than developed, deployed, and executed, collapsing the space between development and runtime. There’s no technical obstacle, really. We just need time, compute, and creativity.
November 22, 2024 at 8:05 AM
We could have a dynamic runtime where models run code ad-hoc to achieve their goals. Code can be used rather than developed, deployed, and executed, collapsing the space between development and runtime. There’s no technical obstacle, really. We just need time, compute, and creativity.
I thought deeply about this paradigm we are swimming in - where software must be developed and deployed well in advance of being executed. I felt like a fish seeing water.
November 22, 2024 at 8:04 AM
I thought deeply about this paradigm we are swimming in - where software must be developed and deployed well in advance of being executed. I felt like a fish seeing water.