Mathis Lucka
ma-this.bsky.social
Mathis Lucka
@ma-this.bsky.social
AI engineering at deepset.
Everyone’s talking about memory or state for agents. I thought this was really important, but after starting to make my own agents I realized that state for tools is equally important. Combining LLM tool calls and passing outputs from other tools is a fundamental need for any compound AI system.
January 21, 2025 at 8:06 AM
Let’s say agency is a continuum and it is defined as ceding control of the program to an LLM. The level of agency is not just determined by the code you build around the LLM. Modern LLMs are so good at instruction following that the prompt effectively controls the program. Prompting vs. agency?
January 3, 2025 at 11:25 AM
I’ve come to appreciate standardization in terminology.

Terminology, not the hype-driven trash pile of buzzwords that others in the AI industry seem to like so much (anyone building multi-agent cognitive architectures?).

Some terms I like in the 🧵
December 28, 2024 at 11:59 AM
Still pondering tools for agents vs. tools for developers. Can they be the same? This time: outputs.

Agents: always text, maybe audio, video or images.

Developers: any type that you want to pass to the next step in a program.

Here’s a podcast discussing the topic: open.spotify.com/episode/1zUb...
Language Agents: From Reasoning to Acting
Latent Space: The AI Engineer Podcast · Episode
open.spotify.com
December 27, 2024 at 11:28 AM
Key lesson on designing tools for agents vs. designing tools for humans:

Never raise an exception (at least when the agent is running in the same runtime). You have to catch exceptions and return them as error messages with as much additional context as possible.

Humans: fail early.
December 26, 2024 at 10:23 PM
Noticed yesterday that Claude.ai doesn’t regenerate full artifacts anymore. It seems to make a series of edits instead. Wondering if @anthropic.com is using the edit tool described in their SWEBench blog: www.anthropic.com/research/swe...
Raising the bar on SWE-bench Verified with Claude 3.5 Sonnet
A post for developers about the new Claude 3.5 Sonnet and the SWE-bench eval
www.anthropic.com
December 26, 2024 at 9:12 AM
I need to dig into all things agents. I’ll microblog about my journey here. Feels like social media in 2010.
December 26, 2024 at 9:08 AM