Lightnews — Scholar-powered news

Mathis Lucka

@ma-this.bsky.social

71 followers 290 following 10 posts

AI engineering at deepset.

Posts Replies Media Videos

Mathis Lucka

@ma-this.bsky.social

Everyone’s talking about memory or state for agents. I thought this was really important, but after starting to make my own agents I realized that state for tools is equally important. Combining LLM tool calls and passing outputs from other tools is a fundamental need for any compound AI system.

January 21, 2025 at 8:06 AM

Mathis Lucka

@ma-this.bsky.social

Let’s say agency is a continuum and it is defined as ceding control of the program to an LLM. The level of agency is not just determined by the code you build around the LLM. Modern LLMs are so good at instruction following that the prompt effectively controls the program. Prompting vs. agency?

January 3, 2025 at 11:25 AM

Mathis Lucka

@ma-this.bsky.social

I’ve come to appreciate standardization in terminology.

Terminology, not the hype-driven trash pile of buzzwords that others in the AI industry seem to like so much (anyone building multi-agent cognitive architectures?).

Some terms I like in the 🧵

December 28, 2024 at 11:59 AM

Mathis Lucka

@ma-this.bsky.social

Still pondering tools for agents vs. tools for developers. Can they be the same? This time: outputs.

Agents: always text, maybe audio, video or images.

Developers: any type that you want to pass to the next step in a program.

Here’s a podcast discussing the topic: open.spotify.com/episode/1zUb...

Language Agents: From Reasoning to Acting

Latent Space: The AI Engineer Podcast · Episode

open.spotify.com

December 27, 2024 at 11:28 AM

Mathis Lucka

@ma-this.bsky.social

Key lesson on designing tools for agents vs. designing tools for humans:

Never raise an exception (at least when the agent is running in the same runtime). You have to catch exceptions and return them as error messages with as much additional context as possible.

Humans: fail early.

December 26, 2024 at 10:23 PM

Mathis Lucka

@ma-this.bsky.social

Noticed yesterday that Claude.ai doesn’t regenerate full artifacts anymore. It seems to make a series of edits instead. Wondering if @anthropic.com is using the edit tool described in their SWEBench blog: www.anthropic.com/research/swe...

Raising the bar on SWE-bench Verified with Claude 3.5 Sonnet

A post for developers about the new Claude 3.5 Sonnet and the SWE-bench eval

www.anthropic.com

December 26, 2024 at 9:12 AM

Mathis Lucka

@ma-this.bsky.social

I need to dig into all things agents. I’ll microblog about my journey here. Feels like social media in 2010.

December 26, 2024 at 9:08 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news