letta.com
Read our breakdown of the benchmark at letta.com/blog/context...
See the live leaderboard at leaderboard.letta.com
Read our breakdown of the benchmark at letta.com/blog/context...
See the live leaderboard at leaderboard.letta.com
GPT-5 has lower per-token cost than Sonnet 4.5, but costs more in the benchmark because GPT-5 agents are more "token hungry".
GPT-5 has lower per-token cost than Sonnet 4.5, but costs more in the benchmark because GPT-5 agents are more "token hungry".
In its present state, the benchmark is far from saturated - the top model (Sonnet 4.5) takes 74%.
In its present state, the benchmark is far from saturated - the top model (Sonnet 4.5) takes 74%.
• Setting up the memory tool
• Agent self-improvement cycles
• Testing it on technical questions
• Complete memory architecture redesign
• Setting up the memory tool
• Agent self-improvement cycles
• Testing it on technical questions
• Complete memory architecture redesign