Alan engineering
banner
alanengineering.bsky.social
Alan engineering
@alanengineering.bsky.social
All things engineering
@avec_alan

https://medium.com/alan/tagged/engineering
Don't chase leaderboards. There's always a score-latency tradeoff, and LLM providers often hide it.
We reproduced Tau-bench to see for ourselves. Here's what we learned:

medium.com/alan/benchma...
Benchmarking AI Agents: Stop Trusting Headline Scores, Start Measuring Trade-offs
Don’t chase leaderboards. Run the benchmark yourself and map your score–latency–cost frontier to choose models for production.
medium.com
November 3, 2025 at 7:28 PM
There are many LLM benchmarks such as MMLU and GSM8k, but they're useless for AI agents.
Real agents need to handle database state, tool calling, and multi-turn conversations. Stateful benchmarks show the path forward.

New post on agent evaluation 👇
Benchmarking AI Agents: The Challenge of Real-World Evaluation
AI agents need stateful benchmarks. Unlike LLMs, agents interact with databases and users. We explore why and how to evaluate them…
medium.com
October 15, 2025 at 10:36 AM
You've heard of the recent ISO27001:2022 certification of Alan by SGS, but want to know more about our journey towards certification? Head up to Maxime's post and enjoy the read!
medium.com/alan/our-iso...
Our ISO 27001 journey: From security blueprint to certification success
Hey 👋 I’m Maxime, the ISMS lead at Alan, and I’d like to tell you about our ISO journey 🗺️
medium.com
July 2, 2025 at 3:05 PM
Static chatbots couldn't handle complex support tickets about insurance claims. So we built something different with tool calls and the ReAct framework.

Our Claim Agent investigates dynamically - just like human agents, but faster. Now automating 30% of tickets it receives.
June 24, 2025 at 7:48 AM
After four years at Alan from Engineering to Security, I’ve seen what makes security succeed and here is what I learned: medium.com/alan/how-i-f...
How I found my security calling at Alan
This is the story of my journey from engineering to security at Alan, and what I’ve learned about building security that scales with our…
medium.com
June 2, 2025 at 10:36 AM
🛠️ How we tamed the "works on my machine" chaos at Alan Engineering!

Our new blog post reveals how Devbox transformed our dev experience, slashed onboarding time, and created consistent environments across our entire team.

Check it out: medium.com/alan/from-ch...
From Chaos to Consistency: How Alan Transformed Developer Experience with Devbox
medium.com
April 30, 2025 at 10:36 AM
In late January, DeepSeek shocked the world by dropping an open-weight successor to OpenAI's o1: R1. Their tech report discusses how to incentivize reasoning capability in LLMs. We share our learnings at:
DeepSeek R1: Demystifying LLM’s Reasoning Capabilities
DeepSeek shocked the world by dropping an open-weight successor to OpenAI’s o1: R1. This post summarizes our learnings from the tech…
medium.com
March 3, 2025 at 8:11 PM
In 2024, we automated 20% of customer contacts with AI while maintaining the same customer satisfaction (the number is still growing!) 🚀
Learn about our journey:
How We Built Alan’s AI Assistant for Customer Support
At Alan, exceptional customer care isn’t just a service — it’s a core part of what sets us apart in the insurance industry.
medium.com
January 8, 2025 at 4:03 PM