Sascha Buehrle
banner
sascha.buehrle.io
Sascha Buehrle
@sascha.buehrle.io
Founder Lemony.ai | building cascadeflow
AI × Product × Technology
Building LangChain agents? Here's the cost problem nobody talks about:

40-70% of agent tool calls don't need expensive flagship models.

A customer support agent making 1000 calls/day = $150/mo wasted on overkill routing.

Solution: Smart cascading in 3 lines.

github.com/lemony-ai/cascadeflow
November 20, 2025 at 3:55 PM
Launch Day 7 🔄 Multi-stage pipeline + domain models

cascadeflow detects your domain (SQL, code, medical etc.) → routes to specialized small model first → cascades to larger models only if needed.

Learns YOUR domains automatically.

80% stay on cheap specialists.

github.com/lemony-ai/ca...

#LLM
GitHub - lemony-ai/cascadeflow: Smart AI model cascading for cost optimization
Smart AI model cascading for cost optimization. Contribute to lemony-ai/cascadeflow development by creating an account on GitHub.
github.com
November 13, 2025 at 1:07 PM
Release Day 6 🤖 cascadeflow for Agents

Tool calls multiply costs. 49% cite ROI as #1 adoption barrier.

cascadeflow's drafter/verifier pattern saves 20-60% on agent systems:
- Tool call cost optimization
- Per-agent budget tracking
- Real-time spend visibility

github.com/lemony-ai/ca...
November 12, 2025 at 7:06 PM
Release Day 5 🔌 cascadeflow for edge devices

Where it all started. Run AI on performance-limited hardware:
- Fully local: vLLM/Ollama support, <10B handles most
- Hybrid: escalate to cloud only when needed
- Domain-specific models outperform flagships

Examples
github.com/lemony-ai/ca...

#EdgeAI
November 11, 2025 at 8:38 AM
Day 4 🎯 User-tier cascading

cascadeflow now assigns different AI models & cascading strategies per user tier:

- different cascade rules per user/group
- enforce spending limits and routes
- available via presets

Your strategy. Your rules. Full transparency.

github.com/lemony-ai/ca...

#AI
GitHub - lemony-ai/cascadeflow: Smart AI model cascading for cost optimization
Smart AI model cascading for cost optimization. Contribute to lemony-ai/cascadeflow development by creating an account on GitHub.
github.com
November 10, 2025 at 10:45 AM
Day 2 of release sprint 🎯

n8n integration is live on github! cascadeflow now plugs into your workflows.

Connect 2 models (cheap drafter + flagship verifier) instead of 1 expensive model. 70-80% of queries never touch the flagship = 40-85% cost savings.

⭐ us: github.com/lemony-ai/ca...

#n8n #AI
November 7, 2025 at 4:40 PM
Day 1 of our release sprint 🚀
CascadeFlow shipped, open source AI cascading that cuts costs 30-65% in 3 lines of code.
Small models handle 80% of workflows. Python + TypeScript. MIT licensed.
Try it & leave a ⭐
github.com/lemony-ai/ca...
#OpenSource #AI
GitHub - lemony-ai/cascadeflow: Smart AI model cascading for cost optimization
Smart AI model cascading for cost optimization. Contribute to lemony-ai/cascadeflow development by creating an account on GitHub.
github.com
November 6, 2025 at 4:17 PM
Excited to launch cascadeflow on @github.com!
Smart AI model cascading that cuts your AI costs by 30-65% in just 3 lines of code.
How small & domain-specific models can handle 80% of workflows; we're making this knowledge available to the entire AI community.
github.com/lemony-ai/ca...
GitHub - lemony-ai/cascadeflow: Smart AI model cascading for cost optimization
Smart AI model cascading for cost optimization. Contribute to lemony-ai/cascadeflow development by creating an account on GitHub.
github.com
November 6, 2025 at 3:54 PM