Galileo.ai
@rungalileo.bsky.social
The fastest way to ship reliable AI apps - Evaluation, Experimentation, and Observability Platform
Reposted by Galileo.ai
✨Here's why your AI is lying!✨
The other week at @devrelcon.bsky.social I sat down to chat with Joseph Petty from @appsmith.bsky.social about AI, why you need evaluations, and how @rungalileo.bsky.social can help you.
Oh, and 🌶️ Jim's spicy take on AI 👀
youtu.be/I2vRx5Ieak8?...
The other week at @devrelcon.bsky.social I sat down to chat with Joseph Petty from @appsmith.bsky.social about AI, why you need evaluations, and how @rungalileo.bsky.social can help you.
Oh, and 🌶️ Jim's spicy take on AI 👀
youtu.be/I2vRx5Ieak8?...
Why Every AI Company Needs an AI to Test Their AI
YouTube video by Appsmith
youtu.be
August 11, 2025 at 6:35 PM
✨Here's why your AI is lying!✨
The other week at @devrelcon.bsky.social I sat down to chat with Joseph Petty from @appsmith.bsky.social about AI, why you need evaluations, and how @rungalileo.bsky.social can help you.
Oh, and 🌶️ Jim's spicy take on AI 👀
youtu.be/I2vRx5Ieak8?...
The other week at @devrelcon.bsky.social I sat down to chat with Joseph Petty from @appsmith.bsky.social about AI, why you need evaluations, and how @rungalileo.bsky.social can help you.
Oh, and 🌶️ Jim's spicy take on AI 👀
youtu.be/I2vRx5Ieak8?...
Success for AI agents varies greatly by domain and requires nuanced, domain-specific metrics.
@erinmikail.bsky.social's new tutorial shows how to build and track tailored custom metrics using Galileo for reliable AI evaluation.
Read Erin's blog here: galileo.ai/blog/silly-s...
@erinmikail.bsky.social's new tutorial shows how to build and track tailored custom metrics using Galileo for reliable AI evaluation.
Read Erin's blog here: galileo.ai/blog/silly-s...
July 3, 2025 at 6:37 PM
Success for AI agents varies greatly by domain and requires nuanced, domain-specific metrics.
@erinmikail.bsky.social's new tutorial shows how to build and track tailored custom metrics using Galileo for reliable AI evaluation.
Read Erin's blog here: galileo.ai/blog/silly-s...
@erinmikail.bsky.social's new tutorial shows how to build and track tailored custom metrics using Galileo for reliable AI evaluation.
Read Erin's blog here: galileo.ai/blog/silly-s...
Deploying an LLM without the right infrastructure in place is like casting spells without a spellbook.
#AI #LLM #AIEvaluation #MLOps #DataQuality #Cohesity #GalileoAI #ChainOfThought #Podcast
#AI #LLM #AIEvaluation #MLOps #DataQuality #Cohesity #GalileoAI #ChainOfThought #Podcast
June 11, 2025 at 9:52 PM
Deploying an LLM without the right infrastructure in place is like casting spells without a spellbook.
#AI #LLM #AIEvaluation #MLOps #DataQuality #Cohesity #GalileoAI #ChainOfThought #Podcast
#AI #LLM #AIEvaluation #MLOps #DataQuality #Cohesity #GalileoAI #ChainOfThought #Podcast
We’re excited to release 2 new AI agent interfaces that make agent observability & evaluations even more effective.
- Timeline View: No more guessing where your agent gets stuck, see execution flow & bottlenecks quickly.
- Conversation View: Debug from the user's perspective, not just the system's
- Timeline View: No more guessing where your agent gets stuck, see execution flow & bottlenecks quickly.
- Conversation View: Debug from the user's perspective, not just the system's
June 11, 2025 at 6:01 PM
We’re excited to release 2 new AI agent interfaces that make agent observability & evaluations even more effective.
- Timeline View: No more guessing where your agent gets stuck, see execution flow & bottlenecks quickly.
- Conversation View: Debug from the user's perspective, not just the system's
- Timeline View: No more guessing where your agent gets stuck, see execution flow & bottlenecks quickly.
- Conversation View: Debug from the user's perspective, not just the system's
What do LLM evals and comedy have in common? Timing.
Join @erinmikail.bsky.social at the #databricks #DataAISummit as she breaks down what it really takes to test LLMs in unexpected domains—like generating humor.
Come for the eval benchmarks. Stay for the chaos.
#GenAI #LLMevals #AIUX #LLMops
Join @erinmikail.bsky.social at the #databricks #DataAISummit as she breaks down what it really takes to test LLMs in unexpected domains—like generating humor.
Come for the eval benchmarks. Stay for the chaos.
#GenAI #LLMevals #AIUX #LLMops
June 11, 2025 at 3:27 PM
What do LLM evals and comedy have in common? Timing.
Join @erinmikail.bsky.social at the #databricks #DataAISummit as she breaks down what it really takes to test LLMs in unexpected domains—like generating humor.
Come for the eval benchmarks. Stay for the chaos.
#GenAI #LLMevals #AIUX #LLMops
Join @erinmikail.bsky.social at the #databricks #DataAISummit as she breaks down what it really takes to test LLMs in unexpected domains—like generating humor.
Come for the eval benchmarks. Stay for the chaos.
#GenAI #LLMevals #AIUX #LLMops
Make sure to stop by booth #120 to say hi to @erinmikail.bsky.social and the Galileo team during the #DataAISummit this week!
June 9, 2025 at 9:52 PM
Make sure to stop by booth #120 to say hi to @erinmikail.bsky.social and the Galileo team during the #DataAISummit this week!
Next week, Galileo is headed to San Francisco for the Databricks Data + AI Summit!
If you’re building with LLMs, testing agents, or just trying to trust what your models are doing in production, come find us at Booth #120
If you’re building with LLMs, testing agents, or just trying to trust what your models are doing in production, come find us at Booth #120
June 6, 2025 at 6:04 PM
Next week, Galileo is headed to San Francisco for the Databricks Data + AI Summit!
If you’re building with LLMs, testing agents, or just trying to trust what your models are doing in production, come find us at Booth #120
If you’re building with LLMs, testing agents, or just trying to trust what your models are doing in production, come find us at Booth #120
Reposted by Galileo.ai
On my way to SF. If you’re attending the AI Engineering worlds fair and want ti learn why your AI needs reliability and evaluations come say hi at the @rungalileo.bsky.social booth.
June 3, 2025 at 6:24 PM
On my way to SF. If you’re attending the AI Engineering worlds fair and want ti learn why your AI needs reliability and evaluations come say hi at the @rungalileo.bsky.social booth.
Siva Surendira, CEO of Lyzr, perfectly captures why enterprises need robust AI evaluation:
"I recommend Galileo as the antivirus equivalent for your AI system - you need these checks & balances. A MacBook is secure by nature, having that additional layer catches things the core system might miss."
"I recommend Galileo as the antivirus equivalent for your AI system - you need these checks & balances. A MacBook is secure by nature, having that additional layer catches things the core system might miss."
June 2, 2025 at 4:23 PM
Siva Surendira, CEO of Lyzr, perfectly captures why enterprises need robust AI evaluation:
"I recommend Galileo as the antivirus equivalent for your AI system - you need these checks & balances. A MacBook is secure by nature, having that additional layer catches things the core system might miss."
"I recommend Galileo as the antivirus equivalent for your AI system - you need these checks & balances. A MacBook is secure by nature, having that additional layer catches things the core system might miss."
🚨 Heading to the AI Engineers World’s Fair in SF next week?
I’ll (@JimBobBennett) be there with the Galileo crew—booth, talks, party, and all. I’m giving a talk on “Taming Your AI Agents with Evaluations”, aka how to stop your AI from making up entire book reports (Chicago Sun-Times, we see you 👀).
I’ll (@JimBobBennett) be there with the Galileo crew—booth, talks, party, and all. I’m giving a talk on “Taming Your AI Agents with Evaluations”, aka how to stop your AI from making up entire book reports (Chicago Sun-Times, we see you 👀).
May 30, 2025 at 10:33 PM
🚨 Heading to the AI Engineers World’s Fair in SF next week?
I’ll (@JimBobBennett) be there with the Galileo crew—booth, talks, party, and all. I’m giving a talk on “Taming Your AI Agents with Evaluations”, aka how to stop your AI from making up entire book reports (Chicago Sun-Times, we see you 👀).
I’ll (@JimBobBennett) be there with the Galileo crew—booth, talks, party, and all. I’m giving a talk on “Taming Your AI Agents with Evaluations”, aka how to stop your AI from making up entire book reports (Chicago Sun-Times, we see you 👀).
We just dropped a new walkthrough showing you how to build powerful agents by combining MongoDB Atlas with Galileo.
May 29, 2025 at 4:16 PM
We just dropped a new walkthrough showing you how to build powerful agents by combining MongoDB Atlas with Galileo.
𝗧𝗵𝗲 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻 𝗕𝗹𝘂𝗲𝗽𝗿𝗶𝗻𝘁: 𝗣𝗮𝗿𝘁 𝟭
May 27, 2025 at 7:10 PM
𝗧𝗵𝗲 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻 𝗕𝗹𝘂𝗲𝗽𝗿𝗶𝗻𝘁: 𝗣𝗮𝗿𝘁 𝟭
In less than three years, your new coworker might not be human. 🤖
@poolsideai co-founders @JasoncWarner and @EisoKant believe AI will soon collaborate with teams inside high-consequence environments such as banking, energy, and healthcare-grade software.
@poolsideai co-founders @JasoncWarner and @EisoKant believe AI will soon collaborate with teams inside high-consequence environments such as banking, energy, and healthcare-grade software.
May 23, 2025 at 6:01 PM
In less than three years, your new coworker might not be human. 🤖
@poolsideai co-founders @JasoncWarner and @EisoKant believe AI will soon collaborate with teams inside high-consequence environments such as banking, energy, and healthcare-grade software.
@poolsideai co-founders @JasoncWarner and @EisoKant believe AI will soon collaborate with teams inside high-consequence environments such as banking, energy, and healthcare-grade software.
Agentic AI isn't just reactive, it's a proactive partner.
On the Chain of Thought podcast with @ConorBronsdon, @Amplitude_HQ's Chief Engineering Officer, @Wade Chambers, explains how systems like Ask Amplitude transform AI from a tool into a team of PhDs embedded in your product.
On the Chain of Thought podcast with @ConorBronsdon, @Amplitude_HQ's Chief Engineering Officer, @Wade Chambers, explains how systems like Ask Amplitude transform AI from a tool into a team of PhDs embedded in your product.
May 22, 2025 at 6:01 PM
Agentic AI isn't just reactive, it's a proactive partner.
On the Chain of Thought podcast with @ConorBronsdon, @Amplitude_HQ's Chief Engineering Officer, @Wade Chambers, explains how systems like Ask Amplitude transform AI from a tool into a team of PhDs embedded in your product.
On the Chain of Thought podcast with @ConorBronsdon, @Amplitude_HQ's Chief Engineering Officer, @Wade Chambers, explains how systems like Ask Amplitude transform AI from a tool into a team of PhDs embedded in your product.
AI isn't a toy anymore. It's in production.
Wade Chambers, Chief Engineering Officer at Amplitude, explains on the Chain of Thought podcast with @conorbronsdon.bsky.social that we've passed the leap-of-faith phase. The proof:
✅ Better decision-making
✅ Faster research
✅ Real agentic solutions
Wade Chambers, Chief Engineering Officer at Amplitude, explains on the Chain of Thought podcast with @conorbronsdon.bsky.social that we've passed the leap-of-faith phase. The proof:
✅ Better decision-making
✅ Faster research
✅ Real agentic solutions
May 21, 2025 at 10:03 PM
AI isn't a toy anymore. It's in production.
Wade Chambers, Chief Engineering Officer at Amplitude, explains on the Chain of Thought podcast with @conorbronsdon.bsky.social that we've passed the leap-of-faith phase. The proof:
✅ Better decision-making
✅ Faster research
✅ Real agentic solutions
Wade Chambers, Chief Engineering Officer at Amplitude, explains on the Chain of Thought podcast with @conorbronsdon.bsky.social that we've passed the leap-of-faith phase. The proof:
✅ Better decision-making
✅ Faster research
✅ Real agentic solutions
We’re live at the Microsoft for Startups booth until 2PM- come say hi! 🙌
It’s been amazing connecting with builders, developers, and curious minds. If you’re interested in agents, LLM apps, or just want to see how we’re helping to ship reliable AI apps, stop by for a Galileo demo.
#MSbuild
It’s been amazing connecting with builders, developers, and curious minds. If you’re interested in agents, LLM apps, or just want to see how we’re helping to ship reliable AI apps, stop by for a Galileo demo.
#MSbuild
May 21, 2025 at 6:37 PM
We’re live at the Microsoft for Startups booth until 2PM- come say hi! 🙌
It’s been amazing connecting with builders, developers, and curious minds. If you’re interested in agents, LLM apps, or just want to see how we’re helping to ship reliable AI apps, stop by for a Galileo demo.
#MSbuild
It’s been amazing connecting with builders, developers, and curious minds. If you’re interested in agents, LLM apps, or just want to see how we’re helping to ship reliable AI apps, stop by for a Galileo demo.
#MSbuild
We're mentioned in Jensen Huang's slides at COMPUTEX in Taipei! 🤩
Delighted to be part of NVIDIA's AI Factory Validated Designs!
Delighted to be part of NVIDIA's AI Factory Validated Designs!
May 21, 2025 at 6:02 PM
We're mentioned in Jensen Huang's slides at COMPUTEX in Taipei! 🤩
Delighted to be part of NVIDIA's AI Factory Validated Designs!
Delighted to be part of NVIDIA's AI Factory Validated Designs!
"We started Poolside because we saw AGI differently."
While everyone rushed to scale the next-token prediction post-ChatGPT, Jason Warner and Eiso Kant (poolside's CEO and CTO) bet on a different path: reinforcement learning as the deeper scaling axis of intelligence itself.
While everyone rushed to scale the next-token prediction post-ChatGPT, Jason Warner and Eiso Kant (poolside's CEO and CTO) bet on a different path: reinforcement learning as the deeper scaling axis of intelligence itself.
May 21, 2025 at 1:02 AM
"We started Poolside because we saw AGI differently."
While everyone rushed to scale the next-token prediction post-ChatGPT, Jason Warner and Eiso Kant (poolside's CEO and CTO) bet on a different path: reinforcement learning as the deeper scaling axis of intelligence itself.
While everyone rushed to scale the next-token prediction post-ChatGPT, Jason Warner and Eiso Kant (poolside's CEO and CTO) bet on a different path: reinforcement learning as the deeper scaling axis of intelligence itself.
Reposted by Galileo.ai
You at #msbuild? Here’s all the @rungalileo.bsky.social fun you can expect today.
May 20, 2025 at 7:38 PM
You at #msbuild? Here’s all the @rungalileo.bsky.social fun you can expect today.
Galileo partners with NVIDIA's Enterprise AI Factory, enhancing AI systems with its reliability platform. This collaboration, launched at COMPUTEX, provides tools for guardrails, synthetic data, model evaluation, and experiment tracking.
Explore more: buff.ly/uYDp76M
#NVIDIACOMPUTEX #AgenticAI
Explore more: buff.ly/uYDp76M
#NVIDIACOMPUTEX #AgenticAI
May 20, 2025 at 4:01 PM
Galileo partners with NVIDIA's Enterprise AI Factory, enhancing AI systems with its reliability platform. This collaboration, launched at COMPUTEX, provides tools for guardrails, synthetic data, model evaluation, and experiment tracking.
Explore more: buff.ly/uYDp76M
#NVIDIACOMPUTEX #AgenticAI
Explore more: buff.ly/uYDp76M
#NVIDIACOMPUTEX #AgenticAI
Say hi to @jimbobbennett.dev, Roie Schwaber-Cohen, and @conorbronsdon.bsky.social at the @msft4startups.bsky.social
Find us today at #MSbuild:
- 1:30p - 2:05p at Hub Theater A
- 2:30p - 6:30p Galileo's AI Reliability + Evals platform on Azure
- 3:30p - 5p at the NVIDIA Inception Partner Showcase
Find us today at #MSbuild:
- 1:30p - 2:05p at Hub Theater A
- 2:30p - 6:30p Galileo's AI Reliability + Evals platform on Azure
- 3:30p - 5p at the NVIDIA Inception Partner Showcase
May 20, 2025 at 3:05 PM
Say hi to @jimbobbennett.dev, Roie Schwaber-Cohen, and @conorbronsdon.bsky.social at the @msft4startups.bsky.social
Find us today at #MSbuild:
- 1:30p - 2:05p at Hub Theater A
- 2:30p - 6:30p Galileo's AI Reliability + Evals platform on Azure
- 3:30p - 5p at the NVIDIA Inception Partner Showcase
Find us today at #MSbuild:
- 1:30p - 2:05p at Hub Theater A
- 2:30p - 6:30p Galileo's AI Reliability + Evals platform on Azure
- 3:30p - 5p at the NVIDIA Inception Partner Showcase
“I’d never ask a customer to send me their source code & just say ‘trust me.’” — @jcw.bsky.social, CEO @poolsideai.bsky.social
AI infra shouldn’t trade privacy for performance.
✅ Trust through transparency
✅ No overreach
We’re here for it.
AI infra shouldn’t trade privacy for performance.
✅ Trust through transparency
✅ No overreach
We’re here for it.
May 20, 2025 at 1:03 AM
“I’d never ask a customer to send me their source code & just say ‘trust me.’” — @jcw.bsky.social, CEO @poolsideai.bsky.social
AI infra shouldn’t trade privacy for performance.
✅ Trust through transparency
✅ No overreach
We’re here for it.
AI infra shouldn’t trade privacy for performance.
✅ Trust through transparency
✅ No overreach
We’re here for it.