Lightnews — Scholar-powered news

Reposted by Galileo.ai

Jim Bennett

@jimbobbennett.dev

✨Here's why your AI is lying!✨

The other week at @devrelcon.bsky.social I sat down to chat with Joseph Petty from @appsmith.bsky.social about AI, why you need evaluations, and how @rungalileo.bsky.social can help you.

Oh, and 🌶️ Jim's spicy take on AI 👀

youtu.be/I2vRx5Ieak8?...

Why Every AI Company Needs an AI to Test Their AI

YouTube video by Appsmith

youtu.be

August 11, 2025 at 6:35 PM

Galileo.ai

@rungalileo.bsky.social

Success for AI agents varies greatly by domain and requires nuanced, domain-specific metrics.

@erinmikail.bsky.social's new tutorial shows how to build and track tailored custom metrics using Galileo for reliable AI evaluation.

Read Erin's blog here: galileo.ai/blog/silly-s...

July 3, 2025 at 6:37 PM

Galileo.ai

@rungalileo.bsky.social

Deploying an LLM without the right infrastructure in place is like casting spells without a spellbook.

#AI #LLM #AIEvaluation #MLOps #DataQuality #Cohesity #GalileoAI #ChainOfThought #Podcast

June 11, 2025 at 9:52 PM

Galileo.ai

@rungalileo.bsky.social

We’re excited to release 2 new AI agent interfaces that make agent observability & evaluations even more effective.

- Timeline View: No more guessing where your agent gets stuck, see execution flow & bottlenecks quickly.

- Conversation View: Debug from the user's perspective, not just the system's

June 11, 2025 at 6:01 PM

Galileo.ai

@rungalileo.bsky.social

What do LLM evals and comedy have in common? Timing.

Join @erinmikail.bsky.social at the #databricks #DataAISummit as she breaks down what it really takes to test LLMs in unexpected domains—like generating humor.

Come for the eval benchmarks. Stay for the chaos.

#GenAI #LLMevals #AIUX #LLMops

June 11, 2025 at 3:27 PM

Galileo.ai

@rungalileo.bsky.social

Make sure to stop by booth #120 to say hi to @erinmikail.bsky.social and the Galileo team during the #DataAISummit this week!

June 9, 2025 at 9:52 PM

Galileo.ai

@rungalileo.bsky.social

Next week, Galileo is headed to San Francisco for the Databricks Data + AI Summit!

If you’re building with LLMs, testing agents, or just trying to trust what your models are doing in production, come find us at Booth #120

June 6, 2025 at 6:04 PM

Reposted by Galileo.ai

Jim Bennett

@jimbobbennett.dev

On my way to SF. If you’re attending the AI Engineering worlds fair and want ti learn why your AI needs reliability and evaluations come say hi at the @rungalileo.bsky.social booth.

Jim sitting on a plane wearing a white hoodie and orange glasses

June 3, 2025 at 6:24 PM

Galileo.ai

@rungalileo.bsky.social

Siva Surendira, CEO of Lyzr, perfectly captures why enterprises need robust AI evaluation:

"I recommend Galileo as the antivirus equivalent for your AI system - you need these checks & balances. A MacBook is secure by nature, having that additional layer catches things the core system might miss."

June 2, 2025 at 4:23 PM

Galileo.ai

@rungalileo.bsky.social

🚨 Heading to the AI Engineers World’s Fair in SF next week?

I’ll (@JimBobBennett) be there with the Galileo crew—booth, talks, party, and all. I’m giving a talk on “Taming Your AI Agents with Evaluations”, aka how to stop your AI from making up entire book reports (Chicago Sun-Times, we see you 👀).

May 30, 2025 at 10:33 PM

Galileo.ai

@rungalileo.bsky.social

We just dropped a new walkthrough showing you how to build powerful agents by combining MongoDB Atlas with Galileo.

May 29, 2025 at 4:16 PM

Galileo.ai

@rungalileo.bsky.social

𝗧𝗵𝗲 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻 𝗕𝗹𝘂𝗲𝗽𝗿𝗶𝗻𝘁: 𝗣𝗮𝗿𝘁 𝟭

May 27, 2025 at 7:10 PM

Galileo.ai

@rungalileo.bsky.social

In less than three years, your new coworker might not be human. 🤖

@poolsideai co-founders @JasoncWarner and @EisoKant believe AI will soon collaborate with teams inside high-consequence environments such as banking, energy, and healthcare-grade software.

May 23, 2025 at 6:01 PM

Galileo.ai

@rungalileo.bsky.social

Agentic AI isn't just reactive, it's a proactive partner.

On the Chain of Thought podcast with @ConorBronsdon, @Amplitude_HQ's Chief Engineering Officer, @Wade Chambers, explains how systems like Ask Amplitude transform AI from a tool into a team of PhDs embedded in your product.

May 22, 2025 at 6:01 PM

Galileo.ai

@rungalileo.bsky.social

AI isn't a toy anymore. It's in production.

Wade Chambers, Chief Engineering Officer at Amplitude, explains on the Chain of Thought podcast with ‪@conorbronsdon.bsky.social‬ that we've passed the leap-of-faith phase. The proof:

✅ Better decision-making
✅ Faster research
✅ Real agentic solutions

May 21, 2025 at 10:03 PM

Galileo.ai

@rungalileo.bsky.social

We’re live at the Microsoft for Startups booth until 2PM- come say hi! 🙌

It’s been amazing connecting with builders, developers, and curious minds. If you’re interested in agents, LLM apps, or just want to see how we’re helping to ship reliable AI apps, stop by for a Galileo demo.

#MSbuild

May 21, 2025 at 6:37 PM

Galileo.ai

@rungalileo.bsky.social

We're mentioned in Jensen Huang's slides at COMPUTEX in Taipei! 🤩

Delighted to be part of NVIDIA's AI Factory Validated Designs!

May 21, 2025 at 6:02 PM

Galileo.ai

@rungalileo.bsky.social

"We started Poolside because we saw AGI differently."

While everyone rushed to scale the next-token prediction post-ChatGPT, Jason Warner and Eiso Kant (poolside's CEO and CTO) bet on a different path: reinforcement learning as the deeper scaling axis of intelligence itself.

May 21, 2025 at 1:02 AM

Reposted by Galileo.ai

Jim Bennett

@jimbobbennett.dev

New website time! @rungalileo.bsky.social just dropped a new website and logo!

galileo.ai

Galileo AI: The Generative AI Evaluation Company

Galileo's Evaluation Intelligence Platform empowers AI teams to evaluate, iterate, monitor, and protect generative AI applications at enterprise scale.

galileo.ai

May 20, 2025 at 5:14 PM

Reposted by Galileo.ai

Jim Bennett

@jimbobbennett.dev

You at #msbuild? Here’s all the @rungalileo.bsky.social fun you can expect today.

May 20, 2025 at 7:38 PM

Galileo.ai

@rungalileo.bsky.social

Galileo partners with NVIDIA's Enterprise AI Factory, enhancing AI systems with its reliability platform. This collaboration, launched at COMPUTEX, provides tools for guardrails, synthetic data, model evaluation, and experiment tracking.

Explore more: buff.ly/uYDp76M

#NVIDIACOMPUTEX #AgenticAI

“Enterprise-Scale Agentic AI with Galileo + NVIDIA” featuring architecture flow diagrams showing Galileo’s evaluation metrics, agent observability, and NVIDIA’s optimized inference powering multi-agent systems.

May 20, 2025 at 4:01 PM

Galileo.ai

@rungalileo.bsky.social

Say hi to @jimbobbennett.dev, Roie Schwaber-Cohen‬, and @conorbronsdon.bsky.social at the @msft4startups.bsky.social

Find us today at #MSbuild:
- 1:30p - 2:05p at Hub Theater A
- 2:30p - 6:30p Galileo's AI Reliability + Evals platform on Azure
- 3:30p - 5p at the NVIDIA Inception Partner Showcase

May 20, 2025 at 3:05 PM

Galileo.ai

@rungalileo.bsky.social

Galileo is at #MSbuild this week! See you at the Microsoft for Startups booth.

May 20, 2025 at 2:25 PM

Galileo.ai

@rungalileo.bsky.social

“I’d never ask a customer to send me their source code & just say ‘trust me.’” — @jcw.bsky.social, CEO @poolsideai.bsky.social

AI infra shouldn’t trade privacy for performance.
✅ Trust through transparency
✅ No overreach

We’re here for it.

May 20, 2025 at 1:03 AM

Galileo.ai

@rungalileo.bsky.social

Oh hi! What are you excited about at #MSBuild? We’re excited to see you next week!

May 18, 2025 at 1:02 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news