Lightnews — Scholar-powered news

MLflow @mlflow.org · 9h

With that metadata—and MLflow’s MCP features released recently—the judge can make tool calls to MLflow to search spans and query different aspects of the trace.

Office Hours: Wed, Oct 22: luma.com/officehours1...

#mlflow #agenticjudges #llm #genai

MLflow Office Hours | October 22 · Zoom · Luma

luma.com

MLflow @mlflow.org · 9h

In this mode, the judge gets the MLflow trace info object: input to the call, output, and basically the root span ID for that trace.

1

MLflow @mlflow.org · 9h

Missed last week’s #MLflow Community Meetup? Check out Ben Wilson on agentic judges: “The judge no longer works as an LLM as a judge—it actually works as an agent as a judge.”

🎥 Full video: www.youtube.com/live/bkMabn8...

#opensource #oss #agenticjudges

1 1

MLflow @mlflow.org · 2d

Use AI to debug AI! 👏 With MLflow 3.4’s MCP server, Claude can query MLflow traces to compare runs and identify issues. @danliden.com’s post covers setup and examples in minutes.

Learn more: www.danliden.com/posts/202510...

#opensource #mlflow #oss #MCP #claude

Using MLflow's MCP Server for Conversational Trace Analysis

A hands-on exploration of MLflow's new Model Context Protocol server that enables AI assistants to interact with MLflow traces, with setup tips for virtual environments and examples in both Claude ...

www.danliden.com

1 3

MLflow @mlflow.org · 6d

⚡ In this lightning talk at MLOps World, Danny Chiao tackled a top agent challenge: ensuring high quality output.

Rather than labeling and analyzing traces by hand, MLflow makes it easy to log, evaluate, and iterate faster—using techniques leading companies rely on to deploy agents in production. ✅

1

MLflow @mlflow.org · 8d

🚨 Reminder: MLflow Community Meetup is tomorrow, Oct 8 at 4:00 PM PT!

We'll explore trace‑aware, feedback‑aligned judges and versioned eval datasets in MLflow. You don't wait to miss it!

🎥 LIVE on LinkedIn, YouTube & X
🔗 RSVP: luma.com/mlflow-1001

#opensource #oss #mlflow

MLflow @mlflow.org · 9d

Building better LLM evals? Ben Wilson highlights how frameworks like #DSPy boost judge prompts—and reliability—as models evolve.

Tips for judge reproducibility/reliability: use reproducible pipelines, re-tune logic as endpoints change, & standardize. ✅

🎥 Watch more: www.youtube.com/live/HTxpmnO...

MLflow @mlflow.org · 12d

Ready to dive into #MLflow? 🚀

Join our text-based LIVE AMA with Danny Chiao & @danliden.com after Danny's MLOps World | GenAI Summit lightning talk!

🗓️ Oct 9 | 1–3pm CT
📍MLflow Slack | #General channel
🔗 RSVP: luma.com/liveama-slac...

#opensource #oss #genai #mlops #llmops

[Live AMA] From production tricks, labeling tips, to integrating MLflow for agent quality—ask anything MLflow! · Luma

Join us for a text-based Live AMA (Ask Me Anything) with Danny Chiao, following his lightning talk at MLOps World. This is your chance to dive deeper into…

luma.com

1 2

MLflow @mlflow.org · 14d

🚀 Headed to MLOps World | GenAI Summit 2025 next week? Don’t miss an exciting lightning talk from Danny Chiao, Engineering Lead at Databricks!

🎤 𝗧𝗲𝗰𝗵𝗻𝗶𝗾𝘂𝗲𝘀 𝘁𝗼 𝗯𝘂𝗶𝗹𝗱 𝗵𝗶𝗴𝗵 𝗾𝘂𝗮𝗹𝗶𝘁𝘆 𝗮𝗴𝗲𝗻𝘁𝘀 𝗳𝗮𝘀𝘁𝗲𝗿 𝘄𝗶𝘁𝗵 𝗠𝗟𝗳𝗹𝗼𝘄

🗓️ October 9
📍 Austin, TX
🔗 Learn more: mlopsworld.com#agenda

#MLflow #GenAI #MLOps #LLM

MLflow @mlflow.org · 14d

🚨 RESCHEDULED: Wednesday, October 8

The next MLflow Community Meetup happens NEXT Wednesday, Oct 8 at 4PM PT—and you won’t want to miss it.

RSVP here: luma.com/mlflow-1001

#opensource #oss #mlflow

2

MLflow @mlflow.org · 14d

Our next MLflow Community Meetup is happening TODAY—Wednesday, October 1 at 4PM PT! 🙌

Don’t miss this chance to connect and learn:
🔹 Smarter Evaluations with Trace-Aware, Feedback-Aligned Judges
🔹 Keeping Eval Datasets Relevant as Your App Evolves

✅ RSVP: luma.com/mlflow-1001

#oss #mlflow #genai

MLflow Community Meetup · Luma

Join us for the next MLflow Community Meetup on October 1 at 4PM PT! Ben Wilson, MLflow Maintainer, will dive deep into: Building Smarter Evals with…

luma.com

1

MLflow @mlflow.org · 15d

🚀 The fifth “Invoice Extraction with OpenAI + MLflow” session is now available! #MLflow Ambassador Shrinath Suresh dives into designing a custom scorer to evaluate invoice extraction models beyond just labels or LLM-as-a-judge.

🎥 youtu.be/SmuhOmOYXSg?...
📖 medium.com/@shrinath.su...

#opensource

1

MLflow @mlflow.org · 16d

🚨 Just 2 days away!

The next #MLflow Community Meetup happens this Wednesday, Oct 1 at 4PM PT—and you won’t want to miss it.

We will cover:
🔹 𝗕𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝗦𝗺𝗮𝗿𝘁𝗲𝗿 𝗘𝘃𝗮𝗹𝘀 𝘄𝗶𝘁𝗵 𝗧𝗿𝗮𝗰𝗲-𝗔𝘄𝗮𝗿𝗲, 𝗙𝗲𝗲𝗱𝗯𝗮𝗰𝗸-𝗔𝗹𝗶𝗴𝗻𝗲𝗱 𝗝𝘂𝗱𝗴𝗲𝘀
🔹 𝗞𝗲𝗲𝗽𝗶𝗻𝗴 𝗘𝘃𝗮𝗹 𝗗𝗮𝘁𝗮𝘀𝗲𝘁𝘀 𝗥𝗲𝗹𝗲𝘃𝗮𝗻𝘁 𝗮𝘀 𝗬𝗼𝘂𝗿 𝗔𝗽𝗽 𝗖𝗵𝗮𝗻𝗴𝗲𝘀

RSVP 👉 luma.com/mlflow-1001

#oss

MLflow Community Meetup · Luma

Join us for the next MLflow Community Meetup on October 1 at 4PM PT! Ben Wilson, MLflow Maintainer, will dive deep into: Building Smarter Evals with…

luma.com

MLflow @mlflow.org · 22d

Episode 3 of Invoice Extraction is live! 🚀

#MLflow Ambassador Shrinath Suresh explores prompt versioning, comparing, automating workflows, and effective reuse with MLflow’s Prompt Registry.

🎥 Watch: youtube.com/watch?v=fau8...
📖 Read: medium.com/@shrinath.su...

#opensource #oss #genai

#3 Prompt Engineering & MLflow Prompt Management

YouTube video by TheAIGuy

youtube.com

1

MLflow @mlflow.org · 27d

Businesses run on invoices. Getting structured data from them—fast and accurate—is critical.

🧩 GPT-5 is powerful for extraction
⚙️ MLflow 3 makes it repeatable & traceable

Demo + blog from #MLflow Ambassador Shrinath Suresh! 👇

▶️ youtu.be/E5GSWLhI5uA
📝 medium.com/@shrinath.su...

#opensource #oss

Introduction to invoice extraction using OpenAI GPT5 and MLflow 3.3

YouTube video by TheAIGuy

youtu.be

1

MLflow @mlflow.org · 28d

🚀 Join the next MLflow Community Meetup on Oct 1 at 4PM PT!

🔹 𝗕𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝗦𝗺𝗮𝗿𝘁𝗲𝗿 𝗘𝘃𝗮𝗹𝘀 𝘄𝗶𝘁𝗵 𝗧𝗿𝗮𝗰𝗲-𝗔𝘄𝗮𝗿𝗲, 𝗙𝗲𝗲𝗱𝗯𝗮𝗰𝗸-𝗔𝗹𝗶𝗴𝗻𝗲𝗱 𝗝𝘂𝗱𝗴𝗲𝘀
🔹 𝗞𝗲𝗲𝗽𝗶𝗻𝗴 𝗘𝘃𝗮𝗹 𝗗𝗮𝘁𝗮𝘀𝗲𝘁𝘀 𝗥𝗲𝗹𝗲𝘃𝗮𝗻𝘁 𝗮𝘀 𝗬𝗼𝘂𝗿 𝗔𝗽𝗽 𝗖𝗵𝗮𝗻𝗴𝗲𝘀

Bring your questions about dataset management, evaluation workflows! ✅

RSVP 👉 luma.com/mlflow-1001

#opensource #oss

MLflow Community Meetup · Luma

Join us for the next MLflow Community Meetup on October 1 at 4PM PT! Ben Wilson, MLflow Maintainer, will dive deep into: Building Smarter Evals with…

luma.com

1 3

MLflow @mlflow.org · Sep 15

This blog highlights how MLflow’s #GenAI capabilities streamline development of an LLM-based Optical Character Recognition (OCR) tool. These capabilities reduce friction, accelerate workflows, and deliver value to both technical and non-technical contributors.

🚀 Dive in: mlflow.org/blog/mlflow-...

MLflow @mlflow.org · Sep 11

This blog looks at the “Coffee Machine” approach: global teams set up standardized #ML pipelines, & local teams adapt them using their own #data. ☕

#MLflow supports every step, making it possible to track changes, register model variants, & maintain reproducibility.

🔗 medium.com/dscier/brewi...

MLflow @mlflow.org · Sep 9

📣 Happening Tomorrow — MLflow Office Hours!

Join #MLflow maintainers for a live Q&A session! Whether you’re running MLflow in production or experimenting with LLMs & GenAI, this is your chance to bring real challenges and get direct feedback.

🕒 Sept 10 @ 3PM SGT
🎟 RSVP: lu.ma/mlflow-910

#oss

MLflow @mlflow.org · Sep 5

The real unlock → MLflow’s tracing integration. Every tool call + reasoning step gets captured and replayable. When an agent fails, you can see why—not guess. Critical for debugging multi-step chains + production bottlenecks.

🔗 Learn more: mlflow.org/docs/latest/...

#LLMOps #AI #MLflow #oss

Evaluating Agents | MLflow

AI Agents are an emerging pattern of GenAI applications that can use tools, make decisions, and execute multi-step workflows. However, evaluating the performance of those complex agents is challenging...

mlflow.org

MLflow @mlflow.org · Sep 5

MLflow lets you create custom scorers for agent behavior: did it use the right tool, in the right order, with proper reasoning? Datasets can encode patterns + decisions, not just input–output. You’re testing how the agent thinks—not just what it outputs.

#AgentEvaluation #MLflow #opensource

1 1 1

MLflow @mlflow.org · Sep 5

Evaluating AI agents is tricky—they make multi-step decisions, use tools, and follow reasoning chains. To measure them, you need to assess the workflow, not just the final answer. That’s where MLflow’s new agent evaluation framework steps in.

🔗 Docs: mlflow.org/docs/latest/...

#MLflow #AI #MLOps

1

MLflow @mlflow.org · Aug 25

In this blog by #MLflow Ambassador Rahul Pandey, learn how the MLflow Prompt Registry helps teams move from scattered notebook files and chat threads to a streamlined prompt management process—one that’s versioned, searchable, and always ready for rollbacks. 🚀

🔗 Dive in: medium.com/dscier/end-t...

End-to-End Prompt Lifecycle Management and Optimization with MLflow

A complete journey from prompt creation to production deployment using semantic evaluation

medium.com

MLflow @mlflow.org · Aug 22

📣 MLflow 3.3.0 is now available!

This release introduces several major features and improvements:
🔹 𝗠𝗼𝗱𝗲𝗹 𝗥𝗲𝗴𝗶𝘀𝘁𝗿𝘆 𝗪𝗲𝗯𝗵𝗼𝗼𝗸𝘀
🔹 𝗔𝗴𝗻𝗼 𝗧𝗿𝗮𝗰𝗶𝗻𝗴 𝗜𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗶𝗼𝗻
🔹 𝗚𝗲𝗻𝗔𝗜 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻 𝗶𝗻 𝗢𝗦𝗦
🔹 𝗥𝗲𝘃𝗮𝗺𝗽𝗲𝗱 𝗧𝗿𝗮𝗰𝗲 𝗧𝗮𝗯𝗹𝗲 𝗩𝗶𝗲𝘄
🔹 𝗙𝗮𝘀𝘁𝗔𝗣𝗜 + 𝗨𝘃𝗶𝗰𝗼𝗿𝗻 𝗦𝗲𝗿𝘃𝗲𝗿

🔗 Check out the release notes: github.com/mlflow/mlflo...

#oss

1 2

MLflow @mlflow.org · Aug 21

📣 MLflow Office Hours — Wednesday, Sept 10

Connect directly with #MLflow maintainers and contributors for live Q&A! Bring your production challenges or your latest #LLM and #GenAI experiments—this session is dedicated to hands-on technical discussion and feedback.

Save your spot ➡️ lu.ma/mlflow-910

MLflow Office Hours | September 10 · Zoom · Luma

Join us for the next MLflow Office Hours on Wednesday, September 10, for an open Q&A session with MLflow maintainers and contributors! 🎊 Whether you're…

lu.ma

1 1