arize-phoenix
banner
arize-phoenix.bsky.social
arize-phoenix
@arize-phoenix.bsky.social
Open-Source AI Observability and Evaluation
app.phoenix.arize.com
🚀 New OpenInference integration with @pipecat.bsky.social

We released native OpenInference instrumentation for Pipecat using OTEL; enabling end-to-end observability across realtime agent pipelines.

Get rich semantic traces with zero manual instrumentation in Arize Phoenix.

Demo Below 🎬
January 26, 2026 at 6:46 PM
01.22.2026 CLI Prompt Commands: Pipe Prompts to AI Assistants
January 22, 2026 at 6:56 AM
Creating datasets from traces has been available in the Phoenix UI. Now, the Phoenix client libraries enable programmatic dataset creation from production traces with links between dataset examples and their source spans.
January 21, 2026 at 9:15 AM
Create Datasets from Traces with Span Associations
Available in arize-phoenix-client 1.28.0+ (Python) and @arizeai/phoenix-client 2.0.0+ (TypeScript)
January 21, 2026 at 9:15 AM
📣 the Phoenix CLI: Terminal Access for AI Coding Assistants such as @claude_code and @geminicli

@arizeai/phoenix-cli is a command-line interface for retrieving trace data. It provides the same observability data available in the Phoenix UI through shell commands and file exports.
January 17, 2026 at 9:39 AM
Span comes from "span of time" - That's why during that span you can have events to keep a simple log of things that happen. Phoenix now supports displaying span event attributes in the UI.

span.add_event(
name="model.config",
attributes={}
)
January 12, 2026 at 8:46 PM
We released a new Python Tracing Quickstart for Phoenix!

Agents don’t fail like traditional software. They can misclassify, retrieve the wrong docs, or forget context.

Learn the workflows that turn agents into systems you can understand and assess: arize.com/docs/phoeni...
January 9, 2026 at 5:00 PM
🏢 Phoenix now supports on-prem LDAP authentication!
For enterprises that can't use cloud SSO, Phoenix integrates directly with your internal directory—Active Directory, OpenLDAP, or any LDAP v3 server.
Your data. Your infrastructure. Your identity provider.
December 16, 2025 at 6:39 PM
Phoenix Evals now supports message-based LLM-as-a-judge prompts— an upgrade that aligns evals with how modern models actually expect instructions.

🧵👇
December 11, 2025 at 4:37 AM
📚 New end-to-end Phoenix tutorials for Python and TypeScript are live!

Learn the core workflows to:

1. Interpret agent traces
2. Define tasks & build datasets for experiments
3. Construct purposeful LLM evals
4. Iterate based on results to improve reliability.

Choose your language & dive in!⬇️
December 3, 2025 at 5:00 PM
TypeScript Evals Quickstart: arize.com/docs/phoeni...
November 26, 2025 at 5:00 PM
Run evals fast with our TypeScript Evals Quickstart!

The new TypeScript Evals package to be a simple & powerful way to evaluate your agents:
✅ Define a task (what the agent does)
✅ Build a dataset
✅ Use an LLM-as-a-Judge evaluator to score outputs
✅ Run evals and see results in Phoenix
Docs 👇
November 26, 2025 at 5:00 PM
Dig into agent traces without a single line of code!

Our new live Phoenix Demos let you explore every step of an agent’s reasoning just by chatting with pre-built agents, with traces appearing instantly as you go.
November 20, 2025 at 3:25 PM
🌀 Since LLMs are probabilistic, their synthesis can differ even when the supplied prompts are exactly the same. This can make it challenging to determine if a particular change is warranted as a single execution cannot concretely tell you whether a given change improves or degrades your task.
September 26, 2025 at 11:48 PM
Trace Flowise apps with Arize Phoenix 🔍

Flowise is fast, visual, and low-code — but what happens under the hood?

With the new Arize Phoenix integration, you can debug, inspect, and visualize your LLM applications and agent workflows with 1 configuration step - no code required.
April 15, 2025 at 9:03 PM
Use Ragas with Arize AI @arize.bsky.social or ArizePhoenix to improve the evaluation of your LLM applications

Together you can:

✅ Evaluate performance with Ragas metrics
✅ Visualize and understand LLM behavior through traces & experiments in Arize or Phoenix

Dive into our docs & notebooks ⬇️
April 9, 2025 at 12:20 AM
New in the Phoenix client: Prompt Tagging 🏷️

📌Tag prompts in code and see those tags reflected in the UI
📌Tag prompt versions as development, staging, or production — or define your own
📌Add in tag descriptions for more clarity

Manage your prompt lifecycles with confidence🚀
April 4, 2025 at 7:24 PM
Better LLMs start with better data and observability

We’ve integrated @CleanlabAI’s Trustworthy Language Model (TLM) with Phoenix to help teams improve LLM reliability and performance

🔗 Dive into the full implementation in our docs & notebook:
March 20, 2025 at 7:50 PM
Some updates for Projects! Gain more flexibility and control with:

📌 Persistent column selection for consistent views
🔍 Filter data directly from tables with metadata and quick metadata filters
⏳ Set custom time ranges for traces & spans
🌳 Option to filter spans by root spans

Check out the demo👇
March 7, 2025 at 11:39 PM
🧠 Phoenix now supports Anthropic Sonnet 3.7 & Thinking Budgets!

This makes Prompt Playground ideal for side-by-side reasoning tests: o3 vs. Anthropic vs. R1.

Plus, GPT-4.5 support keeps it up to date with the latest from OpenAI & Anthropic - test them all out in the playground! ⚡️
March 7, 2025 at 5:29 PM