Reference / Understand Orchestra AI Research Skills

Phoenix Observability

Open-source AI observability platform for LLM tracing, evaluation, and monitoring. Use when debugging LLM applications with detailed traces, running evaluations on datasets, or monitoring production AI systems with real-time insights.

# Phoenix - AI Observability Platform

Open-source AI observability and evaluation platform for LLM applications with tracing, evaluation, datasets, experiments, and real-time monitoring.

## When to use Phoenix

**Use Phoenix when:**
- Debugging LLM application issues with detailed traces
- Running systematic evaluations on datasets
- Monitoring production LLM systems in real-time
- Building experiment pipelines for prompt/model comparison
- Self-hosted observability without vendor lock-in

**Key features:**
- **Tracing**: OpenTelemetry-based trace collection for any LLM framework
- **Evaluation**: LLM-as-judge evaluators for quality assessment
- **Datasets**: Versioned test sets for regression testing
- **Experiments**: Compare prompts, models, and configurations
- **Playground**: Interactive prompt testing with multiple models
- **Open-source**: Self-hosted with PostgreSQL or SQLite

**Use alternatives instead:**
- **LangSmith**: Managed platform with LangChain-first integration
- **Weights & Biases**: Deep learning experiment tracking focus
- **Arize Cloud**: Managed Phoenix with enterprise features

Classification

Reference Documentation, cheatsheets, setup guides

Reference Understand

Explain or analyze

Scope Project

This codebase

Triggered Activates on context match -- file patterns, topics, working state

Phoenix Observability

Tags

Classification