Key concepts
| Concept | Description |
|---|---|
| Gateway | Edge layer between your app and LLM provider. Records traffic with < 10ms overhead. |
| Inference | A single LLM call stored by Gateway. Includes request, response, cost, latency, & token counts. |
| Trace | A multi-step execution captured through OpenTelemetry. Useful for agents, tools, framework runs, and custom orchestration. |
| Span | One step inside a trace, such as a model call, tool call, retriever, graph node, or custom application operation. |
| Task | A user-defined objective (like “summarize docs” or “classify tickets”) that groups related inferences so you can track each AI feature independently. |
| Metrics | Aggregated cost, latency, error rates, and token usage across your inferences. Filterable by model, task, or provider. |
Next steps
Set up tasks
Group your LLM calls by objective.
Integrate with your LLM provider
Connect your app and start capturing traffic.
Metrics Explorer
See your LLM usage dashboards.
Inference Viewer
Browse individual LLM calls.
Trace CLI
Inspect trace trees, span timelines, facets, and exports from the terminal.