> ## Documentation Index
> Fetch the complete documentation index at: https://docs.inference.net/llms.txt
> Use this file to discover all available pages before exploring further.

# Gateway

> Record and analyze your production LLM traffic

Catalyst Gateway captures LLM requests flowing through your products. It stores the raw request, response, and metadata associated with each invocation of an LLM.

Recorded data is used to provide in-depth metrics and visibility into your LLM token usage, cost, latency, and error rates, across all your providers in a single, unified view. Additionally, this data is used to power downstream model evaluation and training.

Catalyst Gateway supports all major LLM providers and frameworks. For agents,
tools, framework runs, and custom orchestration, Catalyst Tracing captures full
trace trees and individual spans in addition to gateway inferences. View the
[integrations guide](/integrations/overview) for in-depth instructions.

<Frame>
  <iframe style={{ width: "100%", aspectRatio: "16 / 9", border: 0, display: "block" }} src="https://www.youtube.com/embed/S5ddaKQ-REU?list=PLJzp7SN2tfJsRAU9VGSfSo60CyDJzqhLP&rel=0" title="Observe Every LLM Call in One Place | Catalyst" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowFullScreen />
</Frame>

## Key concepts

| Concept       | Description                                                                                                                                           |
| ------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Gateway**   | Edge layer between your app and LLM provider. Records traffic with \< 10ms overhead.                                                                  |
| **Inference** | A single LLM call stored by Gateway. Includes request, response, cost, latency, & token counts.                                                       |
| **Trace**     | A multi-step execution captured through OpenTelemetry. Useful for agents, tools, framework runs, and custom orchestration.                            |
| **Span**      | One step inside a trace, such as a model call, tool call, retriever, graph node, or custom application operation.                                     |
| **Task**      | A user-defined objective (like "summarize docs" or "classify tickets") that groups related inferences so you can track each AI feature independently. |
| **Metrics**   | Aggregated cost, latency, error rates, and token usage across your inferences. Filterable by model, task, or provider.                                |

## Next steps

<CardGroup cols={2}>
  <Card title="Set up tasks" icon="bullseye" href="/platform/gateway/tasks">
    Group your LLM calls by objective.
  </Card>

  <Card title="Integrate with your LLM provider" icon="plug" href="/platform/gateway/integrate">
    Connect your app and start capturing traffic.
  </Card>

  <Card title="Metrics Explorer" icon="chart-line" href="/platform/gateway/metrics-explorer">
    See your LLM usage dashboards.
  </Card>

  <Card title="Inference Viewer" icon="list" href="/platform/gateway/inference-viewer">
    Browse individual LLM calls.
  </Card>

  <Card title="Trace CLI" icon="terminal" href="/cli/traces">
    Inspect trace trees, span timelines, facets, and exports from the terminal.
  </Card>
</CardGroup>
