Skip to main content
Catalyst’s gateway sits between your application and your LLM provider. This page shows two ways to capture your first LLM call: let the CLI instrument your codebase for you, or wire it up manually.

Choose a setup path

Installing with AI is quickest. Use the manual flow if you want to review each change yourself.
Use the Inference CLI to automatically initialize a coding agent like Claude Code to scan your codebase, update your LLM clients, and add required request metadata.
1

Install the CLI and authenticate

Install the Inference CLI globally and log in. Your browser will open to authenticate.
npm install -g @inference/cli && inf auth login
2

Run instrumentation in your project

Navigate to your project root and run instrumentation.
cd /path/to/your/project && inf instrument
The command guides you through the following workflow:
  • Select a coding agent to use: Claude Code, OpenCode, or Codex.
  • Scan your codebase for LLM clients such as OpenAI, Anthropic, LangChain,etc
  • Redirect base URLs to the gateway
  • Add routing headers so requests are authenticated, forwarded, and traced
  • Add task IDs so each call site is grouped automatically in the dashboard
  • Review the generated changes before applying them
Run inf instrument --dry-run to preview changes without modifying any files.
3

Run your app

Run your application how you normally would to produce inference requests. Requests from your application are now routed through the gateway and will appear in the dashboard.
4

View your results

Open the dashboard to see request details, traces, and analytics.
Want the full canonical guide for this workflow? See Install with AI.
That’s it. Every request now flows through Catalyst and gets captured automatically.

What gets captured

Once traffic is flowing, Catalyst records:
  • The full request and response payloads
  • Cost per call and aggregate spend
  • Latency (end-to-end and time to first token)
  • Token counts (input and output)
  • Error rates and status codes
  • Model and provider

Where to find your data

  • Metrics Explorer - dashboards for cost, latency, errors, and usage across all your LLM calls
  • Inference Viewer - browse and filter individual requests and responses

Next steps

Connect more providers

Set up Anthropic, Cerebras, Groq, and other providers.

Organize with tasks

Group LLM calls by feature or objective to track metrics separately.

Build a dataset

Turn captured traffic into datasets for evals and training.

Upload a dataset

Already have data? Upload a JSONL file to start evaluating or training.