Skip to main content
This page is the gateway-focused quickstart. Point your SDK at https://api.inference.net/v1, add a couple of headers, and Catalyst captures every request with cost, latency, and full request/response payloads. If you’d rather see the higher-level Get Started flow, start with Record your first LLM call. The example below uses OpenAI. For other providers (Anthropic, Vertex AI, Gemini, OpenRouter, Cerebras, Groq, LangChain, ElevenLabs), see the Gateway overview.

Choose a setup path

Installing with AI is the quickest. Use the manual flow if you want to wire it up yourself.
Use the Inference CLI to launch a coding agent like Claude Code, OpenCode, or Codex to scan your codebase, update your LLM clients, and add the routing headers.
1

Install the CLI and authenticate

Install the Inference CLI globally and log in. Your browser will open to authenticate.
npm install -g @inference/cli && inf auth login
2

Run gateway instrumentation in your project

From your project root, run instrumentation in gateway mode.
cd /path/to/your/project && inf instrument --mode gateway
The command guides you through the following workflow:
  • Select a coding agent: Claude Code, OpenCode, or Codex.
  • Scan your codebase for LLM clients such as OpenAI, Anthropic, LangChain, etc.
  • Redirect base URLs to the Catalyst Gateway.
  • Add routing headers so requests are authenticated, forwarded, and tagged.
  • Add task IDs so each call site is grouped automatically in the dashboard.
  • Review the generated changes before applying them.
Pick both instead of gateway to also install the tracing SDK in the same pass. Run inf instrument --dry-run to preview changes without modifying any files.
3

Run your app

Run your application how you normally would. Requests now flow through Gateway and appear in the dashboard.
4

View your results

Open the dashboard to see request details and analytics.
Want the full canonical guide for this workflow? See Install with AI.
That’s it. Every request now flows through Gateway and gets captured automatically.

Headers

These headers control routing, authentication, and how the request gets tagged in the dashboard. The only one required for every request is Authorization. Add the others as needed.
HeaderRequiredDescription
AuthorizationYesBearer <your-project-api-key>. Authenticates the request to Catalyst and selects the project scope. For OpenAI-compatible SDKs, set this as the SDK’s apiKey.
x-inference-provider-api-keyWhen proxying a providerYour downstream provider’s API key (OpenAI, Groq, Cerebras, etc.). The gateway forwards it as bearer auth so your code never has to. For Anthropic’s native /v1/messages route, use x-api-key instead.
x-inference-providerOptionalForces routing to a specific provider (openai, anthropic, groq, cerebras, vertex-ai, gemini). When omitted, the gateway infers the provider from the SDK path or base URL. Set this only when you want to override that inference.
x-inference-provider-urlOptionalRoutes to any OpenAI-compatible provider by base URL, even one without a dedicated integration. For third-party OpenAI-compatible URLs, the gateway infers OpenAI automatically. Pair with x-inference-provider only when you want to force a specific provider name.
x-inference-environmentOptionalTags the request with an environment name like production, staging, or development. Filterable in the dashboard.
x-inference-task-idOptionalGroups requests under a logical task such as summarize-docs or chat-support. Useful for filtering, analytics, and building datasets.
x-inference-metadata-*OptionalAttach arbitrary metadata to a request. The x-inference-metadata- prefix is stripped to form the key (e.g., x-inference-metadata-chat-id: abc123 stores chat-id: abc123). Filter inferences and create datasets in the dashboard using these keys.

Provider base URLs

The base URL you point your SDK at determines which provider the gateway forwards to. Most providers don’t need an explicit x-inference-provider header, the gateway figures it out from the URL.
ProviderBase URLNote
OpenAIhttps://api.openai.com/v1Default routing, no provider header needed.
OpenRouterhttps://openrouter.ai/apiNo provider header needed unless you want to force openai.
Anthropichttps://api.anthropic.com/v1No provider header needed for /v1/messages or api.anthropic.com.
Google Geminihttps://generativelanguage.googleapis.comUse /v1beta/models/* native paths, or /v1beta/openai for OpenAI-compatible calls.
Vertex AIhttps://aiplatform.googleapis.com/v1/projects/{project}/locations/global/endpoints/openapiSet x-inference-provider: vertex-ai.
Azure OpenAIhttps://{resource}.openai.azure.com/openai/deployments/{deployment}
Groqhttps://api.groq.com/openai/v1
Together AIhttps://api.together.xyz/v1
Fireworks AIhttps://api.fireworks.ai/inference/v1
Perplexityhttps://api.perplexity.ai
Mistralhttps://api.mistral.ai/v1
DeepSeekhttps://api.deepseek.com/v1
Cerebrashttps://api.cerebras.ai/v1
Inference.nethttps://api.inference.net/v1

What gets captured

Once traffic is flowing, Catalyst records:
  • The full request and response payloads
  • Cost per call and aggregate spend
  • Latency, including time to first token (TTFT) and tokens per second
  • Token counts (input and output)
  • Error rates and status codes
  • Model and provider

Where to find your data

Next steps

Gateway overview

Routing headers, supported providers, and the full set of OpenAI-compatible base URLs.

Connect more providers

Set up Anthropic, Vertex AI, Gemini, OpenRouter, Cerebras, Groq, and more.

Organize with tasks

Group LLM calls by feature or objective to track metrics separately.

Build a dataset

Turn captured traffic into datasets for evals and training.