Gateway Quickstart

This page is the gateway-focused quickstart. Point your SDK at https://api.inference.net/v1, add a couple of headers, and Catalyst captures every request with cost, latency, and full request/response payloads. If you’d rather see the higher-level Get Started flow, start with Record your first LLM call. The example below uses OpenAI. For other providers (Anthropic, Vertex AI, Gemini, OpenRouter, Cerebras, Groq, LangChain, ElevenLabs), see the Gateway overview.

Choose a setup path

Installing with AI is the quickest. Use the manual flow if you want to wire it up yourself.

Install with AI
Install manually

Use the Inference CLI to launch a coding agent like Claude Code, OpenCode, or Codex to scan your codebase, update your LLM clients, and add the routing headers.

Install the CLI and authenticate

Install the Inference CLI globally and log in. Your browser will open to authenticate.

npm install -g @inference/cli && inf auth login

Run gateway instrumentation in your project

From your project root, run instrumentation in gateway mode.

cd /path/to/your/project && inf instrument --mode gateway

The command guides you through the following workflow:

Select a coding agent: Claude Code, OpenCode, or Codex.
Scan your codebase for LLM clients such as OpenAI, Anthropic, LangChain, etc.
Redirect base URLs to the Catalyst Gateway.
Add routing headers so requests are authenticated, forwarded, and tagged.
Add task IDs so each call site is grouped automatically in the dashboard.
Review the generated changes before applying them.

Pick both instead of gateway to also install the tracing SDK in the same pass. Run inf instrument --dry-run to preview changes without modifying any files.

Run your app

Run your application how you normally would. Requests now flow through Gateway and appear in the dashboard.

View your results

Open the dashboard to see request details and analytics.

Want the full canonical guide for this workflow? See Install with AI.

Use this path if you want to wire it up yourself. The example below uses OpenAI. For Anthropic, Vertex AI, Gemini, OpenRouter, Cerebras, Groq, LangChain, and ElevenLabs, see the per-provider guides linked from the Gateway overview.

Get your API keys

You need two keys:

Inference Catalyst project API key from your dashboard under API Keys
Provider API key (in this example, OpenAI) from your OpenAI account

Set them as environment variables:

export INFERENCE_API_KEY=<your-project-api-key>
export OPENAI_API_KEY=<your-openai-api-key>

Update your code

Point your SDK at https://api.inference.net/v1 and use your Catalyst project API key as the SDK’s apiKey. Your provider’s API key goes in the x-inference-provider-api-key header so the gateway can forward it. The gateway adds roughly 10ms of latency and forwards your requests to the provider as-is.

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.inference.net/v1",
  apiKey: process.env.INFERENCE_API_KEY,
  defaultHeaders: {
    "x-inference-provider-api-key": process.env.OPENAI_API_KEY,
    "x-inference-provider": "openai",
  },
});

const response = await client.chat.completions.create({
  model: "gpt-4.1",
  messages: [{ role: "user", content: "Hello, world!" }],
});

console.log(response.choices[0].message.content);

Send a request

Run the snippet above. Once the request completes, Catalyst captures it automatically.

View your results

Open the dashboard to inspect the request and metrics.

Need a different provider? See the Gateway overview for per-provider guides, or use any OpenAI-compatible endpoint through the x-inference-provider-url header.

That’s it. Every request now flows through Gateway and gets captured automatically.

What gets captured

Once traffic is flowing, Catalyst records:

The full request and response payloads
Cost per call and aggregate spend
Latency, including time to first token (TTFT) and tokens per second
Token counts (input and output)
Error rates and status codes
Model and provider

Where to find your data

Metrics Explorer for cost, latency, errors, and usage across all your LLM calls
Inference Viewer to browse and filter individual requests and responses

Next steps

Gateway overview

Routing headers, supported providers, and the full set of OpenAI-compatible base URLs.

Connect more providers

Set up Anthropic, Vertex AI, Gemini, OpenRouter, Cerebras, Groq, and more.

Organize with tasks

Group LLM calls by feature or objective to track metrics separately.

Build a dataset

Turn captured traffic into datasets for evals and training.

Integrations

Gateway

Traces

Choose a setup path

What gets captured

Where to find your data

Next steps

Gateway overview

Connect more providers

Organize with tasks

Build a dataset

Integrations

Gateway

Traces

Documentation Index

​Choose a setup path

​What gets captured

​Where to find your data

​Next steps

Gateway overview

Connect more providers

Organize with tasks

Build a dataset

Choose a setup path

What gets captured

Where to find your data

Next steps