Install with AI - Inference.net Documentation

The Inference CLI is the fastest way to connect your app to Catalyst. It scans your codebase, finds your LLM clients, and either routes them through the gateway, adds trace collection, or both — guided by an AI coding agent (Claude Code, OpenCode, or Codex). Want your AI coding assistant to query Catalyst resources directly? Configure the MCP server.

Install with AI works with OpenAI, Anthropic, Amazon Bedrock, Gemini, Vertex AI, Groq, Cerebras, OpenRouter, LangChain and more.

What it sets up

When you run inf instrument, you’ll be asked which Catalyst product to set up:

Mode	What it does
`gateway`	Routes every LLM call through the Catalyst proxy so cost, latency, and usage telemetry land in Observability.
`tracing`	Adds the Catalyst tracing SDK and instruments your agents, tools, and provider calls with span trees.
`both`	Sets up gateway routing first, then trace collection — in a single agent session.

Pass --mode gateway, --mode tracing, or --mode both to skip the prompt (useful in CI or scripted runs).

Install the CLI

Install the Inference CLI globally.

npm install -g @inference/cli

inf auth login

Running in CI or another headless environment? Use inf auth set-key instead of browser login.

Run instrumentation in your project

Navigate to your project root and run instrumentation.

cd /path/to/your/project && inf instrument

The command guides you through the following workflow:

Pick what to instrument: gateway, tracing, or both.
Select a coding agent to use: Claude Code, OpenCode, or Codex.
Scan your codebase for LLM clients such as OpenAI, Anthropic, LangChain, etc.
For gateway: redirect base URLs to the proxy, add routing and task ID headers.
For tracing: install the Catalyst tracing SDK, initialize it at the app entrypoint, and add spans around agents, tools, and provider calls.
Review the generated changes before applying them.

Run inf instrument --dry-run to preview changes without modifying any files.

Run your app

Run your application how you normally would to produce inference requests. Requests from your application are now routed through the gateway and will appear in the dashboard.

Verify it worked

Open the dashboard to see request details, traces, and analytics. You can also verify from the CLI:

inf inference list

Add INFERENCE_API_KEY to your .env file so the instrumentation works across environments. Find your key in the dashboard under API Keys.

Supported AI coding agents

Agent	Binary
Claude Code	`claude`
OpenCode	`opencode`
Codex	`codex`

Supported providers

Built-in: OpenAI, Anthropic OpenAI-compatible via x-inference-provider-url: Amazon Bedrock, Google Gemini, Vertex AI, Together AI, Groq, Fireworks AI, Mistral AI, Cerebras, Perplexity, DeepSeek, OpenRouter, Azure OpenAI, and any OpenAI-compatible endpoint. Native provider APIs: Vertex AI native Gemini and Anthropic-on-Vertex are supported through the manual gateway headers documented in the Vertex AI guide.

​What it sets up

​Supported AI coding agents

​Supported providers

What it sets up

Supported AI coding agents

Supported providers