The Inference CLI is the fastest way to connect your app to Catalyst. It scans your codebase, finds your LLM clients, and either routes them through the gateway, adds trace collection, or both — guided by an AI coding agent (Claude Code, OpenCode, or Codex). Want your AI coding assistant to query Catalyst resources directly? Configure the MCP server.Documentation Index
Fetch the complete documentation index at: https://docs.inference.net/llms.txt
Use this file to discover all available pages before exploring further.
Install with AI works with OpenAI, Anthropic, Gemini, Vertex AI, Groq, Cerebras, OpenRouter, LangChain and more.
What it sets up
When you runinf instrument, you’ll be asked which Catalyst product to set up:
| Mode | What it does |
|---|---|
gateway | Routes every LLM call through the Catalyst proxy so cost, latency, and usage telemetry land in Observability. |
tracing | Adds the Catalyst tracing SDK and instruments your agents, tools, and provider calls with span trees. |
both | Sets up gateway routing first, then trace collection — in a single agent session. |
--mode gateway, --mode tracing, or --mode both to skip the prompt (useful in CI or scripted runs).
Run instrumentation in your project
Navigate to your project root and run instrumentation.The command guides you through the following workflow:
- Pick what to instrument: gateway, tracing, or both.
- Select a coding agent to use: Claude Code, OpenCode, or Codex.
- Scan your codebase for LLM clients such as OpenAI, Anthropic, LangChain, etc.
- For gateway: redirect base URLs to the proxy, add routing and task ID headers.
- For tracing: install the Catalyst tracing SDK, initialize it at the app entrypoint, and add spans around agents, tools, and provider calls.
- Review the generated changes before applying them.
Run your app
Run your application how you normally would to produce inference requests. Requests from your application are now routed through the gateway and will appear in the dashboard.
Verify it worked
Open the dashboard to see request details, traces, and analytics. You can also verify from the CLI:
Supported AI coding agents
| Agent | Binary |
|---|---|
| Claude Code | claude |
| OpenCode | opencode |
| Codex | codex |
Supported providers
Built-in: OpenAI, Anthropic OpenAI-compatible viax-inference-provider-url: Google Gemini, Vertex AI, Together AI, Groq, Fireworks AI, Mistral AI, Cerebras, Perplexity, DeepSeek, OpenRouter, Azure OpenAI, and any OpenAI-compatible endpoint.
Native provider APIs: Vertex AI native Gemini and Anthropic-on-Vertex are supported through the manual gateway headers documented in the Vertex AI guide.