The Inference CLI is the fastest way to connect your app to Catalyst. It scans your codebase, finds your LLM clients, and updates them to route through the gateway with minimal manual work.Documentation Index
Fetch the complete documentation index at: https://docs.inference.net/llms.txt
Use this file to discover all available pages before exploring further.
Install with AI works with OpenAI, Anthropic, Gemini, Vertex AI, Groq, Cerebras, OpenRouter, LangChain and more.
Run instrumentation in your project
Navigate to your project root and run instrumentation.The command guides you through the following workflow:
- Select a coding agent to use: Claude Code, OpenCode, or Codex.
- Scan your codebase for LLM clients such as OpenAI, Anthropic, LangChain,etc
- Redirect base URLs to the gateway
- Add routing headers so requests are authenticated, forwarded, and traced
- Add task IDs so each call site is grouped automatically in the dashboard
- Review the generated changes before applying them
Run your app
Run your application how you normally would to produce inference requests. Requests from your application are now routed through the gateway and will appear in the dashboard.
Verify it worked
Open the dashboard to see request details, traces, and analytics. You can also verify from the CLI:
Your app continues using the same provider SDKs you already have; the command updates your existing provider clients.
Supported AI coding agents
| Agent | Binary |
|---|---|
| Claude Code | claude |
| OpenCode | opencode |
| Codex | codex |
Supported providers
Built-in: OpenAI, Anthropic OpenAI-compatible viax-inference-provider-url: Google Gemini, Vertex AI, Together AI, Groq, Fireworks AI, Mistral AI, Cerebras, Perplexity, DeepSeek, OpenRouter, Azure OpenAI, and any OpenAI-compatible endpoint.
Native provider APIs: Vertex AI native Gemini and Anthropic-on-Vertex are supported through the manual gateway headers documented in the Vertex AI guide.