Catalyst has three ways to connect with your stack. Tracing integrations use a lightweight SDK to collect the full shape of an LLM operation directly from your code, including agent runs, tool calls, framework steps, and spans you add yourself. Gateway integrations proxy your existing provider calls through Catalyst with a one-line base URL change, no SDK swap required. MCP connects compatible AI coding assistants to Catalyst resources with your project API key. Browse the documented integrations below. If you do not see a gateway provider yet, Catalyst can still route many OpenAI-compatible endpoints throughDocumentation Index
Fetch the complete documentation index at: https://docs.inference.net/llms.txt
Use this file to discover all available pages before exploring further.
x-inference-provider-url.
Jump to: Tracing integrations · Gateway integrations · Routing headers · OpenAI-compatible providers
Traces Integrations
Traces integrations use the@inference/tracing (TypeScript) or inference-catalyst-tracing (Python) SDK to collect OpenInference-shaped spans directly from LLM SDKs, agent frameworks, and your own orchestration code. A single setup() call instruments the providers or frameworks you enable. Spans are exported over OTLP and grouped in Catalyst by service, trace, and task.
Use Traces when you need:
- Full agent run trees, not just individual requests
- Tool calls, tool results, and multi-step framework spans
- Visibility into work that never touches the Catalyst gateway (local models, custom routing, non-HTTP orchestration)
Traces overview
Learn what gets captured and how to get started with Catalyst Tracing.
Browse Tracing Integrations
Gateway Integrations
Gateway integrations route requests through the Catalyst gateway with a one-line base URL change. You keep your existing provider API keys. Your Catalyst project API key authenticates requests to the gateway, and a small set of headers control routing, environments, and task grouping. Because requests flow through the gateway, Catalyst can measure performance metrics that are invisible to application code: time to first token (TTFT), tokens per second, and end-to-end latency across providers. These are captured automatically without any changes to your request logic.Gateway overview
Routing headers, supported providers, and the full Gateway setup reference.
Browse Gateway Integrations
Routing Headers
| Header | Required | Description |
|---|---|---|
Authorization | Yes | Bearer <your-project-api-key> authenticates the request to the gateway and links it to your project. For OpenAI-compatible SDKs, set this as the SDK’s apiKey. |
x-inference-provider-api-key | Yes | Your provider API key or token, such as OpenAI, Groq, Gemini, or a Google Cloud Vertex credential. The gateway forwards it downstream. For Anthropic’s native SDK, use x-api-key instead. |
x-inference-provider | No | Forces routing to a specific provider, such as openai, anthropic, gemini, vertex-ai, or cerebras. Usually inferred from the SDK, path, or x-inference-provider-url; set it only to override that inference. |
x-inference-environment | No | Tags requests with an environment, such as production or staging. |
x-inference-task-id | No | Groups requests under a logical task for filtering and analytics. |
x-inference-provider-url | No | Routes to any OpenAI-compatible provider by specifying its base URL. For Vertex native APIs, set this to the global or regional aiplatform.googleapis.com base URL. |
Supported OpenAI-compatible Provider URLs
Any OpenAI-compatible provider can be used via thex-inference-provider-url header, even when it does not have a dedicated guide in the catalog yet.
| Provider | Base URL |
|---|---|
| OpenAI | https://api.openai.com/v1 |
| OpenRouter | https://openrouter.ai/api |
| Anthropic | https://api.anthropic.com/v1 |
| Google Gemini | https://generativelanguage.googleapis.com for native Gemini paths; https://generativelanguage.googleapis.com/v1beta/openai for OpenAI-compatible calls |
| Vertex AI | https://aiplatform.googleapis.com/v1/projects/{project}/locations/global/endpoints/openapi |
| Azure OpenAI | https://{resource}.openai.azure.com/openai/deployments/{deployment} |
| Groq | https://api.groq.com/openai/v1 |
| Together AI | https://api.together.xyz/v1 |
| Fireworks AI | https://api.fireworks.ai/inference/v1 |
| Perplexity | https://api.perplexity.ai |
| Mistral | https://api.mistral.ai/v1 |
| DeepSeek | https://api.deepseek.com/v1 |
| Cerebras | https://api.cerebras.ai/v1 |
| Inference.net | https://api.inference.net/v1 |
https://{location}-aiplatform.googleapis.com/v1/projects/{project}/locations/{location}/endpoints/openapi instead of the global host.