Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.inference.net/llms.txt

Use this file to discover all available pages before exploring further.

The Inference MCP server lets compatible AI coding assistants query and operate Catalyst resources from your project. Use it to inspect projects, models, datasets, rubrics, evals, training jobs, deployments, inferences, traces, and spans without leaving your MCP client. MCP authentication uses project API keys only. Add your key as a Bearer token, and the MCP server forwards it to Catalyst APIs where permissions are enforced.
Use a read-only key for browsing resources. Use a key with write access when you want MCP tools to create datasets, run evals, launch training jobs, or change deployments.

Configure your client

1

Create or choose a project API key

Open API Keys in the dashboard and choose a key scoped to the project you want your MCP client to access.
2

Add the MCP server to your tool

Use https://mcp.inference.net/mcp as the server URL and pass your key in the Authorization header. Examples for common clients are below.
3

Reload and test

Restart or reload your MCP client, then call list_projects to confirm that the key is being forwarded correctly.

Claude Code

Set INFERENCE_API_KEY in your shell, then add the hosted MCP server:
export INFERENCE_API_KEY="<your-project-api-key>"

claude mcp add --transport http inference https://mcp.inference.net/mcp \
  --header "Authorization: Bearer $INFERENCE_API_KEY"

Cursor

Add the server to .cursor/mcp.json for one project, or ~/.cursor/mcp.json for all projects.
{
  "mcpServers": {
    "inference": {
      "url": "https://mcp.inference.net/mcp",
      "headers": {
        "Authorization": "Bearer ${env:INFERENCE_API_KEY}"
      }
    }
  }
}

VS Code

Create or edit .vscode/mcp.json. VS Code will prompt for the key the first time it starts the server and store it securely.
{
  "inputs": [
    {
      "type": "promptString",
      "id": "inference-api-key",
      "description": "Inference API key",
      "password": true
    }
  ],
  "servers": {
    "inference": {
      "type": "http",
      "url": "https://mcp.inference.net/mcp",
      "headers": {
        "Authorization": "Bearer ${input:inference-api-key}"
      }
    }
  }
}

Windsurf

Edit ~/.codeium/windsurf/mcp_config.json and add the server:
{
  "mcpServers": {
    "inference": {
      "serverUrl": "https://mcp.inference.net/mcp",
      "headers": {
        "Authorization": "Bearer ${env:INFERENCE_API_KEY}"
      }
    }
  }
}

Codex

Set INFERENCE_API_KEY, then add the server with the Codex CLI:
export INFERENCE_API_KEY="<your-project-api-key>"

codex mcp add inference \
  --url https://mcp.inference.net/mcp \
  --bearer-token-env-var INFERENCE_API_KEY

OpenCode

Add the server under mcp in your OpenCode config:
{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "inference": {
      "type": "remote",
      "url": "https://mcp.inference.net/mcp",
      "enabled": true,
      "headers": {
        "Authorization": "Bearer <your-project-api-key>"
      }
    }
  }
}

Gemini CLI

Add the server to ~/.gemini/settings.json or .gemini/settings.json:
{
  "mcpServers": {
    "inference": {
      "httpUrl": "https://mcp.inference.net/mcp",
      "headers": {
        "Authorization": "Bearer <your-project-api-key>"
      }
    }
  }
}

Other clients

Use Streamable HTTP with this URL and header. The client must support custom headers; OAuth-only or SSE-only clients are not supported.
URL: https://mcp.inference.net/mcp
Authorization: Bearer <your-project-api-key>

Troubleshooting

ErrorWhat it meansHow to fix
401The Authorization header is missing or the API key is invalid.Check that the header is Authorization: Bearer <your-project-api-key> and that the key starts with sk-inference-.
403The key is valid but does not have the permission needed for the tool.Use a key with the required read or write scope for that action.
Keep project API keys out of source control. Prefer your MCP client’s secret storage or environment-variable support when available.