Skip to main content
Route your Cerebras requests through the Inference Catalyst gateway to get cost tracking, latency monitoring, and analytics. Cerebras has a dedicated provider routing ID, so you use the OpenAI SDK with x-inference-provider: cerebras.
Prefer automatic setup? Run inf instrument to instrument your codebase in seconds. Learn more

Setup

1

Get your API keys

You need two keys:
2

Set environment variables

export INFERENCE_API_KEY=<your-project-api-key>
export CEREBRAS_API_KEY=<your-cerebras-api-key>
3

Update your code

Point the SDK at the gateway. Your project API key goes in apiKey, and your Cerebras key goes in x-inference-provider-api-key.
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.inference.net/v1",
  apiKey: process.env.INFERENCE_API_KEY,
  defaultHeaders: {
    "x-inference-provider-api-key": process.env.CEREBRAS_API_KEY,
    "x-inference-provider": "cerebras",
    "x-inference-environment": process.env.NODE_ENV,
  },
});

const response = await client.chat.completions.create({
  model: "llama3.1-8b",
  messages: [{ role: "user", content: "Hello" }],
}, {
  headers: { "x-inference-task-id": "default" },
});