Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.inference.net/llms.txt

Use this file to discover all available pages before exploring further.

Route your Cerebras requests through the Inference Catalyst gateway to get cost tracking, latency monitoring, and analytics. Cerebras has a dedicated provider routing ID, so you use the OpenAI SDK with x-inference-provider: cerebras.
Prefer automatic setup? Run inf instrument to instrument your codebase in seconds. Learn more

Setup

1

Get your API keys

You need two keys:
2

Set environment variables

export INFERENCE_API_KEY=<your-project-api-key>
export CEREBRAS_API_KEY=<your-cerebras-api-key>
3

Update your code

Point the SDK at the gateway. Your project API key goes in apiKey, and your Cerebras key goes in x-inference-provider-api-key.
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.inference.net/v1",
  apiKey: process.env.INFERENCE_API_KEY,
  defaultHeaders: {
    "x-inference-provider-api-key": process.env.CEREBRAS_API_KEY,
    "x-inference-provider": "cerebras",
    "x-inference-environment": process.env.NODE_ENV,
  },
});

const response = await client.chat.completions.create({
  model: "llama3.1-8b",
  messages: [{ role: "user", content: "Hello" }],
}, {
  headers: { "x-inference-task-id": "default" },
});