Call Your Deployment - Inference.net Documentation

Deployments expose an OpenAI-compatible chat completions endpoint. Point any OpenAI SDK client at the Inference base URL and set the model to your deployment’s model path.

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.inference.net/v1",
  apiKey: process.env.INFERENCE_API_KEY,
});

const response = await client.chat.completions.create({
  model: "acme-corp/my-model",
  messages: [{ role: "user", content: "Hello, world!" }],
});

Structured outputs, function calling, and other chat completions features all work the same way. You can also tag requests with x-inference-task-id to group calls by objective — see Tasks for details.

Where to find your model path

The model path is shown on your deployment’s detail page in the dashboard. It’s your team slug followed by the name you chose when creating the deployment (e.g. acme-corp/my-model).

Deploy a Trained Model

Manage and Monitor

⌘I

Documentation Index

​Where to find your model path

Where to find your model path