Quickstart

Get an API key

Visit inference.net and create an account.
On the dashboard, visit the API Keys tab on the left sidebar. Create an API key or use the default key.
Copy the API key to your clipboard by clicking the copy icon to the right of the key.
In your terminal, set the INFERENCE_API_KEY environment variable to the API key you copied.

export INFERENCE_API_KEY=<your-api-key>

Test Request

Perform a simple curl request to the Inference.net API.

curl -N https://api.inference.net/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $INFERENCE_API_KEY" \
  -d '{
    "model": "meta-llama/llama-3.1-8b-instruct/fp-8",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "What is the meaning of life?"
      }
    ],
    "stream": true
  }'

The request output should stream into your terminal.

OpenAI SDK

Inference.net is compatible with the OpenAI Chat API. You can use the official OpenAI SDK to interact with the Inference.net API. We support both streaming and non-streaming requests, as well as the following parameters:

max_tokens
temperature
top_p
frequency_penalty
presence_penalty

If you need parameters that are not list here, please contact us and we’ll add them. Note: Make sure you export the INFERENCE_API_KEY environment variable before running the code below.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.inference.net/v1",
    api_key=os.environ.get("INFERENCE_API_KEY"),
)

response = client.chat.completions.create(
    model="meta-llama/llama-3.1-8b-instruct/fp-8",
    messages=[{"role": "user", "content": "What is the meaning of life?"}],
    stream=True,
)

for chunk in response:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end='', flush=True)

Next Steps

Batch Processing

Process multiple asynchronous requests in a single API call and retrieve results

View Models

Explore the models available on Inference.net

Get Started

Use Cases

Features

Fine-Tuning

Resources

Get an API key

Test Request

OpenAI SDK

Next Steps

Batch Processing

View Models

Get Started

Use Cases

Features

Fine-Tuning

Resources

​Get an API key

​Test Request

​OpenAI SDK

​Next Steps

Batch Processing

View Models

Get an API key

Test Request

OpenAI SDK

Next Steps