Skip to main content
Use this quickstart if you already have an LLM app and want the fastest path to the first Inference.net aha moment: a real request showing up in the dashboard with provider, model, latency, tokens, and cost.

What you’ll have when you finish

  • one real request routed through Inference.net
  • provider, model, cost, and latency visible in the dashboard
  • enough metadata to start building datasets, evals, and training workflows

Before you start

  • an existing app that already calls OpenAI, Anthropic, or another OpenAI-compatible provider
  • your upstream provider API key
  • an Inference.net project and observability project key

Step 1: choose the setup path

CLI path

Use inf install if you want the fastest self-serve setup.

Manual path

Keep full control over the SDK config and proxy headers.

Step 2: route one request through Inference.net

CLI path

  1. Install the CLI with /cli/install.
  2. Authenticate with inf auth login.
  3. Run inf install inside your application.
  4. Send one normal request from your app.

Manual path

For manual integration, the only essential changes are:
  • point your client at https://api.inference.net/v1
  • keep your upstream provider auth exactly as it is today
  • add x-inference-provider
  • add x-inference-observability-api-key
export OPENAI_API_KEY=<your-openai-api-key>
export INFERENCE_OBSERVABILITY_API_KEY=<your-observability-project-key>

curl https://api.inference.net/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -H "x-inference-provider: openai" \
  -H "x-inference-observability-api-key: $INFERENCE_OBSERVABILITY_API_KEY" \
  -H "x-inference-environment: production" \
  -H "x-inference-task-id: default" \
  -d '{
    "model": "gpt-5",
    "messages": [
      {"role": "user", "content": "Say hello in five words or fewer."}
    ]
  }'

Step 3: verify the first observed request

Open the dashboard and confirm that the request shows:
  • upstream provider
  • model name
  • environment and task
  • duration
  • total, input, and output tokens
  • cost
If those fields are missing, the request is not being routed through Observe correctly.

Step 4: add just enough metadata to make the traffic useful

Start with these headers:
  • x-inference-environment
  • x-inference-task-id
  • x-inference-metadata-*
That is enough to make the traffic filterable for later dataset creation. Use /reference/headers-and-metadata for the full reference.

What to do next

Create Datasets from Observed Traffic

Turn live traffic into the eval and training datasets that power the rest of the platform.

Meet with Us

Talk to our team if you want help migrating a larger production workflow.