Skip to main content
Inference.net is a full LLM lifecycle platform for teams that already use LLM providers and want more confidence in production. You can route existing traffic through Observe, turn real requests into datasets, run evals, train better task-specific models, and deploy them behind stable production endpoints. The docs are optimized for one primary journey:
  1. get the first observed request
  2. turn that traffic into datasets
  3. build an eval baseline
  4. train or deploy with confidence

Start here

Observe Quickstart

Route an existing OpenAI-compatible app through Inference.net and verify your first observed request.

API Quickstart

Make your first direct API call when you want to start from the hosted API instead of Observe.

Search models

Browse the model catalog before you pick an API or deployment path.

Meet with Us

Talk to our team if you want help designing your eval, training, or deployment workflow.

How the platform fits together

Observe

Capture real traffic first.

Datasets

Save the slices you want to measure and improve.

Evaluate

Compare models on real product tasks.

Train & Deploy

Improve and ship the model only after you trust the evidence.

Who should start where

If this sounds like you…Start hereWhat comes next
”We already use OpenAI or Anthropic and want visibility first.”/start-here/observe-quickstartThen create datasets from observed traffic
”We want to prototype directly against the API first.”/quickstartThen choose realtime, background, or batch
”We need a release gate before changing models.”/guides/build-a-real-traffic-eval-baselineThen use the same eval to decide whether to train or deploy