Skip to main content
Route LLM traffic through Inference.net without giving up the SDKs and providers you already use. You keep your application logic, keep your upstream provider credentials, and gain request-level visibility into cost, latency, tokens, errors, environments, tasks, and custom metadata.

What traffic capture gives you

  • Request tracing for every inference that passes through the proxy
  • Cost and latency analytics broken down by provider, task, environment, and model
  • Datasets created from filtered production traffic
  • A direct path into evals and training using the exact requests your product already sees

Two ways to get started

Automatic instrumentation

Use the CLI to scan your codebase and add instrumentation with inf install.

Manual instrumentation

Point your SDK at Inference.net and add the routing and metadata headers yourself.

How it fits into the lifecycle

Capturing traffic is usually the entry point for platform workflows.
  1. Capture real requests.
  2. Save representative slices as datasets.
  3. Run evals against the datasets.
  4. Fine-tune or distill a better model.
  5. Deploy the result and keep observing production traffic.

What you will work with in the product

Once traffic is flowing, most teams spend time in these surfaces:
  • Inferences to inspect individual requests and responses
  • Tasks to group related traffic by workflow or feature
  • Datasets to save the examples you want to evaluate or train on
  • Evals to compare models and judge outputs
  • Training Jobs to turn good data and good rubrics into a better model

Best fit

Traffic capture is a strong fit when:
  • you already use OpenAI, Anthropic, or another OpenAI-compatible provider
  • you want analytics and trace inspection without rewriting your app
  • you want to build evals or training datasets from real production traffic
  • you need cleaner visibility into cost, latency, and failures by environment or task

Next steps

Datasets

Save live traffic as datasets or import historical logs.

Evals

Run repeatable quality checks against real datasets.

Talk to an engineer

Meet with our team if you want help planning your eval, training, or migration workflow.