Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.inference.net/llms.txt

Use this file to discover all available pages before exploring further.

Catalyst is a platform for understanding, evaluating, and improving the AI systems you ship. Create an account to get started with Catalyst.

The Catalyst platform

Catalyst is built around two products. Use them on their own or together.

Tracing

Capture full traces of every LLM call, tool call, and agent step. Then let Halo, our open-source agent-loop optimizer, read your traces and tell you what to fix in prompts, tools, and the harness itself.

Gateway

Route LLM traffic through Catalyst Gateway to capture every request, watch usage and cost, build datasets, run evals, and fine-tune and deploy task-specific models on dedicated GPUs.
Tracing is for agent improvement. Gateway is for inference observability and training task-specific models from your captured traffic. Most teams start with one and add the other later. They also compose: when your agent calls LLMs through Gateway, trace spans line up with the same requests Gateway is recording.

Quick Start

Tracing

Capture your first trace

Install the tracing SDK and capture LLM calls, tool calls, and agent steps in minutes.

Analyze your traces

Inspect trace trees in the dashboard and run Halo to find what to improve.

Gateway

Record your first LLM call

Route traffic through the Catalyst gateway to capture LLM calls and view metrics.

Run your first eval

Define quality, measure it, and compare models side by side.

Train and deploy a model

The full loop: data, training, and a production endpoint.

Use the Inference API

Call open-source and custom models running on Inference.net.

Catalyst Platform

Tracing

Collect OpenInference-shaped traces from LLM SDKs, agent frameworks, and custom application code.

Observe

Observe your LLM usage and get metrics and visibility into your production traffic.

Datasets

Create and manage datasets for evaluation and training.

Evaluate

Evaluate models to measure quality across model candidates.

Train

Train a custom model on your production data to improve performance, lower latency and cost.

Deploy

Deploy a model to a dedicated GPU to use in production.