> ## Documentation Index > Fetch the complete documentation index at: https://docs.inference.net/llms.txt > Use this file to discover all available pages before exploring further. # Catalyst Workflow > How Tracing and Gateway fit together across the Catalyst platform. Catalyst is built around two products. * **Tracing** is for agent improvement. The tracing SDKs capture the full execution of your agents and AI apps: LLM calls, tool calls, framework steps, and any custom spans you wrap. Halo, our open-source agent-loop optimizer, reads those traces and writes up what to fix in prompts, tools, and the harness itself. * **Gateway** is for inference observability and training task-specific models. Gateway sits between your app and your LLM provider, recording every request. From that recorded traffic, you can build datasets, run evals, fine-tune smaller models, and deploy them on dedicated GPUs. Tracing and Gateway stand alone. Many teams start with one and add the other later. They also compose: when your agent calls LLMs through Gateway, the trace spans and the gateway records line up against the same requests. The platform also provides access to open-source and Inference.net-trained models (like [Schematron](/workhorse-models/schematron) for structured data extraction) through an OpenAI-compatible API. *** ## Tracing Capture full traces of your agents and AI apps. The Catalyst tracing SDKs collect LLM calls, tool calls, framework steps, agent runs, and any custom spans you wrap. Then Halo (our open-source [agent-loop optimizer](https://github.com/context-labs/halo)) reads your traces and surfaces concrete things to improve in your prompts, tools, and agent harness. **What you'll do:** * [Capture your first trace](/get-started/capture-first-trace) by installing the tracing SDK and pointing it at Catalyst * [Analyze your traces](/get-started/analyze-traces) in the Traces and Agents dashboards, then run Halo to find what to fix * Instrument [providers and frameworks](/integrations/traces/overview) like OpenAI, Anthropic, LangChain, LangGraph, Vercel AI SDK, OpenAI Agents, LiveKit, PI AI, Pydantic AI, and more * Add stable [agent identity](/integrations/traces/agent-identity) so the Agents dashboard groups runs correctly **Outcome:** Deep visibility into how your agents behave end to end, plus an automated reviewer that points you to the highest-impact fixes. Install the SDK and capture your first trace. *** ## Gateway Record and analyze your production LLM traffic. Catalyst Gateway sits between your app and your LLM provider, capturing every request, response, cost, and latency metric with less than 10ms of overhead. Keep using any provider or model — Gateway is transparent. **What you'll do:** * [Integrate with your LLM provider](/platform/gateway/integrate) to start capturing traffic * [Define tasks](/platform/gateway/tasks) to group LLM calls by objective (e.g. "summarize docs", "classify tickets") * [Explore metrics](/platform/gateway/metrics-explorer) for cost, latency, errors, and token usage across all your calls * [Browse individual inferences](/platform/gateway/inference-viewer) to inspect raw requests and responses **Outcome:** Full visibility into how your AI features perform in production — broken down by model, task, and provider. Set up Gateway and start capturing LLM traffic. *** ## Datasets Curate collections of LLM inputs and outputs for evaluation and training. Datasets can come from your live production traffic captured through Gateway, or from files you upload directly. **What you'll do:** * [Build datasets from traffic](/platform/datasets/build-from-traffic) by filtering captured inferences and saving them * [Upload your own data](/platform/datasets/upload-a-dataset) as JSONL files when you have curated examples or are migrating from another platform * Understand [dataset formats](/platform/datasets/formats) and the schema your data needs to follow **Outcome:** Clean, representative datasets scoped to specific tasks — ready to power evals and training. Build or upload your first dataset. *** ## Eval Measure model quality with rubrics scored by LLM judges. Define what "good" looks like for your use case, then score model outputs systematically across candidates. Evals tell you which model is better and by how much — so you can make decisions with data instead of intuition. **What you'll do:** * [Write a rubric](/platform/eval/write-a-rubric) that describes your quality criteria in plain English — from a template, AI generation, or scratch * [Run a model comparison](/platform/eval/run-a-comparison) to score multiple models side by side on your dataset * Understand how [LLM-as-a-judge](/platform/eval/llm-as-a-judge) scoring works under the hood * [Read the results](/platform/eval/read-the-results) to interpret scores and decide which model wins **Outcome:** A repeatable, data-driven way to measure model quality before and after every change — and a validated rubric that can guide training. Define quality, measure it, and compare models. *** ## Train Fine-tune a task-specific model on your production data. The result is a model that's smaller, faster, and cheaper to run than the general-purpose model it replaces — while being more accurate for your workload. You don't need to be an ML engineer to use it. **What you'll do:** * [Choose a recipe](/platform/train/choose-a-recipe) — a pre-configured training setup with a vetted base model and optimized parameters * [Launch a training run](/platform/train/launch-a-run) with your training dataset, eval dataset, and rubric * [Monitor mid-training evals](/platform/train/mid-training-evals) to track quality scores as the model learns **Outcome:** A trained, task-specific model that's been validated against your rubric — ready to deploy. Fine-tune a model on your data. *** ## Deploy Ship your trained model to a dedicated GPU with an OpenAI-compatible API. The API uses the same base URL and API key as the rest of the Inference platform — switching from an off-the-shelf model to your custom model is a one-line code change. **What you'll do:** * [Deploy a trained model](/platform/deploy/deploy-a-model) to a dedicated GPU in a few clicks * [Call your deployment](/platform/deploy/call-your-deployment) using the same OpenAI-compatible SDK you already use * [Manage and monitor](/platform/deploy/manage-and-monitor) your deployment lifecycle, scaling, and performance **Outcome:** A production endpoint serving your custom model — and the beginning of the next improvement loop. Deploy, observe, eval, retrain. Ship your model to a dedicated GPU. *** ## Pick your starting point Install the tracing SDK and capture LLM calls, tool calls, and agent steps. Inspect trace trees and run Halo to find what to improve. Route traffic through the Catalyst gateway to capture LLM calls and view metrics. Define quality, measure it, and compare models side by side. The full loop: data, training, and a production endpoint. Access open-source and Inference.net models directly.