@inference/tracing (TypeScript) or inference-catalyst-tracing (Python) package, point it at Catalyst, and capture your first span. If you’d rather see the higher-level flow, start with Capture your first trace.
The example below uses OpenAI because it’s the smallest end-to-end trace. The same setup pattern applies to Anthropic, LangChain, LangGraph, LangSmith, OpenAI Agents, LiveKit Agents, ElevenLabs Agents, PI AI, Cursor SDK, Claude Agent SDK, Pydantic AI, the Vercel AI SDK, and manual spans.
Choose a setup path
Installing with AI is the quickest. Use the manual flow if you want to wire it up yourself.- Install with AI
- Install manually
Use the Inference CLI to launch a coding agent like Claude Code, OpenCode, or Codex to install the tracing SDK, configure export, and wire up your LLM clients.
Install the CLI and authenticate
Install the Inference CLI globally and log in. Your browser will open to authenticate.
Run tracing instrumentation in your project
From your project root, run instrumentation in tracing mode.The command guides you through the following workflow:
- Select a coding agent: Claude Code, OpenCode, or Codex.
- Scan your codebase for LLM clients and agent frameworks.
- Install
@inference/tracingorinference-catalyst-tracingplus the right per-integration extras. - Wire
setup()into your app entrypoint so spans start before clients are constructed. - Add stable service and agent identity so traces group cleanly in the dashboard.
- Review the generated changes before applying them.
Run your app
Run your application how you normally would. Traces stream to Catalyst as your code executes.
View your trace
Open the dashboard and filter by your service name to see the captured trace tree.
Want the full canonical guide for this workflow? See Install with AI.
Group calls under an agent
The example above is a one-shot LLM call. Once your app runs multiple LLM calls as part of a logical unit (an agent run, a conversation turn, a workflow), wrap that unit inagentSpan so the LLM spans nest under an AGENT row carrying agent.id, agent.name, and session.id. The Agents dashboard groups on those attributes.
agentSpan row instead of as an orphan. Real agents run many LLM calls per session; the outer span is what makes them findable as one thing.
Wrap your own code
For non-LLM steps inside an agent loop (a tool call, a retrieval, a custom router, an evaluator, a CLI subprocess), wrap them withmanualSpan. Combined with the agentSpan above, you get a full trace tree: an outer AGENT row, inner SDK rows, and inner manual rows, all parented correctly.
Flushing and process lifecycle
Spans are batched and exported in the background, so a process that exits or freezes before the batch flushes drops them. How you flush depends on the process shape:- Short-lived script: call
await tracing.shutdown()before exit. It force-flushes, then tears the provider down. The examples above do this. - Long-lived service (HTTP server, Slack bot, queue worker): call
setup()once per process before the first SDK client is constructed, memoize the result so any handler canawaitit, and callshutdown()only onSIGTERM. Never per request, since that forces a synchronous flush and adds latency. - Serverless or edge (Lambda, Cloudflare Workers): memoize
setup()the same way, but flush per invocation withtracing.provider.forceFlush()instead ofshutdown(), since the provider must survive for the next warm invocation.
Long-lived service
Serverless and edge runtimes
On Lambda, Cloudflare Workers, or any runtime that freezes the process between invocations, the background batch processor may never run, so spans are dropped. Memoizesetup() the same way as a long-lived service (the provider is reused
across warm invocations), but flush at the end of each invocation with
tracing.provider.forceFlush() rather than calling shutdown(). Reserve
shutdown() for real process teardown, since it tears down the provider the
next warm invocation needs.
Selective instrumentation
setup() auto-instruments every supported SDK it detects. If you want
explicit control — for example, to instrument OpenAI but skip LangChain — set
autoInstrument: false and call the targeted helper yourself:
TypeScript
Verify
Open the Catalyst dashboard and navigate to the Agents or Traces tab. The trace should include an OpenAI LLM span with input messages, output messages, model name, invocation parameters, finish reason, and token counts. Any custom spans you added show up as parent or sibling nodes in the trace tree. If you don’t see anything, see Troubleshooting.Next Steps
Analyze your traces
Walk trace trees in the dashboard and run Halo to find what to improve.
OpenAI tracing
Add tool calls, structured outputs, and Responses API examples.
Manual spans
Wrap custom agents, CLI calls, and unsupported SDKs.
Agent identity
Add stable agent IDs so the Agents dashboard groups runs correctly.