A production-shaped agent with custom tool execution, end to end. Memoized setup, parent agent span, per-tool TOOL spans, domain attributes, and graceful shutdown.
Use this file to discover all available pages before exploring further.
This example walks through a realistic agent loop end to end: a long-lived
server that handles incoming messages, runs an LLM-driven agent that calls
several custom tools, and exports a clean OpenInference trace tree to
Catalyst. Every piece in this guide maps to a real pattern used by agents in
production today.By the end you will have:
A boot-time setup() that runs once per process.
A request handler that wraps the whole agent run in an AGENT span with
stable agent.id and per-conversation session.id.
Custom tool execution wrapped in TOOL spans with tool.name and
tool_call.id.
Auto-emitted LLM child spans from the patched Anthropic SDK, nested under
the agent span by OTel context propagation.
Domain-specific attributes (tenant, channel, viewer role) on the agent span
for filtering in the dashboard.
A graceful shutdown that flushes batched spans on SIGTERM.
Tracing should initialize once per process, not per request. For a
long-lived server, that means a memoized setup() call that any code path
can await.
TypeScript
// tracing.tsimport Anthropic from "@anthropic-ai/sdk";import { setup, type CatalystTracing } from "@inference/tracing";let tracingPromise: Promise<CatalystTracing> | null = null;export function initTracing(): Promise<CatalystTracing> { if (!tracingPromise) { tracingPromise = setup({ serviceName: process.env.SERVICE_NAME ?? "customer-support", serviceVersion: process.env.SERVICE_VERSION, endpoint: process.env.CATALYST_OTLP_ENDPOINT, token: process.env.CATALYST_OTLP_TOKEN, modules: { anthropic: Anthropic }, }); } return tracingPromise;}export async function shutdownTracing(): Promise<void> { if (!tracingPromise) return; const tracing = await tracingPromise; await tracing.shutdown();}
TypeScript (server entrypoint)
// server.tsimport { initTracing, shutdownTracing } from "./tracing.ts";await initTracing(); // patches Anthropic before any client is constructedconst server = startServer();for (const signal of ["SIGTERM", "SIGINT"] as const) { process.on(signal, async () => { await shutdownTracing(); server.close(() => process.exit(0)); });}
Two things to notice:
initTracing() runs before the first Anthropic client is constructed.
The per-SDK patchers work by mutating the SDK’s prototype, so setup()
has to win the race.
shutdown() runs on SIGTERM, not per request. Spans are batched and
exported in the background; calling shutdown() per request would force
synchronous flushes and add latency.
Each incoming message becomes one trace, rooted at one AGENT span. The
agent span carries the identifiers Catalyst uses for grouping in the Agents
dashboard.
The four app.* attributes are outside the OpenInference vocabulary. They
go on the raw OTel span and become filter facets in the dashboard. Use the
same naming convention (a stable prefix for your app, dot-separated keys) so
you can find them easily under inf trace list --metadata "app.channel=slack".
When the LLM emits a tool_use block, your code runs the actual tool
function. Wrap that execution in a TOOL span so the trace tree shows what
the tool received, what it returned, and how long it took.
manualSpan writes openinference.span.kind=TOOL, tool.name,
tool_call.id, input.value, and input.mime_type from the options. The
callback only needs to set the output. Span end, status, and exception
recording are all handled — if the tool throws, the exception is recorded
on the span, the span ends with ERROR, and the original exception
re-throws so the agent loop can see it.Because executeTool runs inside the active context established by
agentSpan upstream, the TOOL span automatically parents under the agent
span. No span IDs need to be threaded through.
If your tool needs behavior manualSpan does not provide — for instance,
recording a span event mid-callback while keeping the span alive past the
callback return — drop down to tracing.tracer.startActiveSpan and manage
status / span.end() yourself. See
Manual spans → Alternative TypeScript patterns.
The agent loop alternates between calling the LLM and executing tool calls
the LLM requests. Both sides are now instrumented.
TypeScript
// agent.tsimport Anthropic from "@anthropic-ai/sdk";import { executeTool, type ToolName } from "./tools.ts";import type { IncomingMessage } from "./handler.ts";const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });const TOOLS: Anthropic.Tool[] = [ { name: "lookup_order", description: "Look up an order by ID.", input_schema: { type: "object", properties: { orderId: { type: "string" } }, required: ["orderId"], }, }, { name: "issue_refund", description: "Issue a refund.", input_schema: { type: "object", properties: { orderId: { type: "string" }, amount: { type: "number" }, }, required: ["orderId", "amount"], }, },];export async function runAgent(msg: IncomingMessage): Promise<string> { const messages: Anthropic.MessageParam[] = [ { role: "user", content: msg.text }, ]; for (let turn = 0; turn < 8; turn++) { // The patched Anthropic SDK emits an LLM span automatically, parented // under the active agent span via OTel context propagation. const response = await client.messages.create({ model: "claude-haiku-4-5", max_tokens: 1024, tools: TOOLS, messages, }); if (response.stop_reason === "end_turn") { return textOf(response.content); } if (response.stop_reason !== "tool_use") { return textOf(response.content); } messages.push({ role: "assistant", content: response.content }); const toolResults: Anthropic.ToolResultBlockParam[] = []; for (const block of response.content) { if (block.type !== "tool_use") continue; const result = await executeTool( block.name as ToolName, block.input as Record<string, unknown>, block.id, ); toolResults.push({ type: "tool_result", tool_use_id: block.id, content: JSON.stringify(result), }); } messages.push({ role: "user", content: toolResults }); } return "Max turns reached.";}function textOf(content: Anthropic.ContentBlock[]): string { return content .filter((b): b is Anthropic.TextBlock => b.type === "text") .map((b) => b.text) .join("");}
Three observations:
No tracing imports in the inner loop. The agent code looks the same as
it would without tracing. The instrumentation is at the boundaries
(setup(), agentSpan(), executeTool()).
The patched Anthropic SDK does the LLM-span work. We pass
modules: { anthropic: Anthropic } to setup(), and from then on every
client.messages.create() call emits an LLM span with input messages,
output content blocks, model, finish reason, and token usage.
Tool spans are caller-side. They wrap the real function execution, not
the message round-trip. The model-side view of the tool call is captured
on the parent LLM span automatically; the caller-side view is the
TOOL span we author.
Send a request through the server, then check the resulting trace:
# Find the most recent trace from this serviceinf trace list --service customer-support --limit 1# Open its span treeinf trace get <trace-id> --view tree# Inspect a TOOL span's input and outputinf span list --trace-id <trace-id> --kind TOOLinf span get <trace-id> <span-id> --view io# Filter on a domain attributeinf trace list --metadata "app.channel=slack" --range 1h
The trace tree should match the diagram at the top of this page.
If agent.id itself depends on the request (for example, a multi-tenant
service that runs different agent personas per customer), compute it in the
handler:
Stable IDs matter more than human-friendly ones. Prefer support-acme-prod
over support-acme-2024-v2 — the Agents dashboard uses the ID to group runs
across deploys.
If your tool launches a background job that itself does LLM work, capture
the active identity and pass it into the job so the background span can be
filtered together with its originating conversation:
When the agent streams output back to the user, set the span output once at
the end, after the stream completes. The patched SDK already handles
streaming LLM calls correctly; the outer agent span just needs the final
text:
TypeScript
await agentSpan(tracing.tracer, options, async (span) => { span.setInput(msg.text); let final = ""; for await (const chunk of streamAgent(msg)) { final += chunk; yield chunk; // back to the caller } span.setOutput(final);});