- A boot-time
setup()that runs once per process. - A request handler that wraps the whole agent run in an
AGENTspan with stableagent.idand per-conversationsession.id. - Custom tool execution wrapped in
TOOLspans withtool.nameandtool_call.id. - Auto-emitted
LLMchild spans from the patched Anthropic SDK, nested under the agent span by OTel context propagation. - Domain-specific attributes (tenant, channel, viewer role) on the agent span for filtering in the dashboard.
- A graceful shutdown that flushes batched spans on
SIGTERM.
This example hand-rolls the agent loop with the raw Anthropic SDK, so you
author the
TOOL spans yourself (Step 3). The patched SDK captures the LLM
calls automatically, but your tools run in your own code, so nothing emits a
tool span unless you do.If you use a framework that runs the tools for you (OpenAI Agents, LangGraph),
the framework emits the TOOL spans and you skip Step 3. There you would only
add manual spans for steps the framework never sees, like a retrieval inside a
tool. See OpenAI Agents for that path.Step 1 — Bootstrap Tracing Once
Tracing should initialize once per process, not per request. For a long-lived server, that means a memoizedsetup() call that any code path
can await.
TypeScript
TypeScript (server entrypoint)
initTracing()runs before the first Anthropic client is constructed. The per-SDK patchers work by mutating the SDK’s prototype, sosetup()has to win the race.shutdown()runs onSIGTERM, not per request. Spans are batched and exported in the background; callingshutdown()per request would force synchronous flushes and add latency.
Step 2 — Define The Request Boundary
Each incoming message becomes one trace, rooted at oneAGENT span. The
agent span carries the identifiers Catalyst uses for grouping in the Agents
dashboard.
TypeScript
app.* attributes are outside the OpenInference vocabulary. They
go on the raw OTel span and become filter facets in the dashboard. Use the
same naming convention (a stable prefix for your app, dot-separated keys) so
you can find them easily under inf trace list --metadata "app.channel=slack".
Step 3 — Author Tool Spans Around Each Tool Call
When the LLM emits atool_use block, your code runs the actual tool
function. Wrap that execution in a TOOL span so the trace tree shows what
the tool received, what it returned, and how long it took.
TypeScript
manualSpan writes openinference.span.kind=TOOL, tool.name,
tool_call.id, input.value, and input.mime_type from the options. The
callback only needs to set the output. Span end, status, and exception
recording are all handled — if the tool throws, the exception is recorded
on the span, the span ends with ERROR, and the original exception
re-throws so the agent loop can see it.
Because executeTool runs inside the active context established by
agentSpan upstream, the TOOL span automatically parents under the agent
span. No span IDs need to be threaded through.
Step 4 — Wire The Agent Loop
The agent loop alternates between calling the LLM and executing tool calls the LLM requests. Both sides are now instrumented.TypeScript
- No tracing imports in the inner loop. The agent code looks the same as
it would without tracing. The instrumentation is at the boundaries
(
setup(),agentSpan(),executeTool()). - The patched Anthropic SDK does the LLM-span work. We pass
modules: { anthropic: Anthropic }tosetup(), and from then on everyclient.messages.create()call emits anLLMspan with input messages, output content blocks, model, finish reason, and token usage. - Tool spans are caller-side. They wrap the real function execution, not
the message round-trip. The model-side view of the tool call is captured
on the parent
LLMspan automatically; the caller-side view is theTOOLspan we author.
Step 5 — Verify In The Dashboard And CLI
Send a request through the server, then check the resulting trace:Common Variations
Multi-Tenant Service With Per-Request Identity
Ifagent.id itself depends on the request (for example, a multi-tenant
service that runs different agent personas per customer), compute it in the
handler:
TypeScript
support-acme-prod
over support-acme-2024-v2 — the Agents dashboard uses the ID to group runs
across deploys.
Background Jobs Triggered From The Agent
If your tool launches a background job that itself does LLM work, capture the active identity and pass it into the job so the background span can be filtered together with its originating conversation:TypeScript
agent.id and session.id on its own
agent span so the two pieces of work share dashboard grouping.
Streaming Responses
When the agent streams output back to the user, set the span output once at the end, after the stream completes. The patched SDK already handles streaming LLM calls correctly; the outer agent span just needs the final text:TypeScript
Custom Span Events
For mid-callback events that are not span attributes — a rate-limit retry, a fallback to a smaller model, a cache miss — usespan.raw.addEvent:
TypeScript
--view events flag of inf span get and on the
span detail page.
What To Test
| Behavior | How to verify |
|---|---|
setup() runs before the first SDK call | Search server logs for the Catalyst tracing init message; confirm it precedes any Anthropic request log. |
| LLM spans parent under the agent span | inf trace get <id> --view tree shows a single AGENT root with LLM and TOOL children. |
Tool span has tool.name and tool_call.id | inf span get <id> --view attributes |
Errors mark the span ERROR | Force a tool to throw; confirm the span status is ERROR and the trace status is ERROR. |
Spans flush on SIGTERM | Send SIGTERM to the server right after a request; the trace should still appear in Catalyst. |
| Domain attributes are filterable | inf trace list --metadata "app.tenant_id=acme" returns the expected traces. |
Next Steps
Manual spans
The full surface for AGENT, TOOL, CHAIN, and RETRIEVER spans.
Attributes reference
All
Attr.* constants and SpanKindValues with the attributes each kind expects.Handle API reference
Every method on the span handle and how it coerces values.
Troubleshooting
Debug missing spans, missing attributes, and shutdown behavior.