Skip to main content
These examples are copy-paste ready. Each one shows what gets captured and links to the integration page that covers the surface in depth. For setup and configuration, start with the Traces Quickstart. For an end-to-end view of a real production agent, see the Production Agent Example.

OpenAI Chat Completion

Initialize tracing before constructing the OpenAI client. The SDK patches Chat Completions and emits an LLM span with input messages, output messages, model name, invocation parameters, finish reason, and token counts.
import { setup } from "@inference/tracing";
import OpenAI from "openai";

const tracing = await setup({
  serviceName: "checkout-agent",
  modules: { openai: OpenAI },
});

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [
    { role: "system", content: "You answer in one short sentence." },
    { role: "user", content: "Summarize order ABC-123." },
  ],
  max_tokens: 80,
});

console.log(response.choices[0]?.message.content);
await tracing.shutdown();
See OpenAI traces for tool calls, structured outputs, and the Responses API.

OpenAI Tool Round Trip

Tool calls are captured on the model span. The first turn records the assistant tool call and arguments; the second turn records the tool result in the input message list.
import { setup } from "@inference/tracing";
import OpenAI from "openai";

const tracing = await setup({ modules: { openai: OpenAI } });
const client = new OpenAI();

const tools = [
  {
    type: "function" as const,
    function: {
      name: "get_weather",
      description: "Look up the current weather in a city.",
      parameters: {
        type: "object",
        properties: { city: { type: "string" } },
        required: ["city"],
      },
    },
  },
];
const messages: OpenAI.Chat.Completions.ChatCompletionMessageParam[] = [
  { role: "user", content: "What's the weather in San Francisco?" },
];

const first = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages,
  tools,
});
const toolCalls = first.choices[0]?.message.tool_calls ?? [];
messages.push({ role: "assistant", content: null, tool_calls: toolCalls });

for (const toolCall of toolCalls) {
  const args = JSON.parse(toolCall.function.arguments) as { city: string };
  messages.push({
    role: "tool",
    tool_call_id: toolCall.id,
    content: JSON.stringify({
      city: args.city,
      tempF: 62,
      condition: "sunny",
    }),
  });
}

const final = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages,
  tools,
});
console.log(final.choices[0]?.message.content);
await tracing.shutdown();
This captures the model-side view of tool calling: what the LLM asked for and the result you passed back. For a caller-side view that wraps the actual function execution in its own TOOL span, see Manual spans.

OpenAI Structured Output

Structured-output requests keep the schema in llm.invocation_parameters and the model response in output.value.
const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [
    {
      role: "user",
      content: "Extract the city, temperature, and unit from: 72F in Berlin.",
    },
  ],
  response_format: {
    type: "json_schema",
    json_schema: {
      name: "weather_report",
      strict: true,
      schema: {
        type: "object",
        additionalProperties: false,
        properties: {
          city: { type: "string" },
          temperature: { type: "number" },
          unit: { type: "string", enum: ["F", "C"] },
        },
        required: ["city", "temperature", "unit"],
      },
    },
  },
});

console.log(response.choices[0]?.message.content);

OpenAI Responses API

The Responses API is traced separately from Chat Completions. Function-call items are normalized into the same OpenInference tool-call attributes used by Chat Completions, so the dashboard renders them the same way.
const response = await client.responses.create({
  model: "gpt-4o-mini",
  input: "In one sentence, what is OpenTelemetry?",
});

console.log(response.output_text);

Anthropic Messages

Anthropic Messages calls emit LLM spans with user and assistant content blocks, model name, invocation parameters, finish reason, and usage.
import Anthropic from "@anthropic-ai/sdk";
import { setup } from "@inference/tracing";

const tracing = await setup({ modules: { anthropic: Anthropic } });
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

const message = await client.messages.create({
  model: "claude-haiku-4-5",
  max_tokens: 128,
  messages: [{ role: "user", content: "Respond with just the word hello." }],
});

console.log(message.content);
await tracing.shutdown();

Anthropic Prompt Caching

When Anthropic returns prompt-cache usage fields, Catalyst maps them to OpenInference token detail attributes so they show up alongside the regular token counts on the LLM span.
const longSystem =
  "You are a careful, terse assistant. Answer in one sentence.\n\n" +
  "Reference document:\n" +
  "Lorem ipsum dolor sit amet, consectetur adipiscing elit. ".repeat(300);

const params: Anthropic.MessageCreateParamsNonStreaming = {
  model: "claude-haiku-4-5",
  max_tokens: 64,
  system: [
    {
      type: "text",
      text: longSystem,
      cache_control: { type: "ephemeral" },
    },
  ],
  messages: [{ role: "user", content: "Is the document about lorem ipsum?" }],
};

const first = await client.messages.create(params);
const second = await client.messages.create(params);

console.log(first.usage.cache_creation_input_tokens ?? 0);
console.log(second.usage.cache_read_input_tokens ?? 0);
The cache attributes show up on the LLM span as llm.token_count.prompt_details.cache_write and llm.token_count.prompt_details.cache_read. See the Attributes reference for the full set of token-detail keys.

Manual Parent Around Automatic Children

Use an outer agent span around orchestration code when you want nested LLM, tool, or framework spans grouped under one product-level operation. Spans created inside the callback auto-parent under the agent span via OTel context propagation.
import { agentSpan, setup } from "@inference/tracing";
import OpenAI from "openai";

const tracing = await setup({ modules: { openai: OpenAI } });
const client = new OpenAI();

await agentSpan(
  {
    agentId: "refund-review-agent",
    agentName: "Refund Review Agent",
    spanName: "refund-review.run",
    sessionId: "conversation-ticket-123",
    system: "openai",
  },
  async (span) => {
    const ticket = { id: "ticket_123", orderId: "ABC-123" };
    span.setInput(ticket);
    const response = await client.chat.completions.create({
      model: "gpt-4o-mini",
      messages: [
        { role: "user", content: `Review refund for ${ticket.orderId}` },
      ],
    });
    span.setOutput({ decision: response.choices[0]?.message.content });
  },
);

await tracing.shutdown();

OpenAI Agents With Outer Span

Pair OpenAI Agents with OpenAI instrumentation. Use agentSpan() / agent_span() for an explicit outer span; nested OpenAI calls are captured automatically and parent under it.
import { agentSpan, setup } from "@inference/tracing";
import * as agents from "@openai/agents";
import { Agent, run, tool } from "@openai/agents";
import OpenAI from "openai";
import { z } from "zod";

const tracing = await setup({
  modules: { openai: OpenAI, openaiAgents: agents },
});

const lookupOrder = tool({
  name: "lookup_order",
  description: "Look up an order by ID.",
  parameters: z.object({ orderId: z.string() }),
  execute: async ({ orderId }) =>
    JSON.stringify({ orderId, status: "shipped" }),
});

const supportAgent = new Agent({
  name: "SupportAgent",
  instructions: "Use tools to help customers with orders.",
  tools: [lookupOrder],
  model: "gpt-4o-mini",
});

const userMessage = "Where is order ABC-123?";
await agentSpan(
  {
    agentId: "support-agent-prod",
    agentName: "Support Agent",
    spanName: "support-agent.run",
    sessionId: "conversation-order-abc-123",
    system: "openai",
  },
  async (span) => {
    span.setInput(userMessage);
    const result = await run(supportAgent, userMessage, { maxTurns: 4 });
    span.setOutput(String(result.finalOutput ?? ""));
  },
);

await tracing.shutdown();

OpenAI Agents Handoff

Handoffs create a useful trace tree when wrapped in an outer agent span: the triage agent, specialist agent, model calls, and tools are all grouped under one customer request.
TypeScript
import { agentSpan, setup } from "@inference/tracing";
import * as agents from "@openai/agents";
import { Agent, handoff, run, tool } from "@openai/agents";
import OpenAI from "openai";
import { z } from "zod";

const tracing = await setup({
  modules: { openai: OpenAI, openaiAgents: agents },
});

const issueRefund = tool({
  name: "issue_refund",
  description: "Issue a refund for an order.",
  parameters: z.object({ orderId: z.string(), amount: z.number() }),
  execute: async ({ orderId, amount }) =>
    JSON.stringify({ ok: true, orderId, refundId: "RFD-2201", amount }),
});

const refundsAgent = new Agent({
  name: "RefundsAgent",
  instructions: "Handle refund requests and use issue_refund.",
  tools: [issueRefund],
  model: "gpt-4o-mini",
});
const billingAgent = new Agent({
  name: "BillingAgent",
  instructions: "Answer billing questions. Do not issue refunds.",
  model: "gpt-4o-mini",
});
const triageAgent = new Agent({
  name: "TriageAgent",
  instructions: "Route refund requests to RefundsAgent.",
  handoffs: [handoff(refundsAgent), handoff(billingAgent)],
  model: "gpt-4o-mini",
});

await agentSpan(
  {
    agentId: "triage-agent-prod",
    agentName: "Triage Agent",
    spanName: "triage-agent.run",
    sessionId: "conversation-refund-abc-123",
    system: "openai",
  },
  async (span) => {
    const input = "I need a refund for order ABC-123, total $42.50.";
    span.setInput(input);
    const result = await run(triageAgent, input, { maxTurns: 8 });
    span.setOutput(String(result.finalOutput ?? ""));
  },
);

await tracing.shutdown();

LangChain Agent With Tools

LangChain instrumentation hooks the callback manager. You do not need to wrap each tool or model call manually; chain, LLM, and tool spans are emitted from the framework callbacks. If the same agent is already wrapped with LangSmith @traceable, keep that decorator in place and install the langsmith extra. Catalyst uses the LangSmith OTel span as the active parent, so the LangChain and provider spans stay grouped under the decorated run.
import { setup } from "@inference/tracing";
import { ChatAnthropic } from "@langchain/anthropic";
import * as CallbackManagerModule from "@langchain/core/callbacks/manager";
import { createAgent, tool } from "langchain";
import { z } from "zod";

const tracing = await setup({
  modules: { langchainCallbacksManager: CallbackManagerModule },
});

const lookupOrder = tool(
  ({ orderId }) => JSON.stringify({ orderId, status: "shipped", total: 42.5 }),
  {
    name: "lookup_order",
    description: "Look up an order by ID.",
    schema: z.object({ orderId: z.string() }),
  },
);

const cancelOrder = tool(
  ({ orderId, reason }) => JSON.stringify({ ok: true, orderId, reason }),
  {
    name: "cancel_order",
    description: "Cancel a not-yet-delivered order.",
    schema: z.object({ orderId: z.string(), reason: z.string() }),
  },
);

const agent = createAgent({
  model: new ChatAnthropic({ model: "claude-haiku-4-5", maxTokens: 512 }),
  tools: [lookupOrder, cancelOrder],
  systemPrompt: "Use tools to resolve order issues.",
});

const result = await agent.invoke({
  messages: [{ role: "user", content: "Cancel order ABC-123." }],
});
console.log(result.messages.at(-1)?.content);
await tracing.shutdown();

Pydantic AI Structured Agent (Python)

Pydantic AI ships native OpenTelemetry instrumentation. Catalyst registers its provider and enables Pydantic AI instrumentation during setup().
Python
from inference_catalyst_tracing import setup
from pydantic import BaseModel, Field
from pydantic_ai import Agent, RunContext

class CityWeather(BaseModel):
    city: str
    temp_c: float = Field(description="Temperature in Celsius.")
    condition: str

class WeatherReport(BaseModel):
    cities: list[CityWeather]
    summary: str

tracing = setup()

agent = Agent(
    "openai:gpt-4o-mini",
    output_type=WeatherReport,
    system_prompt="Use get_weather for every requested city.",
)

@agent.tool
def get_weather(_ctx: RunContext[None], city: str) -> str:
    """Look up current weather for a city."""
    return f'{{"city": "{city}", "temp_c": 12, "condition": "overcast"}}'

result = agent.run_sync("What's the weather in Paris and Tokyo?")
print(result.output.summary)
tracing.shutdown()

Claude Agent SDK

Python can patch the SDK during setup() before query is imported. TypeScript uses an explicit wrapper because ESM namespace bindings cannot be safely patched.
import { query } from "@anthropic-ai/claude-agent-sdk";
import { setup, wrapClaudeAgentSdkQuery } from "@inference/tracing";

const tracing = await setup();
const tracedQuery = wrapClaudeAgentSdkQuery(query);

const stream = tracedQuery({
  prompt: "Count files matching *.md under the current directory.",
  options: {
    maxTurns: 4,
    allowedTools: ["Bash"],
    permissionMode: "bypassPermissions",
  },
});

for await (const message of stream) {
  console.log(message);
}

await tracing.shutdown();

CLI Or Subprocess Work

When a tool has no instrumentable SDK, wrap the subprocess call in an agent span and set the input, output, and token usage when available.
import { agentSpan, setup } from "@inference/tracing";
import { Codex } from "@openai/codex-sdk";

const tracing = await setup();
const codex = new Codex({ apiKey: process.env.OPENAI_API_KEY });

await agentSpan(
  {
    agentId: "codex-prod",
    agentName: "Codex",
    system: "openai",
    spanName: "codex.invocation",
    sessionId: "conversation-cli-hello",
  },
  async (span) => {
    const prompt = "Reply with just the word hello.";
    span.setInput(prompt);
    const thread = codex.startThread({
      skipGitRepoCheck: true,
      sandboxMode: "read-only",
    });
    const turn = await thread.run(prompt);
    span.setOutput(turn.finalResponse ?? "");
    if (turn.usage != null) {
      span.recordTokens({
        prompt: turn.usage.input_tokens ?? 0,
        completion: turn.usage.output_tokens ?? 0,
      });
    }
  },
);

await tracing.shutdown();
  1. Traces Quickstart — install, configure export, capture your first span.
  2. OpenAI traces or Anthropic traces — the provider you use first.
  3. Manual spans — tool, chain, and retriever spans inside your agent loop.
  4. Production Agent Example — a production-shaped agent end to end.
  5. Agent identity — stable IDs for dashboard grouping.
  6. Troubleshooting — missing spans, missing attributes, shutdown.