> ## Documentation Index
> Fetch the complete documentation index at: https://docs.inference.net/llms.txt
> Use this file to discover all available pages before exploring further.

# Snippets

> Copyable tracing patterns for providers, frameworks, agents, tool loops, structured outputs, prompt caching, handoffs, and custom subprocess work.

These examples are copy-paste ready. Each one shows what gets captured and
links to the integration page that covers the surface in depth.

For setup and configuration, start with the
[Traces Quickstart](/integrations/traces/quickstart). For an end-to-end view
of a real production agent, see the
[Production Agent Example](/integrations/traces/production-agent-example).

## OpenAI Chat Completion

Initialize tracing before constructing the OpenAI client. The SDK patches Chat
Completions and emits an `LLM` span with input messages, output messages,
model name, invocation parameters, finish reason, and token counts.

<CodeGroup>
  <Metadata text="integrations/traces/examples-openai-chat-ts" />

  ```typescript TypeScript theme={"system"}
  import { setup } from "@inference/tracing";
  import OpenAI from "openai";

  const tracing = await setup({
    serviceName: "checkout-agent",
    modules: { openai: OpenAI },
  });

  const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

  const response = await client.chat.completions.create({
    model: "gpt-4o-mini",
    messages: [
      { role: "system", content: "You answer in one short sentence." },
      { role: "user", content: "Summarize order ABC-123." },
    ],
    max_tokens: 80,
  });

  console.log(response.choices[0]?.message.content);
  await tracing.shutdown();
  ```

  <Metadata text="integrations/traces/examples-openai-chat-python" />

  ```python Python theme={"system"}
  import os

  from inference_catalyst_tracing import setup
  from openai import OpenAI

  tracing = setup(service_name="checkout-agent")
  client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

  response = client.chat.completions.create(
      model="gpt-4o-mini",
      messages=[
          {"role": "system", "content": "You answer in one short sentence."},
          {"role": "user", "content": "Summarize order ABC-123."},
      ],
      max_tokens=80,
  )

  print(response.choices[0].message.content)
  tracing.shutdown()
  ```
</CodeGroup>

See [OpenAI traces](/integrations/traces/openai) for tool calls, structured
outputs, and the Responses API.

## OpenAI Tool Round Trip

Tool calls are captured on the model span. The first turn records the assistant
tool call and arguments; the second turn records the tool result in the input
message list.

<CodeGroup>
  <Metadata text="integrations/traces/examples-openai-tool-ts" />

  ```typescript TypeScript theme={"system"}
  import { setup } from "@inference/tracing";
  import OpenAI from "openai";

  const tracing = await setup({ modules: { openai: OpenAI } });
  const client = new OpenAI();

  const tools = [
    {
      type: "function" as const,
      function: {
        name: "get_weather",
        description: "Look up the current weather in a city.",
        parameters: {
          type: "object",
          properties: { city: { type: "string" } },
          required: ["city"],
        },
      },
    },
  ];
  const messages: OpenAI.Chat.Completions.ChatCompletionMessageParam[] = [
    { role: "user", content: "What's the weather in San Francisco?" },
  ];

  const first = await client.chat.completions.create({
    model: "gpt-4o-mini",
    messages,
    tools,
  });
  const toolCalls = first.choices[0]?.message.tool_calls ?? [];
  messages.push({ role: "assistant", content: null, tool_calls: toolCalls });

  for (const toolCall of toolCalls) {
    const args = JSON.parse(toolCall.function.arguments) as { city: string };
    messages.push({
      role: "tool",
      tool_call_id: toolCall.id,
      content: JSON.stringify({
        city: args.city,
        tempF: 62,
        condition: "sunny",
      }),
    });
  }

  const final = await client.chat.completions.create({
    model: "gpt-4o-mini",
    messages,
    tools,
  });
  console.log(final.choices[0]?.message.content);
  await tracing.shutdown();
  ```

  <Metadata text="integrations/traces/examples-openai-tool-python" />

  ```python Python theme={"system"}
  import json

  from inference_catalyst_tracing import setup
  from openai import OpenAI
  from openai.types.chat import ChatCompletionMessageParam

  tracing = setup()
  client = OpenAI()

  tools = [
      {
          "type": "function",
          "function": {
              "name": "get_weather",
              "description": "Look up the current weather in a city.",
              "parameters": {
                  "type": "object",
                  "properties": {"city": {"type": "string"}},
                  "required": ["city"],
              },
          },
      },
  ]
  messages: list[ChatCompletionMessageParam] = [
      {"role": "user", "content": "What's the weather in San Francisco?"},
  ]

  first = client.chat.completions.create(
      model="gpt-4o-mini",
      messages=messages,
      tools=tools,
  )
  tool_calls = first.choices[0].message.tool_calls or []
  messages.append(
      {
          "role": "assistant",
          "content": None,
          "tool_calls": [tc.model_dump() for tc in tool_calls],
      },
  )

  for tool_call in tool_calls:
      args = json.loads(tool_call.function.arguments)
      messages.append(
          {
              "role": "tool",
              "tool_call_id": tool_call.id,
              "content": json.dumps(
                  {"city": args["city"], "temp_f": 62, "condition": "sunny"},
              ),
          },
      )

  final = client.chat.completions.create(
      model="gpt-4o-mini",
      messages=messages,
      tools=tools,
  )
  print(final.choices[0].message.content)
  tracing.shutdown()
  ```
</CodeGroup>

This captures the *model-side* view of tool calling: what the LLM asked for
and the result you passed back. For a *caller-side* view that wraps the actual
function execution in its own `TOOL` span, see
[Manual spans](/integrations/traces/manual-spans#tool-chain-and-retriever-spans).

## OpenAI Structured Output

Structured-output requests keep the schema in `llm.invocation_parameters` and
the model response in `output.value`.

<CodeGroup>
  <Metadata text="integrations/traces/examples-openai-structured-ts" />

  ```typescript TypeScript theme={"system"}
  const response = await client.chat.completions.create({
    model: "gpt-4o-mini",
    messages: [
      {
        role: "user",
        content: "Extract the city, temperature, and unit from: 72F in Berlin.",
      },
    ],
    response_format: {
      type: "json_schema",
      json_schema: {
        name: "weather_report",
        strict: true,
        schema: {
          type: "object",
          additionalProperties: false,
          properties: {
            city: { type: "string" },
            temperature: { type: "number" },
            unit: { type: "string", enum: ["F", "C"] },
          },
          required: ["city", "temperature", "unit"],
        },
      },
    },
  });

  console.log(response.choices[0]?.message.content);
  ```

  <Metadata text="integrations/traces/examples-openai-structured-python" />

  ```python Python theme={"system"}
  response = client.chat.completions.create(
      model="gpt-4o-mini",
      messages=[
          {
              "role": "user",
              "content": "Extract the city, temperature, and unit from: 72F in Berlin.",
          },
      ],
      response_format={
          "type": "json_schema",
          "json_schema": {
              "name": "weather_report",
              "strict": True,
              "schema": {
                  "type": "object",
                  "additionalProperties": False,
                  "properties": {
                      "city": {"type": "string"},
                      "temperature": {"type": "number"},
                      "unit": {"type": "string", "enum": ["F", "C"]},
                  },
                  "required": ["city", "temperature", "unit"],
              },
          },
      },
  )

  print(response.choices[0].message.content)
  ```
</CodeGroup>

## OpenAI Responses API

The Responses API is traced separately from Chat Completions. Function-call
items are normalized into the same OpenInference tool-call attributes used by
Chat Completions, so the dashboard renders them the same way.

<CodeGroup>
  <Metadata text="integrations/traces/examples-openai-responses-ts" />

  ```typescript TypeScript theme={"system"}
  const response = await client.responses.create({
    model: "gpt-4o-mini",
    input: "In one sentence, what is OpenTelemetry?",
  });

  console.log(response.output_text);
  ```

  <Metadata text="integrations/traces/examples-openai-responses-python" />

  ```python Python theme={"system"}
  response = client.responses.create(
      model="gpt-4o-mini",
      input="In one sentence, what is OpenTelemetry?",
  )

  print(response.output_text)
  ```
</CodeGroup>

## Anthropic Messages

Anthropic Messages calls emit `LLM` spans with user and assistant content
blocks, model name, invocation parameters, finish reason, and usage.

<CodeGroup>
  <Metadata text="integrations/traces/examples-anthropic-basic-ts" />

  ```typescript TypeScript theme={"system"}
  import Anthropic from "@anthropic-ai/sdk";
  import { setup } from "@inference/tracing";

  const tracing = await setup({ modules: { anthropic: Anthropic } });
  const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

  const message = await client.messages.create({
    model: "claude-haiku-4-5",
    max_tokens: 128,
    messages: [{ role: "user", content: "Respond with just the word hello." }],
  });

  console.log(message.content);
  await tracing.shutdown();
  ```

  <Metadata text="integrations/traces/examples-anthropic-basic-python" />

  ```python Python theme={"system"}
  import os

  from anthropic import Anthropic
  from inference_catalyst_tracing import setup

  tracing = setup()
  client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

  message = client.messages.create(
      model="claude-haiku-4-5",
      max_tokens=128,
      messages=[{"role": "user", "content": "Respond with just the word hello."}],
  )

  print(message.content)
  tracing.shutdown()
  ```
</CodeGroup>

## Anthropic Prompt Caching

When Anthropic returns prompt-cache usage fields, Catalyst maps them to
OpenInference token detail attributes so they show up alongside the regular
token counts on the LLM span.

<CodeGroup>
  <Metadata text="integrations/traces/examples-anthropic-cache-ts" />

  ```typescript TypeScript theme={"system"}
  const longSystem =
    "You are a careful, terse assistant. Answer in one sentence.\n\n" +
    "Reference document:\n" +
    "Lorem ipsum dolor sit amet, consectetur adipiscing elit. ".repeat(300);

  const params: Anthropic.MessageCreateParamsNonStreaming = {
    model: "claude-haiku-4-5",
    max_tokens: 64,
    system: [
      {
        type: "text",
        text: longSystem,
        cache_control: { type: "ephemeral" },
      },
    ],
    messages: [{ role: "user", content: "Is the document about lorem ipsum?" }],
  };

  const first = await client.messages.create(params);
  const second = await client.messages.create(params);

  console.log(first.usage.cache_creation_input_tokens ?? 0);
  console.log(second.usage.cache_read_input_tokens ?? 0);
  ```

  <Metadata text="integrations/traces/examples-anthropic-cache-python" />

  ```python Python theme={"system"}
  long_system = (
      "You are a careful, terse assistant. Answer in one sentence.\n\n"
      "Reference document:\n"
      + "Lorem ipsum dolor sit amet, consectetur adipiscing elit. " * 300
  )

  params = dict(
      model="claude-haiku-4-5",
      max_tokens=64,
      system=[
          {
              "type": "text",
              "text": long_system,
              "cache_control": {"type": "ephemeral"},
          },
      ],
      messages=[{"role": "user", "content": "Is the document about lorem ipsum?"}],
  )

  first = client.messages.create(**params)
  second = client.messages.create(**params)

  print(first.usage.cache_creation_input_tokens)
  print(second.usage.cache_read_input_tokens)
  ```
</CodeGroup>

The cache attributes show up on the LLM span as
`llm.token_count.prompt_details.cache_write` and
`llm.token_count.prompt_details.cache_read`. See the
[Attributes reference](/integrations/traces/attributes#token-usage) for the
full set of token-detail keys.

## Manual Parent Around Automatic Children

Use an outer agent span around orchestration code when you want nested LLM,
tool, or framework spans grouped under one product-level operation. Spans
created inside the callback auto-parent under the agent span via OTel context
propagation.

<CodeGroup>
  <Metadata text="integrations/traces/examples-manual-parent-ts" />

  ```typescript TypeScript theme={"system"}
  import { agentSpan, setup } from "@inference/tracing";
  import OpenAI from "openai";

  const tracing = await setup({ modules: { openai: OpenAI } });
  const client = new OpenAI();

  await agentSpan(
    {
      agentId: "refund-review-agent",
      agentName: "Refund Review Agent",
      spanName: "refund-review.run",
      sessionId: "conversation-ticket-123",
      system: "openai",
    },
    async (span) => {
      const ticket = { id: "ticket_123", orderId: "ABC-123" };
      span.setInput(ticket);
      const response = await client.chat.completions.create({
        model: "gpt-4o-mini",
        messages: [
          { role: "user", content: `Review refund for ${ticket.orderId}` },
        ],
      });
      span.setOutput({ decision: response.choices[0]?.message.content });
    },
  );

  await tracing.shutdown();
  ```

  <Metadata text="integrations/traces/examples-manual-parent-python" />

  ```python Python theme={"system"}
  from inference_catalyst_tracing import agent_span, setup
  from openai import OpenAI

  tracing = setup()
  client = OpenAI()

  with agent_span(
      tracing.tracer,
      agent_id="refund-review-agent",
      agent_name="Refund Review Agent",
      span_name="refund-review.run",
      session_id="conversation-ticket-123",
      system="openai",
  ) as span:
      ticket = {"id": "ticket_123", "order_id": "ABC-123"}
      span.set_input(ticket)
      response = client.chat.completions.create(
          model="gpt-4o-mini",
          messages=[
              {"role": "user", "content": f"Review refund for {ticket['order_id']}"},
          ],
      )
      decision = response.choices[0].message.content
      span.set_output({"decision": decision})

  tracing.shutdown()
  ```
</CodeGroup>

## OpenAI Agents With Outer Span

Pair OpenAI Agents with OpenAI instrumentation. Use `agentSpan()` /
`agent_span()` for an explicit outer span; nested OpenAI calls are captured
automatically and parent under it.

<CodeGroup>
  <Metadata text="integrations/traces/examples-openai-agents-ts" />

  ```typescript TypeScript theme={"system"}
  import { agentSpan, setup } from "@inference/tracing";
  import * as agents from "@openai/agents";
  import { Agent, run, tool } from "@openai/agents";
  import OpenAI from "openai";
  import { z } from "zod";

  const tracing = await setup({
    modules: { openai: OpenAI, openaiAgents: agents },
  });

  const lookupOrder = tool({
    name: "lookup_order",
    description: "Look up an order by ID.",
    parameters: z.object({ orderId: z.string() }),
    execute: async ({ orderId }) =>
      JSON.stringify({ orderId, status: "shipped" }),
  });

  const supportAgent = new Agent({
    name: "SupportAgent",
    instructions: "Use tools to help customers with orders.",
    tools: [lookupOrder],
    model: "gpt-4o-mini",
  });

  const userMessage = "Where is order ABC-123?";
  await agentSpan(
    {
      agentId: "support-agent-prod",
      agentName: "Support Agent",
      spanName: "support-agent.run",
      sessionId: "conversation-order-abc-123",
      system: "openai",
    },
    async (span) => {
      span.setInput(userMessage);
      const result = await run(supportAgent, userMessage, { maxTurns: 4 });
      span.setOutput(String(result.finalOutput ?? ""));
    },
  );

  await tracing.shutdown();
  ```

  <Metadata text="integrations/traces/examples-openai-agents-python" />

  ```python Python theme={"system"}
  import asyncio
  import json

  from agents import Agent, Runner, function_tool
  from inference_catalyst_tracing import agent_span, setup

  tracing = setup()

  @function_tool
  def lookup_order(order_id: str) -> str:
      """Look up an order by ID."""
      return json.dumps({"order_id": order_id, "status": "shipped"})

  async def run_support_agent() -> str:
      agent = Agent(
          name="SupportAgent",
          instructions="Use tools to help customers with orders.",
          tools=[lookup_order],
          model="gpt-4o-mini",
      )
      user_message = "Where is order ABC-123?"
      with agent_span(
          tracing.tracer,
          agent_id="support-agent-prod",
          agent_name="Support Agent",
          span_name="support-agent.run",
          session_id="conversation-order-abc-123",
          system="openai",
      ) as span:
          span.set_input(user_message)
          result = await Runner.run(agent, input=user_message, max_turns=4)
          output = str(result.final_output or "")
          span.set_output(output)
          return output

  print(asyncio.run(run_support_agent()))
  tracing.shutdown()
  ```
</CodeGroup>

## OpenAI Agents Handoff

Handoffs create a useful trace tree when wrapped in an outer agent span: the
triage agent, specialist agent, model calls, and tools are all grouped under
one customer request.

<Metadata text="integrations/traces/examples-openai-agents-handoff-ts" />

```typescript TypeScript theme={"system"}
import { agentSpan, setup } from "@inference/tracing";
import * as agents from "@openai/agents";
import { Agent, handoff, run, tool } from "@openai/agents";
import OpenAI from "openai";
import { z } from "zod";

const tracing = await setup({
  modules: { openai: OpenAI, openaiAgents: agents },
});

const issueRefund = tool({
  name: "issue_refund",
  description: "Issue a refund for an order.",
  parameters: z.object({ orderId: z.string(), amount: z.number() }),
  execute: async ({ orderId, amount }) =>
    JSON.stringify({ ok: true, orderId, refundId: "RFD-2201", amount }),
});

const refundsAgent = new Agent({
  name: "RefundsAgent",
  instructions: "Handle refund requests and use issue_refund.",
  tools: [issueRefund],
  model: "gpt-4o-mini",
});
const billingAgent = new Agent({
  name: "BillingAgent",
  instructions: "Answer billing questions. Do not issue refunds.",
  model: "gpt-4o-mini",
});
const triageAgent = new Agent({
  name: "TriageAgent",
  instructions: "Route refund requests to RefundsAgent.",
  handoffs: [handoff(refundsAgent), handoff(billingAgent)],
  model: "gpt-4o-mini",
});

await agentSpan(
  {
    agentId: "triage-agent-prod",
    agentName: "Triage Agent",
    spanName: "triage-agent.run",
    sessionId: "conversation-refund-abc-123",
    system: "openai",
  },
  async (span) => {
    const input = "I need a refund for order ABC-123, total $42.50.";
    span.setInput(input);
    const result = await run(triageAgent, input, { maxTurns: 8 });
    span.setOutput(String(result.finalOutput ?? ""));
  },
);

await tracing.shutdown();
```

## LangChain Agent With Tools

LangChain instrumentation hooks the callback manager. You do not need to wrap
each tool or model call manually; chain, LLM, and tool spans are emitted from
the framework callbacks.

If the same agent is already wrapped with LangSmith `@traceable`, keep that
decorator in place and install the `langsmith` extra. Catalyst uses the
LangSmith OTel span as the active parent, so the LangChain and provider spans
stay grouped under the decorated run.

<CodeGroup>
  <Metadata text="integrations/traces/examples-langchain-ts" />

  ```typescript TypeScript theme={"system"}
  import { setup } from "@inference/tracing";
  import { ChatAnthropic } from "@langchain/anthropic";
  import * as CallbackManagerModule from "@langchain/core/callbacks/manager";
  import { createAgent, tool } from "langchain";
  import { z } from "zod";

  const tracing = await setup({
    modules: { langchainCallbacksManager: CallbackManagerModule },
  });

  const lookupOrder = tool(
    ({ orderId }) => JSON.stringify({ orderId, status: "shipped", total: 42.5 }),
    {
      name: "lookup_order",
      description: "Look up an order by ID.",
      schema: z.object({ orderId: z.string() }),
    },
  );

  const cancelOrder = tool(
    ({ orderId, reason }) => JSON.stringify({ ok: true, orderId, reason }),
    {
      name: "cancel_order",
      description: "Cancel a not-yet-delivered order.",
      schema: z.object({ orderId: z.string(), reason: z.string() }),
    },
  );

  const agent = createAgent({
    model: new ChatAnthropic({ model: "claude-haiku-4-5", maxTokens: 512 }),
    tools: [lookupOrder, cancelOrder],
    systemPrompt: "Use tools to resolve order issues.",
  });

  const result = await agent.invoke({
    messages: [{ role: "user", content: "Cancel order ABC-123." }],
  });
  console.log(result.messages.at(-1)?.content);
  await tracing.shutdown();
  ```

  <Metadata text="integrations/traces/examples-langchain-python" />

  ```python Python theme={"system"}
  import json

  from inference_catalyst_tracing import setup
  from langchain.agents import create_agent
  from langchain_anthropic import ChatAnthropic
  from langchain_core.tools import tool

  tracing = setup()

  ORDERS = {"ABC-123": {"status": "shipped", "total": 42.5}}

  @tool
  def lookup_order(order_id: str) -> str:
      """Look up an order by ID."""
      return json.dumps({"order_id": order_id, **ORDERS[order_id]})

  @tool
  def cancel_order(order_id: str, reason: str) -> str:
      """Cancel a not-yet-delivered order."""
      return json.dumps({"ok": True, "order_id": order_id, "reason": reason})

  llm = ChatAnthropic(model_name="claude-haiku-4-5", max_tokens_to_sample=512)
  agent = create_agent(
      llm,
      tools=[lookup_order, cancel_order],
      system_prompt="Use tools to resolve order issues.",
  )

  result = agent.invoke(
      {"messages": [{"role": "user", "content": "Cancel order ABC-123."}]},
  )
  print(result["messages"][-1].content)
  tracing.shutdown()
  ```
</CodeGroup>

## Pydantic AI Structured Agent (Python)

Pydantic AI ships native OpenTelemetry instrumentation. Catalyst registers its
provider and enables Pydantic AI instrumentation during `setup()`.

<Metadata text="integrations/traces/examples-pydantic-ai-python" />

```python Python theme={"system"}
from inference_catalyst_tracing import setup
from pydantic import BaseModel, Field
from pydantic_ai import Agent, RunContext

class CityWeather(BaseModel):
    city: str
    temp_c: float = Field(description="Temperature in Celsius.")
    condition: str

class WeatherReport(BaseModel):
    cities: list[CityWeather]
    summary: str

tracing = setup()

agent = Agent(
    "openai:gpt-4o-mini",
    output_type=WeatherReport,
    system_prompt="Use get_weather for every requested city.",
)

@agent.tool
def get_weather(_ctx: RunContext[None], city: str) -> str:
    """Look up current weather for a city."""
    return f'{{"city": "{city}", "temp_c": 12, "condition": "overcast"}}'

result = agent.run_sync("What's the weather in Paris and Tokyo?")
print(result.output.summary)
tracing.shutdown()
```

## Claude Agent SDK

Python can patch the SDK during `setup()` before `query` is imported.
TypeScript uses an explicit wrapper because ESM namespace bindings cannot be
safely patched.

<CodeGroup>
  <Metadata text="integrations/traces/examples-claude-agent-sdk-ts" />

  ```typescript TypeScript theme={"system"}
  import { query } from "@anthropic-ai/claude-agent-sdk";
  import { setup, wrapClaudeAgentSdkQuery } from "@inference/tracing";

  const tracing = await setup();
  const tracedQuery = wrapClaudeAgentSdkQuery(query);

  const stream = tracedQuery({
    prompt: "Count files matching *.md under the current directory.",
    options: {
      maxTurns: 4,
      allowedTools: ["Bash"],
      permissionMode: "bypassPermissions",
    },
  });

  for await (const message of stream) {
    console.log(message);
  }

  await tracing.shutdown();
  ```

  <Metadata text="integrations/traces/examples-claude-agent-sdk-python" />

  ```python Python theme={"system"}
  from dotenv import load_dotenv
  from inference_catalyst_tracing import setup

  load_dotenv()
  tracing = setup()

  from claude_agent_sdk import ClaudeAgentOptions, query  # noqa: E402

  options = ClaudeAgentOptions(
      max_turns=4,
      allowed_tools=["Bash"],
      permission_mode="bypassPermissions",
  )

  async for message in query(
      prompt="Count files matching *.md under the current directory.",
      options=options,
  ):
      print(message)

  tracing.shutdown()
  ```
</CodeGroup>

## CLI Or Subprocess Work

When a tool has no instrumentable SDK, wrap the subprocess call in an agent
span and set the input, output, and token usage when available.

<CodeGroup>
  <Metadata text="integrations/traces/examples-cli-ts" />

  ```typescript TypeScript theme={"system"}
  import { agentSpan, setup } from "@inference/tracing";
  import { Codex } from "@openai/codex-sdk";

  const tracing = await setup();
  const codex = new Codex({ apiKey: process.env.OPENAI_API_KEY });

  await agentSpan(
    {
      agentId: "codex-prod",
      agentName: "Codex",
      system: "openai",
      spanName: "codex.invocation",
      sessionId: "conversation-cli-hello",
    },
    async (span) => {
      const prompt = "Reply with just the word hello.";
      span.setInput(prompt);
      const thread = codex.startThread({
        skipGitRepoCheck: true,
        sandboxMode: "read-only",
      });
      const turn = await thread.run(prompt);
      span.setOutput(turn.finalResponse ?? "");
      if (turn.usage != null) {
        span.recordTokens({
          prompt: turn.usage.input_tokens ?? 0,
          completion: turn.usage.output_tokens ?? 0,
        });
      }
    },
  );

  await tracing.shutdown();
  ```

  <Metadata text="integrations/traces/examples-cli-python" />

  ```python Python theme={"system"}
  import subprocess

  from inference_catalyst_tracing import agent_span, setup

  tracing = setup()
  prompt = "Reply with just the word hello."

  with agent_span(
      tracing.tracer,
      agent_id="codex-prod",
      agent_name="Codex",
      system="openai",
      span_name="codex.invocation",
      session_id="conversation-cli-hello",
  ) as span:
      span.set_input(prompt)
      completed = subprocess.run(
          ["codex", "exec", "--skip-git-repo-check", prompt],
          capture_output=True,
          text=True,
          timeout=120,
          check=True,
      )
      span.set_output(completed.stdout.strip())

  tracing.shutdown()
  ```
</CodeGroup>

## Recommended Reading Order

1. [Traces Quickstart](/integrations/traces/quickstart) — install, configure
   export, capture your first span.
2. [OpenAI traces](/integrations/traces/openai) or
   [Anthropic traces](/integrations/traces/anthropic) — the provider you use
   first.
3. [Manual spans](/integrations/traces/manual-spans) — tool, chain, and
   retriever spans inside your agent loop.
4. [Production Agent Example](/integrations/traces/production-agent-example) — a
   production-shaped agent end to end.
5. [Agent identity](/integrations/traces/agent-identity) — stable IDs for
   dashboard grouping.
6. [Troubleshooting](/integrations/traces/troubleshooting) — missing spans,
   missing attributes, shutdown.