> ## Documentation Index
> Fetch the complete documentation index at: https://docs.inference.net/llms.txt
> Use this file to discover all available pages before exploring further.

# Anthropic Traces

> Trace Anthropic Messages API calls, tool use, and prompt caching.

Catalyst instruments Anthropic Messages API calls in TypeScript and Python. The
span includes content blocks, tool-use blocks, model name, invocation
parameters, finish reason, usage, and prompt-cache token details when Anthropic
returns them.

## Install

<CodeGroup>
  <Metadata text="integrations/traces/anthropic-install-typescript" />

  ```bash TypeScript theme={"system"}
  bun add @inference/tracing @anthropic-ai/sdk
  ```

  <Metadata text="integrations/traces/anthropic-install-python" />

  ```bash Python theme={"system"}
  pip install 'inference-catalyst-tracing[anthropic]'
  ```
</CodeGroup>

## Basic Messages Call

<CodeGroup>
  <Metadata text="integrations/traces/anthropic-basic" />

  ```typescript TypeScript theme={"system"}
  import Anthropic from "@anthropic-ai/sdk";
  import { setup } from "@inference/tracing";

  const tracing = await setup({ modules: { anthropic: Anthropic } });
  const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

  const message = await client.messages.create({
    model: "claude-haiku-4-5",
    max_tokens: 128,
    messages: [{ role: "user", content: "Respond with just the word hello." }],
  });

  console.log(message.content);
  await tracing.shutdown();
  ```

  <Metadata text="integrations/traces/anthropic-basic" />

  ```python Python theme={"system"}
  import os

  from anthropic import Anthropic
  from inference_catalyst_tracing import setup

  tracing = setup()
  client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

  message = client.messages.create(
      model="claude-haiku-4-5",
      max_tokens=128,
      messages=[{"role": "user", "content": "Respond with just the word hello."}],
  )

  print(message.content)
  tracing.shutdown()
  ```
</CodeGroup>

## Anthropic Inside An Agent

<CodeGroup>
  <Metadata text="integrations/traces/anthropic-agent-identity" />

  ```typescript TypeScript theme={"system"}
  import { agentSpan } from "@inference/tracing";

  await agentSpan(
    {
      agentId: "research-agent",
      agentName: "Research Agent",
      spanName: "research-agent.run",
      sessionId: "conversation-research-notes",
      role: "research",
      system: "anthropic",
    },
    async (span) => {
      const input = "Summarize the latest customer note.";
      span.setInput(input);
      const message = await client.messages.create({
        model: "claude-haiku-4-5",
        max_tokens: 128,
        messages: [{ role: "user", content: input }],
      });
      span.setOutput(message.content);
    },
  );
  ```

  <Metadata text="integrations/traces/anthropic-agent-identity" />

  ```python Python theme={"system"}
  from inference_catalyst_tracing import agent_span

  with agent_span(
      tracing.tracer,
      agent_id="research-agent",
      agent_name="Research Agent",
      span_name="research-agent.run",
      session_id="conversation-research-notes",
      agent_role="research",
      system="anthropic",
  ) as span:
      user_input = "Summarize the latest customer note."
      span.set_input(user_input)
      message = client.messages.create(
          model="claude-haiku-4-5",
          max_tokens=128,
          messages=[{"role": "user", "content": user_input}],
      )
      span.set_output([block.model_dump() for block in message.content])
  ```
</CodeGroup>

## Tool Use Round Trip

Anthropic tool use is a two-turn pattern: the assistant returns a `tool_use`
block, then your app returns a matching `tool_result` block. Catalyst records
both sides of that relationship on the auto-emitted `LLM` span — what the
model asked for and the result you passed back.

<Tip>
  The capture below is the **model-side** view: it shows the request/response
  the LLM saw. To also capture the **caller-side** view (the actual function
  that ran, its input, output, and duration), wrap the tool function in a
  `TOOL` span using
  [Manual spans](/integrations/traces/manual-spans#tool-chain-and-retriever-spans).
  For a full agent loop, see the
  [Production Agent Example](/integrations/traces/production-agent-example).
</Tip>

<Metadata text="integrations/traces/anthropic-tool-use" />

```typescript TypeScript theme={"system"}
const tools: Anthropic.Tool[] = [
  {
    name: "lookup_order",
    description: "Look up an order by ID.",
    input_schema: {
      type: "object",
      properties: { orderId: { type: "string" } },
      required: ["orderId"],
    },
  },
];

const messages: Anthropic.MessageParam[] = [
  { role: "user", content: "Check order ABC-123." },
];

const first = await client.messages.create({
  model: "claude-haiku-4-5",
  max_tokens: 256,
  tools,
  messages,
});

const toolUse = first.content.find(
  (block): block is Anthropic.ToolUseBlock => block.type === "tool_use",
);

if (toolUse != null) {
  messages.push({ role: "assistant", content: first.content });
  const args = toolUse.input as { orderId: string };
  messages.push({
    role: "user",
    content: [
      {
        type: "tool_result",
        tool_use_id: toolUse.id,
        content: JSON.stringify({ orderId: args.orderId, status: "shipped" }),
      },
    ],
  });

  const final = await client.messages.create({
    model: "claude-haiku-4-5",
    max_tokens: 256,
    tools,
    messages,
  });
  console.log(final.content);
}
```

## Prompt Caching

When Anthropic returns cache creation and cache read token counts, Catalyst maps
them into OpenInference token detail attributes.

<Metadata text="integrations/traces/anthropic-prompt-caching" />

```typescript TypeScript theme={"system"}
const longSystem =
  "You are a careful, terse assistant. Answer in one sentence.\n\n" +
  "Reference document:\n" +
  "Lorem ipsum dolor sit amet, consectetur adipiscing elit. ".repeat(300);

const params: Anthropic.MessageCreateParamsNonStreaming = {
  model: "claude-haiku-4-5",
  max_tokens: 64,
  system: [
    {
      type: "text",
      text: longSystem,
      cache_control: { type: "ephemeral" },
    },
  ],
  messages: [{ role: "user", content: "Is the document about lorem ipsum?" }],
};

const first = await client.messages.create(params);
const second = await client.messages.create(params);

console.log(first.usage.cache_creation_input_tokens ?? 0);
console.log(second.usage.cache_read_input_tokens ?? 0);
```
