Documentation Index
Fetch the complete documentation index at: https://docs.inference.net/llms.txt
Use this file to discover all available pages before exploring further.
Catalyst instruments Anthropic Messages API calls in TypeScript and Python. The
span includes content blocks, tool-use blocks, model name, invocation
parameters, finish reason, usage, and prompt-cache token details when Anthropic
returns them.
When Anthropic calls are part of an agent loop, wrap the product-level operation
with agentSpan() / agent_span() and pass a stable agent.id. The Anthropic
model spans stay nested under that AGENT span for dashboard grouping.
Install
bun add @inference/tracing @anthropic-ai/sdk
Basic Messages Call
import Anthropic from "@anthropic-ai/sdk";
import { setup } from "@inference/tracing";
const tracing = await setup({ modules: { anthropic: Anthropic } });
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const message = await client.messages.create({
model: "claude-haiku-4-5",
max_tokens: 128,
messages: [{ role: "user", content: "Respond with just the word hello." }],
});
console.log(message.content);
await tracing.shutdown();
Anthropic Inside An Agent
import { agentSpan } from "@inference/tracing";
await agentSpan(
tracing.tracer,
{
agentId: "research-agent",
name: "ResearchAgent",
role: "research",
system: "anthropic",
},
async (span) => {
const input = "Summarize the latest customer note.";
span.setInput(input);
const message = await client.messages.create({
model: "claude-haiku-4-5",
max_tokens: 128,
messages: [{ role: "user", content: input }],
});
span.setOutput(message.content);
},
);
Anthropic tool use is a two-turn pattern: the assistant returns a tool_use
block, then your app returns a matching tool_result block. Catalyst records
both sides of that relationship.
const tools: Anthropic.Tool[] = [
{
name: "lookup_order",
description: "Look up an order by ID.",
input_schema: {
type: "object",
properties: { orderId: { type: "string" } },
required: ["orderId"],
},
},
];
const messages: Anthropic.MessageParam[] = [
{ role: "user", content: "Check order ABC-123." },
];
const first = await client.messages.create({
model: "claude-haiku-4-5",
max_tokens: 256,
tools,
messages,
});
const toolUse = first.content.find(
(block): block is Anthropic.ToolUseBlock => block.type === "tool_use",
);
if (toolUse != null) {
messages.push({ role: "assistant", content: first.content });
const args = toolUse.input as { orderId: string };
messages.push({
role: "user",
content: [
{
type: "tool_result",
tool_use_id: toolUse.id,
content: JSON.stringify({ orderId: args.orderId, status: "shipped" }),
},
],
});
const final = await client.messages.create({
model: "claude-haiku-4-5",
max_tokens: 256,
tools,
messages,
});
console.log(final.content);
}
Prompt Caching
When Anthropic returns cache creation and cache read token counts, Catalyst maps
them into OpenInference token detail attributes.
const longSystem =
"You are a careful, terse assistant. Answer in one sentence.\n\n" +
"Reference document:\n" +
"Lorem ipsum dolor sit amet, consectetur adipiscing elit. ".repeat(300);
const params: Anthropic.MessageCreateParamsNonStreaming = {
model: "claude-haiku-4-5",
max_tokens: 64,
system: [
{
type: "text",
text: longSystem,
cache_control: { type: "ephemeral" },
},
],
messages: [{ role: "user", content: "Is the document about lorem ipsum?" }],
};
const first = await client.messages.create(params);
const second = await client.messages.create(params);
console.log(first.usage.cache_creation_input_tokens ?? 0);
console.log(second.usage.cache_read_input_tokens ?? 0);