Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.inference.net/llms.txt

Use this file to discover all available pages before exploring further.

The Vercel AI SDK emits native OpenTelemetry spans when experimental_telemetry is enabled. Catalyst provides the tracer provider and a small helper that wires those AI SDK spans into your Catalyst trace export. Use this guide when your app calls the ai package directly through generateText, streamText, structured outputs, tools, or ToolLoopAgent. The same setup works with AI SDK providers such as @ai-sdk/openai, @ai-sdk/anthropic, and @ai-sdk/openai-compatible.

What Is Captured

  • ai.generateText operation spans and ai.generateText.doGenerate model-step spans
  • ai.streamText operation spans and ai.streamText.doStream model-step spans
  • ai.toolCall spans for client-side tool execution
  • ToolLoopAgent.generate() and ToolLoopAgent.stream() activity through the same native AI SDK spans
  • Prompt text or prompt messages, response text, structured output metadata, and streamed text
  • Tool call names, IDs, arguments, and tool results
  • Token usage including input, output, total, cached input, and reasoning tokens when the provider returns them
  • operation.name values that include your functionId
  • Custom metadata passed through experimental_telemetry

Install

TypeScript
bun add @inference/tracing ai
Install the provider package you use with the AI SDK:
TypeScript
bun add @ai-sdk/openai-compatible
Other common provider packages include @ai-sdk/openai, @ai-sdk/anthropic, and @ai-sdk/google.

Configure Export

Set your Catalyst OTLP endpoint and token in the runtime environment. Short-lived scripts should also set a stable service name so traces are easy to find.
export CATALYST_OTLP_ENDPOINT="https://telemetry.inference.cool"
export CATALYST_OTLP_TOKEN="..."
export CATALYST_SERVICE_NAME="ai-sdk-worker"
If you route model calls through an OpenAI-compatible gateway, configure the AI SDK provider separately:
export INFERENCE_BASE_URL="https://api.inference.net/v1"
export INFERENCE_API_KEY="..."
export INFERENCE_MODEL="meta-llama/Llama-3.1-8B-Instruct"

Initialize Tracing

Initialize Catalyst tracing before the first AI SDK call. Import the AI SDK namespace and pass it to setup() so auto-detection and integration status can see the installed module.
TypeScript
import * as ai from "ai";
import { setup } from "@inference/tracing";
import { createAISdkTelemetrySettings } from "@inference/tracing/ai-sdk";

const tracing = await setup({
  serviceName: process.env.CATALYST_SERVICE_NAME ?? "ai-sdk-worker",
  modules: { aiSdk: ai },
});

const telemetry = (functionId: string) =>
  createAISdkTelemetrySettings(tracing.tracer, {
    functionId,
    metadata: {
      route: "support-summary",
      environment: process.env.NODE_ENV ?? "development",
    },
  });
Pass experimental_telemetry: telemetry("...") on every AI SDK call you want to trace. The AI SDK does not apply telemetry settings globally.

Provider Setup

This example uses an OpenAI-compatible provider, which works with Catalyst Gateway and other OpenAI-compatible endpoints.
TypeScript
import { createOpenAICompatible } from "@ai-sdk/openai-compatible";

const provider = createOpenAICompatible({
  name: "inference",
  baseURL: process.env.INFERENCE_BASE_URL ?? "https://api.inference.net/v1",
  apiKey: process.env.INFERENCE_API_KEY!,
  includeUsage: true,
  supportsStructuredOutputs: true,
  headers: {
    "x-inference-environment": "development",
    "x-inference-metadata-app": "ai-sdk-worker",
  },
});

const model = provider(
  process.env.INFERENCE_MODEL ?? "meta-llama/Llama-3.1-8B-Instruct",
);
includeUsage: true is useful because usage metadata is what populates token columns in Catalyst. Some providers only return token counts for non-streaming calls or only after a stream finishes.

Basic Generation

TypeScript
import { generateText } from "ai";

const result = await generateText({
  model,
  system: "You answer in one concise sentence.",
  prompt: "Summarize why trace trees are useful.",
  experimental_telemetry: telemetry("support-summary-generate"),
});

console.log(result.text);
Expected spans:
  • ai.generateText
  • ai.generateText.doGenerate
Expected promoted fields include llm_model_name, input_tokens, output_tokens, total_tokens, input, and output when the provider returns the corresponding AI SDK attributes.

Streaming

streamText() produces a streaming operation span and a model-step span. Consume the stream before process shutdown so the AI SDK can finish the span and record the reconstructed response text.
TypeScript
import { streamText } from "ai";

const result = streamText({
  model,
  prompt: "Stream a six-word sentence about observability.",
  experimental_telemetry: telemetry("support-summary-stream"),
});

let text = "";
for await (const chunk of result.textStream) {
  text += chunk;
}

console.log(text);
Expected spans:
  • ai.streamText
  • ai.streamText.doStream
For short-lived scripts, call await tracing.shutdown() after the stream is fully consumed.

Tool Calling

Tool calls create both model spans and client-side ai.toolCall spans. The model-step span records the tool call requested by the model. The tool span records your local execute() call and its result.
TypeScript
import { generateText, stepCountIs, tool } from "ai";
import { z } from "zod";

const result = await generateText({
  model,
  prompt: "Use the weather tool for Paris, then summarize the result.",
  stopWhen: stepCountIs(2),
  tools: {
    weather: tool({
      description: "Get the current weather for a city.",
      inputSchema: z.object({ city: z.string() }),
      execute: async ({ city }) => ({
        city,
        temperatureC: 21,
        condition: "clear",
      }),
    }),
  },
  toolChoice: { type: "tool", toolName: "weather" },
  experimental_telemetry: telemetry("weather-tool-generate"),
});

console.log(result.text);
Expected spans:
  • ai.generateText
  • one or more ai.generateText.doGenerate model-step spans
  • ai.toolCall spans for executed tools
Useful raw attributes:
AttributeMeaning
ai.response.toolCallsTool calls requested by the model on a model-step span
ai.toolCall.nameTool name on the client-side tool span
ai.toolCall.idTool call ID that links model request and tool execution
ai.toolCall.argsJSON arguments passed to execute()
ai.toolCall.resultJSON result returned by execute()

Structured Output

Structured outputs are traced through the same generateText operation shape. The response text or parsed output is preserved in AI SDK attributes when the provider returns it.
TypeScript
import { generateText, Output } from "ai";
import { z } from "zod";

const result = await generateText({
  model,
  prompt: "Extract city, temperatureC, and condition from: Paris is clear and 21C.",
  output: Output.object({
    name: "weather_report",
    schema: z.object({
      city: z.string(),
      temperatureC: z.number(),
      condition: z.string(),
    }),
  }),
  experimental_telemetry: telemetry("weather-structured-output"),
});

console.log(result.output);
Use supportsStructuredOutputs: true on OpenAI-compatible providers when the downstream model endpoint supports native structured outputs.

Agents

The AI SDK’s ToolLoopAgent accepts experimental_telemetry in the agent constructor. Agent calls then emit the same native AI SDK spans as core functions:
  • agent.generate() emits ai.generateText and ai.generateText.doGenerate
  • agent.stream() emits ai.streamText and ai.streamText.doStream
  • agent tool execution emits ai.toolCall
There is no separate ai.agent span today. Infer the agent loop from the parent/child relationships, repeated model-step spans, tool-call spans, and the functionId you choose.
TypeScript
import { ToolLoopAgent, stepCountIs, tool } from "ai";
import { z } from "zod";

const weatherAgent = new ToolLoopAgent({
  model,
  instructions:
    "Use available tools to answer weather questions, then give a concise final answer.",
  stopWhen: stepCountIs(2),
  tools: {
    weather: tool({
      description: "Get the current weather for a city.",
      inputSchema: z.object({ city: z.string() }),
      execute: async ({ city }) => ({
        city,
        temperatureC: 21,
        condition: "clear",
      }),
    }),
  },
  toolChoice: { type: "tool", toolName: "weather" },
  experimental_telemetry: telemetry("weather-agent-generate"),
});

const answer = await weatherAgent.generate({
  prompt: "Use the weather tool for Paris, then answer in one sentence.",
});

console.log(answer.text);
console.log(answer.steps.length);

Streaming Agent

TypeScript
import { ToolLoopAgent } from "ai";

const streamingAgent = new ToolLoopAgent({
  model,
  instructions: "Stream concise answers.",
  experimental_telemetry: telemetry("weather-agent-stream"),
});

const stream = await streamingAgent.stream({
  prompt: "Stream a short weather summary.",
});

for await (const chunk of stream.textStream) {
  process.stdout.write(chunk);
}

Next.js Route Handler

For request/response applications, initialize tracing in a module that is loaded before route handlers call the AI SDK. Keep shutdown() for process lifecycle hooks or short-lived jobs; do not call it after every web request.
TypeScript
// app/api/chat/route.ts
import * as ai from "ai";
import { streamText } from "ai";
import { setup } from "@inference/tracing";
import { createAISdkTelemetrySettings } from "@inference/tracing/ai-sdk";

const tracing = await setup({
  serviceName: "next-ai-sdk-app",
  modules: { aiSdk: ai },
});

const telemetry = (functionId: string) =>
  createAISdkTelemetrySettings(tracing.tracer, {
    functionId,
    metadata: { route: "/api/chat" },
  });

export async function POST(request: Request) {
  const { prompt } = await request.json();

  const result = streamText({
    model,
    prompt,
    experimental_telemetry: telemetry("chat-route-stream"),
  });

  return result.toUIMessageStreamResponse();
}

Verify Traces

Filter by the service name you configured:
inf trace list --range 1h --service ai-sdk-worker --limit 10
Look for ai.generateText, ai.streamText, and ai.toolCall spans. If you set distinct functionId values, you can also search for the corresponding operation.name attributes in the trace detail view.

Attribute Reference

Catalyst promotes stable AI SDK attributes into canonical columns and preserves all raw attributes for inspection.
Catalyst fieldAI SDK attribute
llm_model_nameai.model.id
input_tokensai.usage.inputTokens or ai.usage.promptTokens
output_tokensai.usage.outputTokens or ai.usage.completionTokens
total_tokensai.usage.totalTokens or ai.usage.tokens
cache_read_tokensai.usage.cachedInputTokens
reasoning_tokensai.usage.reasoningTokens
input_messagesai.prompt.messages
inputai.prompt
outputai.response.text
Observation kinds are inferred from ai.operationId:
ai.operationId shapeObservation kind
ai.generateText, ai.generateText.doGenerateLLM
ai.streamText, ai.streamText.doStreamLLM
ai.generateObject, ai.streamObjectLLM
ai.toolCallTOOL
ai.embed, ai.embedManyEMBEDDING

Common Gotchas

  • Pass experimental_telemetry on every AI SDK call or agent you want traced.
  • Use a stable functionId; it appears in operation.name and makes filtering easier.
  • Set includeUsage: true on OpenAI-compatible providers when available.
  • Fully consume streams before process exit.
  • Call await tracing.shutdown() in scripts, CLIs, tests, and job workers that exit after a run.
  • Do not call shutdown() after each request in a long-running server.
  • If tool calls appear on model spans but no ai.toolCall span appears, confirm the tool has an execute() function and is executed client-side.