Signals

Once traces are flowing, signals turn them into structured labels. A signal is a plain-language classifier you define once and Catalyst runs continuously against one of your agents. Describe what you want to detect (“is this NSFW?”, “did the user get frustrated?”, “was the task completed?”) and every matching trace gets labeled automatically by an LLM judge, with the results queryable right alongside your other agent metrics. Signals are how you evaluate the things default metrics miss. For a user-facing agent they tell you how people are actually interacting with it (frustration, sentiment, jailbreak attempts). For a non-user-facing agent they tell you how the agent itself is behaving (did it refuse, did it complete the task, did it stay on policy).

Want the full walkthrough with a worked example instead of the reference? See the guide Measure your agent’s quality with Signals.

Before you start

Signals run on captured traffic, so you need traces flowing first. This guide assumes you’ve already captured your first trace. You also need:

An agent with traces installed. Signals are always scoped to a single agent, so there is nothing to label until that agent is emitting traces.
A stable, consistent agentId. Signals are per agent, so your traces need a consistent agent identity to group by. Set the agentId once and keep it the same across runs. If you instrumented with the CLI this is usually already set. See Set agent identity.
A sessionId, if you want session-scoped signals. Session signals classify a whole conversation, which requires that your traces carry a sessionId (your conversation or chat ID) so Catalyst can assemble the conversation. See Choose a scope.

Where to find signals

You can get to signals two ways in the dashboard:

The Signals tab, which lists every signal and is the main table view across your agents.
The Agents tab, where you pick an agent and open its Signals view to see and create signals scoped to just that agent.

How signals work

A signal evaluates traffic for one of your agents. When you activate it, Catalyst samples incoming traffic for that agent, sends each sampled target’s input and output to an LLM judge with the prompt you wrote, and writes the label back. You can then filter, chart, and break down your traces by that label.

You define the signal

Pick a scope and classifier type, write a prompt describing what to look for, and set a sample rate.

Catalyst samples matching traffic

For an active signal, a deterministic share of the agent’s traffic (the sample rate) is selected for classification. Sampling is deterministic, so the same target always resolves the same way.

A judge labels each target

The judge reads the input and output of each sampled span, trace, or session and returns a label that conforms to your classifier type.

Labels land on your traces

Results are stored against each labeled target and surfaced in the dashboard, where you can filter by label value and watch the label distribution over time.

Choose a scope

A signal’s scope is the unit it labels. You pick it when you create the signal, and it’s fixed for the life of the signal (everything else is editable). Scope determines how much context the judge sees on each call.

Span

A single model call. The narrowest scope. Useful for narrow, call-level checks, but often too granular if you care about the interaction as a whole.

Trace

One turn, or request. The judge sees the whole turn rather than a single call. A good fit for request-level outcomes.

Session

The full conversation. Usually the most useful scope for understanding a user, since the judge sees the entire back-and-forth. Requires a sessionId on your traces so the conversation can be assembled.

Session scope only works if your traces carry a sessionId. If you haven’t set one, add it before creating a session-scoped signal. See Set agent identity.

Two classifier types

When you create a signal you choose how it labels each target.

Binary (yes / no)

A true/false classifier. Use it for “is this X or not” questions: NSFW content, jailbreak attempts, refusals, or any flag you want to filter on. No labels to configure, just a prompt.

String (enumerated labels)

A classifier that returns one of a fixed set of labels you define. Use it when there’s more than two outcomes: sentiment (positive / neutral / negative), task outcome (completed / partial / failed / abandoned), and so on. Define between 2 and 10 labels.

Create a signal

Create a signal from the Signals tab, or from a specific agent’s Signals view under the Agents tab. Either way the signal is scoped to one agent.

Open the signal editor for your agent

From the Signals tab or an agent’s Signals view, pick the agent whose traffic you want to label and create a new signal.

Name the signal

Give it a short, descriptive name like NSFW or User frustration. The name is how the label shows up everywhere else in the dashboard.

Choose a scope

Pick Span, Trace, or Session (see Choose a scope). This is the one setting you can’t change later, so pick the unit you actually want to reason about. Session is usually the most useful for understanding a user.

Choose a classifier type

Pick Binary (yes / no) or String (enumerated labels). For a string classifier, add the 2 to 10 labels the judge is allowed to return.

Write the prompt

Describe what the signal should classify. This is the instruction the judge follows on every target, so be specific about what counts as each outcome. For a binary signal, describe what makes a target a “yes.” For a string signal, describe when to pick each label.

Set the sample rate

The sample rate is the share of matching traffic that gets classified, and lower rates cost less. 100% runs on every target, which is often unnecessary at high volume, so to spend fewer credits pick a lower rate like 25% or 50% and the signal only runs on that share of incoming traffic. Start lower on high-volume agents and raise it once you trust the labels. Common presets are 10%, 25%, 50%, and 100%.

Save as a draft or activate

Save draft keeps the signal unpublished so you can keep tuning it. Activate publishes it and starts live classification on new traffic.

Don’t want to start from scratch? Use Start from a template to prefill the classifier type, prompt, and labels for a common signal, then edit from there. See Templates below.

Test before you activate

Before you commit a signal to live traffic, run it against recent traffic to preview how it labels it. A test run classifies a small sample synchronously and shows you the label distribution and per-target results, and nothing is saved. It’s a preview only and doesn’t affect your signal or store any labels.

Open the tester

From the signal, choose Test it.

Pick a sample size and time range

Choose how many recent targets to classify (1 to 100) and the window to pull them from.

Read the preview

You’ll see the label distribution across the sample plus a per-target breakdown, including which targets got flagged. If the labels don’t match your intent, adjust the prompt or labels and test again.

Test results are not persisted. They exist only to help you tune the prompt before activating.

Activate and run live

Activating a signal starts live classification: as new traffic arrives for the agent, the configured share of it gets sampled and labeled automatically. You don’t have to do anything else, and labels accumulate as traffic flows. A signal is always in one of three states:

State	What it means
Draft	Saved but not running. Nothing is being classified.
Active	Live. New matching traffic is sampled and labeled.
Disabled	Paused. Live classification has stopped, but past labels are kept.

You can disable an active signal at any time to stop classification without losing the labels you’ve already collected, and re-enable it later.

Backfill historical data

Live classification only labels traffic that arrives after you activate. To label traffic you already captured, run a manual run (backfill) at any time. Unlike a test, a manual run saves its labels: they’re stored against your traces and tied to the run, exactly like live labels.

Open the manual run dialog

From the signal, choose Manual run / Backfill.

Pick a time range and sample rate

Choose the historical window to apply the signal to, and the share of matching traffic in that window to classify.

Start the run

The run executes in the background, classifying past traffic across the window. Results land in the same place as live labels as the run progresses.

Backfill a representative window first to sanity-check the labels at scale before running it over a long history. A manual run classifies real traffic and counts toward usage.

Read the results

There are a few places to read what a signal found:

The Signals tab is the main table across your agents, with each signal’s current state and headline numbers at a glance.
The agent’s Overview under the Agents tab has per-signal graphs: trends and volume over time, plus a range of metrics for each signal so you can see how a label is moving.
The signal detail view has a table of every classified target (the spans, traces, or sessions the signal labeled). Click into any row to open the underlying trace, span, or session and read the actual conversation that produced the label.

Labeled targets render their label as a colored chip. For a binary signal, “yes” and “no” get distinct colors; for a string signal, each label gets its own color. From there you can:

Filter by label value to pull up just the targets a signal flagged (for example, every session labeled “yes” by a jailbreak signal).
Watch the distribution over time to see how a label trends across hours or days.
Jump straight to the underlying trace, span, or session to see the full context and conversation behind any labeled target.

Alert on a signal

Once a signal is running, set up alerts so you hear about changes without watching the dashboard. Alerts are configured per signal, and you can get notified through the Slack integration or by email. An alert fires when a metric crosses a condition over a window. You choose:

The metric. What to watch, depending on the classifier type:
- Label volume (any signal): how many labels came in.
- True rate (binary signals): the share of labels that are “yes.”
- Value count (string signals): how many labels landed on one specific value.
- Value share (string signals): that value’s share of all labels.
The comparison. Either a percentage change versus the prior equal-length window (for example, “this value’s count is up 10% in the last 24 hours”) or an absolute threshold (for example, “true rate is below 80%”).
The window the comparison runs over, from 5 minutes up to 48 hours.
A minimum label count so quiet periods don’t trip the alert on statistically meaningless deltas.
A cooldown so a flapping condition doesn’t bombard you. After a firing, the next one is suppressed until the cooldown elapses.

You can backtest an alert against recent history to see when it would have fired before you turn it on, and pause or re-enable any alert at any time.

Start with a wide window, a sensible minimum label count, and a cooldown. Tighten the threshold once you’ve seen how the metric actually moves in the backtest.

Versioning and editing

Signals are versioned. Editing a signal (changing its prompt, labels, classifier type, or sample rate) creates a new version rather than mutating the old one, and the dashboard shows each version after every edit. The current version is the one powering live classification, and labels record which version produced them, so you can change a signal’s definition without losing the history of what earlier versions decided. Scope is the exception: it’s fixed once a signal is created. Everything else is editable, but to label a different unit you create a new signal. When you no longer need a signal, archive it. Archiving stops it and removes it from your active list while preserving the labels it produced.

Templates

To get started quickly, create a signal from a template and edit from there. Built-in templates include:

Template	Type	What it flags
NSFW	Binary	Spans whose content is sexually explicit, graphic, or otherwise not safe for work.
Jailbreak attempt	Binary	Spans where the user tries to bypass the model’s safety guardrails or system instructions.
Laziness / refusal	Binary	Spans where the assistant refuses, stalls, or gives a low-effort non-answer instead of completing the task.
User frustration	Binary	Spans where the user expresses frustration, annoyance, or dissatisfaction with the assistant.
Sentiment	String	The overall sentiment the user expresses: positive, neutral, or negative.
Task outcome	String	Whether the task the user asked for was completed: completed, partial, failed, or abandoned.

Templates are just a starting point. You can write any prompt you want and build a signal from scratch, with either classifier type.

Manage signals from the CLI and MCP

The dashboard isn’t the only way in. Everything above is also available through the Inference CLI and the MCP server, so you can create, run, and read signals from your terminal or straight from an AI coding assistant.

CLI. The inf signals command group lists, creates, edits, activates, disables, and archives signals, tests them, kicks off manual runs, and reads labels and distributions.
MCP. The Inference MCP server exposes the same operations as tools (creating signals, activating and disabling them, running backfills, configuring alerts, and querying labels and distributions), so an assistant can set up and inspect signals on your behalf.

Feed signals into Halo

Signals pair naturally with Halo, our agent-loop optimizer. The targets a signal flags are exactly the traffic worth digging into:

Improve a behavior. Point Halo at the spans, traces, or sessions a signal flagged and ask how to fix what they have in common (the refusals, the frustrated sessions, the failed tasks).
Decide what to measure. Talk to Halo about your traces to surface which signals would be most valuable to add in the first place.

Next steps

Analyze your traces

Run Halo on your traces to find systemic failure modes and concrete fixes.

Set agent identity

Add stable agent IDs so signals attach to the right agent and group cleanly.

Capture more of your stack

Add tracing to more providers, frameworks, and agent runtimes so signals have more to label.

Wrap custom work

Add spans around retrieval, routing, and subprocesses so signals can classify them too.

Get Started

Gateway

Datasets

Eval

Deploy

Platform

Train

Before you start

Where to find signals

How signals work

Choose a scope

Span

Trace

Session

Two classifier types

Binary (yes / no)

String (enumerated labels)

Create a signal

Test before you activate

Activate and run live

Backfill historical data

Read the results

Alert on a signal

Versioning and editing

Templates

Manage signals from the CLI and MCP

Feed signals into Halo

Next steps

Analyze your traces

Set agent identity

Capture more of your stack

Wrap custom work

​Before you start

​Where to find signals

​How signals work

​Choose a scope

Span

Trace

Session

​Two classifier types

Binary (yes / no)

String (enumerated labels)

​Create a signal

​Test before you activate

​Activate and run live

​Backfill historical data

​Read the results

​Alert on a signal

​Versioning and editing

​Templates

​Manage signals from the CLI and MCP

​Feed signals into Halo

​Next steps

Analyze your traces

Set agent identity

Capture more of your stack

Wrap custom work

Before you start

Where to find signals

How signals work

Choose a scope

Two classifier types

Create a signal

Test before you activate

Activate and run live

Backfill historical data

Read the results

Alert on a signal

Versioning and editing

Templates

Manage signals from the CLI and MCP

Feed signals into Halo

Next steps