Skip to main content
Once traces are flowing, signals turn them into structured labels. A signal is a plain-language classifier you define once and Catalyst runs continuously against your agent’s spans. Describe what you want to detect — “is this NSFW?”, “did the user get frustrated?”, “was the task completed?” — and every matching span gets labeled automatically by an LLM judge, with the results queryable right alongside your other agent metrics. This guide assumes you’ve already captured your first trace. Signals only run on spans, so you need tracing installed and traces flowing before there’s anything to label.

How signals work

A signal evaluates spans for one of your agents. When you activate it, Catalyst samples incoming spans for that agent, sends each sampled span’s input and output to an LLM judge with the prompt you wrote, and writes the label back to the span. You can then filter, chart, and break down your traces by that label.
1

You define the signal

Pick a classifier type, write a prompt describing what to look for, and set a sample rate.
2

Catalyst samples matching spans

For an active signal, a deterministic share of the agent’s spans (the sample rate) is selected for classification. Sampling is deterministic, so the same span always resolves the same way.
3

A judge labels each span

The judge reads the span’s input and output and returns a label that conforms to your classifier type.
4

Labels land on your traces

Results are stored against each span and surfaced in the dashboard, where you can filter by label value and watch the label distribution over time.

Two classifier types

When you create a signal you choose how it labels spans.

Binary (yes / no)

A true/false classifier. Use it for “is this X or not” questions: NSFW content, jailbreak attempts, refusals, or any flag you want to filter on. No labels to configure — just a prompt.

String (enumerated labels)

A classifier that returns one of a fixed set of labels you define. Use it when there’s more than two outcomes: sentiment (positive / neutral / negative), task outcome (completed / partial / failed / abandoned), and so on. Define between 2 and 10 labels.

Create a signal

Signals live in the Agents workspace, scoped to a specific agent.
1

Open the Signals view for your agent

From the Agents tab, pick the agent whose spans you want to label, then open its Signals view and create a new signal.
2

Name the signal

Give it a short, descriptive name like NSFW or User frustration. The name is how the label shows up everywhere else in the dashboard.
3

Choose a classifier type

Pick Binary (yes / no) or String (enumerated labels). For a string classifier, add the 2–10 labels the judge is allowed to return.
4

Write the prompt

Describe what the signal should classify. This is the instruction the judge follows on every span, so be specific about what counts as each outcome. For a binary signal, describe what makes a span a “yes.” For a string signal, describe when to pick each label.
5

Set the sample rate

The sample rate is the share of matching spans that get classified — lower rates cost less. Start lower on high-volume agents and raise it once you trust the labels. Common presets are 10%, 25%, 50%, and 100%.
6

Save as a draft or activate

Save draft keeps the signal unpublished so you can keep tuning it. Activate publishes it and starts live classification on new spans.
Don’t want to start from scratch? Use Start from a template to prefill the classifier type, prompt, and labels for a common signal, then edit from there. See Templates below.

Test before you activate

Before you commit a signal to live traffic, run it against recent spans to preview how it labels them. A test run classifies a small sample of spans synchronously and shows you the label distribution and per-span results — nothing is saved. It’s a preview only and doesn’t affect your signal or store any labels.
1

Open the tester

From the signal, choose Test it.
2

Pick a sample size and time range

Choose how many recent spans to classify (1–100) and the window to pull them from.
3

Read the preview

You’ll see the label distribution across the sample plus a per-span breakdown. If the labels don’t match your intent, adjust the prompt or labels and test again.
Test results are not persisted. They exist only to help you tune the prompt before activating.

Activate and run live

Activating a signal starts live classification: as new spans arrive for the agent, the configured share of them gets sampled and labeled automatically. You don’t have to do anything else — labels accumulate as traffic flows. A signal is always in one of three states:
StateWhat it means
DraftSaved but not running. No spans are being classified.
ActiveLive. New matching spans are sampled and labeled.
DisabledPaused. Live classification has stopped, but past labels are kept.
You can disable an active signal at any time to stop classification without losing the labels you’ve already collected, and re-enable it later.

Backfill historical data

Live classification only labels spans that arrive after you activate. To label spans you already captured, run a manual run (backfill). Unlike a test, a manual run saves its labels — they’re stored against your spans and tied to the run, exactly like live labels.
1

Open the manual run dialog

From the signal, choose Manual run / Backfill.
2

Pick a time range and sample rate

Choose the historical window to apply the signal to, and the share of matching spans in that window to classify.
3

Start the run

The run executes in the background, classifying past spans across the window. Results land in the same place as live labels as the run progresses.
Backfill a representative window first to sanity-check the labels at scale before running it over a long history — a manual run classifies real spans and counts toward usage.

Read the results

Labeled spans show up in the dashboard with their label rendered as a colored chip. For a binary signal, “yes” and “no” get distinct colors; for a string signal, each label gets its own color. From there you can:
  • Filter by label value to pull up just the spans a signal flagged (for example, every span labeled “yes” by a jailbreak signal).
  • Watch the distribution over time to see how a label trends across hours or days.
  • Jump straight to the trace behind any labeled span to see the full context.

Versioning and editing

Signals are versioned. Editing a signal — changing its prompt, labels, classifier type, or sample rate — creates a new version rather than mutating the old one. The current version is the one powering live classification, and labels record which version produced them, so you can change a signal’s definition without losing the history of what earlier versions decided. When you no longer need a signal, archive it. Archiving stops it and removes it from your active list while preserving the labels it produced.

Templates

To get started quickly, create a signal from a template and edit from there. Built-in templates include:
TemplateTypeWhat it flags
NSFWBinarySpans whose content is sexually explicit, graphic, or otherwise not safe for work.
Jailbreak attemptBinarySpans where the user tries to bypass the model’s safety guardrails or system instructions.
Laziness / refusalBinarySpans where the assistant refuses, stalls, or gives a low-effort non-answer instead of completing the task.
User frustrationBinarySpans where the user expresses frustration, annoyance, or dissatisfaction with the assistant.
SentimentStringThe overall sentiment the user expresses — positive, neutral, or negative.
Task outcomeStringWhether the task the user asked for was completed — completed, partial, failed, or abandoned.

Next steps

Analyze your traces

Run Halo on your traces to find systemic failure modes and concrete fixes.

Set agent identity

Add stable agent IDs so signals attach to the right agent and group cleanly.

Capture more of your stack

Add tracing to more providers, frameworks, and agent runtimes so signals have more to label.

Wrap custom work

Add spans around retrieval, routing, and subprocesses so signals can classify them too.