How signals work
A signal evaluates spans for one of your agents. When you activate it, Catalyst samples incoming spans for that agent, sends each sampled span’s input and output to an LLM judge with the prompt you wrote, and writes the label back to the span. You can then filter, chart, and break down your traces by that label.You define the signal
Pick a classifier type, write a prompt describing what to look for, and set a sample rate.
Catalyst samples matching spans
For an active signal, a deterministic share of the agent’s spans (the sample rate) is selected for classification. Sampling is deterministic, so the same span always resolves the same way.
A judge labels each span
The judge reads the span’s input and output and returns a label that conforms to your classifier type.
Two classifier types
When you create a signal you choose how it labels spans.Binary (yes / no)
A true/false classifier. Use it for “is this X or not” questions: NSFW content, jailbreak attempts, refusals, or any flag you want to filter on. No labels to configure — just a prompt.
String (enumerated labels)
A classifier that returns one of a fixed set of labels you define. Use it when there’s more than two outcomes: sentiment (positive / neutral / negative), task outcome (completed / partial / failed / abandoned), and so on. Define between 2 and 10 labels.
Create a signal
Signals live in the Agents workspace, scoped to a specific agent.Open the Signals view for your agent
From the Agents tab, pick the agent whose spans you want to label, then open its Signals view and create a new signal.
Name the signal
Give it a short, descriptive name like
NSFW or User frustration. The name is how the label shows up everywhere else in the dashboard.Choose a classifier type
Pick Binary (yes / no) or String (enumerated labels). For a string classifier, add the 2–10 labels the judge is allowed to return.
Write the prompt
Describe what the signal should classify. This is the instruction the judge follows on every span, so be specific about what counts as each outcome. For a binary signal, describe what makes a span a “yes.” For a string signal, describe when to pick each label.
Set the sample rate
The sample rate is the share of matching spans that get classified — lower rates cost less. Start lower on high-volume agents and raise it once you trust the labels. Common presets are 10%, 25%, 50%, and 100%.
Test before you activate
Before you commit a signal to live traffic, run it against recent spans to preview how it labels them. A test run classifies a small sample of spans synchronously and shows you the label distribution and per-span results — nothing is saved. It’s a preview only and doesn’t affect your signal or store any labels.Pick a sample size and time range
Choose how many recent spans to classify (1–100) and the window to pull them from.
Test results are not persisted. They exist only to help you tune the prompt before activating.
Activate and run live
Activating a signal starts live classification: as new spans arrive for the agent, the configured share of them gets sampled and labeled automatically. You don’t have to do anything else — labels accumulate as traffic flows. A signal is always in one of three states:| State | What it means |
|---|---|
| Draft | Saved but not running. No spans are being classified. |
| Active | Live. New matching spans are sampled and labeled. |
| Disabled | Paused. Live classification has stopped, but past labels are kept. |
Backfill historical data
Live classification only labels spans that arrive after you activate. To label spans you already captured, run a manual run (backfill). Unlike a test, a manual run saves its labels — they’re stored against your spans and tied to the run, exactly like live labels.Pick a time range and sample rate
Choose the historical window to apply the signal to, and the share of matching spans in that window to classify.
Read the results
Labeled spans show up in the dashboard with their label rendered as a colored chip. For a binary signal, “yes” and “no” get distinct colors; for a string signal, each label gets its own color. From there you can:- Filter by label value to pull up just the spans a signal flagged (for example, every span labeled “yes” by a jailbreak signal).
- Watch the distribution over time to see how a label trends across hours or days.
- Jump straight to the trace behind any labeled span to see the full context.
Versioning and editing
Signals are versioned. Editing a signal — changing its prompt, labels, classifier type, or sample rate — creates a new version rather than mutating the old one. The current version is the one powering live classification, and labels record which version produced them, so you can change a signal’s definition without losing the history of what earlier versions decided. When you no longer need a signal, archive it. Archiving stops it and removes it from your active list while preserving the labels it produced.Templates
To get started quickly, create a signal from a template and edit from there. Built-in templates include:| Template | Type | What it flags |
|---|---|---|
| NSFW | Binary | Spans whose content is sexually explicit, graphic, or otherwise not safe for work. |
| Jailbreak attempt | Binary | Spans where the user tries to bypass the model’s safety guardrails or system instructions. |
| Laziness / refusal | Binary | Spans where the assistant refuses, stalls, or gives a low-effort non-answer instead of completing the task. |
| User frustration | Binary | Spans where the user expresses frustration, annoyance, or dissatisfaction with the assistant. |
| Sentiment | String | The overall sentiment the user expresses — positive, neutral, or negative. |
| Task outcome | String | Whether the task the user asked for was completed — completed, partial, failed, or abandoned. |
Next steps
Analyze your traces
Run Halo on your traces to find systemic failure modes and concrete fixes.
Set agent identity
Add stable agent IDs so signals attach to the right agent and group cleanly.
Capture more of your stack
Add tracing to more providers, frameworks, and agent runtimes so signals have more to label.
Wrap custom work
Add spans around retrieval, routing, and subprocesses so signals can classify them too.