Before you start
Signals run on captured traffic, so you need traces flowing first. This guide assumes you’ve already captured your first trace. You also need:- An agent with traces installed. Signals are always scoped to a single agent, so there is nothing to label until that agent is emitting traces.
- A stable, consistent
agentId. Signals are per agent, so your traces need a consistent agent identity to group by. Set theagentIdonce and keep it the same across runs. If you instrumented with the CLI this is usually already set. See Set agent identity. - A
sessionId, if you want session-scoped signals. Session signals classify a whole conversation, which requires that your traces carry asessionId(your conversation or chat ID) so Catalyst can assemble the conversation. See Choose a scope.
Where to find signals
You can get to signals two ways in the dashboard:- The Signals tab, which lists every signal and is the main table view across your agents.
- The Agents tab, where you pick an agent and open its Signals view to see and create signals scoped to just that agent.
How signals work
A signal evaluates traffic for one of your agents. When you activate it, Catalyst samples incoming traffic for that agent, sends each sampled target’s input and output to an LLM judge with the prompt you wrote, and writes the label back. You can then filter, chart, and break down your traces by that label.You define the signal
Pick a scope and classifier type, write a prompt describing what to look for, and set a sample rate.
Catalyst samples matching traffic
For an active signal, a deterministic share of the agent’s traffic (the sample rate) is selected for classification. Sampling is deterministic, so the same target always resolves the same way.
A judge labels each target
The judge reads the input and output of each sampled span, trace, or session and returns a label that conforms to your classifier type.
Choose a scope
A signal’s scope is the unit it labels. You pick it when you create the signal, and it’s fixed for the life of the signal (everything else is editable). Scope determines how much context the judge sees on each call.Span
A single model call. The narrowest scope. Useful for narrow, call-level checks, but often too granular if you care about the interaction as a whole.
Trace
One turn, or request. The judge sees the whole turn rather than a single call. A good fit for request-level outcomes.
Session
The full conversation. Usually the most useful scope for understanding a user, since the judge sees the entire back-and-forth. Requires a
sessionId on your traces so the conversation can be assembled.Session scope only works if your traces carry a
sessionId. If you haven’t set one, add it before creating a session-scoped signal. See Set agent identity.Two classifier types
When you create a signal you choose how it labels each target.Binary (yes / no)
A true/false classifier. Use it for “is this X or not” questions: NSFW content, jailbreak attempts, refusals, or any flag you want to filter on. No labels to configure, just a prompt.
String (enumerated labels)
A classifier that returns one of a fixed set of labels you define. Use it when there’s more than two outcomes: sentiment (positive / neutral / negative), task outcome (completed / partial / failed / abandoned), and so on. Define between 2 and 10 labels.
Create a signal
Create a signal from the Signals tab, or from a specific agent’s Signals view under the Agents tab. Either way the signal is scoped to one agent.Open the signal editor for your agent
From the Signals tab or an agent’s Signals view, pick the agent whose traffic you want to label and create a new signal.
Name the signal
Give it a short, descriptive name like
NSFW or User frustration. The name is how the label shows up everywhere else in the dashboard.Choose a scope
Pick Span, Trace, or Session (see Choose a scope). This is the one setting you can’t change later, so pick the unit you actually want to reason about. Session is usually the most useful for understanding a user.
Choose a classifier type
Pick Binary (yes / no) or String (enumerated labels). For a string classifier, add the 2 to 10 labels the judge is allowed to return.
Write the prompt
Describe what the signal should classify. This is the instruction the judge follows on every target, so be specific about what counts as each outcome. For a binary signal, describe what makes a target a “yes.” For a string signal, describe when to pick each label.
Set the sample rate
The sample rate is the share of matching traffic that gets classified, and lower rates cost less. 100% runs on every target, which is often unnecessary at high volume, so to spend fewer credits pick a lower rate like 25% or 50% and the signal only runs on that share of incoming traffic. Start lower on high-volume agents and raise it once you trust the labels. Common presets are 10%, 25%, 50%, and 100%.
Test before you activate
Before you commit a signal to live traffic, run it against recent traffic to preview how it labels it. A test run classifies a small sample synchronously and shows you the label distribution and per-target results, and nothing is saved. It’s a preview only and doesn’t affect your signal or store any labels.Pick a sample size and time range
Choose how many recent targets to classify (1 to 100) and the window to pull them from.
Test results are not persisted. They exist only to help you tune the prompt before activating.
Activate and run live
Activating a signal starts live classification: as new traffic arrives for the agent, the configured share of it gets sampled and labeled automatically. You don’t have to do anything else, and labels accumulate as traffic flows. A signal is always in one of three states:| State | What it means |
|---|---|
| Draft | Saved but not running. Nothing is being classified. |
| Active | Live. New matching traffic is sampled and labeled. |
| Disabled | Paused. Live classification has stopped, but past labels are kept. |
Backfill historical data
Live classification only labels traffic that arrives after you activate. To label traffic you already captured, run a manual run (backfill) at any time. Unlike a test, a manual run saves its labels: they’re stored against your traces and tied to the run, exactly like live labels.Pick a time range and sample rate
Choose the historical window to apply the signal to, and the share of matching traffic in that window to classify.
Read the results
There are a few places to read what a signal found:- The Signals tab is the main table across your agents, with each signal’s current state and headline numbers at a glance.
- The agent’s Overview under the Agents tab has per-signal graphs: trends and volume over time, plus a range of metrics for each signal so you can see how a label is moving.
- The signal detail view has a table of every classified target (the spans, traces, or sessions the signal labeled). Click into any row to open the underlying trace, span, or session and read the actual conversation that produced the label.
- Filter by label value to pull up just the targets a signal flagged (for example, every session labeled “yes” by a jailbreak signal).
- Watch the distribution over time to see how a label trends across hours or days.
- Jump straight to the underlying trace, span, or session to see the full context and conversation behind any labeled target.
Alert on a signal
Once a signal is running, set up alerts so you hear about changes without watching the dashboard. Alerts are configured per signal, and you can get notified through the Slack integration or by email. An alert fires when a metric crosses a condition over a window. You choose:- The metric. What to watch, depending on the classifier type:
- Label volume (any signal): how many labels came in.
- True rate (binary signals): the share of labels that are “yes.”
- Value count (string signals): how many labels landed on one specific value.
- Value share (string signals): that value’s share of all labels.
- The comparison. Either a percentage change versus the prior equal-length window (for example, “this value’s count is up 10% in the last 24 hours”) or an absolute threshold (for example, “true rate is below 80%”).
- The window the comparison runs over, from 5 minutes up to 48 hours.
- A minimum label count so quiet periods don’t trip the alert on statistically meaningless deltas.
- A cooldown so a flapping condition doesn’t bombard you. After a firing, the next one is suppressed until the cooldown elapses.
Versioning and editing
Signals are versioned. Editing a signal (changing its prompt, labels, classifier type, or sample rate) creates a new version rather than mutating the old one, and the dashboard shows each version after every edit. The current version is the one powering live classification, and labels record which version produced them, so you can change a signal’s definition without losing the history of what earlier versions decided. Scope is the exception: it’s fixed once a signal is created. Everything else is editable, but to label a different unit you create a new signal. When you no longer need a signal, archive it. Archiving stops it and removes it from your active list while preserving the labels it produced.Templates
To get started quickly, create a signal from a template and edit from there. Built-in templates include:| Template | Type | What it flags |
|---|---|---|
| NSFW | Binary | Spans whose content is sexually explicit, graphic, or otherwise not safe for work. |
| Jailbreak attempt | Binary | Spans where the user tries to bypass the model’s safety guardrails or system instructions. |
| Laziness / refusal | Binary | Spans where the assistant refuses, stalls, or gives a low-effort non-answer instead of completing the task. |
| User frustration | Binary | Spans where the user expresses frustration, annoyance, or dissatisfaction with the assistant. |
| Sentiment | String | The overall sentiment the user expresses: positive, neutral, or negative. |
| Task outcome | String | Whether the task the user asked for was completed: completed, partial, failed, or abandoned. |
Manage signals from the CLI and MCP
The dashboard isn’t the only way in. Everything above is also available through the Inference CLI and the MCP server, so you can create, run, and read signals from your terminal or straight from an AI coding assistant.- CLI. The
inf signalscommand group lists, creates, edits, activates, disables, and archives signals, tests them, kicks off manual runs, and reads labels and distributions. - MCP. The Inference MCP server exposes the same operations as tools (creating signals, activating and disabling them, running backfills, configuring alerts, and querying labels and distributions), so an assistant can set up and inspect signals on your behalf.
Feed signals into Halo
Signals pair naturally with Halo, our agent-loop optimizer. The targets a signal flags are exactly the traffic worth digging into:- Improve a behavior. Point Halo at the spans, traces, or sessions a signal flagged and ask how to fix what they have in common (the refusals, the frustrated sessions, the failed tasks).
- Decide what to measure. Talk to Halo about your traces to surface which signals would be most valuable to add in the first place.
Next steps
Analyze your traces
Run Halo on your traces to find systemic failure modes and concrete fixes.
Set agent identity
Add stable agent IDs so signals attach to the right agent and group cleanly.
Capture more of your stack
Add tracing to more providers, frameworks, and agent runtimes so signals have more to label.
Wrap custom work
Add spans around retrieval, routing, and subprocesses so signals can classify them too.