Skip to main content
Use inf dataset to upload JSONL inference data and manage datasets created from captured traffic, existing uploads, or JSONL files on disk. Materialized datasets feed into inf eval run for evals and into training jobs. Alias: inf datasets

inf dataset upload

Import a JSONL file into the active project as an upload entry. An upload is the raw material you can then materialize into an eval or training dataset. The CLI validates the file locally, uploads it in parts, waits for processing to finish, and prints the detected format plus the processed line count.
inf dataset upload <file>

Arguments

ArgumentRequiredDescription
fileYesPath to the JSONL file to upload

Options

FlagRequiredDescriptionDefault
-n, --name <name>NoUpload name shown in CatalystFilename without extension
--no-waitNoReturn after the transfer finishes instead of polling processingOff
Uploaded data appears in Datasets → Uploads in the dashboard. Once processing completes, create an eval or training dataset from that upload — either with inf dataset create --upload-id below or in the dashboard.

Examples

# Use the filename as the upload name and wait for processing
inf dataset upload ./data/support-summaries.jsonl

# Set a custom upload name
inf dataset upload ./data/support-summaries.jsonl --name support-summaries-v2

# Return after the transfer completes, without waiting for processing
inf dataset upload ./data/support-summaries.jsonl --no-wait

inf dataset create

Materialize an eval or training dataset from captured traffic, an existing upload, or a JSONL file on disk. The file-backed path uploads, waits for processing, and materializes in one command.
inf dataset create -n <name> -t <type> [source-flags…]

Options

FlagRequiredDescriptionDefault
-n, --name <name>YesDataset name
-t, --type <type>Yeseval or training
-f, --file <path>NoJSONL file on disk — uploads, waits for processing, then materializes from that upload
--upload-id <id>NoMaterialize from an existing upload
--task <taskId>NoFilter captured traffic by task ID
--model <modelId>NoFilter captured traffic by model ID
--since <date>NoStart of the time window for traffic filters (ISO 8601 or YYYY-MM-DD HH:MM:SS)30 days ago
--until <date>NoEnd of the time window for traffic filters1 minute from now
--limit <n>NoCap on the number of inferences included
--status <status>NoStatus filter: success (default), 2xx, or a specific code like 200 — datasets reject non-success traffic unless you overridesuccess
--description <text>NoFree-text dataset description
--file and --upload-id are mutually exclusive — --file creates a new upload automatically. Date values accept ISO 8601 (2026-04-01T00:00:00Z) or ClickHouse format (2026-04-01 00:00:00). Dataset materialization runs asynchronously. The command prints the dataset ID and points at inf dataset get <id> to check progress.

Examples

# One-command: upload a JSONL file and materialize an eval dataset
inf dataset create -n demo-eval -t eval --file ./samples.jsonl

# Materialize from an existing upload
inf dataset create -n training-v1 -t training --upload-id up_abc123

# Filter captured traffic by task + time window
inf dataset create -n support-eval -t eval \
  --task support-tickets \
  --since 2026-04-01 \
  --until 2026-04-14

# Cap to 1,000 rows (only successful traffic is included by default)
inf dataset create -n small-eval -t eval --task support-tickets --limit 1000

inf dataset list

Display datasets in the active project.
inf dataset list
Alias: inf dataset ls

Options

FlagRequiredDescriptionDefault
-l, --limit <n>NoMaximum number of results20
The table shows the dataset ID (8-char prefix), name, type, inference count, export status, and creation date. Use --json to get full UUIDs for scripting.

Examples

# Default table view
inf dataset list

# More results
inf dataset list --limit 100

# Pipe full UUIDs into another command
inf dataset list --json | jq -r '.[].id'

inf dataset get

View detailed information about a specific dataset — ID, name, type, inference count, export status, source project, and creation date.
inf dataset get <id>

Arguments

ArgumentRequiredDescription
idYesDataset ID, UUID prefix (4+ chars), or exact name

inf dataset download

Download a dataset as a JSONL file. If the server-side export isn’t ready yet, the CLI requests it and polls until it’s ready before downloading.
inf dataset download [id]

Arguments

ArgumentRequiredDescription
idNoDataset ID, UUID prefix (4+ chars), or exact name. If omitted in an interactive terminal, the CLI prompts you to choose.

Options

FlagRequiredDescriptionDefault
-o, --output <path>NoOutput file path<dataset-name>.jsonl for Hugging Face, <dataset-name>.source-backed.jsonl for source-backed
-f, --format <format>NoDownload format: huggingface or source-backedPrompted in a TTY; otherwise huggingface
The CLI resolves dataset IDs by exact ID, UUID prefix, or exact name.

Examples

# Download to the default filename
inf dataset download ds_abc123

# Download to a specific file
inf dataset download ds_abc123 --output ./data/my-dataset.jsonl

# Download source-backed JSONL (request/response objects)
inf dataset download customer-support-eval --format source-backed

# Pick interactively when no id is provided
inf dataset download