Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.inference.net/llms.txt

Use this file to discover all available pages before exploring further.

If you already have curated data from annotation pipelines, synthetic generation, or another platform, you can upload it as JSONL instead of building a dataset from captured traffic. Uploads and datasets are separate objects in Catalyst:
  • An upload is the imported JSONL file plus its validation and processing status.
  • A dataset is the stable collection you use for evals, training, and download.
You upload the file first, then create an eval or training dataset from that uploaded data once processing finishes.

How to upload

  1. Go to Datasets in the dashboard
  2. Click Upload Data
  3. Select your .jsonl file
  4. Give the upload a name and start the import
The upload appears in Datasets > Uploads, where you can track processing and review any validation errors.
The upload command does not ask whether the data is for evals or training. You choose eval vs training when you create a dataset from the completed upload.

After upload

  1. Wait for the upload to finish processing in Datasets > Uploads.
  2. Open the dataset creation flow and select the upload as your source.
  3. Choose whether the resulting dataset is eval or training.
Successful uploads become a reusable source in the same dataset creation flow you use for traffic-backed datasets.

Supported formats

Two JSONL formats are supported. See Dataset Formats for full schemas, required fields, and validation rules.
FormatStructureBest for
Source-backed{ request, response } per lineRound-tripping data captured from providers
Hugging Face{ messages } per lineStandard training/eval format, easy to create
The system auto-detects the format from the first valid line. Every row in the file must use the same format.

Validation behavior

  • Invalid rows are reported with line numbers in the upload status details.
  • Uploads can complete with some failed rows if at least one row imports successfully.
  • Mixed-format files are treated as a fatal error and fail the upload.
  • Source-backed rows must include a usable model value in the request.

Upload limits

LimitValue
Maximum file size10 GB
Maximum line count1,000,000

Next steps

Build from traffic instead

Pull datasets directly from your captured production traffic.

CLI Command Reference

Upload from the terminal with inf dataset upload.

Dataset formats reference

Full schema details and validation rules.