Datasets - Inference.net Documentation

Datasets are collections of LLM inputs and outputs used for evaluation and fine-tuning. They can come from two places: your live production traffic captured through Gateway, or files you upload directly. Everything downstream depends on good data. Evals need representative examples to measure model quality. Training needs diverse, high-quality samples to teach a model your task. Datasets are where both start.

Types of datasets

Type	Purpose	How it evolves
Eval dataset	Measures model quality against a rubric	Stays stable, a fixed set of challenging examples that act as your benchmark
Training dataset	Data the model learns from during fine-tuning	Changes often as you iterate on data quality and coverage

The zero-overlap rule

Catalyst automatically enforces zero-overlap between training and eval datasets. If a training dataset overlaps with an eval dataset, the overlapping data will be excluded from the training dataset when a new training run is created.

Key concepts

Concept	Description
Build from traffic	Filter your captured production inferences and save them as a dataset. The best datasets come from real usage.
Upload	Bring your own JSONL files when you have curated data or are migrating from another platform.
Dataset format	The schema your data needs to follow. See Dataset Formats for supported fields and validation rules.
Task tags	Use task tags when building from traffic to filter by objective. This gives you clean, focused samples instead of mixed traffic.

Tips for good datasets

Diverse training data leads to models that generalize well. If your training data isn’t heterogeneous, the trained model won’t handle edge cases.
Stable eval data gives you a consistent benchmark. Don’t change your eval dataset frequently, it’s the measuring stick.
Start with production traffic when possible. Real user inputs reflect the actual distribution of requests your model will see, and they’re harder to fake than synthetic data.
Use task tags to filter by objective before saving a dataset. A dataset scoped to a single task is almost always more useful than one built from mixed traffic.

Next steps

Build from traffic

Turn filtered production traffic into a dataset.

Upload a dataset

Bring your own JSONL files.

Set up your first eval

Use your dataset to compare models.

Train a custom model

Use your dataset to fine-tune a task-specific model.

Inference Viewer

Build a Dataset from Traffic

⌘I

​Types of datasets

​The zero-overlap rule

​Key concepts

​Tips for good datasets

​Next steps