Datasets and Uploads

Observe supports two main data paths:

Live traces captured from proxied traffic
Historical uploads imported from JSONL files

Both paths end in the same place: datasets you can use for evals, training, and analysis.

Live traces vs historical uploads

Source	Best for
Live traces	Building datasets from the exact traffic your product sees today
Historical uploads	Backfills, migrations from another provider, or importing older request logs

Creating datasets from live traffic

Use the Inferences page in the dashboard to filter the requests you care about, then save the filtered view as a dataset. Typical flow:

filter requests by environment, task, model, provider, or metadata
inspect the slice until it represents the workflow you care about
save it as either a training dataset or an eval dataset
reuse that dataset in evals or training jobs

Uploading historical logs

Historical uploads use JSONL files. Each line must be a single JSON object with these fields:

Field	Required	Description
`request`	Yes	Raw provider request body
`response`	No	Raw provider response body, or `null` if you only have requests

The system detects the provider format from the request and response shape.

Source-backed JSONL example

This OpenAI request/response pair matches the helper factories used in inference/apps/llm-ops-consumer/tests/unit/inference-upload.vitest.test.ts.

{"request":{"model":"gpt-4","messages":[{"role":"user","content":"Hello"}],"temperature":0.7,"max_tokens":100},"response":{"id":"chatcmpl-123","object":"chat.completion","created":1700000000,"model":"gpt-4","choices":[{"index":0,"message":{"role":"assistant","content":"Hi there!"},"finish_reason":"stop"}]}}

Upload limits

Limit	Value
Maximum file size	`10 GB`
Maximum line count	`1,000,000`

Upload workflow

Open the Datasets page in your project.
Click Upload Data.
Name the upload.
Select a .jsonl file.
Wait for processing to complete.
Inspect the uploaded records and create datasets from the imported traffic.

Processing states

Status	Meaning
`awaiting_upload`	The upload record exists, but the file has not been received yet
`pending`	The file was received and queued
`processing`	The lines are being parsed and validated
`completed`	The upload finished and the records are available
`failed`	A fatal error stopped processing

When to use each dataset type

Dataset type	Use it for
Eval dataset	Repeatable model comparisons and rubric-based scoring
Training dataset	Fine-tuning or distillation runs

Most teams keep both: an eval dataset to measure progress and a training dataset to improve the model.

Recommended next steps

Run /evaluate/overview after you have a representative eval dataset.
Move into /train/overview after your eval rubric is stable.
Keep routing production traffic through Observe so the datasets stay realistic over time.

Start Here

Workhorse Models

Guides

Reference

Tutorials

Live traces vs historical uploads

Creating datasets from live traffic

Uploading historical logs

Source-backed JSONL example

Upload limits

Upload workflow

Processing states

When to use each dataset type

Recommended next steps

Start Here

Workhorse Models

Guides

Reference

Tutorials

​Live traces vs historical uploads

​Creating datasets from live traffic

​Uploading historical logs

​Source-backed JSONL example

​Upload limits

​Upload workflow

​Processing states

​When to use each dataset type

​Recommended next steps

Live traces vs historical uploads

Creating datasets from live traffic

Uploading historical logs

Source-backed JSONL example

Upload limits

Upload workflow

Processing states

When to use each dataset type

Recommended next steps