What you’ll have when you finish
- one deployment for the trained model
- one public model identifier
- one successful smoke test against the deployment endpoint
Before you start
- complete a training run with the E2E Fine-tuning guide
- confirm the trained model still beats or matches the baseline on the eval you trust
Step 1: review the completed training job
Before you create a deployment, inspect the training job detail page for:- final status
- external job ID
- base model
- current or final loss
- checkpoint evals and average scores
- final model reference / weights
Step 2: create the deployment
In the deployment create flow, you choose:- deployment name
- model
- speed target
- instance count
teamSlug/name-randomId unless you override it.
Step 3: copy the public model identifier
Once the deployment exists, copy the deployment’s public model identifier from the overview page. You will use that as themodel value in a normal API request.
Step 4: send a smoke test
Use the deployment’s public model identifier with the standard API shape and verify that:- the request completes successfully
- the output is correct enough for the workflow
- the request shows up in the deployment inferences view
Step 5: watch the deployed traffic
After rollout, inspect:- deployment overview
- instances
- recent deployment inferences
- Observe analytics for the surrounding workflow
Verify it worked
You should now have:- one live deployment
- one public model identifier
- one successful deployment request visible in the deployment inferences tab
What to do next
Capture Traffic
Keep routing real traffic through Inference.net so the next eval and training cycle stays grounded in production behavior.
Deployment
Learn more about deployment configuration, scaling, and operations.