What you’ll have when you finish
- one deployment for the trained model
- one public model identifier
- one successful smoke test against the deployment endpoint
Before you start
- complete a training run with /guides/turn-eval-failures-into-a-training-run
- confirm the trained model still beats or matches the baseline on the eval you trust
Step 1: review the completed training job
Before you create a deployment, inspect the training job detail page for:- final status
- external job ID
- base model
- current or final loss
- checkpoint evals and average scores
- final model reference / weights
Step 2: create the deployment
In the deployment create flow, you choose:- deployment name
- model
- speed target
- instance count
teamSlug/name-randomId unless you override it.
Step 3: copy the public model identifier
Once the deployment exists, copy the deployment’s public model identifier from the overview page. You will use that as themodel value in a normal API request.
Step 4: send a smoke test
Use the exact deployment-specific API example on /deploy/call-a-deployed-model and verify that:- the request completes successfully
- the output is correct enough for the workflow
- the request shows up in the deployment inferences view
Step 5: watch the deployed traffic
After rollout, inspect:- deployment overview
- instances
- recent deployment inferences
- Observe analytics for the surrounding workflow
Verify it worked
You should now have:- one live deployment
- one public model identifier
- one successful deployment request visible in the deployment inferences tab
What to do next
Observe Overview
Keep routing real traffic through Inference.net so the next eval and training cycle stays grounded in production behavior.
Choose Realtime, Background, Group, or Batch
Decide whether the deployment should serve interactive traffic or a non-interactive workflow.