What Deploy gives you
- dedicated deployment creation and configuration
- deployment-specific API examples and model identifiers
- instance and settings surfaces for operating production capacity
- a clean destination for models that come out of Train
When to choose Deploy
Choose Deploy when:- the workload is important enough to justify dedicated capacity
- you need stronger control over the serving path than the shared serverless API provides
- you want to promote a trained model into production
- you want a stable endpoint and deployment-specific operating surfaces
Serverless vs dedicated
| Path | Best for |
|---|---|
| Serverless API | Fast experimentation, variable traffic, broad access to hosted models |
| Dedicated deployment | Stable production throughput, custom deployment settings, trained model promotion |
Recommended workflow
- Start in Observe and capture representative traffic.
- Use Evaluate to define success.
- Use Train if you need a task-specific model improvement.
- Move into Deploy once the model and workload deserve their own production runtime.
- Keep observing the deployment in production.
Next steps
Choose the execution path
Decide whether the workload should stay on the shared API, use async execution, or move into a dedicated deployment.
Create a deployment
Choose the model, speed target, and instance count for a new deployment.
Call a deployed model
Use the deployment’s public model identifier with the standard API shape.
Operate deployments
Inspect instances, recent inferences, and lifecycle settings after rollout.
Train Overview
Prepare a model that is worth promoting.
Observe Overview
Keep production traffic visible after rollout.
Inference Modes
Understand when to stay serverless and when to move into dedicated capacity.
Talk to an engineer
Meet with us if you want help planning deployment topology, scaling, or rollout strategy.