Deployment

What deployment gives you
When to choose Deploy
Serverless vs dedicated
Recommended workflow
Next steps

Deployment is where you move from experimentation to a stable production serving path. Use it when a workload deserves dedicated runtime capacity, deployment-specific controls, or a promoted trained model.

What deployment gives you

dedicated deployment creation and configuration
deployment-specific API examples and model identifiers
instance and settings surfaces for operating production capacity
a clean destination for models that come out of fine-tuning

When to choose Deploy

Choose Deploy when:

the workload is important enough to justify dedicated capacity
you need stronger control over the serving path than the shared serverless API provides
you want to promote a trained model into production
you want a stable endpoint and deployment-specific operating surfaces

Serverless vs dedicated

Path	Best for
Serverless API	Fast experimentation, variable traffic, broad access to hosted models
Dedicated deployment	Stable production throughput, custom deployment settings, trained model promotion

Recommended workflow

Start by capturing traffic from your application.
Use evals to define success.
Use fine-tuning if you need a task-specific model improvement.
Move into deployment once the model and workload deserve their own production runtime.
Keep observing the deployment in production.

Next steps

Deploy a Trained Model

Promote a completed training run into a production serving path.

Deploy a HF Model

Bring a Hugging Face model onto Inference.net.

Fine-tuning

Prepare a model that is worth promoting.

Capture Traffic

Keep production traffic visible after rollout.

Talk to an engineer

Meet with us if you want help planning deployment topology, scaling, or rollout strategy.

Fine-tuning End-to-End Fine-tuning

⌘I

Get Started

Platform

Guides

API

Workhorse Models

What deployment gives you

When to choose Deploy

Serverless vs dedicated

Recommended workflow

Next steps

Deploy a Trained Model

Deploy a HF Model

Fine-tuning

Capture Traffic

Talk to an engineer

Get Started

Platform

Guides

API

Workhorse Models

​What deployment gives you

​When to choose Deploy

​Serverless vs dedicated

​Recommended workflow

​Next steps

Deploy a Trained Model

Deploy a HF Model

Fine-tuning

Capture Traffic

Talk to an engineer

What deployment gives you

When to choose Deploy

Serverless vs dedicated

Recommended workflow

Next steps