Skip to main content
After a deployment is live, the operational work shifts from creation to inspection, validation, and controlled change management.

The deployment detail surfaces

The product organizes deployment operations into four tabs:
TabUse it for
OverviewDeployment metadata, public model identifier, and API usage
InstancesRuntime capacity and instance-level status
InferencesRecent requests served by the deployment
SettingsAdministrative actions, including deletion

What to monitor

Instances

Watch for instance status changes such as:
  • Running
  • Initializing
  • ShuttingDown
  • Terminated
  • failure states
This view is where you confirm that the deployment actually has healthy serving capacity.

Inferences

The deployment inferences view is the fastest way to validate:
  • which model identifier is serving traffic
  • whether requests are succeeding
  • token counts
  • request duration
  • total cost
Use it for spot checks after rollout or after any meaningful deployment change.

Settings

Settings is where destructive lifecycle actions live. Treat deletion as permanent and use it only when you are sure you no longer need the deployment.

Best practices

  • keep a baseline smoke test for every deployment
  • inspect recent inferences immediately after rollout
  • continue routing production traffic through Observe so deployment behavior remains visible in the larger platform analytics
  • avoid making deployment decisions without a current eval baseline

Next steps

Observe Overview

Keep production traffic visible after rollout.

Evaluate Overview

Use evals to decide whether a deployment change actually improved the product.