Skip to main content
A recipe is a pre-configured training setup. It includes a vetted base model, optimized training parameters, and compute configuration. You don’t need ML expertise to pick one.

What a recipe includes

  • Base model — selected by the Inference team for quality on the task type
  • Optimized training parameters — learning rate, epochs, and other hyperparameters
  • Compute configuration — minimum 8 GPUs per training run

How to choose

Pick based on task difficulty and capability needs, not specific model names. Start small and scale up only if your eval results show the smaller recipe isn’t cutting it.
TierBest for
TinyHigh-throughput tasks where speed matters most — simple classification, extraction, tagging
SmallFast inference with more capability — structured output, entity extraction, routing
MediumGood balance of speed and intelligence — summarization, Q&A, agentic tasks that need to be fast
LargeComplex reasoning, multi-step tasks, difficult agentic use cases
Some recipes offer specific capabilities (like multimodal support) that are only available with certain base models. Choose those when your task requires them.

Available recipes

These are the pre-built recipes currently available on the platform. Each one has been configured and tested by the Inference team.
RecipeBase ModelParametersDescription
TinyQwen 3.5 0.8B0.8BTiny and incredibly fast. Best for high-volume, low-complexity tasks.
SmallQwen 3.5 4B4BSmall and incredibly fast. A good default when latency is the priority.
MediumQwen 3.5 9B9BGood balance of speed and intelligence. Good as a fast agentic model.
LargeQwen 3.5 27B27BLarge and intelligent. Great for difficult agentic use cases.
All recipes use 8x H100 GPUs and include optimized training parameters. You don’t need to configure any of this — just pick the tier that fits your task.