What a recipe includes
- Base model — selected by the Inference team for quality on the task type
- Optimized training parameters — learning rate, epochs, and other hyperparameters
- Compute configuration — minimum 8 GPUs per training run
How to choose
Pick based on task difficulty and capability needs, not specific model names. Start small and scale up only if your eval results show the smaller recipe isn’t cutting it.| Tier | Best for |
|---|---|
| Tiny | High-throughput tasks where speed matters most — simple classification, extraction, tagging |
| Small | Fast inference with more capability — structured output, entity extraction, routing |
| Medium | Good balance of speed and intelligence — summarization, Q&A, agentic tasks that need to be fast |
| Large | Complex reasoning, multi-step tasks, difficult agentic use cases |
Available recipes
These are the pre-built recipes currently available on the platform. Each one has been configured and tested by the Inference team.| Recipe | Base Model | Parameters | Description |
|---|---|---|---|
| Tiny | Qwen 3.5 0.8B | 0.8B | Tiny and incredibly fast. Best for high-volume, low-complexity tasks. |
| Small | Qwen 3.5 4B | 4B | Small and incredibly fast. A good default when latency is the priority. |
| Medium | Qwen 3.5 9B | 9B | Good balance of speed and intelligence. Good as a fast agentic model. |
| Large | Qwen 3.5 27B | 27B | Large and intelligent. Great for difficult agentic use cases. |