Choose a Recipe - Inference.net Documentation

A recipe is a pre-configured training setup. It includes a vetted base model, optimized training parameters, and compute configuration. You don’t need ML expertise to pick one.

What a recipe includes

Base model — selected by the Inference team for quality on the task type
Optimized training parameters — learning rate, epochs, and other hyperparameters
Compute configuration — minimum 8 GPUs per training run

How to choose

Pick based on task difficulty and capability needs, not specific model names. Start small and scale up only if your eval results show the smaller recipe isn’t cutting it.

Tier	Best for
Tiny	High-throughput tasks where speed matters most — simple classification, extraction, tagging
Small	Fast inference with more capability — structured output, entity extraction, routing
Medium	Good balance of speed and intelligence — summarization, Q&A, agentic tasks that need to be fast
Large	Complex reasoning, multi-step tasks, difficult agentic use cases

Some recipes offer specific capabilities (like multimodal support) that are only available with certain base models. Choose those when your task requires them.

Available recipes

These are the pre-built recipes currently available on the platform. Each one has been configured and tested by the Inference team.

Recipe	Base Model	Parameters	Description
Tiny	Qwen 3.5 0.8B	0.8B	Tiny and incredibly fast. Best for high-volume, low-complexity tasks.
Small	Qwen 3.5 4B	4B	Small and incredibly fast. A good default when latency is the priority.
Medium	Qwen 3.5 9B	9B	Good balance of speed and intelligence. Good as a fast agentic model.
Large	Qwen 3.5 27B	27B	Large and intelligent. Great for difficult agentic use cases.

All recipes use 8x H100 GPUs and include optimized training parameters. You don’t need to configure any of this — just pick the tier that fits your task.

​What a recipe includes

​How to choose

​Available recipes

What a recipe includes

How to choose

Available recipes