- Serverless LLM Inference - Use the API to access the top open source language models like Llama-3.1-8B. Pay only for the tokens you use. View the list of available models here.
- LoRA Inference (Early access) - Upload LoRA adapters and access them via streaming or batch endpoints.
- Image Generation (Early access) - Generate images with models like FLUX[DEV] and Stable Diffusion. Pay per generation.
Getting Started
Get up and running with Inference.net APIs.Quickstart
Get up and running with Inference.net using the OpenAI SDK
Batch Processing
Process multiple asynchronous requests in a single API call and retrieve results