Get Started
Introduction
Get started with Inference Cloud
Note: Inference Cloud is currently in beta. Some features may change.
Welcome to the Inference Cloud docs! Inference Cloud makes it easy to access the leading open source AI models with only a few lines of code. Our mission is to build the best AI-native platform for developers to build their own AI applications.
We currently offer:
- Serverless LLM Inference - Use the API to access the top open source language models like Llama-3.1-8B. Pay only for the tokens you use. View the list of available models here.
- LoRA Inference (Early access) - Upload LoRA adapters and access them via streaming or batch endpoints.
- Image Generation (Early access) - Generate images with models like FLUX[DEV] and Stable Diffusion. Pay per generation.
Getting Started
Get up and running with Inference Cloud APIs.
Quickstart
Get up and running with Inference Cloud using the OpenAI SDK
Batch Processing
Process multiple asynchronous requests in a single API call and retrieve results
Resources
Learn more about Inference Cloud APIs and how to use them.