Note: Inference Cloud is currently in beta. Some features may change.

Welcome to the Inference Cloud docs! Inference Cloud makes it easy to access the leading open source AI models with only a few lines of code. Our mission is to build the best AI-native platform for developers to build their own AI applications.

We currently offer:

  • Serverless LLM Inference - Use the API to access the top open source language models like Llama-3.1-8B. Pay only for the tokens you use. View the list of available models here.
  • LoRA Inference (Early access) - Upload LoRA adapters and access them via streaming or batch endpoints.
  • Image Generation (Early access) - Generate images with models like FLUX[DEV] and Stable Diffusion. Pay per generation.

Getting Started

Get up and running with Inference Cloud APIs.

Resources

Learn more about Inference Cloud APIs and how to use them.