Note: Inference Cloud is currently in beta. Some features may change.

Does Inference Cloud support streaming?

Yes, inference cloud supports streaming for all language models. Simply set the stream=true parameter in your request.

Does Inference Cloud support batching?

Yes, inference cloud supports batching for all models. Please contact us for details.

Does Inference Cloud support custom models?

No currently, we are working on on supported LoRAs for Llama 3.1 models. If you have a specific use case or would like early access, please contact us.

Does Inference Cloud use my data?

No, Inference Cloud does not use your data for training, or any other purpose. We delete all data after 30 days.