Direct API defaults
Current baseline limits for the direct API are:- Language models:
500 requests per minute - Image models:
100 requests per minute
When limits become the wrong tool
If you are hitting rate limits regularly, the answer is often to change the execution mode rather than just ask for a larger number. Consider:- /api/background-jobs for delayed single-request workflows
- /guides/choose-realtime-background-group-or-batch when you need help choosing between background jobs, group jobs, and batch
- /api/batch for large offline workloads
- /deploy/overview if the workload deserves dedicated capacity