Group API is available for both
/v1/slow/group/chat/completions and /v1/slow/group/completions endpoints.Overview
The Group API provides a streamlined way to submit multiple asynchronous inference requests as a single unit. Unlike the Batch API which requires JSONL file uploads, the Group API accepts requests directly in the request body, making it ideal for:- Small to medium batches: Process up to 50 requests at once
- Related tasks: Group related inference requests together
- Webhook notifications: Get notified when all requests in a group complete
- Simpler integration: No file uploads or JSONL formatting required
- Faster implementation: Direct JSON API calls without file management
Group API vs Batch API
| Feature | Group API | Batch API |
|---|---|---|
| Maximum requests | 50 | 1,000,000 |
| Input format | JSON array in request body | JSONL file upload |
| File management | Not required | Required |
| Use case | Small batches, quick implementation | Large-scale processing |
| Webhook support | Yes | Yes |
| Completion time | 1-72 hours | 1-72 hours |
Getting Started
1. Submit a Group Request
Submit multiple requests together by sending them as an array in the request body:JSON
2. Retrieve Group Results
Once your group is processed, retrieve all generation results using the group ID:JSON
Using Webhooks
Attach a webhook to receive notifications when your group completes processing. Include thewebhook_id when submitting the group:
- Group ID
- Completion status
- Summary of successful and failed requests
- Custom IDs for each request (if provided)
Text Completions Support
The Group API also supports text completions:Limits and Constraints
- Maximum requests per group: 50
- Request format: Direct JSON (no JSONL files required)
- Supported endpoints:
/v1/slow/group/chat/completions/v1/slow/group/completions
- Completion time: 24-72 hours
- Request expiration: Groups expire after 72 hours if not completed
Best Practices
- Group related requests: Use groups for requests that logically belong together (e.g., analyzing multiple documents from the same source).
- Use webhooks for notifications: Instead of polling, configure webhooks to be notified when your group completes.
- Handle individual failures: Some requests in a group may fail while others succeed. Check each generation’s status.
- Stay under limits: Keep groups to 50 requests or less. For larger batches, use the Batch API.
-
Include metadata: Add custom IDs or metadata to your requests for easier tracking:
JSON
Error Handling
The API validates your request structure immediately. Common errors include:JSON
webhook_id(correct)webhook_url(incorrect — this is for the Batch API)
When to Use Group API
Choose the Group API when you need:- Quick implementation without file management
- To process 50 or fewer related requests
- Webhook notifications for a set of requests
- Simple JSON-based integration