> ## Documentation Index
> Fetch the complete documentation index at: https://docs.inference.net/llms.txt
> Use this file to discover all available pages before exploring further.

# Webhooks: Quick Reference

> Quick reference of webhook support for asynchronous inference

## Dashboard Management

Webhooks are managed through the inference.net dashboard:

1. Navigate to **Webhooks** on the sidebar
2. Create, test, archive, or restore webhooks through the UI
3. Copy your webhook identifier for use in generation requests

## Payload Structures

### generation.completed

```json JSON theme={"system"}
{
  "event": "generation.completed",
  "timestamp": "ISO 8601 timestamp",
  "webhook_id": "webhook identifier",
  "generation_id": "generation ID",
  "data": {
    "state": "Success|Failed",
    "stateMessage": "Human readable status",
    "request": { /* Original request with metadata */ },
    "response": { /* OpenAI format response */ },
    "finishedAt": "ISO 8601 timestamp"
  }
}
```

### async-embedding.completed

```json JSON theme={"system"}
{
  "event": "async-embedding.completed",
  "timestamp": "ISO 8601 timestamp",
  "webhook_id": "webhook identifier",
  "generation_id": "generation ID",
  "data": {
    "state": "Success|Failed",
    "stateMessage": "Human readable status",
    "request": { /* Original embeddings request with metadata */ },
    "response": { /* OpenAI format embeddings response */ },
    "finishedAt": "ISO 8601 timestamp"
  }
}
```

### slow-group.completed

```json JSON theme={"system"}
{
  "event": "slow-group.completed",
  "timestamp": "ISO 8601 timestamp",
  "group_id": "group ID",
  "data": {
    "group_size": 2,
    "status": "completed",
    "generations": [
      {
        "generationId": "generation ID",
        "state": "Success|Failed",
        "stateMessage": "Human readable status",
        "request": { /* Original request */ },
        "response": { /* OpenAI format response */ },
        "finishedAt": "ISO 8601 timestamp or null"
      }
    ]
  }
}
```

## Headers

| Header                      | Description                   | Example                                                                        |
| --------------------------- | ----------------------------- | ------------------------------------------------------------------------------ |
| `X-Inference-Event`         | Event type                    | `generation.completed`, `async-embedding.completed`, or `slow-group.completed` |
| `X-Inference-Webhook-ID`    | Webhook identifier            | `AhALzdz8S`                                                                    |
| `X-Inference-Generation-ID` | Generation ID (if applicable) | `XBKcs7F1s2oJ_AHiLMbF4`                                                        |
| `X-Inference-Group-ID`      | Group ID (for group events)   | `GRP_XYZ123`                                                                   |
| `User-Agent`                | inference.net webhook agent   | `inference.net-Webhook/1.0`                                                    |
| `Content-Type`              | Always `application/json`     | `application/json`                                                             |

## Using Webhooks in Generations

Include the webhook identifier in your generation request metadata:

### Chat Completions

<CodeGroup>
  ```typescript TypeScript theme={"system"}
  import OpenAI from "openai";

  const client = new OpenAI({
    baseURL: "https://api.inference.net/v1/slow",
    apiKey: process.env.INFERENCE_API_KEY,
  });

  const response = await client.chat.completions.create({
    model: "google/gemma-3-27b-instruct/bf-16",
    messages: [{ role: "user", content: "Hello!" }],
    // @ts-expect-error metadata is not in the OpenAI SDK types
    metadata: { webhook_id: "YOUR_WEBHOOK_IDENTIFIER" },
  });
  ```

  ```python Python theme={"system"}
  import os
  from openai import OpenAI

  client = OpenAI(
      base_url="https://api.inference.net/v1/slow",
      api_key=os.environ["INFERENCE_API_KEY"],
  )

  response = client.chat.completions.create(
      model="google/gemma-3-27b-instruct/bf-16",
      messages=[{"role": "user", "content": "Hello!"}],
      extra_body={"metadata": {"webhook_id": "YOUR_WEBHOOK_IDENTIFIER"}},
  )
  ```

  ```bash cURL theme={"system"}
  curl https://api.inference.net/v1/slow/chat/completions \
    -H "Authorization: Bearer $INFERENCE_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "google/gemma-3-27b-instruct/bf-16",
      "messages": [{"role": "user", "content": "Hello!"}],
      "metadata": {"webhook_id": "YOUR_WEBHOOK_IDENTIFIER"}
    }'
  ```
</CodeGroup>

### Embeddings

<CodeGroup>
  ```typescript TypeScript theme={"system"}
  const embeddingResponse = await fetch(
    "https://api.inference.net/v1/async/embeddings",
    {
      method: "POST",
      headers: {
        Authorization: `Bearer ${process.env.INFERENCE_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        model: "qwen/qwen3-embedding-4b",
        input: ["Text to embed", "Another text to embed"],
        metadata: { webhook_id: "YOUR_WEBHOOK_IDENTIFIER" },
      }),
    },
  );
  ```

  ```python Python theme={"system"}
  import os
  import requests

  response = requests.post(
      "https://api.inference.net/v1/async/embeddings",
      headers={
          "Authorization": f"Bearer {os.environ['INFERENCE_API_KEY']}",
          "Content-Type": "application/json",
      },
      json={
          "model": "qwen/qwen3-embedding-4b",
          "input": ["Text to embed", "Another text to embed"],
          "metadata": {"webhook_id": "YOUR_WEBHOOK_IDENTIFIER"},
      },
  )
  ```

  ```bash cURL theme={"system"}
  curl https://api.inference.net/v1/async/embeddings \
    -H "Authorization: Bearer $INFERENCE_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "qwen/qwen3-embedding-4b",
      "input": ["Text to embed", "Another text to embed"],
      "metadata": {"webhook_id": "YOUR_WEBHOOK_IDENTIFIER"}
    }'
  ```
</CodeGroup>

## Minimal Webhook Handler Examples

<CodeGroup>
  ```typescript TypeScript theme={"system"}
  app.post("/webhook", express.json(), (req, res) => {
    res.status(200).json({ received: true });

    if (req.body.event === "generation.completed") {
      setImmediate(() => {
        console.log("Generation completed:", req.body.generation_id);
        // Your processing logic here
      });
    } else if (req.body.event === "async-embedding.completed") {
      setImmediate(() => {
        console.log("Embedding completed:", req.body.generation_id);
        console.log("Number of embeddings:", req.body.data.response.data.length);
      });
    } else if (req.body.event === "slow-group.completed") {
      setImmediate(() => {
        console.log("Group completed:", req.body.group_id);
        console.log("Group size:", req.body.data.group_size);
        req.body.data.generations.forEach((gen: any) => {
          console.log(`Generation ${gen.generationId}: ${gen.state}`);
        });
      });
    }
  });
  ```

  ```python Python theme={"system"}
  @app.post("/webhook")
  async def handle_webhook(payload: dict, background_tasks: BackgroundTasks):
      background_tasks.add_task(process_webhook, payload)
      return {"received": True}

  def process_webhook(payload):
      if payload["event"] == "generation.completed":
          print(f"Processing generation {payload['generation_id']}")
          # Your processing logic here
      elif payload["event"] == "async-embedding.completed":
          print(f"Processing embedding {payload['generation_id']}")
          print(f"Number of embeddings: {len(payload['data']['response']['data'])}")
      elif payload["event"] == "slow-group.completed":
          print(f"Processing group {payload['group_id']}")
          print(f"Group size: {payload['data']['group_size']}")
          for gen in payload["data"]["generations"]:
              print(f"Generation {gen['generationId']}: {gen['state']}")
  ```
</CodeGroup>

## Timing & Limits

| Metric           | Value            | Notes                         |
| ---------------- | ---------------- | ----------------------------- |
| Response timeout | 30 seconds       | Must respond within this time |
| Retry attempts   | 3                | With exponential backoff      |
| Max payload size | 10MB             | Typical: 5-50KB               |
| Delivery time    | Under 60 seconds | From completion to webhook    |

## Response Codes

| Code    | Meaning            | Retry? |
| ------- | ------------------ | ------ |
| 200-299 | Success            | No     |
| 400-499 | Client error       | No     |
| 500-599 | Server error       | Yes    |
| Timeout | No response in 30s | Yes    |

## Best Practices Checklist

* [ ] Respond with 200 OK immediately
* [ ] Process webhook data asynchronously
* [ ] Implement idempotency with generation\_id or group\_id
* [ ] Validate webhook source via headers
* [ ] Handle errors gracefully
* [ ] Monitor webhook processing
* [ ] Use HTTPS endpoint
* [ ] Set up proper error logging
* [ ] Test webhook with dashboard test feature
* [ ] Implement timeout handling
* [ ] Handle both individual and group completions

## Common Issues & Solutions

| Issue                  | Solution                                                                     |
| ---------------------- | ---------------------------------------------------------------------------- |
| Not receiving webhooks | Check webhook not disabled in dashboard, test connectivity, verify HTTPS URL |
| Duplicate webhooks     | Implement idempotency, ensure 200 OK response                                |
| Webhooks timing out    | Respond immediately, process asynchronously                                  |
| Invalid payload        | Validate against documented schema                                           |
| Test webhook fails     | Check endpoint is publicly accessible, returns 200 OK                        |

## Support Resources

* [Full Documentation](https://docs.inference.net)
* [API Reference](https://docs.inference.net/api)
* [Support](mailto:support@inference.net)
* Discord Community