Webhooks: Quick Reference - Inference.net Documentation

Dashboard Management

Webhooks are managed through the inference.net dashboard:

Navigate to Webhooks on the sidebar
Create, test, archive, or restore webhooks through the UI
Copy your webhook identifier for use in generation requests

Payload Structures

generation.completed

JSON

{
  "event": "generation.completed",
  "timestamp": "ISO 8601 timestamp",
  "webhook_id": "webhook identifier",
  "generation_id": "generation ID",
  "data": {
    "state": "Success|Failed",
    "stateMessage": "Human readable status",
    "request": { /* Original request with metadata */ },
    "response": { /* OpenAI format response */ },
    "finishedAt": "ISO 8601 timestamp"
  }
}

async-embedding.completed

JSON

{
  "event": "async-embedding.completed",
  "timestamp": "ISO 8601 timestamp",
  "webhook_id": "webhook identifier",
  "generation_id": "generation ID",
  "data": {
    "state": "Success|Failed",
    "stateMessage": "Human readable status",
    "request": { /* Original embeddings request with metadata */ },
    "response": { /* OpenAI format embeddings response */ },
    "finishedAt": "ISO 8601 timestamp"
  }
}

slow-group.completed

JSON

{
  "event": "slow-group.completed",
  "timestamp": "ISO 8601 timestamp",
  "group_id": "group ID",
  "data": {
    "group_size": 2,
    "status": "completed",
    "generations": [
      {
        "generationId": "generation ID",
        "state": "Success|Failed",
        "stateMessage": "Human readable status",
        "request": { /* Original request */ },
        "response": { /* OpenAI format response */ },
        "finishedAt": "ISO 8601 timestamp or null"
      }
    ]
  }
}

Headers

Header	Description	Example
`X-Inference-Event`	Event type	`generation.completed`, `async-embedding.completed`, or `slow-group.completed`
`X-Inference-Webhook-ID`	Webhook identifier	`AhALzdz8S`
`X-Inference-Generation-ID`	Generation ID (if applicable)	`XBKcs7F1s2oJ_AHiLMbF4`
`X-Inference-Group-ID`	Group ID (for group events)	`GRP_XYZ123`
`User-Agent`	inference.net webhook agent	`inference.net-Webhook/1.0`
`Content-Type`	Always `application/json`	`application/json`

Using Webhooks in Generations

Include the webhook identifier in your generation request metadata:

Chat Completions

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.inference.net/v1/slow",
  apiKey: process.env.INFERENCE_API_KEY,
});

const response = await client.chat.completions.create({
  model: "google/gemma-3-27b-instruct/bf-16",
  messages: [{ role: "user", content: "Hello!" }],
  // @ts-expect-error metadata is not in the OpenAI SDK types
  metadata: { webhook_id: "YOUR_WEBHOOK_IDENTIFIER" },
});

Embeddings

const embeddingResponse = await fetch(
  "https://api.inference.net/v1/async/embeddings",
  {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.INFERENCE_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      model: "qwen/qwen3-embedding-4b",
      input: ["Text to embed", "Another text to embed"],
      metadata: { webhook_id: "YOUR_WEBHOOK_IDENTIFIER" },
    }),
  },
);

Minimal Webhook Handler Examples

app.post("/webhook", express.json(), (req, res) => {
  res.status(200).json({ received: true });

  if (req.body.event === "generation.completed") {
    setImmediate(() => {
      console.log("Generation completed:", req.body.generation_id);
      // Your processing logic here
    });
  } else if (req.body.event === "async-embedding.completed") {
    setImmediate(() => {
      console.log("Embedding completed:", req.body.generation_id);
      console.log("Number of embeddings:", req.body.data.response.data.length);
    });
  } else if (req.body.event === "slow-group.completed") {
    setImmediate(() => {
      console.log("Group completed:", req.body.group_id);
      console.log("Group size:", req.body.data.group_size);
      req.body.data.generations.forEach((gen: any) => {
        console.log(`Generation ${gen.generationId}: ${gen.state}`);
      });
    });
  }
});

Timing & Limits

Metric	Value	Notes
Response timeout	30 seconds	Must respond within this time
Retry attempts	3	With exponential backoff
Max payload size	10MB	Typical: 5-50KB
Delivery time	Under 60 seconds	From completion to webhook

Response Codes

Code	Meaning	Retry?
200-299	Success	No
400-499	Client error	No
500-599	Server error	Yes
Timeout	No response in 30s	Yes

Best Practices Checklist

Common Issues & Solutions

Issue	Solution
Not receiving webhooks	Check webhook not disabled in dashboard, test connectivity, verify HTTPS URL
Duplicate webhooks	Implement idempotency, ensure 200 OK response
Webhooks timing out	Respond immediately, process asynchronously
Invalid payload	Validate against documented schema
Test webhook fails	Check endpoint is publicly accessible, returns 200 OK

Support Resources

Getting Started With Webhooks

Batch API

⌘I

​Dashboard Management

​Payload Structures

​generation.completed

​async-embedding.completed

​slow-group.completed

​Headers

​Using Webhooks in Generations

​Chat Completions

​Embeddings

​Minimal Webhook Handler Examples

​Timing & Limits

​Response Codes

​Best Practices Checklist

​Common Issues & Solutions

​Support Resources

Dashboard Management

Payload Structures

generation.completed

async-embedding.completed

slow-group.completed

Headers

Using Webhooks in Generations

Chat Completions

Embeddings

Minimal Webhook Handler Examples

Timing & Limits

Response Codes

Best Practices Checklist

Common Issues & Solutions

Support Resources