Skip to main content

Dashboard Management

Webhooks are managed through the inference.net dashboard:
  1. Navigate to Webhooks on the sidebar
  2. Create, test, archive, or restore webhooks through the UI
  3. Copy your webhook identifier for use in generation requests

Payload Structures

generation.completed

JSON
{
  "event": "generation.completed",
  "timestamp": "ISO 8601 timestamp",
  "webhook_id": "webhook identifier",
  "generation_id": "generation ID",
  "data": {
    "state": "Success|Failed",
    "stateMessage": "Human readable status",
    "request": { /* Original request with metadata */ },
    "response": { /* OpenAI format response */ },
    "finishedAt": "ISO 8601 timestamp"
  }
}

async-embedding.completed

JSON
{
  "event": "async-embedding.completed",
  "timestamp": "ISO 8601 timestamp",
  "webhook_id": "webhook identifier",
  "generation_id": "generation ID",
  "data": {
    "state": "Success|Failed",
    "stateMessage": "Human readable status",
    "request": { /* Original embeddings request with metadata */ },
    "response": { /* OpenAI format embeddings response */ },
    "finishedAt": "ISO 8601 timestamp"
  }
}

slow-group.completed

JSON
{
  "event": "slow-group.completed",
  "timestamp": "ISO 8601 timestamp",
  "group_id": "group ID",
  "data": {
    "group_size": 2,
    "status": "completed",
    "generations": [
      {
        "generationId": "generation ID",
        "state": "Success|Failed",
        "stateMessage": "Human readable status",
        "request": { /* Original request */ },
        "response": { /* OpenAI format response */ },
        "finishedAt": "ISO 8601 timestamp or null"
      }
    ]
  }
}

Headers

HeaderDescriptionExample
X-Inference-EventEvent typegeneration.completed, async-embedding.completed, or slow-group.completed
X-Inference-Webhook-IDWebhook identifierAhALzdz8S
X-Inference-Generation-IDGeneration ID (if applicable)XBKcs7F1s2oJ_AHiLMbF4
X-Inference-Group-IDGroup ID (for group events)GRP_XYZ123
User-Agentinference.net webhook agentinference.net-Webhook/1.0
Content-TypeAlways application/jsonapplication/json

Using Webhooks in Generations

Include the webhook identifier in your generation request metadata:

Chat Completions

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.inference.net/v1/slow",
  apiKey: process.env.INFERENCE_API_KEY,
});

const response = await client.chat.completions.create({
  model: "google/gemma-3-27b-instruct/bf-16",
  messages: [{ role: "user", content: "Hello!" }],
  // @ts-expect-error metadata is not in the OpenAI SDK types
  metadata: { webhook_id: "YOUR_WEBHOOK_IDENTIFIER" },
});

Embeddings

const embeddingResponse = await fetch(
  "https://api.inference.net/v1/async/embeddings",
  {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.INFERENCE_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      model: "qwen/qwen3-embedding-4b",
      input: ["Text to embed", "Another text to embed"],
      metadata: { webhook_id: "YOUR_WEBHOOK_IDENTIFIER" },
    }),
  },
);

Minimal Webhook Handler Examples

app.post("/webhook", express.json(), (req, res) => {
  res.status(200).json({ received: true });

  if (req.body.event === "generation.completed") {
    setImmediate(() => {
      console.log("Generation completed:", req.body.generation_id);
      // Your processing logic here
    });
  } else if (req.body.event === "async-embedding.completed") {
    setImmediate(() => {
      console.log("Embedding completed:", req.body.generation_id);
      console.log("Number of embeddings:", req.body.data.response.data.length);
    });
  } else if (req.body.event === "slow-group.completed") {
    setImmediate(() => {
      console.log("Group completed:", req.body.group_id);
      console.log("Group size:", req.body.data.group_size);
      req.body.data.generations.forEach((gen: any) => {
        console.log(`Generation ${gen.generationId}: ${gen.state}`);
      });
    });
  }
});

Timing & Limits

MetricValueNotes
Response timeout30 secondsMust respond within this time
Retry attempts3With exponential backoff
Max payload size10MBTypical: 5-50KB
Delivery timeUnder 60 secondsFrom completion to webhook

Response Codes

CodeMeaningRetry?
200-299SuccessNo
400-499Client errorNo
500-599Server errorYes
TimeoutNo response in 30sYes

Best Practices Checklist

  • Respond with 200 OK immediately
  • Process webhook data asynchronously
  • Implement idempotency with generation_id or group_id
  • Validate webhook source via headers
  • Handle errors gracefully
  • Monitor webhook processing
  • Use HTTPS endpoint
  • Set up proper error logging
  • Test webhook with dashboard test feature
  • Implement timeout handling
  • Handle both individual and group completions

Common Issues & Solutions

IssueSolution
Not receiving webhooksCheck webhook not disabled in dashboard, test connectivity, verify HTTPS URL
Duplicate webhooksImplement idempotency, ensure 200 OK response
Webhooks timing outRespond immediately, process asynchronously
Invalid payloadValidate against documented schema
Test webhook failsCheck endpoint is publicly accessible, returns 200 OK

Support Resources