Structured Outputs
Ensure responses adhere to a JSON schema.
When using Structured Outputs, always include instructions in the system prompt to respond in JSON format. For example: “You are a helpful assistant. Respond in JSON format.”
Introduction
JSON is one of the most widely used formats in the world for applications to exchange data.
Structured Outputs is a feature that ensures the model will always generate responses that adhere to your supplied JSON Schema, so you don’t need to worry about the model omitting a required key, or hallucinating an invalid enum value.
Some benefits of Structured Outputs include:
- Reliable type-safety: No need to validate or retry incorrectly formatted responses
- Simpler prompting: No need for strongly worded prompts to achieve consistent formatting
Getting Started
You’ll need an Inference.net account and API key to use Structured Outputs. See our Quick Start Guide for instructions on how to create an account and get an API key.
Install the OpenAI SDK for your language of choice.
To connect to Inference.net using the OpenAI SDK, you will need to set the base URL to https://api.inference.net/v1
.
In this example, we are reading the API key from the environment variable INFERENCE_API_KEY
.
When to use Structured Outputs
Structured Outputs are suitable when you want to indicate a structured schema for use when the model responds to the user.
For example, if you are building a math tutoring application, you might want the assistant to respond to your user using a specific JSON Schema so that you can generate a UI that displays different parts of the model’s output in distinct ways.
Put simply:
- If you are connecting the model to tools, functions, data, etc. in your system, then you should use function calling
- If you want to structure the model’s output when it responds to the user, then you should use a structured
response_format
Structured Outputs vs JSON mode
Structured Outputs is the evolution of JSON mode. While both ensure valid JSON is produced, only Structured Outputs ensure schema adherance. Both Structured Outputs and JSON mode are supported in the Chat Completions API and Batch API.
We recommend always using Structured Outputs instead of JSON mode when possible.
Structured Outputs | JSON Mode | |
---|---|---|
Outputs valid JSON | Yes | Yes |
Adheres to schema | Yes (see supported schemas) | No |
Enabling | response_format: { type: "json_schema", json_schema: {"strict": true, "schema": ... } } | response_format: { type: "json_object" } |
Example
Chain of thought
You can ask the model to output an answer in a structured, step-by-step way, to guide the user through the solution.
Example response
Defining Structured Outputs Schemas with the OpenAI SDK
In addition to supporting JSON Schema in the REST API, the OpenAI SDK for JavaScript makes it easy to define object schemas using Zod. Below, you can see how to extract information from unstructured text that conforms to a schema defined in code. There is also a Python SDK helper, but it is currently in beta and does not support complex schemas.
Step By Step Example - Parsing The Model’s Output
You can also use the OpenAI SDK helper to parse the model’s output into an object of your desired format.
The following examples use OpenAI’s built-in zod helper for more complex schema specification.
The Python SDK helper is currently in beta and does not currently support complex schemas, so there are no Python snippets for the following examples.
Step 1: Define your object
First you must define an object or data structure to represent the JSON Schema that the model should be constrained to follow.
For example, you can define an object like this:
Step 2: Supply your object in the API call
You can use the parse
method to automatically parse the JSON response into the object you defined.
Under the hood, the SDK takes care of supplying the JSON schema corresponding to your data structure, and then parsing the response as an object.
Handling edge cases
In some cases, the model might not generate a valid response that matches the provided JSON schema.
This can happen if for example you reach a max tokens limit and the response is incomplete.
Streaming
You can use streaming to process model responses as they are being generated, and parse them as structured data.
That way, you don’t have to wait for the entire response to complete before handling it. This is particularly useful if you would like to display JSON fields one by one, or handle function call arguments as soon as they are available.
Here is how you can stream a model response with the stream
helper:
Supported schemas
Structured Outputs supports a subset of the JSON Schema language.
Supported types
The following types are supported for Structured Outputs:
- String
- Number
- Boolean
- Integer
- Object
- Array
- Enum
- anyOf
Required Fields And Additional Properties
To use Structured Outputs, all properties on all objects must be specified as required
.
Also, additionalProperties
must be set to false
.
In the following example, note how both location
and unit
are listed as required properties.
Schema Limitations Depend on the Model
Limitations on the number of properties, enum values, and total string size may vary depending on the model you are using.
Key ordering
When using Structured Outputs, outputs will be produced in the same order as the ordering of keys in the schema.
JSON mode
When using JSON mode, always instruct the model to produce JSON in the system prompt. For example: “You are a helpful assistant. Respond in JSON format.”
JSON mode is a more basic version of the Structured Outputs feature. While JSON mode ensures that model output is valid JSON, Structured Outputs reliably matches the model’s output to the schema you specify. We recommend you use Structured Outputs if it is supported for your use case.
When JSON mode is turned on, the model’s output is ensured to be valid JSON, except for in some edge cases that you should detect and handle appropriately.
To turn on JSON mode with the Chat Completions or Assistants API you can set the response_format
to { "type": "json_object" }
.
Important notes:
- When using JSON mode, you must always instruct the model to produce JSON via some message in the conversation, for example via your system message. If you don’t include an explicit instruction to generate JSON, the model may generate non-JSON or an unending stream of whitespace.
- JSON mode will not guarantee the output matches any specific schema, only that it is valid and parses without errors. You should use Structured Outputs to ensure it matches your schema, or if that is not possible, you should use a validation library and potentially retries to ensure that the output matches your desired schema.
- Your application must detect and handle the edge cases that can result in the model output not being a complete JSON object (see below)
- Some models will include a triple backtick / JSON code format block around the JSON response. This should be detected and handled appropriately.