Ensure responses adhere to a JSON schema.
When using Structured Outputs, always include instructions in the system prompt to respond in JSON format. For example: “You are a helpful assistant. Respond in JSON format.”
JSON is one of the most widely used formats in the world for applications to exchange data.
Structured Outputs is a feature that ensures the model will always generate responses that adhere to your supplied JSON Schema, so you don’t need to worry about the model omitting a required key, or hallucinating an invalid enum value.
Some benefits of Structured Outputs include:
You’ll need an Inference.net account and API key to use Structured Outputs. See our Quick Start Guide for instructions on how to create an account and get an API key.
Install the OpenAI SDK for your language of choice.
To connect to Inference.net using the OpenAI SDK, you will need to set the base URL to https://api.inference.net/v1
.
In this example, we are reading the API key from the environment variable INFERENCE_API_KEY
.
Structured Outputs are suitable when you want to indicate a structured schema for use when the model responds to the user.
For example, if you are building a math tutoring application, you might want the assistant to respond to your user using a specific JSON Schema so that you can generate a UI that displays different parts of the model’s output in distinct ways.
Put simply:
response_format
Structured Outputs is the evolution of JSON mode. While both ensure valid JSON is produced, only Structured Outputs ensure schema adherance. Both Structured Outputs and JSON mode are supported in the Chat Completions API and Batch API.
We recommend always using Structured Outputs instead of JSON mode when possible.
Structured Outputs | JSON Mode | |
---|---|---|
Outputs valid JSON | Yes | Yes |
Adheres to schema | Yes (see supported schemas) | No |
Enabling | response_format: { type: "json_schema", json_schema: {"strict": true, "schema": ... } } | response_format: { type: "json_object" } |
You can ask the model to output an answer in a structured, step-by-step way, to guide the user through the solution.
In addition to supporting JSON Schema in the REST API, the OpenAI SDK for JavaScript makes it easy to define object schemas using Zod. Below, you can see how to extract information from unstructured text that conforms to a schema defined in code. There is also a Python SDK helper, but it is currently in beta and does not support complex schemas.
You can also use the OpenAI SDK helper to parse the model’s output into an object of your desired format.
The following examples use OpenAI’s built-in zod helper for more complex schema specification.
The Python SDK helper is currently in beta and does not currently support complex schemas, so there are no Python snippets for the following examples.
First you must define an object or data structure to represent the JSON Schema that the model should be constrained to follow.
For example, you can define an object like this:
You can use the parse
method to automatically parse the JSON response into the object you defined.
Under the hood, the SDK takes care of supplying the JSON schema corresponding to your data structure, and then parsing the response as an object.
In some cases, the model might not generate a valid response that matches the provided JSON schema.
This can happen if for example you reach a max tokens limit and the response is incomplete.
You can use streaming to process model responses as they are being generated, and parse them as structured data.
That way, you don’t have to wait for the entire response to complete before handling it. This is particularly useful if you would like to display JSON fields one by one, or handle function call arguments as soon as they are available.
Here is how you can stream a model response with the stream
helper:
Structured Outputs supports a subset of the JSON Schema language.
The following types are supported for Structured Outputs:
To use Structured Outputs, all properties on all objects must be specified as required
.
Also, additionalProperties
must be set to false
.
In the following example, note how both location
and unit
are listed as required properties.
Limitations on the number of properties, enum values, and total string size may vary depending on the model you are using.
When using Structured Outputs, outputs will be produced in the same order as the ordering of keys in the schema.
When using JSON mode, always instruct the model to produce JSON in the system prompt. For example: “You are a helpful assistant. Respond in JSON format.”
JSON mode is a more basic version of the Structured Outputs feature. While JSON mode ensures that model output is valid JSON, Structured Outputs reliably matches the model’s output to the schema you specify. We recommend you use Structured Outputs if it is supported for your use case.
When JSON mode is turned on, the model’s output is ensured to be valid JSON, except for in some edge cases that you should detect and handle appropriately.
To turn on JSON mode with the Chat Completions or Assistants API you can set the response_format
to { "type": "json_object" }
.
Important notes:
Ensure responses adhere to a JSON schema.
When using Structured Outputs, always include instructions in the system prompt to respond in JSON format. For example: “You are a helpful assistant. Respond in JSON format.”
JSON is one of the most widely used formats in the world for applications to exchange data.
Structured Outputs is a feature that ensures the model will always generate responses that adhere to your supplied JSON Schema, so you don’t need to worry about the model omitting a required key, or hallucinating an invalid enum value.
Some benefits of Structured Outputs include:
You’ll need an Inference.net account and API key to use Structured Outputs. See our Quick Start Guide for instructions on how to create an account and get an API key.
Install the OpenAI SDK for your language of choice.
To connect to Inference.net using the OpenAI SDK, you will need to set the base URL to https://api.inference.net/v1
.
In this example, we are reading the API key from the environment variable INFERENCE_API_KEY
.
Structured Outputs are suitable when you want to indicate a structured schema for use when the model responds to the user.
For example, if you are building a math tutoring application, you might want the assistant to respond to your user using a specific JSON Schema so that you can generate a UI that displays different parts of the model’s output in distinct ways.
Put simply:
response_format
Structured Outputs is the evolution of JSON mode. While both ensure valid JSON is produced, only Structured Outputs ensure schema adherance. Both Structured Outputs and JSON mode are supported in the Chat Completions API and Batch API.
We recommend always using Structured Outputs instead of JSON mode when possible.
Structured Outputs | JSON Mode | |
---|---|---|
Outputs valid JSON | Yes | Yes |
Adheres to schema | Yes (see supported schemas) | No |
Enabling | response_format: { type: "json_schema", json_schema: {"strict": true, "schema": ... } } | response_format: { type: "json_object" } |
You can ask the model to output an answer in a structured, step-by-step way, to guide the user through the solution.
In addition to supporting JSON Schema in the REST API, the OpenAI SDK for JavaScript makes it easy to define object schemas using Zod. Below, you can see how to extract information from unstructured text that conforms to a schema defined in code. There is also a Python SDK helper, but it is currently in beta and does not support complex schemas.
You can also use the OpenAI SDK helper to parse the model’s output into an object of your desired format.
The following examples use OpenAI’s built-in zod helper for more complex schema specification.
The Python SDK helper is currently in beta and does not currently support complex schemas, so there are no Python snippets for the following examples.
First you must define an object or data structure to represent the JSON Schema that the model should be constrained to follow.
For example, you can define an object like this:
You can use the parse
method to automatically parse the JSON response into the object you defined.
Under the hood, the SDK takes care of supplying the JSON schema corresponding to your data structure, and then parsing the response as an object.
In some cases, the model might not generate a valid response that matches the provided JSON schema.
This can happen if for example you reach a max tokens limit and the response is incomplete.
You can use streaming to process model responses as they are being generated, and parse them as structured data.
That way, you don’t have to wait for the entire response to complete before handling it. This is particularly useful if you would like to display JSON fields one by one, or handle function call arguments as soon as they are available.
Here is how you can stream a model response with the stream
helper:
Structured Outputs supports a subset of the JSON Schema language.
The following types are supported for Structured Outputs:
To use Structured Outputs, all properties on all objects must be specified as required
.
Also, additionalProperties
must be set to false
.
In the following example, note how both location
and unit
are listed as required properties.
Limitations on the number of properties, enum values, and total string size may vary depending on the model you are using.
When using Structured Outputs, outputs will be produced in the same order as the ordering of keys in the schema.
When using JSON mode, always instruct the model to produce JSON in the system prompt. For example: “You are a helpful assistant. Respond in JSON format.”
JSON mode is a more basic version of the Structured Outputs feature. While JSON mode ensures that model output is valid JSON, Structured Outputs reliably matches the model’s output to the schema you specify. We recommend you use Structured Outputs if it is supported for your use case.
When JSON mode is turned on, the model’s output is ensured to be valid JSON, except for in some edge cases that you should detect and handle appropriately.
To turn on JSON mode with the Chat Completions or Assistants API you can set the response_format
to { "type": "json_object" }
.
Important notes: