openai/gpt-5-structured

GPT-5 with support for structured outputs, web search and custom tools

103.5K runs

Readme

cat STANDALONE_README.md

Replicate GPT-5 Structured API: Complete Guide to JSON Schemas

This guide is backed by comprehensive testing against the live API. It explains exactly how to use json_schema and simple_schema with the Replicate GPT-5 Structured model, what works, what doesn’t, and includes copy-paste curl examples.

Prerequisites

export REPLICATE_API_TOKEN="your_token_here"

Quick Decision Guide

  • Use json_schema for anything real: proper data types (numbers, booleans, arrays), nesting, validation, constraints
  • Use simple_schema only for quick, flat, string-only prototypes

Critical Wrapper Format (Required)

You must wrap your schema like this or the API will reject it:

{
  "input": {
    "model": "gpt-5-nano",
    "json_schema": {
      "format": {
        "type": "json_schema",
        "name": "your_schema_name",
        "schema": {
          // Your actual JSON Schema goes here
        }
      }
    }
  }
}

Three Non-Negotiable Rules (Tested)

  1. Every object must include: "additionalProperties": false
  2. Every object must include: a required array listing ALL of its properties
  3. No optional fields: All properties must be in required; use empty strings or ["string", "null"] for missing values

What Works (Confirmed by Testing)

Proper data types: number, integer, boolean, string, array, object
Nested objects: Single or deep nesting (tested up to 8 levels)
Arrays of objects: Lists of structured items with their own validation
String validation: pattern, minLength, maxLength
Numeric constraints: minimum, maximum
Array constraints: minItems, maxItems
Null values: Union types like ["string", "null"]
Enums: Restrict values to specific options

What Doesn’t Work (Confirmed)

uniqueItems: Not permitted (array uniqueness)
oneOf/anyOf/allOf/$ref: Not supported
Optional fields: All properties must be in required

Important Behaviors (Observed)

⚠️ Schema descriptions override prompts: The model follows schema descriptions even if the prompt contradicts them
⚠️ Impossible constraints: Model outputs something anyway (graceful degradation)


1. Flat Object with Proper Types

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d '{
    "input": {
      "model": "gpt-5-nano",
      "prompt": "Create a product review",
      "json_schema": {
        "format": {
          "type": "json_schema",
          "name": "product_review",
          "schema": {
            "type": "object",
            "properties": {
              "title": {"type": "string"},
              "rating": {"type": "number"},
              "is_recommended": {"type": "boolean"},
              "tags": {"type": "array", "items": {"type": "string"}}
            },
            "required": ["title", "rating", "is_recommended", "tags"],
            "additionalProperties": false
          }
        }
      }
    }
  }' \
  https://api.replicate.com/v1/models/openai/gpt-5-structured/predictions

Expected Output:

{
  "title": "Great wireless headphones",
  "rating": 4.5,
  "is_recommended": true,
  "tags": ["audio", "wireless", "comfort"]
}

2. Nested Objects (One Level)

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d '{
    "input": {
      "model": "gpt-5-nano",
      "prompt": "Find a news headline with publication details",
      "enable_web_search": true,
      "json_schema": {
        "format": {
          "type": "json_schema",
          "name": "news_article",
          "schema": {
            "type": "object",
            "properties": {
              "article": {
                "type": "object",
                "properties": {
                  "title": {"type": "string"},
                  "source": {"type": "string"}
                },
                "required": ["title", "source"],
                "additionalProperties": false
              },
              "publication": {
                "type": "object",
                "properties": {
                  "date": {"type": "string"},
                  "author": {"type": "string"}
                },
                "required": ["date", "author"],
                "additionalProperties": false
              }
            },
            "required": ["article", "publication"],
            "additionalProperties": false
          }
        }
      }
    }
  }' \
  https://api.replicate.com/v1/models/openai/gpt-5-structured/predictions

Expected Output:

{
  "article": {
    "title": "Tech Company Announces New AI Model",
    "source": "Reuters"
  },
  "publication": {
    "date": "2024-09-21",
    "author": "John Smith"
  }
}

3. Arrays of Objects

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d '{
    "input": {
      "model": "gpt-5-nano",
      "prompt": "Create a list of 3 team members",
      "json_schema": {
        "format": {
          "type": "json_schema",
          "name": "team_members",
          "schema": {
            "type": "object",
            "properties": {
              "team": {
                "type": "array",
                "items": {
                  "type": "object",
                  "properties": {
                    "name": {"type": "string"},
                    "role": {"type": "string"},
                    "skills": {
                      "type": "array",
                      "items": {"type": "string"}
                    }
                  },
                  "required": ["name", "role", "skills"],
                  "additionalProperties": false
                }
              }
            },
            "required": ["team"],
            "additionalProperties": false
          }
        }
      }
    }
  }' \
  https://api.replicate.com/v1/models/openai/gpt-5-structured/predictions

Expected Output:

{
  "team": [
    {
      "name": "Alice Carter",
      "role": "Data Analyst",
      "skills": ["Python", "SQL", "Statistics"]
    },
    {
      "name": "Bob Liu",
      "role": "Backend Engineer", 
      "skills": ["Java", "Docker", "AWS"]
    },
    {
      "name": "Carol Zhang",
      "role": "UX Designer",
      "skills": ["Figma", "User Research", "Prototyping"]
    }
  ]
}

4. Deep Nesting (Multiple Levels)

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d '{
    "input": {
      "model": "gpt-5-nano",
      "prompt": "Create detailed user profile",
      "json_schema": {
        "format": {
          "type": "json_schema",
          "name": "user_profile",
          "schema": {
            "type": "object",
            "properties": {
              "user": {
                "type": "object",
                "properties": {
                  "personal": {
                    "type": "object",
                    "properties": {
                      "name": {"type": "string"},
                      "age": {"type": "number"},
                      "address": {
                        "type": "object",
                        "properties": {
                          "street": {"type": "string"},
                          "city": {"type": "string"},
                          "country": {"type": "string"}
                        },
                        "required": ["street", "city", "country"],
                        "additionalProperties": false
                      }
                    },
                    "required": ["name", "age", "address"],
                    "additionalProperties": false
                  },
                  "preferences": {
                    "type": "object",
                    "properties": {
                      "theme": {"type": "string"},
                      "notifications": {"type": "boolean"}
                    },
                    "required": ["theme", "notifications"],
                    "additionalProperties": false
                  }
                },
                "required": ["personal", "preferences"],
                "additionalProperties": false
              }
            },
            "required": ["user"],
            "additionalProperties": false
          }
        }
      }
    }
  }' \
  https://api.replicate.com/v1/models/openai/gpt-5-structured/predictions

5. String Patterns and Validation

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d '{
    "input": {
      "model": "gpt-5-nano",
      "prompt": "Create user contact info",
      "json_schema": {
        "format": {
          "type": "json_schema",
          "name": "contact_validation",
          "schema": {
            "type": "object",
            "properties": {
              "email": {
                "type": "string",
                "pattern": "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\\\.[a-zA-Z]{2,}$",
                "description": "Valid email address"
              },
              "username": {
                "type": "string",
                "minLength": 3,
                "maxLength": 20,
                "pattern": "^[a-zA-Z0-9_]+$"
              },
              "status": {
                "type": "string",
                "enum": ["active", "inactive", "pending"]
              }
            },
            "required": ["email", "username", "status"],
            "additionalProperties": false
          }
        }
      }
    }
  }' \
  https://api.replicate.com/v1/models/openai/gpt-5-structured/predictions

6. Null Handling

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d '{
    "input": {
      "model": "gpt-5-nano",
      "prompt": "Create incomplete user profile where some info is missing",
      "json_schema": {
        "format": {
          "type": "json_schema",
          "name": "nullable_test",
          "schema": {
            "type": "object",
            "properties": {
              "name": {"type": "string"},
              "middle_name": {"type": ["string", "null"]},
              "age": {"type": ["number", "null"]},
              "bio": {
                "type": "string",
                "description": "Bio or empty string if not provided"
              }
            },
            "required": ["name", "middle_name", "age", "bio"],
            "additionalProperties": false
          }
        }
      }
    }
  }' \
  https://api.replicate.com/v1/models/openai/gpt-5-structured/predictions

Expected Output:

{
  "name": "John Doe",
  "middle_name": null,
  "age": null,
  "bio": ""
}

7. Array Constraints

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d '{
    "input": {
      "model": "gpt-5-nano",
      "prompt": "Create skills profile with 3-5 skills",
      "json_schema": {
        "format": {
          "type": "json_schema",
          "name": "skills_profile",
          "schema": {
            "type": "object",
            "properties": {
              "skills": {
                "type": "array",
                "items": {"type": "string"},
                "minItems": 3,
                "maxItems": 5
              },
              "scores": {
                "type": "array",
                "items": {
                  "type": "number",
                  "minimum": 1,
                  "maximum": 10
                },
                "minItems": 3,
                "maxItems": 5
              }
            },
            "required": ["skills", "scores"],
            "additionalProperties": false
          }
        }
      }
    }
  }' \
  https://api.replicate.com/v1/models/openai/gpt-5-structured/predictions

Simple Schema: For Quick Prototypes Only

How Simple Schema Works (Tested)

  • Syntax: Array of strings describing fields
  • Output: Everything returns as strings (even numbers/booleans/arrays)
  • Limitation: No real nesting (creates flat keys)

Simple Schema Example

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d '{
    "input": {
      "model": "gpt-5-nano",
      "prompt": "Create test data",
      "simple_schema": [
        "name: string",
        "age: number",
        "active: boolean", 
        "tags: list:string"
      ]
    }
  }' \
  https://api.replicate.com/v1/models/openai/gpt-5-structured/predictions

Output (all strings):

{
  "name": "John Doe",
  "age": "25",
  "active": "true",
  "tags": "programming, design, photography"
}

Simple Schema Field Types

# Field type examples
"title"                    # Defaults to string
"title: string"            # Explicit string
"price: number"            # Returns as "12.34" string
"is_active: boolean"       # Returns as "true" string
"tags: list"               # Returns as comma-separated string
"tags: list:string"        # Same as above
"scores: list:number"      # Returns as comma-separated string

Common Errors and Fixes

Error: Unknown parameter: 'text.properties'

Fix: You forgot the format wrapper around your schema

Error: 'additionalProperties' is required to be supplied and to be false

Fix: Add "additionalProperties": false to every object

Error: 'required' is required and must include every property

Fix: Add ALL property names to the required array

Error: 'oneOf'/'anyOf'/'allOf'/'$ref' not permitted

Fix: Remove these advanced JSON Schema features - they’re not supported

Error: 'uniqueItems' is not permitted

Fix: Remove uniqueItems - handle uniqueness client-side


Key Insights from Testing

Schema Descriptions Override Prompts

If your schema description says:

"name": {
  "type": "string",
  "description": "Name should be Alice"
}

The model will return “Alice” even if your prompt says “Create a user named Bob”.

Impossible Constraints Are Ignored

The model handles contradictory constraints gracefully:

"age": {
  "type": "number",
  "minimum": 100,
  "maximum": 18  // Impossible constraint
}

The model will still return a valid number, ignoring the impossible constraint.


Final Recommendation

Use json_schema for production applications. It provides: - Proper data types (numbers, booleans, arrays) - Unlimited nesting capability (tested to 8 levels) - Robust validation and constraints - Predictable, type-safe output

Use simple_schema only for: - Quick prototyping - Flat data structures - When all values can be strings

The comprehensive testing shows that json_schema is significantly more powerful and reliable for any real-world use case.%