Slimmer API responses for model metadata
To improve performance when using tools like Replicate’s MCP server, we’ve updated our public API to return smaller response objects for model metadata.
This change removes about ⚡️ 5KB ⚡️ from every serialized model object.
For API operations that return multiple models like models.search
and collections.get
, this change shaves over 1MB off the response size, dramatically improving the performance and response times for LLMs that are consuming these data.
What’s changed?
Every Replicate model has its own OpenAPI schema that defines all of its inputs and outputs. This metadata is incredibly useful, as it tells you exactly what you can do with the model, and it’s documented in a machine-readable and industry-standard JSON Schema format.
Model input and outputs schemas are great, but these OpenAPI schemas also include some metadata that is not useful or relevant. Specifically, the openapi_schema.paths
key contained unnecessary metadata that is only useful inside Cog’s internal generated FastAPI client. To reduce the size of the generated responses, we’ve removed this key from the version object and replaced it with an empty object. This results in an OpenAPI schema that is still valid, but much smaller.
Which API operations are affected?
This change affects all API operations that return model version metadata, including:
models.get
- Get a modelmodels.list
- List public modelsmodels.search
- Search public modelsmodels.versions.get
- Get a model versionmodels.versions.list
- List model versionscollections.get
- Get a collection of modelscollections.list
- List collections of models