Deployments API – Replicate changelog

Deployments give you more control over how your models run. You can scale them up and down based on demand, customize their hardware, and monitor performance and predictions without editing your code.

Managing deployments was previously only possible on the web, but you can now also create, read, and update deployments using Replicate’s HTTP API.

Here’s an example API request that updates the number of min and max instances for an existing deployment:

curl -s \
  -X PATCH \
  -H "Authorization: Token $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"min_instances": 3, "max_instances": 10}' \
  https://api.replicate.com/v1/deployments/acme/my-app-image-generator

Check out the API docs for deployments:

Create a deployment
List deployments
Get a deployment
Update a deployment

If you’re new to deployments, check out the getting started guide.