Deployments API


Deployments give you more control over how your models run. You can scale them up and down based on demand, customize their hardware, and monitor performance and predictions without editing your code.

Managing deployments was previously only possible on the web, but you can now also create, read, and update deployments using Replicate’s HTTP API.

Here’s an example API request that updates the number of min and max instances for an existing deployment:

curl -s \
  -X PATCH \
  -H "Authorization: Token $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"min_instances": 3, "max_instances": 10}' \

Check out the API docs for deployments:

If you’re new to deployments, check out the getting started guide.