Table of contents
Replicate makes it easy to run machine learning models. You can run the best open-source models with just one line of code, or deploy your own custom models. But sometimes you need more control. That’s where deployments come in.
Deployments give you more control over how your models run. With deployments you can:
Deployments work with both open-source models and your own custom models.
Deployments auto-scale according to demand. If you send a lot of traffic, they scale up to handle it, and when things are quiet they scale back down, so you only pay for what you need. You can also limit the maximum number of instances the deployment can use to limit your maximum spend, or set a minimum to keep some instances warm and ready for predictions.