We're cutting our prices in half

Posted by @bfirsh

Here’s what’s changing:

  • We’re cutting the per-second price of public models in half. This is going to be applied to all your usage this month, on all public models, from SDXL to Llama 2, and requires no action on your part. Wahey! 🎉
  • Soon, we’ll be cutting the per-second price of private models in half, but we’ll also start charging for setup and idle time. This change will just be for new users. For existing users, this change is opt-in, so you’re not going to pay more.

Here are the prices:

Hardware Before After
CPU $0.000200 per second $0.000100 per second ($0.36 per hour)
Nvidia T4 $0.000550 per second $0.000225 per second ($0.81 per hour)
Nvidia A40 $0.001300 per second $0.000575 per second ($2.07 per hour)
Nvidia A100 (40GB) $0.002300 per second $0.001150 per second ($4.14 per hour)
Nvidia A100 (80GB) $0.003200 per second $0.001400 per second ($5.04 per hour)

What’s happening to private models?

When you run a model, it is running on a GPU instance. It takes a bit of time to start up the model, then your prediction runs, then we keep the instance idle for a bit of time after the prediction finishes so that subsequent requests are fast.

Currently, we charge you only for the amount of time that the model is running a prediction. Soon, we’re going to start charging private models for startup time and idle time, at half the per-second price. This will only be for new users or if you opt-in.

If you’re running a large volume of requests on private models, this will be significantly cheaper, because you’ll be making efficient use of the underlying instance. If you’re running a small number of requests, then this will be more expensive.

Private models still scale to zero when you aren’t using them, but we’ll bill for that bit of extra compute time before it scales to zero. We’re also going to let you control how long that time is.

This change will just be for new users. For existing users, this change will be opt-in and nothing will change unless you want it to.

Next steps

If you’re just using public models, you can stop reading right now. We’re rolling out the new prices over the course of the month. Enjoy your lower bill. 🍹

If you’re an existing user of private models, you’re not going to pay more. We want this to be unambiguously good news for you. If the new prices will save you money, you can switch over. If not, you can keep your current prices. Stay tuned for an email.

If you have any questions, drop us an email: team@replicate.com