Set deadlines for predictions
You can now set a deadline to automatically cancel a prediction if it doesn’t complete within a specified duration. This is useful when you’re building real-time or interactive experiences, like a virtual try-on experience for an online clothing store. In this case, shoppers have usually moved on if an image takes more than 15 seconds to generate.
How it works
Set a deadline by including a Cancel-After
header when creating a prediction. See our docs for details on the header format.
Here’s an example that sets a 1 minute deadline:
curl -X POST \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
-H "Cancel-After: 1m" \
-H "Prefer: wait" \
-H "Content-Type: application/json" \
-d $'{
"input": {
"prompt": "The sun rises slowly between tall buildings. [Ground-level follow shot] Bicycle tires roll over a dew-covered street at dawn. The cyclist passes through dappled light under a bridge as the entire city gradually wakes up."
}
}' \
https://api.replicate.com/v1/models/bytedance/seedance-1-pro/predictions
What happens when a deadline is reached
Replicate sets the prediction’s status to aborted
if the deadline is reached before it starts running, and canceled
if the deadline is reached while it’s running.
For public models, you’re only charged for predictions with a canceled
status, not for aborted
ones.
Deadline vs sync mode wait duration
Prediction deadlines and sync mode serve different purposes. Use prediction deadline (Cancel-After
header) to control when the prediction itself should be canceled. Use sync mode (Prefer: wait
header) to control how long the HTTP request stays open waiting for results.
You can also use both together. In the previous cURL example, Prefer: wait
defaults to 1 min and we’ve explicitly set Cancel-After
to 1 min. This means that the HTTP request will stay open for 1 minute to wait for results, after which the prediction will be canceled, even if it has not completed.
Alternatively, setting Cancel-After: 1m
and Prefer: wait=10
means that the request returns after 10 seconds. If the prediction is still running, you’ll get an incomplete prediction object, and the prediction will continue to run until it completes or is canceled at the 1-minute deadline.
Read more in the docs: