Webhooks
Learn how to receive and manage webhooks from Replicate.
🍿 Watch: Learn more about how to use webhooks → YouTube (14 minutes).
What are webhooks?
Webhooks provide real-time updates about your prediction. Specify an endpoint when you create a prediction, and Replicate will send HTTP POST requests to that URL when the prediction is created, updated, and finished.
Here are some example scenarios where webhooks are useful:
- Persisting prediction data and files. Input and output (including any files) are automatically deleted after an hour for any predictions created through the API. Webhooks give you a way to receive all the metadata for a completed prediction, so you can store it in a database or save the output files to persistent storage before they’re gone.
- Sending notifications when long-running predictions finish. Some predictions like training jobs can take several minutes to run. You can use a webhook handler to send a notification like an email or a Slack message when a prediction completes.
- Creating model pipelines. You can use webhooks to capture the output of one long-running prediction and pipe it into another model as input.
Note: Webhooks are handy, but they’re not strictly necessary to use Replicate, and there are other ways to receive updates. You can also poll the predictions API or use server-sent events (SSEs) to check the status of a prediction over time.
Setting webhooks
To use webhooks, specify a webhook
URL in the request body when creating a prediction.
Here’s an example using the replicate
JavaScript client:
await replicate.predictions.create({
version: "d55b9f2d...",
input: { prompt: "call me later maybe" },
webhook: "https://example.com/replicate-webhook",
webhook_events_filter: ["completed"], // optional
});
In addition to predictions, you can also receive webhooks when fine-tuning models with the training API:
await replicate.trainings.create({
version: "d55b9f2d...",
destination: "my-username/my-model",
input: { training_data: "..." },
webhook: "https://example.com/replicate-webhook",
});
Receiving webhooks
Replicate will send an HTTP POST request to the URL you specified whenever the prediction is created, has new logs, new output, or is completed.
The request body is a prediction object in JSON format. This object has the same structure as the object returned by the get a prediction API. Here’s an example of an unfinished prediction:
{
"id": "ufawqhfynnddngldkgtslldrkq",
"version": "5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa",
"created_at": "2022-04-26T22:13:06.224088Z",
"started_at": null,
"completed_at": null,
"status": "starting",
"input": {
"text": "Alice"
},
"output": null,
"error": null,
"logs": null,
"metrics": {}
}
The prediction’s status
property will have one of the following values:
starting
: the prediction is starting up. If this status lasts longer than a few seconds, then it’s typically because a new worker is being started to run the prediction.processing
: the model is currently running.succeeded
: the prediction completed successfully.failed
: the prediction encountered an error during processing.canceled
: the prediction was canceled by the user.
Here’s an example of a Next.js webhook handler:
// pages/api/replicate-webhook.js
export default async function handler(req, res) {
console.log("🪝 incoming webhook!", req.body.id);
const prediction = req.body;
await saveToMyDatabase(prediction);
await sendSlackNotification(prediction);
res.end();
}
By default, Replicate sends requests to your webhook URL whenever there are new logs, new outputs, or the prediction has finished. You can change which events trigger a webhook using the webhook_events_filter
property.
Your endpoint should respond with a 2xx status code within a few seconds, otherwise the webhook might be retried.
Retries
When Replicate sends the final webhook for a prediction (where the status is succeeded
, failed
or canceled
), we check the response status we get. If we can’t make the request at all, or if we get a 4xx or 5xx response, we’ll retry the webhook. We retry several times on an exponential backoff. The final retry is sent about 1 minute after the prediction completed.
We do not retry any webhooks for intermediate states.
Testing your webhook code
When writing the code for your new webhook handler, it’s useful to be able to receive real webhooks in your development environment so you can verify your code is handling them as expected.
ngrok is a free reverse proxy tool that can create a secure tunnel to your local machine so you can receive webhooks. If you have Node.js installed, run ngrok directly from the command line using the npx
command that’s included with Node.js.
npx ngrok http 3000
The command above will generate output that looks like this:
Session Status online
Session Expires 1 hour, 59 minutes
Version 2.3.41
Region United States (us)
Web Interface http://127.0.0.1:4040
Forwarding http://3e48-20-171-41-18.ngrok.io -> http://localhost:3000
Forwarding https://3e48-20-171-41-18.ngrok.io -> http://localhost:3000
The HTTPS URL in the output (http://3e48-20-171-41-18.ngrok.io
in the example above) is a temporary URL pointing to your local machine. Copy that URL and use it as the base of your webhook URL.
Here’s an example using the replicate
JavaScript client:
await replicate.predictions.create({
version: "d55b9f2d...",
input: { prompt: "call me later maybe" },
webhook: "https://3e48-20-171-41-18.ngrok.io/replicate-webhook",
});
Your webhook handler should now receive webhooks from Replicate. Once you’ve deployed your app, change the value of the webhook
URL to your production webhook handler endpoint when creating predictions.
For a real-world example of webhook handling in Next.js, check out Scribble Diffusion’s codebase.
Tips
- Add query params to your webhook URL to pass along extra metadata, like an internal ID you’re using for a prediction. For example
https://example.com/replicate-webhook?customId=123
- Make webhook handlers idempotent. Identical webhooks can be sent more than once, so you’ll need handle potentially duplicate information.
Further reading
- See the Node.js client webhook docs.
- See the Python client webhook docs.
- See predictions.create and trainings.create API docs.
- See Scribble Diffusion’s codebase for a reference implementation in JavaScript.
- Read our streaming guide to learn how to consume server-sent events (SSEs) from langauge models.