Learn how to receive and manage webhooks from Replicate.

🍿 Watch: Learn more about how to use webhooks → YouTube (14 minutes).

What are webhooks?

Webhooks provide real-time updates about your prediction. Specify an endpoint when you create a prediction, and Replicate will send HTTP POST requests to that URL when the prediction is created, updated, and finished.

Here are some example scenarios where webhooks are useful:

  • Persisting prediction data and files. Input and output (including any files) are automatically deleted after an hour for any predictions created through the API. Webhooks give you a way to receive all the metadata for a completed prediction, so you can store it in a database or save the output files to persistent storage before they’re gone.
  • Sending notifications when long-running predictions finish. Some predictions like training jobs can take several minutes to run. You can use a webhook handler to send a notification like an email or a Slack message when a prediction completes.
  • Creating model pipelines. You can use webhooks to capture the output of one long-running prediction and pipe it into another model as input.

Note: Webhooks are handy, but they’re not strictly necessary to use Replicate, and there are other ways to receive updates. You can also poll the predictions API or use server-sent events (SSEs) to check the status of a prediction over time.

Setting webhooks

To use webhooks, specify a webhook URL in the request body when creating a prediction.

Here’s an example using the replicate JavaScript client:

await replicate.predictions.create({
  version: "d55b9f2d...",
  input: { prompt: "call me later maybe" },
  webhook: "https://example.com/replicate-webhook",
  webhook_events_filter: ["completed"], // optional

In addition to predictions, you can also receive webhooks when fine-tuning models with the training API:

await replicate.trainings.create({
  version: "d55b9f2d...",
  destination: "my-username/my-model",
  input: { training_data: "..." },
  webhook: "https://example.com/replicate-webhook",

Receiving webhooks

Replicate will send an HTTP POST request to the URL you specified whenever the prediction is created, has new logs, new output, or is completed.

The request body is a prediction object in JSON format. This object has the same structure as the object returned by the get a prediction API. Here’s an example of an unfinished prediction:

  "id": "ufawqhfynnddngldkgtslldrkq",
  "version": "5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa",
  "created_at": "2022-04-26T22:13:06.224088Z",
  "started_at": null,
  "completed_at": null,
  "status": "starting",
  "input": {
    "text": "Alice"
  "output": null,
  "error": null,
  "logs": null,
  "metrics": {}

The prediction’s status property will have one of the following values:

  • starting: the prediction is starting up. If this status lasts longer than a few seconds, then it’s typically because a new worker is being started to run the prediction.
  • processing: the model is currently running.
  • succeeded: the prediction completed successfully.
  • failed: the prediction encountered an error during processing.
  • canceled: the prediction was canceled by the user.

Here’s an example of a Next.js webhook handler:

// pages/api/replicate-webhook.js
export default async function handler(req, res) {
  console.log("🪝 incoming webhook!", req.body.id);
  const prediction = req.body;
  await saveToMyDatabase(prediction);
  await sendSlackNotification(prediction);

By default, Replicate sends requests to your webhook URL whenever there are new logs, new outputs, or the prediction has finished. You can change which events trigger a webhook using the webhook_events_filter property.

Your endpoint should respond with a 2xx status code within a few seconds, otherwise the webhook might be retried.


When Replicate sends the final webhook for a prediction (where the status is succeeded, failed or canceled), we check the response status we get. If we can’t make the request at all, or if we get a 4xx or 5xx response, we’ll retry the webhook. We retry several times on an exponential backoff. The final retry is sent about 1 minute after the prediction completed.

We do not retry any webhooks for intermediate states.

Testing your webhook code

When writing the code for your new webhook handler, it’s useful to be able to receive real webhooks in your development environment so you can verify your code is handling them as expected.

ngrok is a free reverse proxy tool that can create a secure tunnel to your local machine so you can receive webhooks. If you have Node.js installed, run ngrok directly from the command line using the npx command that’s included with Node.js.

npx ngrok http 3000

The command above will generate output that looks like this:

Session Status                online
Session Expires               1 hour, 59 minutes
Version                       2.3.41
Region                        United States (us)
Web Interface       
Forwarding                    http://3e48-20-171-41-18.ngrok.io -> http://localhost:3000
Forwarding                    https://3e48-20-171-41-18.ngrok.io -> http://localhost:3000

The HTTPS URL in the output (http://3e48-20-171-41-18.ngrok.io in the example above) is a temporary URL pointing to your local machine. Copy that URL and use it as the base of your webhook URL.

Here’s an example using the replicate JavaScript client:

await replicate.predictions.create({
  version: "d55b9f2d...",
  input: { prompt: "call me later maybe" },
  webhook: "https://3e48-20-171-41-18.ngrok.io/replicate-webhook",

Your webhook handler should now receive webhooks from Replicate. Once you’ve deployed your app, change the value of the webhook URL to your production webhook handler endpoint when creating predictions.

For a real-world example of webhook handling in Next.js, check out Scribble Diffusion’s codebase.


  • Add query params to your webhook URL to pass along extra metadata, like an internal ID you’re using for a prediction. For example https://example.com/replicate-webhook?customId=123
  • Make webhook handlers idempotent. Identical webhooks can be sent more than once, so you’ll need handle potentially duplicate information.

Further reading