jd7h / propainter

Object removal, video completion and video outpainting

  • Public
  • 1.6K runs
  • GitHub
  • Paper
  • License

Run time and cost

This model costs approximately $0.056 to run on Replicate, or 17 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 58 seconds. The predict time for this model varies significantly based on the inputs.

Readme

ProPainter

ProPainter is a model for:

  • Object removal: removing object(s) from a video
  • Video completion: completing a masked video
  • Video outpainting: expanding the view of a video

The model improves flow-based propagation and spatiotemporal Transformers, two mainstream mechanisms in video inpainting. ProPainter uses dual-domain propagation that combines the advantages of image and feature warping, exploiting global correspondences reliably. It also uses a mask-guided sparse video Transformer, which achieves high efficiency by discarding unnecessary and redundant tokens.

Video inpainting typically requires a significant amount of GPU memory. The model has the following options to reduce memory usage:

  • Reduce the number of local neighbors through decreasing the neighbor_length (default 10).
  • Reduce the number of global references by increasing the ref_stride (default 10).
  • Set the resize_ratio (default 1.0) to resize the processing video.
  • Set a smaller video size via specifying the width and height.
  • Set fp16 to true to use fp16 (half precision) during inference.
  • Reduce the frames of sub-videos with subvideo_length (default 80), which effectively decouples GPU memory costs and video length.