Readme
Cog pipeline for XMem and ProPainter
This is a generative AI pipeline that combines two models:
- XMem, a model for video object segmentation
- ProPainter, a model for video inpainting
This pipeline can be used for easy video inpainting. XMem turns a source video and an annotated first frame into a video mask. ProPainter takes a source video and a video mask and fills everything under the mask with inpainting.
How to use it
Here’s how you can use this pipeline to do video inpainting on a source video, for example kitten_short.mp4
.
1. Extract the first frame of your video.
XMem needs an annotated first video frame to create a video mask for ProPainter.
To make this annotated frame, you can start by extracting the frames from your source video with ffmpeg
:
mkdir frames
ffmpeg -i kitten_short.mp4 frames/%04d.jpg
2. Create a mask of the first frame
You can then use an image segmentation model, such as Segment Anything, to turn the first frame, frames/0001.jpg
, into a mask.
3. Feed the source video and the mask into the pipeline
We can now feed our video kitten_short.mp4
and first_frame_mask.png
into the model pipeline on this page! Just upload your source video under ‘video’, and the masked first frame under ‘mask’.
XMem will generate a video mask from the inputs. ProPainter will take XMem’s output, and use it for video inpainting.
Licenses
The cog files have the MIT license. For the license of the underlying models, please see their respective repositories on Github.