zsxkib / mimic-motion

MimicMotion: High-quality human motion video generation with pose-guided control

  • Public
  • 2.3K runs
  • A100 (80GB)
  • GitHub
  • Paper
  • License

Input

*file

Reference video file containing the motion to be mimicked

*file
Preview
appearance_image

Reference image file for the appearance of the generated video

integer
(minimum: 64, maximum: 1024)

Height of the output video in pixels. Width is automatically calculated.

Default: 576

integer
(minimum: 2)

Number of frames to generate in each processing chunk

Default: 16

integer
(minimum: 0)

Number of overlapping frames between chunks for smoother transitions

Default: 6

integer
(minimum: 1, maximum: 100)

Number of denoising steps in the diffusion process. More steps can improve quality but increase processing time.

Default: 25

number
(minimum: 0, maximum: 1)

Strength of noise augmentation. Higher values add more variation but may reduce coherence with the reference.

Default: 0

number
(minimum: 0.1, maximum: 10)

Strength of guidance towards the reference. Higher values adhere more closely to the reference but may reduce creativity.

Default: 2

integer
(minimum: 1)

Interval for sampling frames from the reference video. Higher values skip more frames.

Default: 2

integer
(minimum: 1, maximum: 60)

Frames per second of the output video. Affects playback speed.

Default: 15

integer

Random seed. Leave blank to randomize the seed

string

Choose the checkpoint version to use

Default: "v1-1"

Output

Generated in

This example was created by a different version, zsxkib/mimic-motion:2e7f79ed.

Run time and cost

This model costs approximately $0.98 to run on Replicate, or 1 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 12 minutes. The predict time for this model varies significantly based on the inputs.

Readme

✨MimicMotion: High-Quality Human Motion Video Generation🎥

About

Implementation of MimicMotion, a model for generating high-quality human motion videos with confidence-aware pose guidance.

Examples

Here are some examples of MimicMotion’s outputs:


<span>Highlights: rich details, good temporal smoothness, and long video length. </span>

Limitations

  • The model performs best with clear, well-lit input videos and images.
  • Very complex or rapid motions may be challenging for the model to reproduce accurately.
  • Higher resolutions provide more detail but require more processing time and resources.

MimicMotion is a 🔥 model developed by Tencent AI Lab. It excels at generating high-quality human motion videos with rich details and good temporal smoothness.

MimicMotion on Replicate can be used for research and non-commercial work. For commercial use, please contact the original authors.

Core Model

model architecture
An overview of the framework of MimicMotion.

MimicMotion uses a sophisticated architecture including a UNet-based spatio-temporal model, VAE, CLIP vision model, and custom PoseNet. It features confidence-aware pose guidance and progressive latent fusion for improved video generation.

For more technical details, check out the Research paper.

Safety

⚠️ Users should be aware of potential ethical implications: - Ensure you have the right to use reference videos and images, especially those featuring identifiable individuals. - Be responsible and transparent about generated content to avoid potential misuse for misinformation. - Be cautious about using copyrighted material as reference inputs without permission. - Avoid using the model to create videos that could be mistaken for real footage of individuals without their consent.

For more about ethical AI use, visit Tencent’s AI Ethics Principles.

Support

All credit goes to the Tencent AI Lab team Give me a follow on Twitter if you like my work! @zsakib_

Citation

@article{mimicmotion2024,
  title={MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance},
  author={Yuang Zhang and Jiaxi Gu and Li-Wen Wang and Han Wang and Junqi Cheng and Yuefeng Zhu and Fangyuan Zou},
  journal={arXiv preprint arXiv:2406.19680},
  year={2024}
}

Changelog

  • Bug fix for NoSuchFile error