arthur-qiu / longercrafter

Tuning-Free Longer Video Diffusion via Noise Rescheduling

  • Public
  • 14.9K runs
  • GitHub
  • Paper
  • License

Input

Output

Run time and cost

This model runs on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 8 minutes. The predict time for this model varies significantly based on the inputs.

Readme

FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling

576x1024 model supports 64 frames, 256x256 models supports 512 frames.

Input: "A chihuahua in astronaut suit floating in space, cinematic lighting, glow effect";
Resolution: 1024 x 576; Frames: 64.

Input: "Campfire at night in a snowy forest with starry sky in the background";
Resolution: 1024 x 576; Frames: 64.

🧲 Support For Other Models

FreeNoise is supposed to work on other similar frameworks. An easy way to test compatibility is by shuffling the noise to see whether a new similar video can be generated (set eta to 0). If your have any questions about applying FreeNoise to other frameworks, feel free to contact Haonan Qiu.

👨‍👩‍👧‍👦 Crafter Family

VideoCrafter: Framework for high-quality video generation.

ScaleCrafter: Tuning-free method for high-resolution image/video generation.

TaleCrafter: An interactive story visualization tool that supports multiple characters.

😉 Citation

@misc{qiu2023freenoise,
      title={FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling}, 
      author={Haonan Qiu and Menghan Xia and Yong Zhang and Yingqing He and Xintao Wang and Ying Shan and Ziwei Liu},
      year={2023},
      eprint={2310.15169},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

📢 Disclaimer

We develop this repository for RESEARCH purposes, so it can only be used for personal/research/non-commercial purposes.