Join us at Uncanny Spaces, a series of talks about ML and creativity. 🚀


Text-to-video generation
6,875 runs


This model runs predictions on Nvidia A100 GPU hardware.

80% of predictions complete within 32 minutes. The predict time for this model varies significantly based on the inputs.


CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers

Setup time can be long as the container is 63GB.

Image prompts are supported thanks to a contribution from nev

Stage 1 output will be a few frames, stage 2 interpolates a longer video and performs dsr resampling.

When running both stages, stage 1 output will render when ready, stage 2 will follow when complete.

Please see the official CogVideo repo for more information: