This model runs predictions on Nvidia A100 GPU hardware.
80% of predictions complete within 32 minutes. The predict time for this model varies significantly based on the inputs.
Image prompts are supported thanks to a contribution from nev
Stage 1 output will be a few frames, stage 2 interpolates a longer video and performs dsr resampling.
When running both stages, stage 1 output will render when ready, stage 2 will follow when complete.
Please see the official CogVideo repo for more information: