lucataco / vseq2vseq

Text to video diffusion model with variable length frame conditioning for infinite length video

Run time and cost

This model runs on Nvidia A100 (40GB) GPU hardware. Predictions typically complete within 4 minutes. The predict time for this model varies significantly based on the inputs.


Implementation of motexture/vseq2vseq


Increase the –times parameter to create even longer videos.

Additional info

For best results –num-frames should be 16, 24 or 32. Higher values will result in slower motion.