Latent blending
Enables video transitions with incredible smoothness between prompts. This method involves specific mixing of intermediate latent representations to create a seamless transition.
Cost: ~$0.02 / transition
Bootup (if required): ~2 minutes
Runtime: ~10s / transition
Recommendations
caption
More ambiguity is better as it allows the model more flexibility to create perceptual similarity. More detailed prompts will lead to more motion so you’ll want to make those transitions longer.
transition_time
10 seconds is the magic cutoff for undetectable motion. Any more than this and you won’t notice the transition. The lower this is the more obvious the effect (not a bad thing!).
My comments
I really liked the aesthetic of the blend in this project and decided to use it as a base for future vector-based content-generation. Anyone can replicate an image of something that exists, but what new things can we generate with the right model?
Models utilized
- Diffusion: Stability AI - Stability XL Turbo
- Feature extraction (perceptual similarity): AlexNet
TODO
[] Quality - Add 4x upscaling option once processing pipeline is optimized,
[] Quality - Compare seed latents with lpips to select closest seeds (degrades performance)