CLIP Guided latent k-diffusion

Run time and cost

Predictions run on Nvidia T4 GPU hardware. Predictions typically complete within 11 minutes. The predict time for this model varies significantly based on the inputs.

This demo is based a simplified diffusion codebase implemented by RiversHaveWings (Katherine Crowson).

Currently it implements CLIP guidance on Jack's finetuned latent diffusion using OpenCLIP ViT32-LAION2b. This runs slowly due to parameters in the model requiring gradient calculation.

This is a testbed for future developments and will change often and possibly break. You have been warned.