✋ This model is not published yet.

You can claim this model if you're @crowsonkb on GitHub. Contact us.

crowsonkb / clip-guided-diffusion-cfg

Generates images from text descriptions with classifier-free guidance

  • Public
  • 801 runs

V-Diffusion

v objective diffusion inference code for PyTorch, by Katherine Crowson (@RiversHaveWings) and Chainbreakers AI (@jd_pressman).

The models are denoising diffusion probabilistic models (https://arxiv.org/abs/2006.11239), which are trained to reverse a gradual noising process, allowing the models to generate samples from the learned data distributions starting from random noise. DDIM-style deterministic sampling (https://arxiv.org/abs/2010.02502) is also supported. The models are also trained on continuous timesteps. They use the ‘v’ objective from Progressive Distillation for Fast Sampling of Diffusion Models (https://openreview.net/forum?id=TIdIXIpzhoI). Guided diffusion sampling scripts (https://arxiv.org/abs/2105.05233) are included, specifically CLIP guided diffusion.

This model specifically is a diffusion model conditioned on CLIP text embeddings and uses classifier-free guidance (https://openreview.net/pdf?id=qw8AKxfYbI), similar to GLIDE (https://arxiv.org/abs/2112.10741).

Thank you to stability.ai for compute to train these models!