daanelson / stable-diffusion-long-prompts

img2img Stable Diffusion, but with longer prompts

  • Public
  • 585 runs
  • GitHub
  • Paper
  • License

Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input.

This is a custom stable diffusion pipeline modified to enable prompting with longer prompts, up to 231 tokens (as contrasted with the 77 tokens for ordinary stable diffusion). Source code for this implementation can be found here. Implementation by SkyTNT.

Prompt weighting is also supported: - Emphasize/weigh part of your prompt with parentheses as so: a baby deer with (big eyes) - De-emphasize part of your prompt as so: a [baby] deer with big eyes - Precisely weigh part of your prompt as so: a baby deer with (big eyes:1.3)

Prompt weighting equivalents: - a baby deer with == (a baby deer with:1.0) - (big eyes) == (big eyes:1.1) - ((big eyes)) == (big eyes:1.21) - [big eyes] == (big eyes:0.91)

For an in depth Stable Diffusion model card, see the official Replicate implementation of Stable Diffusion.

``` @InProceedings{Rombach_2022_CVPR, author = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj"orn}, title = {High-Resolution Image Synthesis With Latent Diffusion Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {10684-10695} }