Readme
Upscale images with Stable Diffusion, optionally including a prompt to subtly alter the input image.
Model description
A latent diffusion upscaler for the Stable Diffusion autoencoder.
- Developed by: Robin Rombach, Patrick Esser
- Model type: Diffusion-based text-to-image generation model
- Language(s): English
- License: CreativeML Open RAIL++-M License
- Model Description: This is a model that can be used to generate and modify images based on text prompts. It is a Latent Diffusion Model that uses a fixed, pretrained text encoder (OpenCLIP-ViT/H).
- Resources for more information: GitHub Repository.
- Cite as:
@InProceedings{Rombach_2022_CVPR,
author = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj\"orn},
title = {High-Resolution Image Synthesis With Latent Diffusion Models},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2022},
pages = {10684-10695}
}