arielreplicate / stable_diffusion2_upscaling

Image super-resolution with stable-diffusion V2

  • Public
  • 7.1K runs
  • GitHub
  • License

Input

Output

Run time and cost

This model runs on Nvidia A100 (40GB) GPU hardware. Predictions typically complete within 11 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Image super-resolution with Stable Diffusion 2.0

Stable Diffusion is a latent text-to-image diffusion model. This model was fine tuned to perform image upscaling to high resolutions.


The original Stable Diffusion model was created in a collaboration with CompVis and RunwayML and builds upon the work:

teaser image

High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach*, Andreas Blattmann*, Dominik Lorenz\, Patrick Esser, Björn Ommer
CVPR ‘22 Oral | GitHub | arXiv | Project page

and many others.


General Disclaimer

Stable Diffusion models are general text-to-image diffusion models and therefore mirror biases and (mis-)conceptions that are present in their training data. Although efforts were made to reduce the inclusion of explicit pornographic material, we do not recommend using the provided weights for services or products without additional safety mechanisms and considerations. The weights are research artifacts and should be treated as such. Details on the training procedure and data, as well as the intended use of the model can be found in the corresponding model card. The weights are available via the StabilityAI organization at Hugging Face under the CreativeML Open RAIL++-M License.

License

The code in this repository is released under the MIT License.

BibTeX

@misc{rombach2021highresolution,
      title={High-Resolution Image Synthesis with Latent Diffusion Models}, 
      author={Robin Rombach and Andreas Blattmann and Dominik Lorenz and Patrick Esser and Björn Ommer},
      year={2021},
      eprint={2112.10752},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}