Stable Diffusion x2 latent upscaler
1.3K runs

Run time and cost

Predictions run on Nvidia A100 GPU hardware. Predictions typically complete within 12 seconds.

weights from: https://huggingface.co/stabilityai/sd-x2-latent-upscaler

Stable Diffusion x2 latent upscaler

This model card focuses on the latent diffusion-based upscaler developed by Katherine Crowson
in collaboration with Stability AI.
This model was trained on a high-resolution subset of the LAION-2B dataset.
It is a diffusion model that operates in the same latent space as the Stable Diffusion model, which is decoded into a full-resolution image.
To use it with Stable Diffusion, You can take the generated latent from Stable Diffusion and pass it into the upscaler before decoding with your standard VAE.
Or you can take any image, encode it into the latent space, use the upscaler, and decode it.

This upscaling model is designed explicitly for Stable Diffusion as it can upscale Stable Diffusion's latent denoised image embeddings.
This allows for very fast text-to-image + upscaling pipelines as all intermeditate states can be kept on GPU