lucataco / stable-diffusion-x4-upscaler

Stable Diffusion x4 upscaler model

  • Public
  • 6.4K runs
  • GitHub
  • Paper
  • License



Run time and cost

This model runs on Nvidia A100 (40GB) GPU hardware. Predictions typically complete within 106 seconds. The predict time for this model varies significantly based on the inputs.


Implementation of stabilityai/stable-diffusion-x4-upscaler


This model card focuses on the model associated with the Stable Diffusion Upscaler, available here. This model is trained for 1.25M steps on a 10M subset of LAION containing images >2048x2048. The model was trained on crops of size 512x512 and is a text-guided latent upscaling diffusion model. In addition to the textual input, it receives a noise_level as an input parameter, which can be used to add noise to the low-resolution input according to a [predefined diffusion schedule].

Model Details

    author    = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj\"orn},
    title     = {High-Resolution Image Synthesis With Latent Diffusion Models},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {10684-10695}