zsyoaoa / invsr

Arbitrary-steps Image Super-resolution via Diffusion Inversion

  • Public
  • 1.8K runs
  • T4
  • GitHub
  • Weights
  • Paper
  • License

Input

in_path
*file

Input low-quality image

integer

Number of sampling steps.

Default: 1

integer

Chopping resolution

Default: 128

integer

Random seed. Leave blank to randomize the seed.

Default: 12345

Output

We were unable to load these images. Please make sure the URLs are valid.

{
  "input": "https://replicate.delivery/pbxt/M8qhJrY5aD7tG40HumHd3gIIR3LXjMKThkOCNB1oSfGrimcu/32.jpg",
  "outut": "https://replicate.delivery/czjl/BqklqAF5Wu5XOxH2WQJ8lV5HzkMruQ9V74VacSfYecMUdv6TA/out.png"
}
Generated in

Run time and cost

This model costs approximately $0.089 to run on Replicate, or 11 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 7 minutes. The predict time for this model varies significantly based on the inputs.

Readme

InvSR Model Card

This model card focuses on the models associated with the InvSR project, which is available here.

Model Details

  • Developed by: Zongsheng Yue
  • Model type: Arbitrary-steps Image Super-resolution via Diffusion Inversion
  • Model Description: This is the model used in Paper.
  • Resources for more information: GitHub Repository.
  • Cite as:

    @article{yue2024invSR, author = {Zongsheng Yue, Kang Liao, Chen Change Loy}, title = {Arbitrary-steps Image Super-resolution via Diffusion Inversion}, journal = {arXiv preprint arXiv:2412.09013}, year = {2024}, }

Limitations

  • InvSR requires a tiled operation for generating a high-resolution image, which would largely increase the inference time.
  • InvSR sometimes cannot keep 100% fidelity due to its generative nature.
  • InvSR sometimes cannot generate perfect details under complex real-world scenarios.

Training

Training Data The model developer used the following dataset for training the model:

  • Our model is finetuned on LSDIR + 20K samples from FFHQ datasets.

Training Procedure InvSR achieves the goal of image super-resolution via diffusion inversion technique on SD-Turbo, detailed training pipelines can be found in our GitHub repo.

We currently provide the following checkpoints:

Evaluation Results

See Paper for details.