Examples

Run time and cost

This model costs approximately $0.13 to run on Replicate, or 7 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 97 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Stable Diffusion Infinite Zoom

Run it on Replicate:

This repo is based on Stable Diffusion by CompVis group: and Stable Inpainting by Runway

The idea is based on this tweet by Matt Henderson

Model description

Given a prompt I run txt2img,py with sd-v1-4.ckpt Then I paste a downscaled version of the image into it’s center and inpaint around the center using inpaint.py using this sd-v1-5-inpainting.ckpt from I repeat the inpainting step twice.

Then zoom in by upscaling the image and cuting it to the original size while pasting the “center” image in its due area.

How to run

Download text-2-image and inpainting weights

hf_hub_download(repo_id=”runwayml/stable-diffusion-v1-5”, filename=”v1-5-pruned-emaonly.ckpt”, cache_dir=”.”, use_auth_token=<HuggingFace token>) hf_hub_download(repo_id=”runwayml/stable-diffusion-inpainting”, filename=”sd-v1-5-inpainting.ckpt”, cache_dir=”.”, use_auth_token=<HuggingFace token>)

create video

python3 scripts/inf_zoom.py <your prompt>

Credits

@misc{rombach2021highresolution,
      title={High-Resolution Image Synthesis with Latent Diffusion Models}, 
      author={Robin Rombach and Andreas Blattmann and Dominik Lorenz and Patrick Esser and Björn Ommer},
      year={2021},
      eprint={2112.10752},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}