andreasjansson / stable-diffusion-animation

Animate Stable Diffusion by interpolating between two prompts

  • Public
  • 118.6K runs
  • A100 (80GB)
  • GitHub
  • License

Input

*string
Shift + Return to add a new line

Prompt to start the animation with

*string
Shift + Return to add a new line

Prompt to end the animation with. You can include multiple prompts by separating the prompts with | (the 'pipe' character)

integer

Width of output image

Default: 512

integer

Height of output image

Default: 512

integer
(minimum: 1, maximum: 500)

Number of denoising steps

Default: 50

number

Lower prompt strength generates more coherent gifs, higher respects prompts more but can be jumpy

Default: 0.8

integer
(minimum: 2, maximum: 50)

Number of frames to animate

Default: 10

integer
(minimum: 1, maximum: 50)

Number of steps to interpolate between animation frames

Default: 5

number
(minimum: 1, maximum: 20)

Scale for classifier-free guidance

Default: 7.5

integer
(minimum: 1, maximum: 50)

Frames/second in output GIF

Default: 20

boolean

Whether to reverse the animation and go back to the beginning before looping

Default: false

boolean

Whether to use FILM for between-frame interpolation (film-net.github.io)

Default: true

boolean

Whether to display intermediate outputs during generation

Default: false

integer

Random seed. Leave blank to randomize the seed

string

Output file format

Default: "gif"

Output

Generated in

This example was created by a different version, andreasjansson/stable-diffusion-animation:a0cd8005.

Run time and cost

This model costs approximately $0.13 to run on Replicate, or 7 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 90 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Stable Diffusion Animation

Animate Stable Diffusion by interpolating between two prompts

Code: https://github.com/andreasjansson/cog-stable-diffusion/tree/animate

How does it work?

Starting with noise, we then use stable diffusion to denoise for n steps towards the mid-point between the start prompt and end prompt, where n = num_inference_steps * (1 - prompt_strength). The higher the prompt strength, the fewer steps towards the mid-point.

We then denoise from that intermediate noisy output towards num_animation_frames interpolation points between the start and end prompts. By starting with an intermediate output, the model will generate samples that are similar to each other, resulting in a smoother animation.

Finally, the generated samples are interpolated with Google’s FILM (Frame Interpolation for Large Scene Motion) for extra smoothness.