voodoohop / stable-diffusion-dance

  • Public
  • 11 runs
  • L40S

Input

Video Player is loading.
Current Time 00:00:000
Duration 00:00:000
Loaded: 0%
Stream Type LIVE
Remaining Time 00:00:000
 
1x
string
Shift + Return to add a new line

Default: "A painting of a moth\nA painting of a killer dragonfly by paul klee, intricate detail\nTwo fishes talking to eachother in deep sea, art by hieronymus bosch"

string
Shift + Return to add a new line

Style suffix to add to the prompt. This can be used to add the same style to each prompt.

Default: "by paul klee, intricate details"

file

input audio file

number

Determines influence of your prompt on generation.

Default: 15

integer

Each seed generates a different image

Default: 13

integer

Number of diffusion steps. Higher steps could produce better results but will take longer to generate. Maximum 30 (using K-Euler-Diffusion).

Default: 20

number

Audio smoothing factor.

Default: 0.8

number

Larger values mean audio will lead to bigger changes in the image.

Default: 0.3

string

Type of loudness to use for audio. Options are 'rms' or 'peak'.

Default: "peak"

number

Frames per second for the generated video.

Default: 16

integer

Width of the generated image. The model was really only trained on 512x512 images. Other sizes tend to create less coherent images.

Default: 384

integer

Height of the generated image. The model was really only trained on 512x512 images. Other sizes tend to create less coherent images.

Default: 512

integer

Number of images to generate at once. Higher batch sizes will generate images faster but will use more GPU memory i.e. not work depending on resolution.

Default: 24

boolean

Whether to interpolate between frames using FFMPEG or not.

Default: true

Output

Generated in

Run time and cost

This model runs on Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

This model doesn't have a readme.