You're looking at a specific version of this model. Jump to the model overview.

zsxkib /memo:b9950fa2

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
image
string
Input image (e.g. PNG/JPG).
audio
string
Input audio (e.g. WAV/MP3).
resolution
integer
512

Min: 64

Max: 2048

Resolution for generation (square). Default: 512
fps
integer
30

Min: 1

Max: 60

Frames per second of output video. Default: 30
num_generated_frames_per_clip
integer
16

Min: 1

Max: 128

Frames per video clip chunk. Default: 16
inference_steps
integer
20

Min: 1

Max: 200

Diffusion inference steps. Default: 20
cfg_scale
number
3.5

Min: 1

Max: 20

Classifier-free guidance scale. Default: 3.5
max_audio_seconds
integer
8

Min: 1

Max: 60

Max audio duration (in seconds). Default: 8
seed
integer
0
Set a random seed (None or 0 for random)

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}