You're looking at a specific version of this model. Jump to the model overview.

zsxkib /step-video-t2v:4acfc436

Input

string
Shift + Return to add a new line

Input text prompt describing the video content

Default: "A robot dancing in Times Square"

integer
(minimum: 24, maximum: 204)

Number of frames in output video (24-204)

Default: 24

integer
(minimum: 10, maximum: 15)

Number of denoising steps (10-15 for Turbo model)

Default: 12

number
(minimum: 3, maximum: 7)

Guidance scale for text conditioning (3.0-7.0)

Default: 5

number
(minimum: 15, maximum: 20)

Temporal shift for motion consistency (15.0-20.0)

Default: 17

integer
(minimum: 256, maximum: 1088)

Vertical resolution (256-1088, multiple of 16)

Default: 544

integer
(minimum: 256, maximum: 1984)

Horizontal resolution (256-1984, multiple of 16)

Default: 992

number
(minimum: 0.5, maximum: 1.5)

Positive prompt enhancement strength (0.5-1.5)

Default: 1

number
(minimum: 0.5, maximum: 1.5)

Negative prompt suppression strength (0.5-1.5)

Default: 1

Including fps and 1 more...

Output

No output yet! Press "Submit" to start a prediction.