zsxkib/humo:121a2140 | Run with an API on Replicate

You're looking at a specific version of this model. Jump to the model overview.

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field	Type	Default value	Description
prompt	string	A person walking confidently down a busy street	Text description of the video. Be detailed about the person, actions, and scene.
reference_image	string		Reference image to control the person's appearance (optional)
audio	string		Audio file for lip-sync and movement synchronization (optional)
width	integer	1280 Min: 640 Max: 1344	Video width in pixels (will be rounded to nearest multiple of 8)
height	integer	720 Min: 384 Max: 768	Video height in pixels (will be rounded to nearest multiple of 8)
num_frames	integer	49 Min: 1 Max: 97	Number of frames (25 fps, so 25 frames = 1 second)
num_inference_steps	integer	20 Min: 5 Max: 100	Denoising steps. More steps = higher quality but slower
guidance_scale	number	4 Min: 1 Max: 20	Text guidance strength. Higher = follows prompt more closely. Lower values (3-5) often produce more natural lighting.
audio_guidance_scale	number	5.5 Min: 1 Max: 20	Audio guidance strength (when audio provided). Higher = better sync
seed	integer	Max: 2147483647	Random seed for reproducible generation
negative_prompt	string	blurry, low quality, distorted, bad anatomy	What to avoid in the video

The shape of the response you’ll get when you run this model with an API.

Schema

{'format': 'uri', 'title': 'Output', 'type': 'string'}