deforum/deforum_stable_diffusion

Animating prompts with stable diffusion

Public
265.5K runs

Input

integer
(minimum: 100, maximum: 1000)

Number of frames for animation

Default: 100

string
Shift + Return to add a new line

Prompt for animation. Provide 'frame number : prompt at this frame', separate different prompts with '|'. Make sure the frame number does not exceed the max_frames.

Default: "0: a beautiful portrait of a woman by Artgerm, trending on Artstation"

string
Shift + Return to add a new line

angle parameter for the motion

Default: "0:(0)"

string
Shift + Return to add a new line

zoom parameter for the motion

Default: "0: (1.04)"

string
Shift + Return to add a new line

translation_x parameter for the motion

Default: "0: (0)"

string
Shift + Return to add a new line

translation_y parameter for the motion

Default: "0: (0)"

string

An enumeration.

Default: "Match Frame 0 LAB"

string

An enumeration.

Default: "plms"

integer
(minimum: 10, maximum: 60)

Choose fps for the video.

Default: 15

integer

Random seed. Leave blank to randomize the seed

Output

Generated in

This output was created using a different version of the model, deforum/deforum_stable_diffusion:fa562b4f.

Run time and cost

This model costs approximately $0.68 to run on Replicate, or 1 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 9 minutes.

Readme

By deforum, a community of AI image synthesis developers, enthusiasts, and artists.

If you have any questions or need help join us on Discord.

seedance-1-pro

bytedance/seedance-1-pro

Generate videos from text prompts or a single reference image. Output 3-12 second clips at 24 fps in 480p, 720p, or 1080p, with optional vertical, square, or widescreen aspect ratios. Maintain physical realism and temporal consistency across complex motion and multi-agent interactions, and support multi-shot sequences with narrative coherence. Control camera behavior (fixed or moving) and guide image-to-video with start and last frame images for smooth transitions. Interpret diverse styles including photorealism, cyberpunk, illustration, and felt texture with strong prompt adherence.

624.1k runs
Official
kling-v2.1

kwaivgi/kling-v2.1

Animate a single image into a 5s or 10s 24fps video guided by a text prompt. Accepts a required start image and optional negative prompt, and outputs a video. Choose standard (720p) or pro (1080p); pro mode also supports specifying an end image to control the final frame.

2.0m runs
Official
seedance-1-lite

bytedance/seedance-1-lite

Generate short videos from text prompts or a starting image. Produce 3–12s clips at 24 fps in 480p, 720p, or 1080p, with aspect ratios including 16:9, 4:3, 1:1, 3:4, 9:16, 21:9, and 9:21. Guide subjects with 1–4 reference images for consistent characters, clothing, avatars, environments, and multi-character interactions (not available at 1080p or when using first/last frame conditioning). Optionally start from a first-frame image and set a target last frame, lock the camera position, and fix a random seed for reproducibility. Supports styles such as photorealism, cyberpunk, illustration, and felt texture, and handles complex motion and multi-character scenes.

1.2m runs
Official
kling-v1.6-standard

kwaivgi/kling-v1.6-standard

Generate 5–10 second 720p 30 fps videos from a text prompt. Animate a start image as the first frame to turn a still into motion, and optionally guide scenes with up to 4 reference images. Choose aspect ratios 16:9, 9:16, or 1:1 (start image preserves its own aspect). Outputs a video.

1.3m runs
Official

kwaivgi/kling-v2.5-turbo-pro

Generate videos from text prompts or from a single reference image plus prompt. Produce cinematic motion with complex camera moves, smooth playback, and stable frames, using a text-timing engine that follows multi-step, causal instructions and pacing. Maintain style and color consistency with refined image conditioning that preserves palette, lighting, and mood across frames. Choose 5 or 10 second clips and aspect ratios 1:1, 16:9, or 9:16 for square, widescreen, or vertical outputs, with faster inference suitable for high-speed action.

14.3k runs
Official
ray

luma/ray

Generate videos from text prompts and images. Produce realistic or fantastical scenes with physically consistent, temporally coherent motion. Use optional start and end images to define first/last frames, enable seamless looping, and continue or prepend footage with start_video_id and end_video_id (reverse extend). Select common and cinematic aspect ratios (16:9, 9:16, 21:9, 1:1, 4:3, 3:4, 9:21). Optimized for fast, scalable video generation. Also known as Luma Dream Machine (Ray).

54.9k runs
Official

wan-video/wan-2.2-5b-fast

Generate videos from a text prompt or animate a single image into motion. Accepts a prompt and optional input image; outputs a video. Optimize speed with go_fast and control resolution (480p, 720p), aspect ratio (16:9, 9:16), frames per second (5–30), number of frames (81–121), random seed, sample_shift, and an optional safety checker toggle.

111.8k runs
Official
ray-flash-2-720p

luma/ray-flash-2-720p

Generate 720p videos from a text prompt. Choose 5s or 9s duration, enable seamless looping, and optionally set the first and last frames via start/end images to control the opening and closing shots. Apply camera motion presets for shot direction and movement, including zoom_in, zoom_out, pan_left, pan_right, tilt_up, tilt_down, orbit_left, orbit_right, dolly_zoom, and aerial_drone. Optimized for fast, iterative text-to-video generation.

35.7k runs
Official