Generate videos

Q: Which models are best for specific use-cases within this collection?

For cinematic realism with high resolution and optional audio, try Veo 3. For fast prototyping (short clips, lower res), Wan models or PixVerse v4 work well. For both text-to-video and image-to-video, Hailuo 02 supports both. If you’re on a budget, stick with 480p or 360p outputs to keep costs low.

Q: What’s the difference between key sub-types or approaches in this collection?

Text-to-video (T2V): You write a prompt and get a video. Image-to-video (I2V): You provide a still image (or first frame) and animate it. Not all models support this. Quality / resolution tiers: Some models focus on speed and lower res (e.g., Wan Fast), while others aim for higher resolution and richer motion (e.g., Hailuo 02, Veo 3). Open-source vs proprietary: Open models like Wan are cheaper and often faster. Licensed models like Veo 3 offer higher fidelity but can be more expensive.

Q: How much do runs typically cost?

Costs vary by model and resolution: PixVerse v4: about $0.30 for a 5-second, 360p video. Wan models: generally very inexpensive for short, low-res clips. Veo 3 and Hailuo 02: prices vary and aren’t always listed publicly, so check the model page for up-to-date details.\ Generally, you’ll pay more for longer durations and higher resolutions.

These models can generate and edit videos from text prompts and images. They use advanced AI techniques like diffusion models and latent space interpolation to create high-quality, controllable video content.

Key capabilities:

Text-to-video generation&nbsp;- Convert text prompts into video clips and animations. Useful for quickly prototyping video concepts.
Image-to-video generation&nbsp;- Animate still images into video.
Inpainting for infinite zoom&nbsp;- Use image inpainting to extrapolate video frames and create infinite zoom effects.
Stylization&nbsp;- Apply artistic filters like cartoonification to give videos a unique look and feel.

State of the art:&nbsp;google/veo-3-fast

For most people looking to generate custom videos from text prompts, we recommend google/veo-3

Open source:&nbsp;wan-video

The Wan video models model by Wan-AI is an excellent open-source option, competitive with the best proprietary video models. Try adjusting the number of steps used for each frame to trade off between generation speed and detail.

Other rankings

Generative video is a rapidly advancing field. Check out the arena and leaderboard at Artificial Analysis to see what's popular today.

Featured models

bytedance/seedance-1-pro-fast

A faster and cheaper version of Seedance 1 Pro

Updated 1 week, 5 days ago

131.2K runs

Official

pixverse/pixverse-v5

Create 5s-8s videos with enhanced character movement, visual effects, and exclusive 1080p-8s support. Optimized for anime characters and complex actions

Updated 2 weeks, 1 day ago

638.8K runs

Official

kwaivgi/kling-v2.5-turbo-pro

Kling 2.5 Turbo Pro: Unlock pro-level text-to-video and image-to-video creation with smooth motion, cinematic depth, and remarkable prompt adherence.

Updated 2 weeks, 1 day ago

1.3M runs

Official

wan-video/wan-2.5-t2v-fast

Wan 2.5 text-to-video, optimized for speed

Updated 2 weeks, 1 day ago

18K runs

Official

wan-video/wan-2.5-t2v

Alibaba Wan 2.5 text to video generation model

Updated 2 weeks, 1 day ago

20.8K runs

Official

wan-video/wan-2.5-i2v-fast

Wan 2.5 image-to-video, optimized for speed

Updated 2 weeks, 1 day ago

23.8K runs

Official

wan-video/wan-2.5-i2v

Alibaba Wan 2.5 Image to video generation with background audio

Updated 2 weeks, 1 day ago

81.5K runs

Official

google/veo-3.1

New and improved version of Veo 3, with higher-fidelity video, context-aware audio, reference image and last frame support

Updated 2 weeks, 1 day ago

86.1K runs

Official

google/veo-3.1-fast

New and improved version of Veo 3 Fast, with higher-fidelity video, context-aware audio and last frame support

Updated 2 weeks, 1 day ago

57.6K runs

Official

minimax/hailuo-2.3

A high-fidelity video generation model optimized for realistic human motion, cinematic VFX, expressive characters, and strong prompt and style adherence across both text-to-video and image-to-video workflows

Updated 2 weeks, 1 day ago

10.1K runs

Official

minimax/hailuo-2.3-fast

A lower-latency image-to-video version of Hailuo 2.3 that preserves core motion quality, visual consistency, and stylization performance while enabling faster iteration cycles.

Updated 2 weeks, 1 day ago

5K runs

Official

openai/sora-2

OpenAI's Flagship video generation with synced audio

Updated 2 weeks, 1 day ago

112.5K runs

Official

Recommended Models

bytedance/seedance-1-pro

A pro version of Seedance that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 1080p resolution

Updated 1 week, 5 days ago

1.1M runs

Official

bytedance/seedance-1-lite

A video generation model that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 720p resolution

Updated 1 week, 5 days ago

1.8M runs

Official

pixverse/pixverse-v4.5

Quickly make 5s or 8s videos at 540p, 720p or 1080p. It has enhanced motion, prompt coherence and handles complex actions well.

Updated 2 weeks, 1 day ago

206.8K runs

Official

pixverse/pixverse-v4

Quickly generate smooth 5s or 8s videos at 540p, 720p or 1080p

Updated 2 weeks, 1 day ago

34.4K runs

Official

leonardoai/motion-2.0

Create 5s 480p videos from a text prompt

Updated 2 weeks, 1 day ago

9K runs

Official

kwaivgi/kling-v2.0

Generate 5s and 10s videos in 720p resolution

Updated 2 weeks, 1 day ago

80.4K runs

Official

kwaivgi/kling-v1.6-pro

Generate 5s and 10s videos in 1080p resolution

Updated 2 weeks, 1 day ago

776.3K runs

Official

kwaivgi/kling-v2.1-master

A premium version of Kling v2.1 with superb dynamics and prompt adherence. Generate 1080p 5s and 10s videos from text or an image

Updated 2 weeks, 1 day ago

74.7K runs

Official

kwaivgi/kling-v1.6-standard

Generate 5s and 10s videos in 720p resolution at 30fps

Updated 2 weeks, 1 day ago

1.4M runs

Official

kwaivgi/kling-v2.1

Use Kling v2.1 to generate 5s and 10s videos in 720p and 1080p resolution from a starting image (image-to-video)

Updated 2 weeks, 1 day ago

2.7M runs

Official

google/veo-2

State of the art video generation model. Veo 2 can faithfully follow simple and complex instructions, and convincingly simulates real-world physics as well as a wide range of visual styles.

Updated 2 weeks, 1 day ago

102K runs

Official

luma/ray-2-540p

Generate 5s and 9s 540p videos

Updated 2 weeks, 1 day ago

10.4K runs

Official

luma/ray-2-720p

Generate 5s and 9s 720p videos

Updated 2 weeks, 1 day ago

30.7K runs

Official

wavespeedai/wan-2.1-t2v-720p

Accelerated inference for Wan 2.1 14B text to video with high resolution, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.

Updated 2 weeks, 1 day ago

34.9K runs

Official

wavespeedai/wan-2.1-t2v-480p

Accelerated inference for Wan 2.1 14B text to video, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.

Updated 2 weeks, 1 day ago

180.3K runs

Official

wavespeedai/wan-2.1-i2v-480p

Accelerated inference for Wan 2.1 14B image to video, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.

Updated 2 weeks, 1 day ago

427.3K runs

Official

wavespeedai/wan-2.1-i2v-720p

Accelerated inference for Wan 2.1 14B image to video with high resolution, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.

Updated 2 weeks, 1 day ago

85.4K runs

Official

google/veo-3-fast

A faster and cheaper version of Google’s Veo 3 video model, with audio

Updated 2 weeks, 1 day ago

126.3K runs

Official

luma/ray

Fast, high quality text-to-video and image-to-video (Also known as Dream Machine)

Updated 2 weeks, 1 day ago

61.9K runs

Official

luma/ray-flash-2-720p

Generate 5s and 9s 720p videos, faster and cheaper than Ray 2

Updated 2 weeks, 1 day ago

38.9K runs

Official

google/veo-3

Sound on: Google’s flagship Veo 3 text to video model, with audio

Updated 2 weeks, 1 day ago

199.6K runs

Official

minimax/video-01

Generate 6s videos with prompts or images. (Also known as Hailuo). Use a subject reference to make a video with a character and the S2V-01 model.

Updated 2 weeks, 1 day ago

617.9K runs

Official

minimax/video-01-director

Generate videos with specific camera movements

Updated 2 weeks, 1 day ago

71.3K runs

Official

minimax/video-01-live

An image-to-video (I2V) model specifically trained for Live2D and general animation use cases

Updated 2 weeks, 1 day ago

170K runs

Official

luma/ray-flash-2-540p

Generate 5s and 9s 540p videos, faster and cheaper than Ray 2

Updated 2 weeks, 1 day ago

54.7K runs

Official

minimax/hailuo-02

Hailuo 2 is a text-to-video and image-to-video model that can make 6s or 10s videos at 768p (standard) or 1080p (pro). It excels at real world physics.

Updated 2 weeks, 1 day ago

227.2K runs

Official

openai/sora-2-pro

OpenAI's Most advanced synced-audio video generation

Updated 2 weeks, 1 day ago

41.4K runs

Official

wan-video/wan-2.2-i2v-a14b

Image-to-video at 720p and 480p with Wan 2.2 A14B

Updated 3 months, 2 weeks ago

41.8K runs

Official

wan-video/wan-2.2-i2v-fast

A very fast and cheap PrunaAI optimized version of Wan 2.2 A14B image-to-video

Updated 3 months, 2 weeks ago

3.3M runs

Official

wan-video/wan-2.2-t2v-fast

A very fast and cheap PrunaAI optimized version of Wan 2.2 A14B text-to-video

Updated 3 months, 2 weeks ago

126.1K runs

Official

fofr/not-real

Make a very realistic looking real-world AI video

Updated 4 months ago

2.3K runs

wan-video/wan-2.1-1.3b

Generate 5s 480p videos. Wan is an advanced and powerful visual generation model developed by Tongyi Lab of Alibaba Group

Updated 8 months, 3 weeks ago

44.3K runs

Official

tencent/hunyuan-video

A state-of-the-art text-to-video generation model capable of creating high-quality videos with realistic motion from text descriptions

Updated 9 months, 4 weeks ago

115.2K runs

lightricks/ltx-video

LTX-Video is the first DiT-based video generation model capable of generating high-quality videos in real-time. It produces 24 FPS videos at a 768x512 resolution faster than they can be watched.

Updated 10 months, 2 weeks ago

161.5K runs

zsxkib/hunyuan-video2video

A state-of-the-art text-to-video generation model capable of creating high-quality videos with realistic motion from text descriptions

Updated 11 months, 1 week ago

2.9K runs

genmoai/mochi-1

Mochi 1 preview is an open video generation model with high-fidelity motion and strong prompt adherence in preliminary evaluation

Updated 11 months, 3 weeks ago

3.1K runs

zsxkib/pyramid-flow

Text-to-Video + Image-to-Video: Pyramid Flow Autoregressive Video Generation method based on Flow Matching

Updated 1 year, 1 month ago

9K runs

cuuupid/cogvideox-5b

Generate high quality videos from a prompt

Updated 1 year, 2 months ago

2.5K runs

meta/sam-2-video

SAM 2: Segment Anything v2 (for videos)

Updated 1 year, 3 months ago

48.9K runs

fofr/tooncrafter

Create videos from illustrated input images

Updated 1 year, 4 months ago

62.6K runs

fofr/video-morpher

Generate a video that morphs between subjects, with an optional style

Updated 1 year, 6 months ago

15K runs

cjwbw/videocrafter

VideoCrafter2: Text-to-Video and Image-to-Video Generation and Editing

Updated 1 year, 9 months ago

131.6K runs

ali-vilab/i2vgen-xl

RESEARCH/NON-COMMERCIAL USE ONLY: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models

Updated 1 year, 10 months ago

128K runs

open-mmlab/pia

Personalized Image Animator

Updated 1 year, 10 months ago

103.5K runs

zsxkib/animatediff-illusions

Monster Labs' Controlnet QR Code Monster v2 For SD-1.5 on top of AnimateDiff Prompt Travel (Motion Module SD 1.5 v2)

Updated 2 years ago

10.5K runs

lucataco/hotshot-xl

😊 Hotshot-XL is an AI text-to-GIF model trained to work alongside Stable Diffusion XL

Updated 2 years ago

852.2K runs

zsxkib/animatediff-prompt-travel

🎨AnimateDiff Prompt Travel🧭 Seamlessly Navigate and Animate Between Text-to-Image Prompts for Dynamic Visual Narratives

Updated 2 years, 1 month ago

5.7K runs

zsxkib/animate-diff

🎨 AnimateDiff (w/ MotionLoRAs for Panning, Zooming, etc): Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning

Updated 2 years, 1 month ago

59K runs

lucataco/animate-diff

Animate Your Personalized Text-to-Image Diffusion Models

Updated 2 years, 1 month ago

320.5K runs

anotherjesse/zeroscope-v2-xl

Zeroscope V2 XL & 576w

Updated 2 years, 4 months ago

299.1K runs

cjwbw/controlvideo

Training-free Controllable Text-to-Video Generation

Updated 2 years, 5 months ago

2.4K runs

cjwbw/text2video-zero

Text-to-Image Diffusion Models are Zero-Shot Video Generators

Updated 2 years, 7 months ago

42K runs

cjwbw/damo-text-to-video

Multi-stage text-to-video generation

Updated 2 years, 7 months ago

153.7K runs

andreasjansson/tile-morph

Create tileable animations with seamless transitions

Updated 2 years, 9 months ago

529.3K runs

arielreplicate/deoldify_video

Add colours to old video footage.

Updated 2 years, 9 months ago

7.3K runs

pollinations/real-basicvsr-video-superresolution

RealBasicVSR: Investigating Tradeoffs in Real-World Video Super-Resolution

Updated 2 years, 9 months ago

9.2K runs

arielreplicate/robust_video_matting

extract foreground of a video

Updated 2 years, 11 months ago

66.2K runs

arielreplicate/stable_diffusion_infinite_zoom

Use Runway's Stable-diffusion inpainting model to create an infinite loop video

Updated 3 years ago

38.4K runs

andreasjansson/stable-diffusion-animation

Animate Stable Diffusion by interpolating between two prompts

Updated 3 years ago

119.5K runs

nateraw/stable-diffusion-videos

Generate videos by interpolating the latent space of Stable Diffusion

Updated 3 years, 2 months ago

58.5K runs

deforum/deforum_stable_diffusion

Animating prompts with stable diffusion

Updated 3 years, 2 months ago

265.9K runs

Recommended Models

Frequently asked questions

Which models are the fastest?

The open-source Wan suite (like wan-video/wan-2.1-t2v-480p) is among the faster text-to-video options on Replicate, especially at lower resolutions and shorter durations. Many models also have “fast” variants, like google/veo-3-fast, designed for quicker turnaround.
Note: Faster runs usually mean lower resolution or simpler motion.

Which models give the best balance of cost and quality?

PixVerse v4 offers a strong balance for many use cases. It uses a unit-based system at $0.01 per unit — for example, a 5-second, 360p video costs about $0.30. Hailuo 02 is another good middle-ground option, with both standard and pro modes for different quality levels. Your ideal choice depends on how much resolution and runtime you need and how much you want to spend.

Which models are best for specific use-cases within this collection?

For cinematic realism with high resolution and optional audio, try Veo 3.
For fast prototyping (short clips, lower res), Wan models or PixVerse v4 work well.
For both text-to-video and image-to-video, Hailuo 02 supports both.
If you’re on a budget, stick with 480p or 360p outputs to keep costs low.

What’s the difference between key sub-types or approaches in this collection?

Text-to-video (T2V): You write a prompt and get a video.
Image-to-video (I2V): You provide a still image (or first frame) and animate it. Not all models support this.
Quality / resolution tiers: Some models focus on speed and lower res (e.g., Wan Fast), while others aim for higher resolution and richer motion (e.g., Hailuo 02, Veo 3).
Open-source vs proprietary: Open models like Wan are cheaper and often faster. Licensed models like Veo 3 offer higher fidelity but can be more expensive.

Which models are good for one common style or output type?

For short, stylized clips (5–10 seconds at lower resolution), PixVerse v4 and Wan models are great picks. They’re fast and relatively inexpensive, making them ideal for concept work, storyboarding, or rapid iteration.

Which models are good for another common style or output type?

If you want high-fidelity motion, longer clips, or more realistic physics, Veo 3 or Hailuo 02 Pro are better options. Hailuo 02 supports 768p in standard mode and 1080p in Pro mode, which makes it a solid choice for more polished results.

What types of outputs do these models produce?

Most text-to-video models generate short video clips (5–10 seconds) at 24 or 30 fps. Supported resolutions range from 360p to 1080p, depending on the model. Some, like Veo 3, can include audio as part of the output.

How much do runs typically cost?

Costs vary by model and resolution:

PixVerse v4: about $0.30 for a 5-second, 360p video.
Wan models: generally very inexpensive for short, low-res clips.
Veo 3 and Hailuo 02: prices vary and aren’t always listed publicly, so check the model page for up-to-date details.
Generally, you’ll pay more for longer durations and higher resolutions.

How can I self-host or push a model to Replicate?

You can push your own model by packaging it with Cog and deploying it. If you’re working with open-source video models, you can also fine-tune them and publish your version for others to use.

Can I use these models for commercial work?

Yes, but always check the model’s license. Most text-to-video models on Replicate are available for commercial use, but some authors include additional restrictions.

How do I use or run these models?

You can use the Replicate playground or run them programmatically.

Pick a model from the text-to-video collection.
Add your prompt (and optionally an image for I2V).
Run the model and wait for the video to generate.
Download or embed your output.

Any other collection-specific tips or considerations?

Start with short durations and lower resolutions to experiment without overspending.
If animating a still image, choose a clean, well-framed starting image for better results.
Be specific in your prompts — details like camera motion or scene type improve output quality.
Not all models handle character consistency or motion equally well; higher-tier models tend to do better here.
Compare resolution and duration to match your budget and needs.
Check for updates, as text-to-video models evolve quickly and new versions can improve speed and quality.

State of the art:&amp;nbsp;google/veo-3-fast

Open source:&amp;nbsp;wan-video

Other rankings

Frequently asked questions

Which models are the fastest?

Which models give the best balance of cost and quality?

Which models are best for specific use-cases within this collection?

What’s the difference between key sub-types or approaches in this collection?

Which models are good for one common style or output type?

Which models are good for another common style or output type?

What types of outputs do these models produce?

How much do runs typically cost?

How can I self-host or push a model to Replicate?

Can I use these models for commercial work?

How do I use or run these models?

Any other collection-specific tips or considerations?

State of the art: google/veo-3-fast

Open source: wan-video