These models can generate and edit videos from text prompts and images. They use advanced AI techniques like diffusion models and latent space interpolation to create high-quality, controllable video content.
Key capabilities:
Runway Gen-4.5 is the top-rated video generation model, ranked #1 on the Artificial Analysis text-to-video benchmark. It produces videos with realistic physics — objects have weight, liquids flow naturally, and fine details like hair and fabric stay coherent across frames. Great for polished, cinematic clips where visual fidelity matters most.
Google Veo 3.1 and Veo 3.1 Fast are strong alternatives with native audio generation. Veo 3.1 Fast is a good pick when you want high quality with quicker turnaround. Note: Veo can sometimes produce slightly overbaked results.
Kling Video 3.0 generates cinematic videos up to 15 seconds with native audio — including lip-synced dialogue, sound effects, and ambient sound. Its multi-shot mode lets you define up to 6 connected scenes in a single generation, making it ideal for short narratives, product demos, and ads. Use standard mode for 720p or pro for 1080p.
Kling Video 3.0 Omni adds reference-based generation and video editing on top. Upload reference images to keep character appearance consistent across scenes, or feed in a reference video for style and camera movement transfer.
Grok Imagine Video from xAI generates short video clips with synchronized audio in around 30 seconds. It supports text-to-video and image-to-video with multiple aspect ratios (16:9, 9:16, 1:1), making it a natural fit for TikTok, Reels, and Shorts. It's best for rapid iteration and social content where speed matters more than maximum fidelity.
Sora 2 from OpenAI has a distinctive diffusion quality that gives outputs a natural, home-video feel. It's great for realistic scenes where you want footage that looks captured rather than generated. Sora 2 Pro offers higher fidelity for more demanding use cases.
Hailuo 2.3 from Minimax supports both text-to-video and image-to-video with standard and pro quality tiers. It's a solid middle-ground option when you need good results without paying top-tier prices. Hailuo 2.3 Fast trades some quality for speed.
PixVerse v5.6 is another cost-effective choice with a unit-based pricing system that keeps shorter, lower-resolution videos affordable.
The Wan video models are excellent open-source options, competitive with many proprietary models. Wan 2.5 is the latest generation, and the fast variants (Wan 2.5 T2V Fast, Wan 2.5 I2V Fast) are among the quickest text-to-video options on Replicate. Try adjusting the number of steps to trade off between speed and detail.
Generative video is a rapidly advancing field. Check out the arena and leaderboard at Artificial Analysis to see what's popular today.
Featured models
Generate videos using xAI's Grok Imagine Video model
Updated 1 month ago
321.2K runs
runwayml/gen-4.5State-of-the-art video motion quality, prompt adherence and visual fidelity
Updated 1 month, 1 week ago
59.7K runs
Kling Video 3.0: Generate cinematic videos up to 15 seconds with multi-shot control, native audio, and improved consistency
Updated 1 month, 2 weeks ago
109K runs
Recommended Models
The open-source Wan suite (like wan-video/wan-2.1-t2v-480p) is among the faster text-to-video options on Replicate, especially at lower resolutions and shorter durations. Many models also have “fast” variants, like google/veo-3-fast, designed for quicker turnaround.
Note: Faster runs usually mean lower resolution or simpler motion.
PixVerse v4 offers a strong balance for many use cases. It uses a unit-based system at $0.01 per unit — for example, a 5-second, 360p video costs about $0.30. Hailuo 02 is another good middle-ground option, with both standard and pro modes for different quality levels. Your ideal choice depends on how much resolution and runtime you need and how much you want to spend.
For short, stylized clips (5–10 seconds at lower resolution), PixVerse v4 and Wan models are great picks. They’re fast and relatively inexpensive, making them ideal for concept work, storyboarding, or rapid iteration.
If you want high-fidelity motion, longer clips, or more realistic physics, Veo 3 or Hailuo 02 Pro are better options. Hailuo 02 supports 768p in standard mode and 1080p in Pro mode, which makes it a solid choice for more polished results.
Most text-to-video models generate short video clips (5–10 seconds) at 24 or 30 fps. Supported resolutions range from 360p to 1080p, depending on the model. Some, like Veo 3, can include audio as part of the output.
Costs vary by model and resolution:
You can push your own model by packaging it with Cog and deploying it. If you’re working with open-source video models, you can also fine-tune them and publish your version for others to use.
Yes, but always check the model’s license. Most text-to-video models on Replicate are available for commercial use, but some authors include additional restrictions.
You can use the Replicate playground or run them programmatically.
Recommended Models
New and improved version of Veo 3 Fast, with higher-fidelity video, context-aware audio and last frame support
Updated 5 days, 2 hours ago
534.5K runs
New and improved version of Veo 3, with higher-fidelity video, context-aware audio, reference image and last frame support
Updated 5 days, 2 hours ago
421.6K runs
Kling Video 3.0 Omni: Unified multimodal video generation with reference images, video editing, native audio, and multi-shot control
Updated 1 month, 1 week ago
218.2K runs
Modify an existing video through natural-language commands, changing subjects, environments, and visual style while preserving the original motion and timing.
Updated 1 month, 2 weeks ago
7.4K runs
VEED Fabric 1.0 is an image-to-video API that turns any image into a talking video
Updated 1 month, 2 weeks ago
19.2K runs
bytedance/dreamactor-m2.0Animate any character, humans, cartoons, animals, even non-humans, from a single image + driving video
Updated 1 month, 2 weeks ago
7.4K runs
Latest video model from Pixverse with astonishing physics
Updated 2 months ago
14.6K runs

openai/sora-2-proOpenAI's Most advanced synced-audio video generation
Updated 2 months, 1 week ago
104.5K runs

openai/sora-2OpenAI's Flagship video generation with synced audio
Updated 2 months, 1 week ago
280.9K runs
A very fast and cheap PrunaAI optimized version of Wan 2.2 A14B text-to-video
Updated 2 months, 2 weeks ago
235.5K runs
A very fast and cheap PrunaAI optimized version of Wan 2.2 A14B image-to-video
Updated 2 months, 2 weeks ago
9.6M runs
Kling 2.5 Turbo Pro: Unlock pro-level text-to-video and image-to-video creation with smooth motion, cinematic depth, and remarkable prompt adherence.
Updated 2 months, 2 weeks ago
2.3M runs
Kling 2.6 Pro: Top-tier image-to-video with cinematic visuals, fluid motion, and native audio generation
Updated 2 months, 4 weeks ago
479.8K runs
Alibaba Wan 2.5 text to video generation model
Updated 3 months, 3 weeks ago
33.6K runs
Alibaba Wan 2.5 Image to video generation with background audio
Updated 3 months, 3 weeks ago
204K runs

Sound on: Google’s flagship Veo 3 text to video model, with audio
Updated 4 months ago
226K runs

A faster and cheaper version of Google’s Veo 3 video model, with audio
Updated 4 months ago
180.6K runs

State of the art video generation model. Veo 2 can faithfully follow simple and complex instructions, and convincingly simulates real-world physics as well as a wide range of visual styles.
Updated 4 months ago
107.3K runs

Quickly generate smooth 5s or 8s videos at 540p, 720p or 1080p
Updated 4 months ago
43.1K runs

Quickly make 5s or 8s videos at 540p, 720p or 1080p. It has enhanced motion, prompt coherence and handles complex actions well.
Updated 4 months ago
255.1K runs

Create 5s-8s videos with enhanced character movement, visual effects, and exclusive 1080p-8s support. Optimized for anime characters and complex actions
Updated 4 months ago
774.6K runs
Wan 2.5 text-to-video, optimized for speed
Updated 4 months ago
47K runs

Accelerated inference for Wan 2.1 14B text to video with high resolution, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.
Updated 4 months ago
36.4K runs

Accelerated inference for Wan 2.1 14B image to video with high resolution, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.
Updated 4 months ago
87.7K runs

Accelerated inference for Wan 2.1 14B image to video, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.
Updated 4 months ago
444.2K runs

bytedance/seedance-1-proA pro version of Seedance that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 1080p resolution
Updated 4 months, 2 weeks ago
1.8M runs

bytedance/seedance-1-liteA video generation model that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 720p resolution
Updated 4 months, 2 weeks ago
3M runs
bytedance/seedance-1-pro-fastA faster and cheaper version of Seedance 1 Pro
Updated 4 months, 2 weeks ago
1.2M runs

Create 5s 480p videos from a text prompt
Updated 4 months, 3 weeks ago
10.9K runs

Generate 5s and 10s videos in 720p resolution
Updated 4 months, 3 weeks ago
93.6K runs

Generate 5s and 10s videos in 1080p resolution
Updated 4 months, 3 weeks ago
818.6K runs

A premium version of Kling v2.1 with superb dynamics and prompt adherence. Generate 1080p 5s and 10s videos from text or an image
Updated 4 months, 3 weeks ago
95.5K runs

Generate 5s and 10s videos in 720p resolution at 30fps
Updated 4 months, 3 weeks ago
1.6M runs

Use Kling v2.1 to generate 5s and 10s videos in 720p and 1080p resolution from a starting image (image-to-video)
Updated 4 months, 3 weeks ago
3.8M runs

luma/ray-2-540pGenerate 5s and 9s 540p videos
Updated 4 months, 3 weeks ago
11.5K runs

luma/ray-2-720pGenerate 5s and 9s 720p videos
Updated 4 months, 3 weeks ago
39K runs
Wan 2.5 image-to-video, optimized for speed
Updated 4 months, 3 weeks ago
59K runs

Accelerated inference for Wan 2.1 14B text to video, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.
Updated 4 months, 3 weeks ago
187.9K runs

luma/ray-flash-2-720pGenerate 5s and 9s 720p videos, faster and cheaper than Ray 2
Updated 4 months, 3 weeks ago
47.2K runs

Generate 6s videos with prompts or images. (Also known as Hailuo). Use a subject reference to make a video with a character and the S2V-01 model.
Updated 4 months, 3 weeks ago
684.6K runs

Generate videos with specific camera movements
Updated 4 months, 3 weeks ago
75.4K runs
A high-fidelity video generation model optimized for realistic human motion, cinematic VFX, expressive characters, and strong prompt and style adherence across both text-to-video and image-to-video workflows
Updated 4 months, 3 weeks ago
68.1K runs
A lower-latency image-to-video version of Hailuo 2.3 that preserves core motion quality, visual consistency, and stylization performance while enabling faster iteration cycles.
Updated 4 months, 3 weeks ago
81.4K runs

An image-to-video (I2V) model specifically trained for Live2D and general animation use cases
Updated 4 months, 3 weeks ago
180.7K runs
luma/ray-flash-2-540pGenerate 5s and 9s 540p videos, faster and cheaper than Ray 2
Updated 4 months, 3 weeks ago
63.7K runs
Hailuo 2 is a text-to-video and image-to-video model that can make 6s or 10s videos at 768p (standard) or 1080p (pro). It excels at real world physics.
Updated 4 months, 3 weeks ago
356.6K runs

Image-to-video at 720p and 480p with Wan 2.2 A14B
Updated 7 months, 3 weeks ago
51.5K runs
fofr/not-realMake a very realistic looking real-world AI video
Updated 8 months, 1 week ago
2.3K runs

Generate 5s 480p videos. Wan is an advanced and powerful visual generation model developed by Tongyi Lab of Alibaba Group
Updated 1 year, 1 month ago
48.4K runs

tencent/hunyuan-videoA state-of-the-art text-to-video generation model capable of creating high-quality videos with realistic motion from text descriptions
Updated 1 year, 2 months ago
117.4K runs

lightricks/ltx-videoLTX-Video is the first DiT-based video generation model capable of generating high-quality videos in real-time. It produces 24 FPS videos at a 768x512 resolution faster than they can be watched.
Updated 1 year, 2 months ago
167.7K runs

zsxkib/hunyuan-video2videoA state-of-the-art text-to-video generation model capable of creating high-quality videos with realistic motion from text descriptions
Updated 1 year, 3 months ago
3K runs

genmoai/mochi-1Mochi 1 preview is an open video generation model with high-fidelity motion and strong prompt adherence in preliminary evaluation
Updated 1 year, 3 months ago
3.2K runs

zsxkib/pyramid-flowText-to-Video + Image-to-Video: Pyramid Flow Autoregressive Video Generation method based on Flow Matching
Updated 1 year, 5 months ago
9.3K runs

cuuupid/cogvideox-5bGenerate high quality videos from a prompt
Updated 1 year, 7 months ago
2.6K runs

meta/sam-2-videoSAM 2: Segment Anything v2 (for videos)
Updated 1 year, 7 months ago
61.9K runs

fofr/tooncrafterCreate videos from illustrated input images
Updated 1 year, 8 months ago
67.5K runs

fofr/video-morpherGenerate a video that morphs between subjects, with an optional style
Updated 1 year, 11 months ago
15.2K runs

cjwbw/videocrafterVideoCrafter2: Text-to-Video and Image-to-Video Generation and Editing
Updated 2 years, 1 month ago
163.2K runs

ali-vilab/i2vgen-xlRESEARCH/NON-COMMERCIAL USE ONLY: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
Updated 2 years, 2 months ago
128.4K runs

open-mmlab/piaPersonalized Image Animator
Updated 2 years, 2 months ago
103.5K runs

zsxkib/animatediff-illusionsMonster Labs' Controlnet QR Code Monster v2 For SD-1.5 on top of AnimateDiff Prompt Travel (Motion Module SD 1.5 v2)
Updated 2 years, 4 months ago
10.5K runs

lucataco/hotshot-xl😊 Hotshot-XL is an AI text-to-GIF model trained to work alongside Stable Diffusion XL
Updated 2 years, 5 months ago
923.1K runs

zsxkib/animatediff-prompt-travel🎨AnimateDiff Prompt Travel🧭 Seamlessly Navigate and Animate Between Text-to-Image Prompts for Dynamic Visual Narratives
Updated 2 years, 5 months ago
5.7K runs

zsxkib/animate-diff🎨 AnimateDiff (w/ MotionLoRAs for Panning, Zooming, etc): Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
Updated 2 years, 5 months ago
59.4K runs
lucataco/animate-diffAnimate Your Personalized Text-to-Image Diffusion Models
Updated 2 years, 6 months ago
332.5K runs

anotherjesse/zeroscope-v2-xlZeroscope V2 XL & 576w
Updated 2 years, 8 months ago
303.4K runs
cjwbw/controlvideoTraining-free Controllable Text-to-Video Generation
Updated 2 years, 10 months ago
2.4K runs
cjwbw/text2video-zeroText-to-Image Diffusion Models are Zero-Shot Video Generators
Updated 2 years, 11 months ago
42.1K runs
cjwbw/damo-text-to-videoMulti-stage text-to-video generation
Updated 3 years ago
157.8K runs
andreasjansson/tile-morphCreate tileable animations with seamless transitions
Updated 3 years, 1 month ago
529.4K runs

arielreplicate/deoldify_videoAdd colours to old video footage.
Updated 3 years, 1 month ago
13.9K runs

pollinations/real-basicvsr-video-superresolutionRealBasicVSR: Investigating Tradeoffs in Real-World Video Super-Resolution
Updated 3 years, 1 month ago
9.3K runs

arielreplicate/robust_video_mattingextract foreground of a video
Updated 3 years, 3 months ago
94.6K runs

arielreplicate/stable_diffusion_infinite_zoomUse Runway's Stable-diffusion inpainting model to create an infinite loop video
Updated 3 years, 4 months ago
38.5K runs
andreasjansson/stable-diffusion-animationAnimate Stable Diffusion by interpolating between two prompts
Updated 3 years, 4 months ago
119.6K runs
deforum/deforum_stable_diffusionAnimating prompts with stable diffusion
Updated 3 years, 6 months ago
267.1K runs