turian/insanely-fast-whisper-with-video
whisper-large-v3, incredibly fast, with video transcription
8.3M runs
openai/whisper
Convert speech in audio to text
136.2M runs
jaaari/kokoro-82m
Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)
52M runs
andreasjansson/clip-features
Return CLIP features for the clip-vit-large-patch14 model
114.3M runs
bytedance/seedream-4
Unified text-to-image generation and precise single-sentence editing at up to 4K resolution
3.8M runs
openai/sora-2
OpenAI's Flagship video generation with synced audio
29K runs
anthropic/claude-4.5-haiku
Claude Haiku 4.5 gives you similar levels of coding performance but at one-third the cost and more than twice the speed
984 runs
philz1337x/crystal-upscaler
High-precision image upscaler optimized for portraits and faces. One of the upscale modes powered by Clarity AI. X:https://x.com/philz1337x
22.8K runs
google/veo-3.1
New and improved version of Veo 3, with higher-fidelity video, context-aware audio, reference image and last frame support
13.2K runs
reve/edit
Image editing model from Reve
1.6K runs
google/nano-banana
Google's latest image editing model in Gemini 2.5
22.5M runs
reve/create
Image generation model from Reve
3K runs
character-ai/ovi-i2v
Ovi: generate videos with audio from image and text inputs
3.9K runs
wan-video/wan-2.5-t2v
Alibaba Wan 2.5 text to video generation model
9.6K runs
qwen/qwen-image-edit-plus
The latest Qwen-Image’s iteration with improved multi-image editing, single-image consistency, and native support for ControlNet
1.4M runs
openai/gpt-5
OpenAI's new model excelling at coding, writing, and reasoning.
370.9K runs
Official models are always on, maintained, and have predictable pricing.
Generate synced sounds for any video and return it with its new soundtrack - now enhanced in version 1.5 for improved sound synchronization and realism
Generate synced sounds for any video, and return it with its new sound track
Turns your audio/video/images into professional-quality animated videos
Unified text-to-image generation and precise single-sentence editing at up to 4K resolution
A video generation model that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 720p resolution
A pro version of Seedance that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 1080p resolution
OpenAI's Most advanced synced-audio video generation
OpenAI's Flagship video generation with synced audio
Use this ultra version of Imagen 4 when quality matters more than speed and cost
Use this fast version of Imagen 4 when speed and cost are more important than quality
Google's Imagen 4 flagship model
Convert PDF to markdown + JSON quickly with high accuracy
Detect and transcribe text in images with accurate bounding boxes, layout analysis, reding order, and table recognition, in 90 languages
Kling 2.5 Turbo Pro: Unlock pro-level text-to-video and image-to-video creation with smooth motion, cinematic depth, and remarkable prompt adherence.
Image generation model from Reve which handles multiple input reference images
Latest hybrid thinking model from Deepseek
Generate 5s and 10s videos in 1080p resolution at 30fps
Claude Haiku 4.5 gives you similar levels of coding performance but at one-third the cost and more than twice the speed
High-precision image upscaler optimized for portraits and faces. One of the upscale modes powered by Clarity AI. X:https://x.com/philz1337x
New and improved version of Veo 3 Fast, with higher-fidelity video, context-aware audio and last frame support
Use AI To Generate Images & Photos with an API
Use AI To Caption Videos with an API
Convert text to speech
Make realistic images of people instantly
Use AI To Generate Videos with an API
Upscaling models that create high-quality images from low-quality images
Use AI To Generate Music with an API
Use AI To Edit Any Image with an API
Models that convert speech to text
Optical character recognition (OCR) and text extraction
Models that remove backgrounds from images and videos
The FLUX family of text-to-image models from Black Forest Labs
Models that improve or restore images by deblurring, colorization, and removing noise
Upscaling models that create high-quality video from low-quality videos
Use AI To Generate Videos from images with an API
Use AI To Lipsync videos with an API
Browse the diverse range of fine-tunes the community has custom-trained on Replicate
Browse the diverse range of qwen-image fine-tunes the community has custom-trained on Replicate
Models that can understand and generate text
Toolbelt-type models for videos and images.
Use AI To Caption Images with an API
Generate videos with Wan, the fastest and highest quality open-source video generation model.
Ask language models about images
Models that generate 3D objects, scenes, radiance fields, textures and multi-views.
Guide image generation with more than just text. Use edge detection, depth maps, and sketches to get the results you want.
Voice-to-voice cloning and musical prosody
Models that generate embeddings from inputs
Get started with these models without adding a credit card. Whether you're making videos, generating images, or upscaling photos, these are great starting points.
Official models are always on, maintained, and have predictable pricing.
Models that detect or segment objects in images and videos.
Browse the diverse range of fine-tunes the community has custom-trained on Replicate
djfrequin/dcm700
37 runs
mirelo/video-to-sfx-v1.5
Generate synced sounds for any video and return it with its new soundtrack - now enhanced in version 1.5 for improved sound synchronization and realism
17 runs
mirelo/video-to-sfx-v1
Generate synced sounds for any video, and return it with its new sound track
2.3K runs
lucataco/deepseek-ocr
Convert documents to markdown, extract raw text, and locate specific content
35 runs
bytedance/omni-human
Turns your audio/video/images into professional-quality animated videos
141.4K runs
espressotechie/qwen-imgedit-4bit
Qwen image edit fast
357 runs
lucataco/qwen3-vl-8b-instruct
A powerful vision-language model in the Qwen series
8 runs
bytedance/seedream-4
Unified text-to-image generation and precise single-sentence editing at up to 4K resolution
3.8M runs
bytedance/seedance-1-lite
A video generation model that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 720p resolution
1.4M runs
bytedance/seedance-1-pro
A pro version of Seedance that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 1080p resolution
793.6K runs
openai/sora-2-pro
OpenAI's Most advanced synced-audio video generation
13.1K runs
openai/sora-2
OpenAI's Flagship video generation with synced audio
29K runs