adirik/grounding-dino
Detect everything with language!
15.7M runs
andreasjansson/clip-features
Return CLIP features for the clip-vit-large-patch14 model
108.8M runs
openai/whisper
Convert speech in audio to text
128.9M runs
jaaari/kokoro-82m
Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)
46.9M runs
bytedance/seedream-4
Unified text-to-image generation and precise single-sentence editing at up to 4K resolution
1.5M runs
leonardoai/lucid-origin
Artistic and high-quality visuals with improved prompt adherence, diversity, and definition
37.9K runs
anthropic/claude-4.5-sonnet
Claude Sonnet 4.5 is the best coding model to date, with significant improvements across the entire development lifecycle
873 runs
openai/gpt-5-structured
GPT-5 with support for structured outputs, web search and custom tools
78.2K runs
kwaivgi/kling-v2.1
Use Kling v2.1 to generate 5s and 10s videos in 720p and 1080p resolution from a starting image (image-to-video)
2M runs
wan-video/wan-2.5-t2v
Alibaba Wan 2.5 text to video generation model
3.3K runs
pixverse/pixverse-v5
Create 5s-8s videos with enhanced character movement, visual effects, and exclusive 1080p-8s support. Optimized for anime characters and complex actions
54.8K runs
qwen/qwen-image-edit-plus
The latest Qwen-Image’s iteration with improved multi-image editing, single-image consistency, and native support for ControlNet
293.2K runs
google/nano-banana
Google's latest image editing model in Gemini 2.5
13M runs
qwen/qwen-image
An image generation foundation model in the Qwen series that achieves significant advances in complex text rendering.
567.8K runs
minimax/hailuo-02
Hailuo 2 is a text-to-video and image-to-video model that can make 6s or 10s videos at 768p (standard) or 1080p (pro). It excels at real world physics.
123.8K runs
prunaai/wan-2.2-image
This model generates beautiful cinematic 2 megapixel images in 3-4 seconds and is derived from the Wan 2.2 model through optimisation techniques from the pruna package
440.6K runs
Official models are always on, maintained, and have predictable pricing.
A premium version of Kling v2.1 with superb dynamics and prompt adherence. Generate 1080p 5s and 10s videos from text or an image
Quickly generate smooth 5s or 8s videos at 540p, 720p or 1080p
Unified text-to-image generation and precise single-sentence editing at up to 4K resolution
Text-guided image editing model that preserves original details while making targeted modifications like lighting changes, object removal, and style conversion
A text-to-image model with support for native high-resolution (2K) image generation
A pro version of Seedance that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 1080p resolution
A video generation model that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 720p resolution
The highest quality Ideogram v3 model. v3 creates images with stunning realism, creative designs, and consistent styles
Balance speed, quality and cost. Ideogram v3 creates images with stunning realism, creative designs, and consistent styles
An excellent image model with state of the art inpainting, prompt comprehension and text rendering
Generate consistent characters from a single reference image. Outputs can be in many styles. You can also use inpainting to add your character to an existing image.
Turbo is the fastest and cheapest Ideogram v3. v3 creates images with stunning realism, creative designs, and consistent styles
Like Ideogram v2, but faster and cheaper
A fast image model with state of the art inpainting, prompt comprehension and text rendering.
Like Ideogram v2 turbo, but now faster and cheaper
Leonardo AI’s first foundational model produces images up to 5 megapixels (fast, quality and ultra modes)
Artistic and high-quality visuals with improved prompt adherence, diversity, and definition
Create 5s 480p videos from a text prompt
Translate videos into over 150 languages
Convert raster images to high-quality SVG format with precision and clean vector paths, perfect for logos, icons, and scalable graphics.
Use AI To Generate Images & Photos with an API
Use AI To Caption Videos with an API
Convert text to speech
Make realistic images of people instantly
Use AI To Generate Videos with an API
Upscaling models that create high-quality images from low-quality images
Use AI To Generate Music with an API
Use AI To Edit Any Image with an API
Models that convert speech to text
Optical character recognition (OCR) and text extraction
Models that remove backgrounds from images and videos
The FLUX family of text-to-image models from Black Forest Labs
Models that improve or restore images by deblurring, colorization, and removing noise
Upscaling models that create high-quality video from low-quality videos
Use AI To Generate Videos from images with an API
Use AI To Lipsync videos with an API
Browse the diverse range of fine-tunes the community has custom-trained on Replicate
Browse the diverse range of qwen-image fine-tunes the community has custom-trained on Replicate
Models that can understand and generate text
Toolbelt-type models for videos and images.
Use AI To Caption Images with an API
Generate videos with Wan, the fastest and highest quality open-source video generation model.
Ask language models about images
Models that generate 3D objects, scenes, radiance fields, textures and multi-views.
Guide image generation with more than just text. Use edge detection, depth maps, and sketches to get the results you want.
Voice-to-voice cloning and musical prosody
Models that generate embeddings from inputs
Get started with these models without adding a credit card. Whether you're making videos, generating images, or upscaling photos, these are great starting points.
Official models are always on, maintained, and have predictable pricing.
Models that detect or segment objects in images and videos.
Browse the diverse range of fine-tunes the community has custom-trained on Replicate
kwaivgi/kling-v2.1-master
A premium version of Kling v2.1 with superb dynamics and prompt adherence. Generate 1080p 5s and 10s videos from text or an image
52.6K runs
pixverse/pixverse-v4
Quickly generate smooth 5s or 8s videos at 540p, 720p or 1080p
29.4K runs
sansan1981/noirvision
NoirVisions is a text-to-image AI model created by Boss Luxe, designed to generate soft, airbrushed, high-fashion digital portraits centering Black women. Trained exclusively on original, curated imagery, this model captures luxury, melanin-rich beauty, a
892 runs
bytedance/seedream-4
Unified text-to-image generation and precise single-sentence editing at up to 4K resolution
1.5M runs
bytedance/seededit-3.0
Text-guided image editing model that preserves original details while making targeted modifications like lighting changes, object removal, and style conversion
254.5K runs
bytedance/seedream-3
A text-to-image model with support for native high-resolution (2K) image generation
2.4M runs
bytedance/seedance-1-pro
A pro version of Seedance that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 1080p resolution
585.7K runs
bytedance/seedance-1-lite
A video generation model that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 720p resolution
1.1M runs
alicem97/alice2
55 runs
ideogram-ai/ideogram-v3-quality
The highest quality Ideogram v3 model. v3 creates images with stunning realism, creative designs, and consistent styles
1.8M runs
ideogram-ai/ideogram-v3-balanced
Balance speed, quality and cost. Ideogram v3 creates images with stunning realism, creative designs, and consistent styles
247.2K runs
ideogram-ai/ideogram-v2
An excellent image model with state of the art inpainting, prompt comprehension and text rendering
2.4M runs