openai/whisper
Convert speech in audio to text
130.3M runs
jaaari/kokoro-82m
Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)
47.7M runs
prunaai/flux.1-dev
This is the fastest Flux Dev endpoint in the world, contact us for more at pruna.ai
24M runs
turian/insanely-fast-whisper-with-video
whisper-large-v3, incredibly fast, with video transcription
5.8M runs
ibm-granite/granite-4.0-h-small
Granite-4.0-H-Small is a 32B parameter long-context instruct model finetuned from Granite-4.0-H-Small-Base using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets.
874 runs
google/nano-banana
Google's latest image editing model in Gemini 2.5
14.9M runs
leonardoai/lucid-origin
Artistic and high-quality visuals with improved prompt adherence, diversity, and definition
46.2K runs
minimax/hailuo-02
Hailuo 2 is a text-to-video and image-to-video model that can make 6s or 10s videos at 768p (standard) or 1080p (pro). It excels at real world physics.
129.6K runs
bytedance/seedream-4
Unified text-to-image generation and precise single-sentence editing at up to 4K resolution
1.8M runs
anthropic/claude-4.5-sonnet
Claude Sonnet 4.5 is the best coding model to date, with significant improvements across the entire development lifecycle
2.2K runs
kwaivgi/kling-v2.1
Use Kling v2.1 to generate 5s and 10s videos in 720p and 1080p resolution from a starting image (image-to-video)
2M runs
wan-video/wan-2.5-t2v
Alibaba Wan 2.5 text to video generation model
4.8K runs
pixverse/pixverse-v5
Create 5s-8s videos with enhanced character movement, visual effects, and exclusive 1080p-8s support. Optimized for anime characters and complex actions
81.3K runs
qwen/qwen-image-edit-plus
The latest Qwen-Image’s iteration with improved multi-image editing, single-image consistency, and native support for ControlNet
462.7K runs
qwen/qwen-image
An image generation foundation model in the Qwen series that achieves significant advances in complex text rendering.
609.3K runs
prunaai/wan-2.2-image
This model generates beautiful cinematic 2 megapixel images in 3-4 seconds and is derived from the Wan 2.2 model through optimisation techniques from the pruna package
480.8K runs
Official models are always on, maintained, and have predictable pricing.
Granite-4.0-H-Small is a 32B parameter long-context instruct model finetuned from Granite-4.0-H-Small-Base using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets.
Google’s hybrid “thinking” AI model optimized for speed and cost-efficiency
Kling 2.5 Turbo Pro: Unlock pro-level text-to-video and image-to-video creation with smooth motion, cinematic depth, and remarkable prompt adherence.
Use this ultra version of Imagen 4 when quality matters more than speed and cost
Bria Increase resolution upscales the resolution of any image. It increases resolution using a dedicated upscaling method that preserves the original image content without regeneration.
Bria AI's remove background model
Commercial-ready, trained entirely on licensed data, text-to-image model. With only 4B parameters provides exceptional aesthetics and text rendering. Evaluated to be on par to other leading models in the market
Bria Background Generation allows for efficient swapping of backgrounds in images via text prompts or reference image, delivering realistic and polished results. Trained exclusively on licensed data for safe and risk-free commercial use
Bria GenFill enables high-quality object addition or visual transformation. Trained exclusively on licensed data for safe and risk-free commercial use.
SOTA Object removal, enables precise removal of unwanted objects from images while maintaining high-quality outputs. Trained exclusively on licensed data for safe and risk-free commercial use
Bria Expand expands images beyond their borders in high quality. Resizing the image by generating new pixels to expand to the desired aspect ratio. Trained exclusively on licensed data for safe and risk-free commercial use
Google's latest image editing model in Gemini 2.5
Create 5s 480p videos from a text prompt
Leonardo AI’s first foundational model produces images up to 5 megapixels (fast, quality and ultra modes)
Artistic and high-quality visuals with improved prompt adherence, diversity, and definition
A low cost and fast version of Hailuo 02. Generate 6s and 10s videos in 512p
Hailuo 2 is a text-to-video and image-to-video model that can make 6s or 10s videos at 768p (standard) or 1080p (pro). It excels at real world physics.
A premium version of Kling v2.1 with superb dynamics and prompt adherence. Generate 1080p 5s and 10s videos from text or an image
Quickly generate smooth 5s or 8s videos at 540p, 720p or 1080p
Unified text-to-image generation and precise single-sentence editing at up to 4K resolution
Use AI To Generate Images & Photos with an API
Use AI To Caption Videos with an API
Convert text to speech
Make realistic images of people instantly
Use AI To Generate Videos with an API
Upscaling models that create high-quality images from low-quality images
Use AI To Generate Music with an API
Use AI To Edit Any Image with an API
Models that convert speech to text
Optical character recognition (OCR) and text extraction
Models that remove backgrounds from images and videos
The FLUX family of text-to-image models from Black Forest Labs
Models that improve or restore images by deblurring, colorization, and removing noise
Upscaling models that create high-quality video from low-quality videos
Use AI To Generate Videos from images with an API
Use AI To Lipsync videos with an API
Browse the diverse range of fine-tunes the community has custom-trained on Replicate
Browse the diverse range of qwen-image fine-tunes the community has custom-trained on Replicate
Models that can understand and generate text
Toolbelt-type models for videos and images.
Use AI To Caption Images with an API
Generate videos with Wan, the fastest and highest quality open-source video generation model.
Ask language models about images
Models that generate 3D objects, scenes, radiance fields, textures and multi-views.
Guide image generation with more than just text. Use edge detection, depth maps, and sketches to get the results you want.
Voice-to-voice cloning and musical prosody
Models that generate embeddings from inputs
Get started with these models without adding a credit card. Whether you're making videos, generating images, or upscaling photos, these are great starting points.
Official models are always on, maintained, and have predictable pricing.
Models that detect or segment objects in images and videos.
Browse the diverse range of fine-tunes the community has custom-trained on Replicate
vufinder/vggt-1b
Feed-forward neural network that directly infers all key 3D attributes of a scene.
4 runs
ibm-granite/granite-4.0-h-small
Granite-4.0-H-Small is a 32B parameter long-context instruct model finetuned from Granite-4.0-H-Small-Base using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets.
874 runs
atonamy/images-to-webp-m
Convert a ZIP of ordered PNG/WebP frames into a WebM or WebP animation using ffmpeg, with quality, frame-rate, and size controls.
26 runs
atonamy/images-to-webm-p
Convert a stack of image frames into sticker-ready VP9 WebM or animated WebP files. Built for Telegram/WhatsApp sticker pipelines with adjustable quality, frame rate, and output size.
27 runs
google/gemini-2.5-flash
Google’s hybrid “thinking” AI model optimized for speed and cost-efficiency
109 runs
kwaivgi/kling-v2.5-turbo-pro
Kling 2.5 Turbo Pro: Unlock pro-level text-to-video and image-to-video creation with smooth motion, cinematic depth, and remarkable prompt adherence.
13.7K runs
google/imagen-4-ultra
Use this ultra version of Imagen 4 when quality matters more than speed and cost
737.8K runs
bria/increase-resolution
Bria Increase resolution upscales the resolution of any image. It increases resolution using a dedicated upscaling method that preserves the original image content without regeneration.
15.2K runs
bria/remove-background
Bria AI's remove background model
68.2K runs
bria/image-3.2
Commercial-ready, trained entirely on licensed data, text-to-image model. With only 4B parameters provides exceptional aesthetics and text rendering. Evaluated to be on par to other leading models in the market
6.7K runs
bria/generate-background
Bria Background Generation allows for efficient swapping of backgrounds in images via text prompts or reference image, delivering realistic and polished results. Trained exclusively on licensed data for safe and risk-free commercial use
15.3K runs
bria/genfill
Bria GenFill enables high-quality object addition or visual transformation. Trained exclusively on licensed data for safe and risk-free commercial use.
1.2K runs