

openai/whisper
Convert speech in audio to text
139.8M runs


andreasjansson/clip-features
Return CLIP features for the clip-vit-large-patch14 model
117M runs


jaaari/kokoro-82m
Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)
54.7M runs


prunaai/flux.1-dev
This is the fastest Flux Dev endpoint in the world, contact us for more at pruna.ai
28.9M runs


reve/create
Image generation model from Reve
7.1K runs

lightricks/ltx-2-fast
Ideal for rapid ideation and mobile workflows. Perfect for creators who need instant feedback, real-time previews, or high-throughput content.
6.7K runs


philz1337x/crystal-upscaler
High-precision image upscaler optimized for portraits and faces. One of the upscale modes powered by Clarity AI. X:https://x.com/philz1337x
95.2K runs

google/veo-3.1
New and improved version of Veo 3, with higher-fidelity video, context-aware audio, reference image and last frame support
31.2K runs


bytedance/seedream-4
Unified text-to-image generation and precise single-sentence editing at up to 4K resolution
6.2M runs


openai/sora-2
OpenAI's Flagship video generation with synced audio
45.8K runs

minimax/hailuo-2.3
A high-fidelity video generation model optimized for realistic human motion, cinematic VFX, expressive characters, and strong prompt and style adherence across both text-to-video and image-to-video workflows
2K runs

bytedance/seedance-1-pro-fast
A faster and cheaper version of Seedance 1 Pro
19K runs


anthropic/claude-4.5-haiku
Claude Haiku 4.5 gives you similar levels of coding performance but at one-third the cost and more than twice the speed
4.8K runs


google/nano-banana
Google's latest image editing model in Gemini 2.5
28.2M runs


qwen/qwen-image-edit-plus
The latest Qwen-Image’s iteration with improved multi-image editing, single-image consistency, and native support for ControlNet
2.1M runs


openai/gpt-5
OpenAI's new model excelling at coding, writing, and reasoning.
446K runs
Official models are always on, maintained, and have predictable pricing.

Professional edge-guided image generation. Control structure and composition using Canny edge detection

Professional depth-aware image generation. Edit images while preserving spatial relationships.

Image editing model from Reve

Image generation model from Reve

Image generation model from Reve which handles multiple input reference images
A new way to edit, transform and generate video

Runway's Gen-4 Image model with references. Use up to 3 reference images to create the exact image you need. Capture every angle.
Generate 5s and 10s 720p videos fast
Upscale videos by 4x, up to a maximum of 4k
Ideal for rapid ideation and mobile workflows. Perfect for creators who need instant feedback, real-time previews, or high-throughput content.

High-precision image upscaler optimized for portraits and faces. One of the upscale modes powered by Clarity AI. X:https://x.com/philz1337x

SOTA Open source model trained on licensed data, transforming intent into structured control for precise, high-quality AI image generation in enterprise and agentic workflows.

Compose a song from a prompt or a composition plan

Sound on: Google’s flagship Veo 3 text to video model, with audio

New and improved version of Veo 3 Fast, with higher-fidelity video, context-aware audio and last frame support
New and improved version of Veo 3, with higher-fidelity video, context-aware audio, reference image and last frame support

Google's Imagen 4 flagship model

Google’s hybrid “thinking” AI model optimized for speed and cost-efficiency

Google's latest image generation model in Gemini 2.5

Google's highest quality text-to-image model, capable of generating images with detail, rich lighting and beauty
Use AI to generate images & photos with an API
Use AI to caption videos with an API
Use AI for text-to-speech or to clone your voice via API
Use AI to generate images from a face with an API
Use AI to generate videos with an API
Use AI to upscale images with super resolution with an API
Use AI to generate music with an API
Use AI to edit any image via API
Use AI to transcribe speech to text via API
Use AI For Optical Character Recognition (OCR) to extract text from images via API
Use AI to remove backgrounds from images and videos with an API
FLUX AI models: advanced image generation & editing via API
Use AI to restore images via API
Use AI to enhance videos via API - Replicate
Explore Large Language Models (LLMs) for chat, generation & NLP tasks via API
Try AI Models for free: video generation, image generation, upscaling, and photo restoration
Use AI to create 3D content with an API
Use AI to control image generation with an API
Embedding models for AI search and analysis
Use AI to edit your videos with an API
Use AI to Generate Videos from Images with API
Use AI object detection and segmentation models to distinguish objects in images & videos
Use AI to generate lipsync videos with an API
Official AI models: Always available, stable, and predictably priced
Flux fine-tunes: build and run custom AI image models via API
Kontext fine-tunes: Build custom AI image models with an API
Create songs with voice cloning models via API
AI media utilities: auto-caption, watermark, frame extraction & more via API
Browse the diverse range of qwen-image fine-tunes the community has custom-trained on Replicate.
Chat with images for understanding, captioning & detection via API
WAN family of models: powerful image-to-video & text-to-video models
Use AI To Caption Images with an API


lucataco/chronoedit
ChronoEdit-14B enables physics-aware image editing and action-conditioned world simulation through temporal reasoning.
30 runs

andreasjansson/cursed-sitcom-generator
The one where anyone can generate a https://cursedsit.com clone with an API
25 runs

andreasjansson/video-stitcher
Fast GPU-powered concatenation of multiple videos, with short audio crossfades
34 runs


lucataco/gpt-oss-safeguard-20b
classify text content based on safety policies that you provide and perform a suite of foundational safety tasks
1 run


minimax/speech-2.6-hd
MiniMax Speech 2.6 HD delivers studio-quality multilingual text-to-audio on Replicate with nuanced prosody, subtitle export, and premium voices
497 runs


minimax/speech-2.6-turbo
Low‑latency MiniMax Speech 2.6 Turbo brings multilingual, emotional text-to-speech to Replicate with 300+ voices and real-time friendly pricing
168 runs


eiby777/olmocr-2-7b-1025-fp8
Quantized to FP8 Version of olmOCR-2-7B-1025, using llmcompressor.
11 runs


elevenlabs/music
Compose a song from a prompt or a composition plan
169 runs


elizabeth815/linocut
Experimenting with lino cut effects
40 runs
avocado/reddit-tiktok-video
3 runs


emersimeon/azlyrics-replicate
45 runs


eiby777/manga_globes
Detect and classify speech bubbles in manga images
9 runs