

beautyyuyanli / multilingual-e5-large
multilingual-e5-large: A multi-language text embedding model
42.4M runs


jaaari / kokoro-82m
Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)
68.4M runs


prunaai / flux-kontext-fast
Ultra fast flux kontext endpoint
14.6M runs


tencentarc / gfpgan
Practical face restoration algorithm for *old photos* or *AI-generated faces*
103.6M runs

The fastest open source TTS model without sacrificing quality.
816 runs

openai/gpt-5.2The best model for coding and agentic tasks across industries
118.8K runs

bytedance/seedream-4.5Seedream 4.5: Upgraded Bytedance image model with stronger spatial understanding and world knowledge
183.8K runs

prunaai/z-image-turboZ-Image Turbo is a super fast text-to-image model of 6B parameters developed by Tongyi-MAI.
643.2K runs
lightricks/ltx-2-retakeTake any shot and edit specific sections. Rephrase, change the action, camera angles and more
668 runs

Google's most advanced reasoning Gemini model
49.9K runs

High-quality image generation and editing with support for eight reference images
436.5K runs

Google's state of the art image generation and editing model 🍌🍌
4.6M runs

prunaai/p-image-editA sub 1 second 0.01$ multi-image editing model built for production use cases. For image generation, check out p-image here: https://replicate.com/prunaai/p-image
742.6K runs
New and improved version of Veo 3, with higher-fidelity video, context-aware audio, reference image and last frame support
189.7K runs

openai/sora-2OpenAI's Flagship video generation with synced audio
140.6K runs

philz1337x/crystal-upscalerHigh-precision image upscaler optimized for portraits, faces and products. One of the upscale modes powered by Clarity AI. X:https://x.com/philz1337x
257K runs
Official models are always on, maintained, and have predictable pricing.
Alibaba Wan 2.6 image to video generation model
Alibaba Wan 2.6 text to video generation model

The fastest open source TTS model without sacrificing quality.

The best model for coding and agentic tasks across industries
Realistic lipsync with refined human emotion capabilities
VEED Fabric 1.0 is an image-to-video API that turns any image into a talking video

Seedream 4.5: Upgraded Bytedance image model with stronger spatial understanding and world knowledge

Z-Image Turbo is a super fast text-to-image model of 6B parameters developed by Tongyi-MAI.

Max-quality image generation and editing with support for ten reference images

Quality image generation and editing with support for reference images
Take any shot and edit specific sections. Rephrase, change the action, camera angles and more

Google's most advanced reasoning Gemini model

Generate complex 3D models from images with Rodin Gen-2

High-quality image generation and editing with support for eight reference images

The best model for coding and agentic tasks with configurable reasoning effort.

Fusion – Product/object blending that fixes perspective and lighting so the subject melts into a new background via the Fusion LoRA.

Relight – Soft, curtain-filtered relighting that repaints the scene with golden-hour or moody tones using the Relight LoRA.

Upscale – Detail-loving upscale/restore pass that sharpens textures and color fidelity with the Upscale LoRA.

Next Scene – “Next beat” cinematic edits that keep subject identity while steering to the next camera move via the Next Scene LoRA

Skin – Natural beauty retouch that enhances pores and tonal variation (no plastic skin) via the Skin LoRA.
Use AI to generate images & photos with an API
Use AI to caption videos with an API
Use AI for text-to-speech or to clone your voice via API
Use AI to generate images from a face with an API
Use AI to generate videos with an API
Use AI to upscale images with super resolution with an API
Use AI to generate music with an API
Use AI to edit any image via API
Use AI to transcribe speech to text via API
Use AI For Optical Character Recognition (OCR) to extract text from images via API
Use AI to remove backgrounds from images and videos with an API
FLUX AI models: advanced image generation & editing via API
Use AI to restore images via API
Use AI to enhance videos via API - Replicate
Detect NSFW content in images and text
Classify text by sentiment, topic, intent, or safety
Identify speakers from audio and video inputs
Replace faces across images with natural-looking results.
Transform rough sketches into polished visuals
Generate custom emojis from text or images
Create anime-style characters, scenes, and animations
Use AI to Generate Videos from Images with API
Use AI to generate lipsync videos with an API
Use AI to create 3D content with an API
Chat with images for understanding, captioning & detection via API
Explore Large Language Models (LLMs) for chat, generation & NLP tasks via API
Try AI Models for free: video generation, image generation, upscaling, and photo restoration
Use AI to control image generation with an API
Embedding models for AI search and analysis
Use AI to edit your videos with an API
Use AI object detection and segmentation models to distinguish objects in images & videos
Official AI models: Always available, stable, and predictably priced
Flux fine-tunes: build and run custom AI image models via API
Kontext fine-tunes: Build custom AI image models with an API
Create songs with voice cloning models via API
AI media utilities: auto-caption, watermark, frame extraction & more via API
Browse the diverse range of qwen-image fine-tunes the community has custom-trained on Replicate.
WAN family of models: powerful image-to-video & text-to-video models
Use AI To Caption Images with an API

nvidia / nemotron-3-nano-30b-a3b
Nemotron-3-Nano-30B-A3B is a large language model (LLM) trained from scratch by NVIDIA
242 runs
wan-video / wan-2.6-i2v
Alibaba Wan 2.6 image to video generation model
237 runs
wan-video / wan-2.6-t2v
Alibaba Wan 2.6 text to video generation model
261 runs

vufinder / depth-anything-v3-metric
Monocular metric depth estimation
4 runs


souterdelilah-bit / delilahsouter
4 runs

resemble-ai / chatterbox-turbo
The fastest open source TTS model without sacrificing quality.
816 runs

nightowlstudio-11 / clipforge
Add Text & Music to Videos
3.5K runs


jasonod888 / moge2
Outputs Depth, Normal & Point Cloud for a Given Input Image
1 run


leadssolution2 / aiavatar
7 runs

vufinder / depth-anything-v3-mono
Monocular relative depth estimation
21 runs


prunaai / qwen-image-fast
Qwen-Image optimized by Pruna AI. Generates high fidelity 1.5MP images in 1s. Model will be priced at 0.01$ in January.
1.8K runs


tattzy25 / famous-flux
A custom Flux 1 Dev LoRA trained on ~50 diverse images blending portrait photography with automotive aesthetics. Use the trigger word FAMOSOFLUXO to activate the style. Perfect for creating unique fusion imagery combining human subjects with car culture e
39 runs