Explore

Fine-tune FLUX fast
Customize FLUX.1 [dev] with the fast FLUX trainer on Replicate
Train the model to recognize and generate new concepts using a small set of example images, for specific styles, characters, or objects. It's fast (under 2 minutes), cheap (under $2), and gives you a warm, runnable model plus LoRA weights to download.
Featured models

bytedance / seededit-3.0
Text-guided image editing model that preserves original details while making targeted modifications like lighting changes, object removal, and style conversion

bytedance / seedance-1-pro
A pro version of Seedance that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 1080p resolution

black-forest-labs / flux-kontext-pro
A state-of-the-art text-based image editing model that delivers high-quality outputs with excellent prompt following and consistent results for transforming images through natural language

replicate / fast-flux-kontext-trainer
Fine-tune FLUX.1 Kontext to follow your own instructions

prunaai / hidream-l1-fast
This is an optimised version of the hidream-l1 model using the pruna ai optimisation toolkit!

google / veo-3-fast
A faster and cheaper version of Google’s Veo 3 video model, with audio
zsxkib / thinksound
Generate contextual audio from video using step-by-step reasoning🎶
minimax / hailuo-02
Hailuo 2 is a text-to-video and image-to-video model that can make 6s or 10s videos at 720p (standard) or 1080p (pro). It excels at real world physics.

ideogram-ai / ideogram-v3-turbo
Turbo is the fastest and cheapest Ideogram v3. v3 creates images with stunning realism, creative designs, and consistent styles
Official models
Official models are always on, maintained, and have predictable pricing.
I want to…
Generate images
Models that generate images from text prompts
Caption videos
Models that generate text from videos
Generate speech
Convert text to speech
Use a face to make images
Make realistic images of people instantly
Generate videos
Models that create and edit videos
Upscale images
Upscaling models that create high-quality images from low-quality images
Generate music
Models to generate and modify music
Edit images
Tools for editing images.
Transcribe speech
Models that convert speech to text
Extract text from images
Optical character recognition (OCR) and text extraction
Remove backgrounds
Models that remove backgrounds from images and videos
Use the FLUX family of models
The FLUX family of text-to-image models from Black Forest Labs
Restore images
Models that improve or restore images by deblurring, colorization, and removing noise
Enhance videos
Upscaling models that create high-quality video from low-quality videos
Chat with images
Ask language models about images
Edit Videos
Tools for editing videos.
Use LLMs
Models that can understand and generate text
Make 3D stuff
Models that generate 3D objects, scenes, radiance fields, textures and multi-views.
Make videos with Wan2.1
Generate videos with Wan2.1, the fastest and highest quality open-source video generation model.
Caption images
Models that generate text from images
Use handy tools
Toolbelt-type models for videos and images.
Control image generation
Guide image generation with more than just text. Use edge detection, depth maps, and sketches to get the results you want.
Sing with voices
Voice-to-voice cloning and musical prosody
Get embeddings
Models that generate embeddings from inputs
Try for free
Get started with these models without adding a credit card. Whether you're making videos, generating images, or upscaling photos, these are great starting points.
Use official models
Official models are always on, maintained, and have predictable pricing.
Detect objects
Models that detect or segment objects in images and videos.
Use FLUX fine-tunes
Browse the diverse range of fine-tunes the community has custom-trained on Replicate
Popular models
This is the fastest Flux Dev endpoint in the world, contact us for more at pruna.ai
Return CLIP features for the clip-vit-large-patch14 model
Practical face restoration algorithm for *old photos* or *AI-generated faces*
SDXL-Lightning by ByteDance: a fast text-to-image model that makes high-quality images in 4 steps
multilingual-e5-large: A multi-language text embedding model
whisper-large-v3, incredibly fast, powered by Hugging Face Transformers! 🤗
Latest models
An AI system that can create realistic images and art from a description in natural language.
A multimodal image generation model that creates high-quality images. You need to bring your own verified OpenAI key to use this model. Your OpenAI account will be charged for usage.
Generate believable consistent character videos
Commercial-ready, trained entirely on licensed data, text-to-image model. With only 4B parameters provides exceptional aesthetics and text rendering. Evaluated to be on par to other leading models in the market
Text-guided image editing model that preserves original details while making targeted modifications like lighting changes, object removal, and style conversion
A pro version of Seedance that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 1080p resolution
A video generation model that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 720p resolution
minimax/hailuo-02 + topazlabs/video-upscale + zsxkib/thinksound
A premium text-based image editing model that delivers maximum performance and improved typography generation for transforming images through natural language prompts
A state-of-the-art text-based image editing model that delivers high-quality outputs with excellent prompt following and consistent results for transforming images through natural language
Granite-speech-3.3-8b is a compact and efficient speech-language model, specifically designed for automatic speech recognition (ASR) and automatic speech translation (AST).
Dia 1.6B by Nari Labs, Generates realistic dialogue audio from text, including non-verbal cues and voice cloning
WhisperX that works with multiple chunks, download, processing and merging the results
Granite-vision-3.3-2b is a compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more.
Fast endpoint for Flux Kontext, optimized with pruna framework
This is the fastest Flux Dev endpoint in the world, contact us for more at pruna.ai
This is an optimised version of the hidream-full model using the pruna ai optimisation toolkit!
This is an optimised version of the hidream-l1-dev model using the pruna ai optimisation toolkit!
This is an optimised version of the hidream-l1 model using the pruna ai optimisation toolkit!
Faster slight quality reduction compared to LTX-Video 13b
Automatically generates expert ThinkSound prompts by analyzing your video w/ Claude 4 - no more struggling with complex audio descriptions
See how Hailuo 02 handles various out of distribution physics prompts
Easily compare the latest AI video models
A faster and cheaper version of Google’s Veo 3 video model, with audio
Generate contextual audio from video using step-by-step reasoning🎶