Explore

I want to…

Use official models

Official models are always on, maintained, and have predictable pricing.

Restore images

Models that improve or restore images by deblurring, colorization, and removing noise

Enhance videos

Models that enhance videos with super-resolution, sound effects, motion capture and other useful production effects.

Detect objects

Models that detect or segment objects in images and videos.

Make 3D stuff

Models that generate 3D objects, scenes, radiance fields, textures and multi-views.

Use FLUX fine-tunes

Browse the diverse range of fine-tunes the community has custom-trained on Replicate

Control image generation

Guide image generation with more than just text. Use edge detection, depth maps, and sketches to get the results you want.

Latest models

Updated 122 runs

LTX-Video is the first DiT-based video generation model capable of generating high-quality videos in real-time. It produces 24 FPS videos at a 768x512 resolution faster than they can be watched.

Updated 50.9K runs

Updated 19.6K runs

Finetune HunyuanVideo LoRAs with kohya-ss/musibi-tuner

Updated 81 runs

Updated 2.2K runs

Updated 22 runs

LatentSync: generate high-quality lip sync animations

Updated 9.8K runs

Simple tool to merge a foreground and background image

Updated 250 runs

A SOTA for background removal - Bria v2.0

Updated 2.1K runs

Convert musubi-tuner LoRA to ComfyUI compatible format

Updated 30 runs

Fine-tune HunyuanVideo via a-r-r-o-w/finetrainers (Work In Progress)

Updated 49 runs

Microsoft's Florence 2 Base

Updated 235 runs

Minimal and Universal Control for Diffusion Transformer - demo for Subject-driven generation

Updated 409 runs

Minimal and Universal Control for Diffusion Transformer - demo for Spatially aligned control

Updated 74 runs

Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization

Updated 914 runs

Updated 162 runs

One Diffusion to Generate Them All

Updated 131 runs

Upscale low resolution images to high resolution images

Updated 2.4K runs

Cog implementation of Diffusers Flux RFInversion Pipeline

Updated 190 runs

Detect deepfake faceswap image

Updated 51 runs

Swap the source face to target face

Updated 208 runs

Unofficial community fork and Diffusers formatted weights of tencent/HunyuanVideo

Updated 104 runs

Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Updated 958 runs

A simple model to detect and crop face found in image, made for https://outfit.fm

Updated 2.9K runs

Fork / Remix of Apollo 7B by Luis C. (https://replicate.com/lucataco/apollo-7b) to support multi-turn conversations.

Updated 21 runs

QVQ-72B-Preview by Qwen is an experimental research model focusing on enhancing visual reasoning capabilities

Updated 251 runs

Remodels interior

Updated 1.4K runs

Autoregressive Video Generation without Vector Quantization

Updated 28 runs

Flawless Text is a high-precision text-to-image model that generates typo-free, visually accurate images from text descriptions, ideal for seamless, error-free creative workflows.

Updated 1.2K runs

For the paper "Structured 3D Latents for Scalable and Versatile 3D Generation".

Updated 172 runs

Autoregressive Image Generation without Vector Quantization

Updated 13 runs

ModernBERT-large is a modernized bidirectional encoder-only Transformer model (BERT-style) pre-trained on 2 trillion tokens of English and code data

Updated 73 runs

ModernBERT-base is a modernized bidirectional encoder-only Transformer model (BERT-style) pre-trained on 2 trillion tokens of English and code data

Updated 68 runs

Scalable Streaming Speech Synthesis with Large Language Models

Updated 1.2K runs

Updated 37 runs

Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Updated 158 runs

A Llama-3.2-11B pretrained model, fine-tuned for content safety classification

Updated 41 runs

sound separation with demucs

Updated 32.9K runs

FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models

Updated 136 runs

Updated 1 run

A Llama-3.1-8B pretrained model, fine-tuned for content safety classification

Updated 29K runs

A 7B parameter Llama 2-based input-output safeguard model

Updated 16 runs

Image inpainting with flux

Updated 33 runs

black-forest-labs/flux-dev

A 12 billion parameter rectified flow transformer capable of generating images from text descriptions

Updated 11.4M runs

Latest model in the Qwen family for chatting with video and image models

Updated 13.3K runs

Remove backgrounds from images.

Updated 668K runs

ibm-granite/granite-3.1-8b-instruct

Granite-3.1-8B-Instruct is a lightweight and open-source 8B parameter model is designed to excel in instruction following tasks such as summarization, problem-solving, text translation, reasoning, code tasks, function-calling, and more.

Updated 496.3K runs

ibm-granite/granite-3.1-2b-instruct

Granite-3.1-2B-Instruct is a lightweight and open-source 2B parameter model designed to excel in instruction following tasks such as summarization, problem-solving, text translation, reasoning, code tasks, function-calling, and more.

Updated 8.3K runs

Fast Hunyuan Video by Hao AI Lab

Updated 367 runs

Demucs is an audio source separator created by Facebook Research.

Updated 353.1K runs