Explore

I want to…

Restore images

Models that improve or restore images by deblurring, colorization, and removing noise

Enhance videos

Models that enhance videos with super-resolution, sound effects, motion capture and other useful production effects.

Detect objects

Models that detect or segment objects in images and videos.

Make 3D stuff

Models that generate 3D objects, scenes, radiance fields, textures and multi-views.

Use FLUX fine-tunes

Browse the diverse range of fine-tunes the community has custom-trained on Replicate

Control image generation

Guide image generation with more than just text. Use edge detection, depth maps, and sketches to get the results you want.

Latest models

Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Updated 64 runs

VideoLLaMA 3: Frontier Multimodal Foundation Models for Video Understanding

Updated 5 runs

This model generates pose variation of a cartoon character. It preserves the cartoon identity. Use this model to augment training dataset for any cartoon character created through AI. The augmented dataset can be used to train a LoRA model.

Updated 1.9K runs

Flex.1 alpha is a pre-trained base 8 billion parameter rectified flow transformer capable of generating images from text descriptions

Updated 106 runs

Change or Replace Video Background with any Image

Updated 41 runs

Zonos-v0.1 by Zyphra, voice cloning, 5 languages and emotion control

Updated 196 runs

Janus-Pro is a novel autoregressive framework for multimodal understanding

Updated 23 runs

Updated 1 run

anthropic/claude-3.5-haiku

Anthropic's fastest, most cost-effective model, with a 200K token context window (claude-3-5-haiku-20241022)

Updated 1.7K runs

anthropic/claude-3.5-sonnet

Anthropic's most intelligent language model to date, with a 200K token context window and image understanding (claude-3-5-sonnet-20241022)

Updated 1.5K runs

GPU accelerated replay renderer / video data clipper for comma.ai connect's openpilot route data. SEE README.

Updated 3.6K runs

Transform Images & Text into 3D Models with AI

Updated 29 runs

DeepSeek-VL2, an advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL

Updated 5.3K runs

DeepSeek-VL2-small, an advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL

Updated 30 runs

minimax/video-01-director

Generate videos with specific camera movements

Updated 733 runs

Zonos-v0.1 beta, a SOTA text-to-speech Transformer model with extraordinary expressive range, built by Zyphra.

Updated 150 runs

Converts a video into a black and white dotted video effect

Updated 139 runs

Run any ComfyUI workflow. Guide: https://github.com/fofr/cog-comfyui

Updated 1.5M runs

Hibiki: High-Fidelity Simultaneous Speech-To-Speech Translation

Updated 7 runs

Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Updated 91 runs

This model is an optimised version of stable-diffusion by stability AI that is deployed on a T4 instead of an A100 making it ~7x cheaper!

Updated 48 runs

Updated 22 runs

This is an optimised version of the flux schnell model from black forest labs with the pruna tool. We achieve a ~3x speedup over the original model with minimal quality loss.

Updated 102 runs

Updated 5 runs

Updated 1 run

Updated 15.3K runs

flux_schnell model img2img inference

Updated 27.8K runs

flux dev

Updated 73.1K runs

Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)

Updated 297 runs

Updated 2 runs

Updated 2 runs

Tiled inference implementation of PLKSR

Updated 41 runs

Make Fun by Changing Face on a GIF!

Updated 474 runs

Updated 50 runs

google/imagen-3-fast

A faster and cheaper Imagen 3 model, for when price or speed are more important than final image quality

Updated 12.7K runs

google/imagen-3

Google's highest quality text-to-image model, capable of generating images with detail, rich lighting and beauty

Updated 38.4K runs

Updated 1 run

Updated 6 runs

Updated 2 runs

https://civitai.com/models/833294

Updated 20.5K runs

black-forest-labs/flux-depth-pro

Professional depth-aware image generation. Edit images while preserving spatial relationships.

Updated 52.6K runs

black-forest-labs/flux-canny-pro

Professional edge-guided image generation. Control structure and composition using Canny edge detection

Updated 83.3K runs

black-forest-labs/flux-fill-pro

Professional inpainting and outpainting model with state-of-the-art performance. Edit or extend images with natural, seamless results.

Updated 201.1K runs

Rembg implementation with mask output

Updated 42 runs

black-forest-labs/flux-1.1-pro

Faster, better FLUX Pro. Text-to-image model with excellent image quality, prompt adherence, and output diversity.

Updated 15M runs

Updated 1 run

black-forest-labs/flux-1.1-pro-ultra

FLUX1.1 [pro] in ultra and raw modes. Images are up to 4 megapixels. Use raw mode for realism.

Updated 5.7M runs

black-forest-labs/flux-pro

State-of-the-art image generation with top of the line prompt following, visual quality, image detail and output diversity.

Updated 9.2M runs

Janus-Pro is a novel autoregressive framework for multimodal understanding

Updated 1.8K runs

Generate music with YuE-s1-7B (English, chain of thought model)

Updated 324 runs