Explore

Generated with davisbrown/flux-half-illustration

Fine-tune FLUX

Customize FLUX.1 [dev] with Ostris's AI Toolkit on Replicate. Train the model to recognize and generate new concepts using a small set of example images, for specific styles, characters, or objects. (Generated with davisbrown/flux-half-illustration.)

I want to…

Make videos with Wan2.1

Generate videos with Wan2.1, the fastest and highest quality open-source video generation model.

Upscale images

Upscaling models that create high-quality images from low-quality images

Restore images

Models that improve or restore images by deblurring, colorization, and removing noise

Use official models

Official models are always on, maintained, and have predictable pricing.

Enhance videos

Models that enhance videos with super-resolution, sound effects, motion capture and other useful production effects.

Detect objects

Models that detect or segment objects in images and videos.

Make 3D stuff

Models that generate 3D objects, scenes, radiance fields, textures and multi-views.

Use FLUX fine-tunes

Browse the diverse range of fine-tunes the community has custom-trained on Replicate

Control image generation

Guide image generation with more than just text. Use edge detection, depth maps, and sketches to get the results you want.

Latest models

Place items in a scene without needing to train on them

Updated 2.5K runs

Cogified implementation of OminiControl

Updated 74 runs

Updated 81 runs

Regression of musical arousal and valence values

Updated 8.8K runs

Step-Audio-TTS-3B represents the industry's first Text-to-Speech (TTS) model trained on a large-scale synthetic dataset utilizing the LLM-Chat paradigm

Updated 1.1K runs

Tiled inference implementation of PLKSR

Updated 69 runs

Updated 167 runs

VideoLLaMA 3: Frontier Multimodal Foundation Models for Video Understanding

Updated 1.9K runs

Flex.1 alpha is a pre-trained base 8 billion parameter rectified flow transformer capable of generating images from text descriptions

Updated 300 runs

Zonos-v0.1 by Zyphra, voice cloning, 5 languages and emotion control

Updated 1.2K runs

Janus-Pro is a novel autoregressive framework for multimodal understanding

Updated 6.7K runs

Updated 651 runs

anthropic/claude-3.5-haiku

Anthropic's fastest, most cost-effective model, with a 200K token context window (claude-3-5-haiku-20241022)

Updated 906.9K runs

anthropic/claude-3.5-sonnet

Anthropic's most intelligent language model to date, with a 200K token context window and image understanding (claude-3-5-sonnet-20241022)

Updated 475.6K runs

Updated 303 runs

Transform Images & Text into 3D Models with AI

Updated 44 runs

DeepSeek-VL2, an advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL

Updated 52.4K runs

DeepSeek-VL2-small, an advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL

Updated 681 runs

Zonos-v0.1 beta, a SOTA text-to-speech Transformer model with extraordinary expressive range, built by Zyphra.

Updated 254 runs

Converts a video into a black and white dotted video effect

Updated 749 runs

Hibiki: High-Fidelity Simultaneous Speech-To-Speech Translation

Updated 12 runs

Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Updated 1.2K runs

Updated 208 runs

Updated 199 runs

Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)

Updated 2.1K runs

Updated 197 runs

Updated 195 runs

Make Fun by Changing Face on a GIF!

Updated 42.5K runs

google/imagen-3-fast

A faster and cheaper Imagen 3 model, for when price or speed are more important than final image quality

Updated 108.6K runs

google/imagen-3

Google's highest quality text-to-image model, capable of generating images with detail, rich lighting and beauty

Updated 803.4K runs

Updated 191 runs

Updated 214 runs

Updated 192 runs

https://civitai.com/models/833294

Updated 28.4K runs

Rembg implementation with mask output

Updated 45 runs

Updated 196 runs

Janus-Pro is a novel autoregressive framework for multimodal understanding

Updated 10.7K runs

Generate music with YuE-s1-7B (English, chain of thought model)

Updated 984 runs

Test deployment of OuteTTS 500M

Updated 503 runs

Interior Design with RealVisXL V5.0-Lightning and ControlNet to generate photorealistic, high-resolution interior designs.

Updated 693 runs

Ultimate anime-themed finetuned SDXL model and the latest installment of the Animagine XL series

Updated 668 runs

Interior Design with RealVisXL V5.0 and ControlNet (Depth & Union SDXL ProMax) to generate photorealistic, high-resolution interior designs with enhanced depth and structure.

Updated 911 runs

STAR Video Upscaler: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

Updated 437 runs

Updated 529 runs

Takes audio (mp3) and a "source-of-truth" audio transcript (string) as input and returns precise timestamps.

Updated 822 runs

Updated 604 runs

A demo model for a guide I'm working on...

Updated 8 runs

DeepSeek-R1 distilled on LLaMA3.3 70B and quantized by ollama

Updated 23 runs

Updated 1.3K runs