Generate images
These models generate images from text prompts. Many of these models are based on Stable Diffusion and FLUX.1.
Read our guide to learn more about using Stable Diffusion.
- Text-to-image - Convert text prompts to photorealistic images. Useful for quickly visualizing concepts
- Control over style - Adjust image properties like lighting and texture via prompts
- In-painting - Expand, edit, or refine images by filling in missing regions
Our Picks
Best overall image generation model: black-forest-labs/flux-pro
The best overall image generation model is black-forest-labs/flux-pro. It offers state-of-the-art performance in prompt following, visual quality, image detail, and output diversity. For more information about how to use FLUX.1, read our blog about FLUX.1.
Best ComfyUI model: fofr/any-comfyui-workflow
If you’re a fan of ComfyUI, you can export any of your favorite ComfyUI workflows to JSON and run them on Replicate using the fofr/any-comfyui-workflow model. For more information, check out our detailed guide to using ComfyUI.
Best fast image generation model: black-forest-labs/flux-schnell
The best-looking fast image generation model is black-forest-labs/flux-schnell, which can generate high-quality images in roughly 1 second. The fastest image generation model is fofr/latent-consistency-model which will generate an image in 0.6 seconds.
Best fine-tunes
Make sure to check out our SDXL fine-tunes collection, which includes all publicly available SDXL fine-tunes hosted on Replicate. This collection should help you get a feel for the sorts of things you can do with fine-tuning.
Recommended models
bytedance / sdxl-lightning-4step
SDXL-Lightning by ByteDance: a fast text-to-image model that makes high-quality images in 4 steps
stability-ai / stable-diffusion
A latent text-to-image diffusion model capable of generating photo-realistic images given any text input
stability-ai / sdxl
A text-to-image generative AI model that creates beautiful images
black-forest-labs / flux-schnell
The fastest image generation model tailored for local development and personal use
stability-ai / stable-diffusion-inpainting
Fill in masked parts of images with Stable Diffusion
ai-forever / kandinsky-2.2
multilingual text2image latent diffusion model
datacte / proteus-v0.2
Proteus v0.2 shows subtle yet significant improvements over Version 0.1. It demonstrates enhanced prompt understanding that surpasses MJ6, while also approaching its stylistic capabilities.
ai-forever / kandinsky-2
text2img model trained on LAION HighRes and fine-tuned on internal datasets
fofr / sdxl-emoji
An SDXL fine-tune based on Apple Emojis
tstramer / material-diffusion
Stable diffusion fork for generating tileable outputs using v1.5 model
playgroundai / playground-v2.5-1024px-aesthetic
Playground v2.5 is the state-of-the-art open-source model in aesthetic quality
black-forest-labs / flux-dev
A 12 billion parameter rectified flow transformer capable of generating images from text descriptions
datacte / proteus-v0.3
ProteusV0.3: The Anime Update
fofr / latent-consistency-model
Super-fast, 0.6s per image. LCM with img2img, large batching and canny controlnet
lucataco / ssd-1b
Segmind Stable Diffusion Model (SSD-1B) is a distilled 50% smaller version of SDXL, offering a 60% speedup while maintaining high-quality text-to-image generation capabilities
batouresearch / sdxl-controlnet-lora
'''Last update: Now supports img2img.''' SDXL Canny controlnet with LoRA support.
fofr / realvisxl-v3-multi-controlnet-lora
RealVisXl V3 with multi-controlnet, lora loading, img2img, inpainting
fofr / any-comfyui-workflow
Run any ComfyUI workflow. Guide: https://github.com/fofr/cog-comfyui
fofr / sticker-maker
Make stickers with AI. Generates graphics with transparent backgrounds.
lucataco / realvisxl2-lcm
RealvisXL-v2.0 with LCM LoRA - requires fewer steps (4 to 8 instead of the original 40 to 50)
lucataco / realvisxl-v2.0
Implementation of SDXL RealVisXL_V2.0
fofr / sdxl-multi-controlnet-lora
Multi-controlnet, lora loading, img2img, inpainting
lucataco / dreamshaper-xl-turbo
DreamShaper is a general purpose SD model that aims at doing everything well, photos, art, anime, manga. It's designed to match Midjourney and DALL-E.
lucataco / open-dalle-v1.1
A unique fusion that showcases exceptional prompt adherence and semantic understanding, it seems to be a step above base SDXL and a step closer to DALLE-3 in terms of prompt comprehension
adirik / realvisxl-v3.0-turbo
Photorealism with RealVisXL V3.0 Turbo based on SDXL
ai-forever / kandinsky-2-1
Kandinsky 2.1 Diffusion Model
nightmareai / disco-diffusion
Generate images using a variety of techniques - Powered by Discoart
lucataco / pixart-xl-2
PixArt-Alpha 1024px is a transformer-based text-to-image diffusion system trained on text embeddings from T5
adirik / realvisxl-v4.0
Photorealism with RealVisXL V4.0
lucataco / realistic-vision-v5
Realistic Vision v5.0 with VAE