These models generate images from text prompts. Many of these models are based on Stable Diffusion.
Read our guide to learn more about using Stable Diffusion.
- Text-to-image - Convert text prompts to photorealistic images. Useful for quickly visualizing concepts
- Control over style - Adjust image properties like lighting and texture via prompts
- In-painting - Expand, edit, or refine images by filling in missing regions
Models we recommend
A latent text-to-image diffusion model capable of generating photo-realistic images given any text input
A text-to-image generative AI model that creates beautiful images
Fill in masked parts of images with Stable Diffusion
multilingual text2image latent diffusion model
text2img model trained on LAION HighRes and fine-tuned on internal datasets
An SDXL fine-tune based on Apple Emojis
Stable diffusion fork for generating tileable outputs using v1.5 model
Segmind Stable Diffusion Model (SSD-1B) is a distilled 50% smaller version of SDXL, offering a 60% speedup while maintaining high-quality text-to-image generation capabilities
Super-fast, 0.6s per image. LCM with img2img, large batching and canny controlnet
'''Last update: Now supports img2img.''' SDXL Canny controlnet with LoRA support.
Implementation of SDXL RealVisXL_V2.0
Playground v2 is a diffusion-based text-to-image generative model trained from scratch by the research team at Playground
RealvisXL-v2.0 with LCM LoRA - requires fewer steps (4 to 8 instead of the original 40 to 50)
Multi-controlnet, lora loading, img2img, inpainting
Kandinsky 2.1 Diffusion Model
Proteus v0.2 shows subtle yet significant improvements over Version 0.1. It demonstrates enhanced prompt understanding that surpasses MJ6, while also approaching its stylistic capabilities.
Generate images using a variety of techniques - Powered by Discoart
A unique fusion that showcases exceptional prompt adherence and semantic understanding, it seems to be a step above base SDXL and a step closer to DALLE-3 in terms of prompt comprehension
RealVisXl V3 with multi-controlnet, lora loading, img2img, inpainting
DreamShaper is a general purpose SD model that aims at doing everything well, photos, art, anime, manga. It's designed to match Midjourney and DALL-E.
PixArt-Alpha 1024px is a transformer-based text-to-image diffusion system trained on text embeddings from T5
Photorealism with RealVisXL V3.0 Turbo based on SDXL
Run any ComfyUI workflow. Guide: https://github.com/fofr/cog-comfyui
ThinkDiffusionXL is a go-to model capable of amazing photorealism that's also versatile enough to generate high-quality images across a variety of styles and subjects without needing to be a prompting genius
SDXL-Lightning by ByteDance, is a fast text-to-image model that makes high-quality images in 4 steps
Nebul.Redmond - Stable Diffusion SD XL Finetuned Model
Realistic Vision v5.0 with VAE
Many models: RealVisXL, Juggernaut, Proteus, DreamShaper, etc.
ProteusV0.3: The Anime Update
SDXL using DeepCache
Editable image generation with MasaCtrl-SDXL
Kosmos-G: Generating Images in Context with Multimodal Large Language Models
Playground v2 is a diffusion-based text-to-image generative model trained from scratch. Try out all 3 models here
Photorealism with RealVisXL V4.0