.png)
idefics
Open-access reproduction of large visual language model Flamingo

wuerstchen
Efficient Pretraining of Text-to-Image Models
seamless_communication
SeamlessM4T—Massively Multilingual & Multimodal Machine Translation
i2vgen-xl
Generating high-definition videos based on input images and videos.

unival
Unified Model for Image, Video, Audio and Language Tasks
lorahub
Efficient Cross-Task Generalization via Dynamic LoRA Composition

resshift
Efficient Diffusion Model for Image Super-resolution by Residual Shifting

ledits
Real Image Editing with DDPM Inversion and Semantic Guidance

kandinsky-2-2-controlnet-depth
Kandinsky Image Generation with ControlNet Conditioning

styledrop
Text-to-Image Generation in Any Style
demucs
Demucs Music Source Separation

diffedit-stable-diffusion
Diffusion-based semantic image editing with mask guidance

textdiffuser
Diffusion Models as Text Painters

prompt-free-diffusion
Prompt-free Diffusion
controlvideo
Training-free Controllable Text-to-Video Generation

shap-e
Generating Conditional 3D Implicit Functions

fastcomposer
Tuning-Free Multi-Subject Image Generation with Localized Attention
sadtalker
Stylized Audio-Driven Single Image Talking Face Animation

semantic-segment-anything
Adding semantic labels for segment anything
videocrafter
Text-to-Video Generation and Editing
text2video-zero
Text-to-Image Diffusion Models are Zero-Shot Video Generators

pix2struct
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
dolly
Fine-tuned GPT-J 6B model on the Alpaca dataset

stable-diffusion-2-1-unclip
Stable Diffusion v2-1-unclip Model
damo-text-to-video
Multi-stage text-to-video generation

unidiffuser
One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale

dreamshaper
Dream Shaper stable diffusion

zoedepth
ZoeDepth: Combining relative and metric depth

hasdx
mixed stable diffusion model

supermarionation
Finetuned Stable-diffusion from Gerry Anderson Supermarionation

pastel-mix
high-quality highly detailed anime stylized latent diffusion model

sd-x2-latent-upscaler
Stable Diffusion x2 latent upscaler

real-esrgan
Real-ESRGAN: Real-World Blind Super-Resolution

t2i-adapter
Learning Adapters towards Controllable for Text-to-Image Diffusion Models

midas
Robust Monocular Depth Estimation

hard-prompts-made-easy
Gradient-Based Discrete Optimization for Prompt Tuning and Discovery

pix2pix-zero
Zero-shot Image-to-Image Translation

dreambooth-avatar
Dreambooth finetuning of Stable Diffusion (v1.5.1) on Avatar art style by Lambda Labs

gta5_artwork_diffusion
GTA5 Artwork Diffusion via Dreambooth

magifactory-t-shirt-diffusion
Generate t-shirt logos with stable-dfffusion
distilgpt2-stable-diffusion-v2
Descriptive stable diffusion prompts generation using GPT2

portraitplus
Portraits with stable-diffusion

anything-v4.0
high-quality, highly detailed anime-style Stable Diffusion models

point-e
Point-E: A System for Generating 3D Point Clouds from Complex Prompts

anything-v3-better-vae
high-quality, highly detailed anime style stable-diffusion with better VAE

future-diffusion
Finte-tuned Stable Diffusion on high quality 3D images with a futuristic Sci-Fi theme

karlo
Text-conditional image generation model based on OpenAI's unCLIP

analog-diffusion
a dreambooth model trained on a diverse set of analog photographs

taiyi-stable-diffusion-1b-chinese-v0.1
Chinese Stable diffusion model

eimis_anime_diffusion
stable-diffusion models for high quality and detailed anime images

anything-v3.0
high-quality, highly detailed anime style stable-diffusion
whisper
with large-v2 checkpoint

stable-diffusion-img2img-v2.1

wavyfusion
dreambooth trained on a very diverse dataset ranging from photographs to paintings

altdiffusion-m9
Multilingual Stable Diffusion

stable-diffusion-v2
sd-v2 with diffusers, test version!

stable-diffusion-v2-inpainting
stable-diffusion-v2-inpainting

rembg
remove images background

stable-diffusion
stable-diffusion with negative prompts, more scheduler

app_icons_generator
App Icons Generator V1 (DreamBooth Model)

aesthetic-predictor
A linear estimator on top of clip to predict the aesthetic quality of pictures

backgroundmatting
Real-Time High-Resolution Background Matting

sd_pixelart_spritesheet_generator
generate pixel art sprite sheets from four different angles with Stable-diffusion

disco-diffusion-style
Disco Diffusion style on Stable Diffusion via Dreambooth

tron-legacy-diffusion
Tron Legacy Diffusion on Stable Diffusion via Dreambooth

dreambooth-pikachu
Pikachu on Stable Diffusion via Dreambooth

herge-style
herge_style on Stable Diffusion via Dreambooth

van-gogh-diffusion
Van Gough on Stable Diffusion via Dreambooth

elden-ring-diffusion
fine-tuned Stable Diffusion model trained on the game art from Elden Ring

prompt-to-prompt
Prompt-to-prompt image editing with cross-attention control

stable-diffusion-v1-5
stable-diffusion with v1-5 checkpoint

stable-diffusion-aesthetic-gradients
Stable Diffusion with Aesthetic Gradients

waifu-diffusion
Stable Diffusion on Danbooru images
whisper-downloadable-subtitles
Added downloadable subtitles for openai/whisper

rudalle-sr
Real-ESRGAN super-resolution model from ruDALL-E

stable-diffusion-high-resolution
Detailed, higher-resolution images from Stable Diffusion

clip-vit-large-patch14
openai/clip-vit-large-patch14 with Transformers

sd-textual-inversion-ugly-sonic
stable-diffusion-textual-inversion fine-tuned with ugly sonic

sd-textual-inversion-spyro-dragon
stable-diffusion-textual-inversion fine-tuned with spyro of the dragon STYLE

sd-textual-inversion
Stable Diffusion Textual Inversion

docentr
End-to-End Document Image Enhancement Transformer

style-your-hair
Pose-Invariant Hairstyle Transfer

repaint
Inpainting using Denoising Diffusion Probabilistic Models

night-enhancement
Unsupervised Night Image Enhancement

latent-diffusion-text2img
text-to-image with latent diffusion

openpsg
Panoptic Scene Graph Generation

mindall-e
text-to-image generation

vq-diffusion
VQ-Diffusion for Text-to-Image Synthesis

compositional-vsual-generation-with-composable-diffusion-models-pytorch
Composable Diffusion

micromotion-stylegan
Decoding Micromotion in Low-dimensional Latent Spaces from StyleGAN

clip-gen
Language-Free Training of a Text-to-Image Generator with CLIP

bigcolor
Colorization using a Generative Color Prior for Natural Images
global_tracking_transformers
Global Tracking Transformers

vqfr
Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder

diffae
Image Manipulatinon with Diffusion Autoencoders

face-align-cog
face alignment using stylegan-encoding

clip-guided-diffusion
Clip-Guided Diffusion Model for Image Generation

clip-guided-diffusion-pokemon
Generates pokemon sprites from prompt
multilingual-stable-diffusion
maskgit
Masked Generative Image Transformer
oneformer
One Transformer to Rule Universal Image Segmentation
ddnm
Zero Shot Image Restoration Using Denoising Diffusion Null-Space Model
chatglm-6b
bilingual language model based on General Language Model (GLM) framework
pix2seq
Turning RGB pixels into semantically meaningful sequences