
sd-21-fp16

imagebind
A model for text, audio, and image embeddings in one space

minigpt-4
A model which generates text in response to an input image and prompt.

flan-t5-large
A language model for tasks like classification, summarization, and more.

flan-t5-base
A small model for language tasks like classification, summarization, and more.

real-esrgan-a100
Real-ESRGAN for image upscaling on an A100

gfpgan-1-4

mixture-of-diffusers
Generate an image by specifying a different text prompt for each region

gfpgan-test
Face restoration and 2x upscaling

swin2sr-speedy

whisper-sandbox
Test model for whisper improvements

yolox
High performance and lightweight object detection models

plug_and_play_image_translation
Edit an image using features from diffusion models

stable-diffusion-speed-lab
Stable Diffusion, accelerated

motion_diffusion_model
A diffusion model for generating human motion video from a text prompt

attend-and-excite
Attention-Based Semantic Guidance for Text-to-Image Diffusion Models

stable-diffusion-long-prompts
img2img Stable Diffusion, but with longer prompts

some-upscalers
Some 4x esrgan upscalers