CompVis `latent-diffusion text2im` finetuned for inpainting.
Generate image from text by guiding a denoising diffusion model. Inference is somewhat slow.
Guide a StyleGAN3 trained on pictures of mannequins with CLIP.
The predecessor to DALLE-2, GLIDE (filtered) with faster PRK/PLMS sampling.
GLIDE-text2im w/ humans and experimental style prompts.
Generate speech from text, clone voices from mp3 files. From James Betker AKA "neonbjb".
Generate 768px images from text using CompVis `retrieval-augmented-diffusion`
Use stable diffusion and aesthetic CLIP embeddings to guide boring outputs to be more aesthetically pleasing.
This model is cold. You'll get a fast response if the model is warm and already running, and a slower response if the model is cold and starting up.
This model runs on T4. View more.