Attention-Based Semantic Guidance for Text-to-Image Diffusion Models
This model has no enabled versions.
A diffusion model for generating human motion video from a text prompt
Some 4x esrgan upscalers
High performance and lightweight object detection models
Stable Diffusion, accelerated
Edit an image using features from diffusion models
Generate an image by specifying a different text prompt for each region
Real-ESRGAN for image upscaling on an A100
A language model for tasks like classification, summarization, and more.
A model which generates text in response to an input image and prompt.
A model for text, audio, and image embeddings in one space
Filling in images quickly with Stable Diffusion and AITemplate
Accelerated transcription of audio using WhisperX
Dataset Preprocessing code for Whisper Fine-Tuning
SDXL, but faster
Image inpainting with flux
This model runs on A100 (80GB).