A model for text, audio, and image embeddings in one space
A language model for tasks like classification, summarization, and more.
Image inpainting with flux
A model which generates text in response to an input image and prompt.
Generate an image by specifying a different text prompt for each region
A diffusion model for generating human motion video from a text prompt
Edit an image using features from diffusion models
Real-ESRGAN for image upscaling on an A100
Some 4x esrgan upscalers
SDXL, but faster
Filling in images quickly with Stable Diffusion and AITemplate
Stable Diffusion, accelerated
Dataset Preprocessing code for Whisper Fine-Tuning
Accelerated transcription of audio using WhisperX
High performance and lightweight object detection models
This model is warm. You'll get a fast response if the model is warm and already running, and a slower response if the model is cold and starting up.