daanelson

Dan Nelson
GitHubtest-model
Testing models! Currently controlnet

whisper-tune
A fine-tuneable version of whisper

speedy-sdxl-test
SDXL, but faster

training-2
whisper-train-preprocessor
Dataset Preprocessing code for Whisper Fine-Tuning

whisperx
Accelerated transcription of audio using WhisperX

speedy-stable-diffusion-inpainting
Filling in images quickly with Stable Diffusion and AITemplate
whisper-jax-hindi
sd-21-fp16

imagebind
A model for text, audio, and image embeddings in one space

minigpt-4
A model which generates text in response to an input image and prompt.

flan-t5-large
A language model for tasks like classification, summarization, and more.

flan-t5-base
A small model for language tasks like classification, summarization, and more.

real-esrgan-a100
Real-ESRGAN for image upscaling on an A100
gfpgan-1-4

mixture-of-diffusers
Generate an image by specifying a different text prompt for each region
gfpgan-test
Face restoration and 2x upscaling
swin2sr-speedy
whisper-sandbox
Test model for whisper improvements

yolox
High performance and lightweight object detection models

plug_and_play_image_translation
Edit an image using features from diffusion models

stable-diffusion-speed-lab
Stable Diffusion, accelerated

motion_diffusion_model
A diffusion model for generating human motion video from a text prompt

attend-and-excite
Attention-Based Semantic Guidance for Text-to-Image Diffusion Models

stable-diffusion-long-prompts
img2img Stable Diffusion, but with longer prompts

some-upscalers
Some 4x esrgan upscalers
sdxl-tune-test