opencodeinterpreter-ds-6.7b

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

Updated 6 runs

supir-v0f

Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild. This is the SUPIR-v0F model and does NOT use LLaVA-13b.

Updated 299 runs

supir-v0q

Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild. This is the SUPIR-v0Q model and does NOT use LLaVA-13b.

Updated 185 runs

supir

Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild. This version uses LLaVA-13b for captioning.

Updated 1.6K runs

uform-gen2-qwen-500m

Pocket-Sized Multimodal AI For Content Understanding and Generation

Updated 236 runs

canary-1b

Nvidia Automatic speech-to-text recognition (ASR) in 4 languages (English, German, French, Spanish)

Updated 31 runs

lambda-eclipse

λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space

Updated 139 runs

blipdiffusion

Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing

Updated 256 runs

blipdiffusion-controlnet

Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing with ControlNet

Updated 115 runs

cogagent-chat

A Visual Language Model for GUI Agents

Updated 248 runs

rmgb

Background removal model developed by BRIA.AI, trained on a carefully selected dataset and is available as an open-source model for non-commercial use.

Updated 210 runs

videocrafter

VideoCrafter2: Text-to-Video and Image-to-Video Generation and Editing

Updated 6.5K runs

depth-anything

Highly practical solution for robust monocular depth estimation by training on a combination of 1.5M labeled images and 62M+ unlabeled images

Updated 2.9K runs

tokenflow

Consistent Diffusion Features for Consistent Video Editing

Updated 1.8K runs

video-retalking

Audio-based Lip Synchronization for Talking Head Video

Updated 11.7K runs

diffmorpher

Diffusion Models for Image Morphing

Updated 566 runs

dreamtalk

RESEARCH/NON-COMMERCIAL USE ONLY: diffusion-based audio-driven expressive talking head generation

Updated 391 runs

openvoice

NON-COMMERCIAL USE ONLY: Versatile Instant Voice Cloning

Updated 958 runs

faster-diffusion

Rethinking the Role of UNet Encoder in Diffusion Models

Updated 126 runs

magicoder

LLMs with open-source code snippets for generating low-bias and high-quality instruction data for code.

Updated 307 runs

segmind-vega

Open-source Distilled Stable Diffusion 100% speedup

Updated 1.5K runs

segmind-vegart

Fast Segmind-Vega with 2-8 inference steps.

Updated 663 runs

cogvlm

powerful open-source visual language model

Updated 11.5K runs

kandinskyvideo

text-to-video generation model

Updated 810 runs

lavie

High-Quality Video Generation with Cascaded Latent Diffusion Models

Updated 12.4K runs

gorilla

Gorilla: Large Language Model Connected with Massive APIs

Updated 77 runs

distil-whisper

Distilled version of Whisper

Updated 218 runs

cutie

Video Object Segmentation, combined with SAM and ProPainter

Updated 166 runs

audiosep

Separate Anything You Describe

Updated 1.8K runs

scalecrafter

Tuning-free Higher-Resolution Visual Generation with Diffusion Models

Updated 719 runs

show-1

Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation

Updated 799 runs

daclip-uir

Controlling Vision-Language Models for Universal Image Restoration

Updated 1.5K runs

instructcv

Instruction tuned text-to-image diffusion models as vision generalists

Updated 301 runs

internlm-xcomposer

Advanced text-image comprehension and composition based on InternLM

Updated 163.6K runs

idefics

Open-access reproduction of large visual language model Flamingo

No versions pushed 855 runs

wuerstchen

Efficient Pretraining of Text-to-Image Models

Updated 3.1K runs

seamless_communication

SeamlessM4T—Massively Multilingual & Multimodal Machine Translation

Updated 20.3K runs

unival

Unified Model for Image, Video, Audio and Language Tasks

Updated 730 runs

lorahub

Efficient Cross-Task Generalization via Dynamic LoRA Composition

Updated 72 runs

resshift

Efficient Diffusion Model for Image Super-resolution by Residual Shifting

Updated 1.4K runs

ledits

Real Image Editing with DDPM Inversion and Semantic Guidance

Updated 831 runs

kandinsky-2-2-controlnet-depth

Kandinsky Image Generation with ControlNet Conditioning

Updated 3.5K runs

styledrop

Text-to-Image Generation in Any Style

No versions pushed 1.2K runs

demucs

Demucs Music Source Separation

Updated 65.5K runs

diffedit-stable-diffusion

Diffusion-based semantic image editing with mask guidance

Updated 372 runs

textdiffuser

Diffusion Models as Text Painters

Updated 1.6K runs

prompt-free-diffusion

Prompt-free Diffusion

Updated 704 runs

controlvideo

Training-free Controllable Text-to-Video Generation

Updated 1.8K runs

shap-e

Generating Conditional 3D Implicit Functions

Updated 11.3K runs

fastcomposer

Tuning-Free Multi-Subject Image Generation with Localized Attention

Updated 33.5K runs

sadtalker

Stylized Audio-Driven Single Image Talking Face Animation

Updated 67.1K runs

semantic-segment-anything

Adding semantic labels for segment anything

Updated 16.1K runs

text2video-zero

Text-to-Image Diffusion Models are Zero-Shot Video Generators

Updated 39.5K runs

pix2struct

Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding

Updated 5.8K runs

dolly

Fine-tuned GPT-J 6B model on the Alpaca dataset

Updated 959 runs

stable-diffusion-2-1-unclip

Stable Diffusion v2-1-unclip Model

Updated 2.1K runs

damo-text-to-video

Multi-stage text-to-video generation

Updated 112.7K runs

unidiffuser

One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale

Updated 1.1K runs

dreamshaper

Dream Shaper stable diffusion

Updated 1.2M runs

zoedepth

ZoeDepth: Combining relative and metric depth

Updated 3.2M runs

hasdx

mixed stable diffusion model

Updated 29.3K runs

supermarionation

Finetuned Stable-diffusion from Gerry Anderson Supermarionation

Updated 1.8K runs

pastel-mix

high-quality highly detailed anime stylized latent diffusion model

Updated 30.4K runs

sd-x2-latent-upscaler

Stable Diffusion x2 latent upscaler

No versions pushed 2K runs

real-esrgan

Real-ESRGAN: Real-World Blind Super-Resolution

Updated 1.3M runs

t2i-adapter

Learning Adapters towards Controllable for Text-to-Image Diffusion Models

Updated 3.8K runs

midas

Robust Monocular Depth Estimation

Updated 16.2K runs

hard-prompts-made-easy

Gradient-Based Discrete Optimization for Prompt Tuning and Discovery

Updated 613 runs

pix2pix-zero

Zero-shot Image-to-Image Translation

Updated 5.3K runs

dreambooth-avatar

Dreambooth finetuning of Stable Diffusion (v1.5.1) on Avatar art style by Lambda Labs

Updated 565 runs

gta5_artwork_diffusion

GTA5 Artwork Diffusion via Dreambooth

Updated 4.7K runs

magifactory-t-shirt-diffusion

Generate t-shirt logos with stable-dfffusion

Updated 181.3K runs

distilgpt2-stable-diffusion-v2

Descriptive stable diffusion prompts generation using GPT2

Updated 558 runs

portraitplus

Portraits with stable-diffusion

Updated 23.1K runs

anything-v4.0

high-quality, highly detailed anime-style Stable Diffusion models

Updated 2.8M runs

point-e

Point-E: A System for Generating 3D Point Clouds from Complex Prompts

Updated 8.2K runs

anything-v3-better-vae

high-quality, highly detailed anime style stable-diffusion with better VAE

Updated 3.3M runs

future-diffusion

Finte-tuned Stable Diffusion on high quality 3D images with a futuristic Sci-Fi theme

Updated 5.2K runs

karlo

Text-conditional image generation model based on OpenAI's unCLIP

Updated 881 runs

analog-diffusion

a dreambooth model trained on a diverse set of analog photographs

Updated 233.4K runs

taiyi-stable-diffusion-1b-chinese-v0.1

Chinese Stable diffusion model

Updated 930 runs

eimis_anime_diffusion

stable-diffusion models for high quality and detailed anime images

Updated 12.1K runs

anything-v3.0

high-quality, highly detailed anime style stable-diffusion

Updated 352K runs

whisper

with large-v2 checkpoint

Updated 45.7K runs

stable-diffusion-img2img-v2.1

Updated 13.2K runs

wavyfusion

dreambooth trained on a very diverse dataset ranging from photographs to paintings

Updated 3.7K runs

altdiffusion-m9

Multilingual Stable Diffusion

Updated 601 runs

stable-diffusion-v2

sd-v2 with diffusers, test version!

Updated 270.4K runs

stable-diffusion-v2-inpainting

stable-diffusion-v2-inpainting

Updated 32.7K runs

rembg

Remove images background

Updated 4.6M runs

stable-diffusion

stable-diffusion with negative prompts, more scheduler

Updated 65.3K runs

app_icons_generator

App Icons Generator V1 (DreamBooth Model)

Updated 2K runs

aesthetic-predictor

A linear estimator on top of clip to predict the aesthetic quality of pictures

Updated 7.9K runs

backgroundmatting

Real-Time High-Resolution Background Matting

Updated 2.5K runs

sd_pixelart_spritesheet_generator

generate pixel art sprite sheets from four different angles with Stable-diffusion

Updated 4.5K runs

disco-diffusion-style

Disco Diffusion style on Stable Diffusion via Dreambooth

Updated 3.3K runs

tron-legacy-diffusion

Tron Legacy Diffusion on Stable Diffusion via Dreambooth

No versions pushed 1.5K runs

dreambooth-pikachu

Pikachu on Stable Diffusion via Dreambooth

Updated 513 runs

herge-style

herge_style on Stable Diffusion via Dreambooth

Updated 2.1K runs

van-gogh-diffusion

Van Gough on Stable Diffusion via Dreambooth

Updated 5.4K runs

elden-ring-diffusion

fine-tuned Stable Diffusion model trained on the game art from Elden Ring

Updated 6.8K runs

prompt-to-prompt

Prompt-to-prompt image editing with cross-attention control

Updated 1.6K runs

stable-diffusion-v1-5

stable-diffusion with v1-5 checkpoint

Updated 34.4K runs

stable-diffusion-aesthetic-gradients

Stable Diffusion with Aesthetic Gradients

Updated 351 runs

waifu-diffusion

Stable Diffusion on Danbooru images

Updated 1.1M runs

whisper-downloadable-subtitles

Added downloadable subtitles for openai/whisper

Updated 1.9K runs

rudalle-sr

Real-ESRGAN super-resolution model from ruDALL-E

Updated 432.7K runs

stable-diffusion-high-resolution

Detailed, higher-resolution images from Stable Diffusion

Updated 71.3K runs

clip-vit-large-patch14

openai/clip-vit-large-patch14 with Transformers

Updated 4M runs

sd-textual-inversion-ugly-sonic

stable-diffusion-textual-inversion fine-tuned with ugly sonic

Updated 2K runs

sd-textual-inversion-spyro-dragon

stable-diffusion-textual-inversion fine-tuned with spyro of the dragon STYLE

Updated 474 runs

sd-textual-inversion

Stable Diffusion Textual Inversion

Updated 478 runs

docentr

End-to-End Document Image Enhancement Transformer

Updated 1.9K runs

style-your-hair

Pose-Invariant Hairstyle Transfer

Updated 8.1K runs

repaint

Inpainting using Denoising Diffusion Probabilistic Models

Updated 3.4K runs

night-enhancement

Unsupervised Night Image Enhancement

Updated 38.3K runs

latent-diffusion-text2img

text-to-image with latent diffusion

Updated 4K runs

openpsg

Panoptic Scene Graph Generation

Updated 954 runs

mindall-e

text-to-image generation

Updated 1.7K runs

vq-diffusion

VQ-Diffusion for Text-to-Image Synthesis

Updated 20.7K runs

compositional-vsual-generation-with-composable-diffusion-models-pytorch

Composable Diffusion

Updated 844 runs

micromotion-stylegan

Decoding Micromotion in Low-dimensional Latent Spaces from StyleGAN

Updated 7.8K runs

clip-gen

Language-Free Training of a Text-to-Image Generator with CLIP

Updated 939 runs

bigcolor

Colorization using a Generative Color Prior for Natural Images

Updated 366.6K runs

global_tracking_transformers

Global Tracking Transformers

Updated 138 runs

vqfr

Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder

Updated 129.7K runs

diffae

Image Manipulatinon with Diffusion Autoencoders

Updated 14.9K runs

face-align-cog

face alignment using stylegan-encoding

Updated 3.5K runs

clip-guided-diffusion

Clip-Guided Diffusion Model for Image Generation

Updated 4.5K runs

clip-guided-diffusion-pokemon

Generates pokemon sprites from prompt

Updated 4.9K runs

rpg-diffusionmaster

No versions pushed 0 runs

maskgit

Masked Generative Image Transformer

No versions pushed 1 run

pix2seq

Turning RGB pixels into semantically meaningful sequences

No versions pushed 0 runs

multilingual-stable-diffusion

No versions pushed 0 runs

videocrafter2

No versions pushed 0 runs

oneformer

One Transformer to Rule Universal Image Segmentation

No versions pushed 0 runs

minigpt-5

No versions pushed 0 runs

chatglm-6b

bilingual language model based on General Language Model (GLM) framework

No versions pushed 0 runs

ddnm

Zero Shot Image Restoration Using Denoising Diffusion Null-Space Model

No versions pushed 0 runs