
bytedance / sa2va-26b-image
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

bytedance / latentsync
LatentSync: generate high-quality lip sync animations

bytedance / hyper-flux-8step
Hyper FLUX 8-step by ByteDance

bytedance / sdxl-lightning-4step
SDXL-Lightning by ByteDance: a fast text-to-image model that makes high-quality images in 4 steps

bytedance / sa2va-8b-image
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

bytedance / sa2va-4b-image
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

bytedance / sa2va-26b-video
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

bytedance / sa2va-4b-video
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

bytedance / sa2va-8b-video
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

bytedance / flux-pulid
⚡️FLUX PuLID: FLUX-dev based Pure and Lightning ID Customization via Contrastive Alignment🎭

bytedance / hyper-flux-16step
Hyper FLUX 16-step by ByteDance

bytedance / pulid
📖 PuLID: Pure and Lightning ID Customization via Contrastive Alignment

bytedance / res-adapter
Domain Consistent Resolution Adapter for Diffusion Models: generating consistent images with resolutions outside of their trained domain

bytedance / piano-transcription
high-resolution piano transcription system: detects piano notes from audio