bytedance
/
sa2va-4b-image
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Want to make some of these yourself?
Run this model