lucataco/interactiveomni-8b
A unified omni-modal model that can simultaneously receive inputs such as images, audio, text, and video and directly generate coherent text and speech
A unified omni-modal model that can simultaneously receive inputs such as images, audio, text, and video and directly generate coherent text and speech