lucataco/interactiveomni-8b

A unified omni-modal model that can simultaneously receive inputs such as images, audio, text, and video and directly generate coherent text and speech

Public
49 runs

Want to make some of these yourself?

Run this model