lucataco/interactiveomni-8b

A unified omni-modal model that can simultaneously receive inputs such as images, audio, text, and video and directly generate coherent text and speech

Public
33 runs

Want to make some of these yourself?

Run this model