microsoft/phi-4-multimodal-instruct

Phi-4-multimodal-instruct is a lightweight open multimodal foundation model that leverages the language, vision, and speech research and datasets used for Phi-3.5 and 4.0 models.

Public
3.2K runs