Collections

Vision models

Multimodal large language models with vision capabilities like object detection and optical character recognition (OCR)