a document intelligence vision language model (VLMs) that enables the ability to query and summarize images from the physical or virtual world
This model doesn't have a readme.