DSE-QWen2-2b-MRL-V1
DSE-QWen2-2b-MRL-V1 is a bi-encoder model designed to encode document screenshots into dense vectors for document retrieval. The Document Screenshot Embedding (DSE) approach captures documents in their original visual format, preserving all information such as text, images, and layout, thus avoiding tedious parsing and potential information loss. DSE aims to provide a generalizable embedding model for Text, PDF documents, Webpage, Slides retrieval.
For example, DSE-QWen2-2b-MRL-V1 achieves 85.8 nDCG@5 on ViDoRE leaderboard.
Note:
Please see here (pytorch.bin version) and here (*.safetensors version) for more details.
Citation
If you find this checkpoint is helpful, please consider citing QWen2, Docmatix, ViDoRe, and DSE work.