light770/qwen3-embedding-0.6b

Compact Powerhouse for Vector Embeddings

Public
70 runs

Run time and cost

This model runs on Nvidia T4 GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

Qwen3-Embedding-0.6B is a lightweight yet high-performing text embedding model from Alibaba’s Qwen team, purpose-built for production RAG pipelines and pgvector deployments. Despite its small footprint, it delivers competitive performance on the MTEB benchmark

  • Flexible Dimensions: Supports Matryoshka Representation Learning (MRL) — generate embeddings from 32 to 1024 dimensions to optimize pgvector storage vs. accuracy trade-offs

  • Long Context: 32K token context window handles long documents without chunking overhead

  • Instruction-Aware: Task-specific instructions boost retrieval accuracy by 1–5% — perfect for domain-specific pgvector search

  • Multilingual: Supports 100+ languages including code, enabling cross-lingual vector search in a single pgvector table

Specification Value
Parameters 0.6B (600M)
Architecture Dense Transformer decoder
Layers 28
Context Length 32,768 tokens
Embedding Dimensions 32–1024 (user-configurable)
MRL Support Yes
License Apache 2.0
Release Date June 2025
Model created