ifaas-uk/bge-m3

Multilingual BGE-M3 embedding model for dense and sparse hybrid retrieval.

Public
15 runs

Run time and cost

This model runs on Nvidia T4 GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

BGE-M3 Embeddings

bge-m3 is a multilingual embedding model from BAAI. This Replicate deployment returns both dense and sparse vectors from the same input text.

Use it for:

  • Semantic search
  • Hybrid dense + sparse retrieval
  • RAG document indexing
  • Multilingual search over English, Arabic, and mixed-language text
  • Qdrant named-vector collections with dense and sparse vectors

Inputs

text

A single text string to embed.

texts_json

Optional JSON array of strings for batch embedding. If texts_json is provided, it is used instead of text.

max_length

Maximum number of tokens per text. Default: 8192.

Output

For a single input, the response includes:

  • embedding.dense: 1024-dimensional dense vector
  • embedding.sparse_indices: sparse token ids
  • embedding.sparse_values: sparse token weights
  • embedding.dense_dim: dense vector size
  • embedding.sparse_token_count: number of sparse lexical weights

For batch input, use the embeddings list.

Example Inputs

Single text:

{
  "text": "The buyer shall pay the purchase price on the agreed date."
}

Batch:

{
  "texts_json": "[\"Murabaha payment clause\", \"Shariah governance standard\"]"
}

With max length:

{
  "text": "The contract must comply with the relevant Shariah standards.",
  "max_length": 4096
}

Example Output

{
  "model": "BAAI/bge-m3",
  "embedding_count": 1,
  "embedding": {
    "text_index": 0,
    "text_length": 60,
    "dense": [0.0123, -0.0456],
    "sparse_indices": [101, 2048],
    "sparse_values": [0.42, 0.31],
    "dense_dim": 1024,
    "sparse_token_count": 2
  },
  "embeddings": [
    {
      "text_index": 0,
      "text_length": 60,
      "dense": [0.0123, -0.0456],
      "sparse_indices": [101, 2048],
      "sparse_values": [0.42, 0.31],
      "dense_dim": 1024,
      "sparse_token_count": 2
    }
  ]
}

The real dense array contains 1024 float values. The sparse arrays vary by input text.

Notes

  • Dense vectors are suitable for cosine similarity search.
  • Sparse vectors are suitable for lexical or keyword-aware retrieval.
  • For Qdrant hybrid search, store the dense vector in a dense named vector and the sparse indices/values in a sparse named vector.
  • The model is best used for embedding and retrieval, not text generation.
Model created
Model updated