BAAIBAAI·📐 Embeddings

BGE-M3

private
Try on Venice.ai ↗
Quick reference
BGE-M3 — TLDR
  • 🌐 Multilingual embedding model supporting more than 100 working languages.
  • 🔧 Unifies dense, sparse, and multi-vector (ColBERT-style) retrieval in one model.
  • 📏 Handles inputs up to 8,192 tokens, from sentences to long documents.
  • 🧠 Built on XLM-RoBERTa with self-knowledge distillation training.
  • 🆕 Adds lexical and late-interaction retrieval missing from earlier BGE models.
  • 🏢 Released by Beijing Academy of Artificial Intelligence (BAAI) under MIT license.
  • 📚 Trained for multilingual, cross-lingual, and long-document information retrieval.
  • ⚡ Over 30 million Hugging Face downloads.
💰 Pricing
$0.150
per 1M tokens
📅 On Venice since
Mar 14, 2025
446 days ago
Provider

The Beijing Academy of Artificial Intelligence (BAAI), also known as the Zhiyuan Institute, is a Chinese non-profit research laboratory dedicated to advancing the fundamentals of AI. Founded as a collaborative hub, BAAI brings together leading AI companies,…

Read full profile →
2 models on Venice
2 embedding
Since Mar 14, 2025

About this model

BGE-M3 is BAAI's multilingual text embedding model, where M3 denotes its three defining properties: Multi-Linguality, Multi-Functionality, and Multi-Granularity. Built on the XLM-RoBERTa architecture, it provides uniform semantic retrieval across more than 100 working languages and processes inputs ranging from short sentences up to documents of 8,192 tokens. A defining capability is that a single model can simultaneously perform dense retrieval, sparse (lexical) retrieval, and multi-vector retrieval, the last using a ColBERT-style late-interaction head.

The most concrete improvement over earlier BGE releases is this unification. Prior BGE embedding models (such as the BGE v1.5 line) focused chiefly on dense retrieval, whereas BGE-M3 lets you obtain BM25-like token weights at no extra cost while generating dense embeddings, and the providers position it as a drop-in replacement for models like DPR and BGE v1.5. A novel self-knowledge distillation approach integrates relevance scores from the different retrieval functions as a teacher signal during training.

According to the M3-Embedding technical paper, the model reaches strong results on multilingual, cross-lingual, and long-document retrieval benchmarks, with the authors recommending a hybrid retrieval plus re-ranking pipeline for best accuracy.

Within BAAI's broader BGE family, the later BGE-EN-ICL takes a different direction as an LLM-based, English-focused embedding model using in-context learning, contrasting with BGE-M3's multilingual, multi-functional design.

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Research & Papers

5 reference papers linked from the HuggingFace model card.

arXiv2402.03216Feb 2024

M3-Embedding: Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation(2024)

Jianlv Chen, Shitao Xiao, Peitian Zhang et al.

In this paper, we introduce a new embedding model called M3-Embedding, which is distinguished for its versatility in \textit{Multi-Linguality}, \textit{Multi-Functionality}, and \textit{Multi-Granularity}. It provides a uniform support for the semantic retrieval of more than 100…

arXiv2004.04906Apr 2020

Dense Passage Retrieval for Open-Domain Question Answering(2020)

Vladimir Karpukhin, Barlas Oğuz, Sewon Min et al.

Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can be practically implemented using dense…

arXiv2106.14807Jun 2021

A Few Brief Notes on DeepImpact, COIL, and a Conceptual Framework for Information Retrieval Techniques(2021)

Jimmy Lin, Xueguang Ma

Recent developments in representational learning for information retrieval can be organized in a conceptual framework that establishes two pairs of contrasts: sparse vs. dense representations and unsupervised vs. learned representations. Sparse learned representations can…

arXiv2107.05720Jul 2021

SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking(2021)

Thibault Formal, Benjamin Piwowarski, Stéphane Clinchant

In neural Information Retrieval, ongoing research is directed towards improving the first retriever in ranking pipelines. Learning dense embeddings to conduct retrieval using efficient approximate nearest neighbors methods has proven to work well. Meanwhile, there has been a…

arXiv2004.12832Apr 2020

ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT(2020)

Omar Khattab, Matei Zaharia

Recent progress in Natural Language Understanding (NLU) is driving fast-paced advances in Information Retrieval (IR), largely owed to fine-tuning deep language models (LMs) for document ranking. While remarkably effective, the ranking models based on these LMs increase…

Data sources: Venice API · HuggingFace · Wikipedia · arXiv — enrichment updated 1d ago