OpenAIOpenAI·📐 Embeddings

Text Embedding 3 Large

anonymized
Try on Venice.ai ↗
Quick reference
Text Embedding 3 Large — TLDR
  • 🧠 OpenAI's most capable embedding model for English and multilingual tasks.
  • 📏 Produces up to 3072-dimensional vectors for measuring text relatedness.
  • 🔧 Supports a dimensions parameter to shorten embeddings, trading accuracy for size.
  • 🌐 Improves multilingual retrieval over the prior ada-002 generation.
  • 🎯 Built for search, clustering, recommendations, and retrieval-augmented generation.
  • ⚡ Even shortened to 256 dims, it can outperform full-size ada-002.
  • 💬 Pairs with the smaller, cheaper sibling for cost-sensitive workloads.
💰 Pricing
$0.163
per 1M tokens
📅 On Venice since
Apr 17, 2026
47 days ago
Provider

OpenAI is an American artificial intelligence research organization headquartered in San Francisco, structured as both a for-profit public benefit corporation and a nonprofit foundation. The lab developed the GPT family of large language models, the DALL-E…

Read full profile →
24 models on Venice
13 text · 4 video · 2 image · 2 embedding · 2 inpaint · 1 asr
Since Jan 15, 2025

About this model

Text Embedding 3 Large is OpenAI's flagship embedding model, designed to convert text into numerical vectors that capture semantic relatedness. These embeddings power search, clustering, recommendations, anomaly detection, classification, and retrieval-augmented generation pipelines. OpenAI describes it as their most capable embedding model for both English and non-English tasks. It sits alongside its lighter counterpart, Text Embedding 3 Small, which trades some quality for lower cost and latency.

Compared with the previous-generation ada-002 model it replaces, the version 3 family delivers measurably stronger results. OpenAI reports that on the MIRACL multilingual retrieval benchmark, the smaller version 3 model lifts the average score from 31.4% to 44.0% versus ada-002, while the English-focused MTEB average rises from 61.0% to 62.3%. Text Embedding 3 Large extends these gains further with higher-dimensional output.

A notable architectural improvement is native dimension flexibility. Developers can pass a dimensions parameter to shorten embeddings without the vectors losing their concept-representing properties. OpenAI notes that, on MTEB, a Text Embedding 3 Large embedding shortened to 256 dimensions can still outperform an unshortened ada-002 embedding sized at 1536. This lets teams fit the model into vector stores limited to 1024 dimensions by trimming from the full 3072, balancing storage and accuracy.

This combination of higher baseline quality, multilingual coverage, and adjustable vector sizes makes Text Embedding 3 Large a versatile default for production retrieval systems, with the small sibling available when cost matters more than maximum accuracy.

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 4d ago