Alibaba·📐 Embeddings

Qwen3 Embedding 0.6B

private

Try on Venice.ai ↗

Quick reference

Qwen3 Embedding 0.6B — TLDR

🧠 Smallest model in Alibaba's Qwen3 Embedding series, ~0.6B parameters.
📏 Handles context up to 32K tokens, embeddings up to 1024 dimensions.
🌐 Multilingual support spanning over 100 languages and code.
🔧 Built on dense Qwen3 foundation models for retrieval, clustering, classification.
🎯 Instruction-aware with flexible Matryoshka-style dimension representation.
🔒 Released under permissive Apache 2.0 license.
⚡ Lightweight footprint suited to efficient, low-latency deployments.
📚 Companion reranking models available across the same family.

💰 Pricing

$0.013

per 1M tokens

📅 On Venice since

Apr 17, 2026

93 days ago

Provider

Alibaba

Alibaba Group is a Chinese multinational technology company founded in 1999 and headquartered in Hangzhou, Zhejiang. Originally built around e-commerce and cloud computing, Alibaba has become one of the most prolific contributors to open-weight AI research,…

Read full profile →

51 models on Venice

20 video · 18 text · 5 image · 4 inpaint · 2 embedding · 2 tts

Since Jan 11, 2025

Wikipedia ↗Official site ↗

See 50 other models from Alibaba →

About this model

Qwen3 Embedding 0.6B is the entry-level member of Alibaba's Qwen3 Embedding series, a family purpose-built for text embedding and ranking tasks. Built atop the dense Qwen3 foundation models, it inherits their multilingual, long-context, and reasoning capabilities while keeping a compact ~0.6B-parameter footprint. It supports context lengths up to 32K tokens and can emit embeddings with dimensions up to 1024, with flexible Matryoshka-style dimension representation and customizable instruction prompts for tuning retrieval behavior.

The series ships in three sizes, and 0.6B sits below its larger siblings, including Qwen3 Embedding 8B. Per the Qwen3 Embedding technical report, the larger 4B and 8B variants post the strongest MTEB, CMTEB, and code-retrieval scores, while the 0.6B model trades some accuracy for speed and lower memory use. That makes 0.6B the efficiency-oriented option for semantic search, recommendation, and document classification where latency and cost matter most.

For developers, the model integrates with sentence-transformers and is distributed in formats including GGUF for varied hardware. It also pairs with companion Qwen3 Reranker models, letting teams combine dense embedding retrieval with reranking in a single pipeline.

Multilingual coverage spans over 100 languages, with Alibaba advising English-written instructions since most training instructions were originally in English. Apache 2.0 licensing and broad ecosystem support make it accessible for self-hosted retrieval-augmented generation and vector-database workloads.

🤗View model card on HuggingFace ↗View source on GitHub ↗

Sources

Qwen3 Embedding: Advancing Text ...arxiv.org ↗

Qwen/Qwen3-Embedding-0.6B · Hugging Facehuggingface.co ↗

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Research & Papers

Primary reference paper for this model family, sourced from the HuggingFace model card.

arXiv2506.05176Jun 2025

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models(2025)

Yanzhao Zhang, Mingxin Li, Dingkun Long et al.

In this work, we introduce the Qwen3 Embedding series, a significant advancement over its predecessor, the GTE-Qwen series, in text embedding and reranking capabilities, built upon the Qwen3 foundation models. Leveraging the Qwen3 LLMs' robust capabilities in multilingual text…

Data sources: Venice API · HuggingFace · Wikipedia · arXiv — enrichment updated 16h ago