Alibaba·💬 Text Generation

Qwen 3 235B A22B Instruct 2507

Function CallingWeb Searchfp8private

🧠 Try in Intelligence →

Try on Venice.ai ↗

Quick reference

Qwen 3 235B A22B Instruct 2507 — TLDR

🧠 Mixture-of-experts: 235B total parameters, 22B active per token
🆕 Updated "non-thinking" refresh of the original Qwen3-235B-A22B
📏 Natively supports 256K context, extendable toward 1M tokens
🎯 Gains in instruction following, math, science, coding, tool use
🌐 Multilingual coverage across many languages and dialects
🔧 Function calling and web search; Qwen-Agent tooling support
🔒 Apache 2.0 licensed open weights
💬 Instruct-only mode; does not emit reasoning traces

💰 Pricing

$0.150 / $0.750

per 1M · input / output

📏 Context

128K tokens

📅 On Venice since

Apr 29, 2025

445 days ago

Provider

Alibaba

Alibaba Group is a Chinese multinational technology company founded in 1999 and headquartered in Hangzhou, Zhejiang. Originally built around e-commerce and cloud computing, Alibaba has become one of the most prolific contributors to open-weight AI research,…

Read full profile →

51 models on Venice

20 video · 18 text · 5 image · 4 inpaint · 2 embedding · 2 tts

Since Jan 11, 2025

Wikipedia ↗Official site ↗

See 50 other models from Alibaba →

About this model

Qwen 3 235B A22B Instruct 2507 is a Mixture-of-Experts large language model from Alibaba's Qwen team, with 235 billion total parameters but only about 22 billion activated per forward pass. It is the "2507" refresh of the original Qwen3-235B-A22B non-thinking mode, released as part of the Qwen3 series. Distributed under the Apache 2.0 license, it targets long-document research, technical work, and high-precision tasks, and is served here in FP8 quantization.

Compared with its same-family predecessor, the original Qwen3-235B-A22B, Qwen reports significant improvements in general capabilities — instruction following, logical reasoning, text comprehension, mathematics, science, coding and tool usage — plus substantial gains in multilingual long-tail knowledge and better alignment on subjective, open-ended tasks. The update also adds enhanced 256K-token long-context understanding, with model-card instructions for extending toward one million tokens. Unlike the original dual-mode design, this Instruct variant operates in non-thinking mode only and does not generate reasoning-trace blocks.

For workloads needing explicit step-by-step reasoning, Qwen released a parallel Qwen 3 235B A22B Thinking 2507 sibling that uses extended reasoning chains. Other related Qwen text models in the catalog include the efficiency-focused Qwen 3 Next 80B. Note that Venice exposes a 128K context window for this deployment, below the model's full native 256K capacity.

🤗View model card on HuggingFace ↗View source on GitHub ↗

Sources

Qwen/Qwen3-235B-A22B-Instruct-2507 · Hugging Facehuggingface.co ↗

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Research & Papers

Primary reference paper for this model family, sourced from the HuggingFace model card.

arXiv2505.09388May 2025

Qwen3 Technical Report(2025)

An Yang, Anfeng Li, Baosong Yang et al.

In this work, we present Qwen3, the latest version of the Qwen model family. Qwen3 comprises a series of large language models (LLMs) designed to advance performance, efficiency, and multilingual capabilities. The Qwen3 series includes models of both dense and Mixture-of-Expert…

Data sources: Venice API · HuggingFace · Wikipedia · arXiv — enrichment updated 4d ago