Alibaba·💬 Text Generation

Qwen 3 235B A22B Thinking 2507

ReasoningFunction CallingWeb Searchfp8private

🧠 Try in Intelligence →

Try on Venice.ai ↗

Quick reference

Qwen 3 235B A22B Thinking 2507 — TLDR

🧠 Reasoning-only model tuned for deep, multi-step problem solving
🏢 Built by Alibaba's Qwen team, Apache 2.0 licensed
🔧 Mixture-of-Experts: 235B total, 22B activated per token
📏 Model card cites native 262,144-token (256K) context
🆕 Split from dual-mode predecessor into dedicated thinking variant
⚡ Served here in FP8 for efficient deployment
🎯 Increased "thinking length" for highly complex tasks
🔧 Strong tool-calling and agentic use via Qwen-Agent

💰 Pricing

$0.450 / $3.50

per 1M · input / output

📏 Context

128K tokens

📅 On Venice since

Apr 29, 2025

445 days ago

Provider

Alibaba

Alibaba Group is a Chinese multinational technology company founded in 1999 and headquartered in Hangzhou, Zhejiang. Originally built around e-commerce and cloud computing, Alibaba has become one of the most prolific contributors to open-weight AI research,…

Read full profile →

51 models on Venice

20 video · 18 text · 5 image · 4 inpaint · 2 embedding · 2 tts

Since Jan 11, 2025

Wikipedia ↗Official site ↗

See 50 other models from Alibaba →

About this model

Qwen 3 235B A22B Thinking 2507 is a reasoning-specialized large language model from Alibaba's Qwen team, released under the Apache 2.0 license. Architecturally it is a Mixture-of-Experts transformer with 235 billion total parameters and roughly 22 billion activated per token, using 128 experts with 8 active per token across 94 layers with grouped-query attention. It always operates in "thinking" mode, emitting reasoning traces before its final answer, and is positioned for in-depth research, technical work, and long, complex documents.

The most direct family comparison is to the original Qwen3-235B-A22B, which uniquely combined thinking and non-thinking behavior in a single switchable model. With the 2507 refresh, Qwen split that design into two dedicated checkpoints: a non-thinking Qwen 3 235B A22B Instruct 2507 and this thinking-only release. Qwen describes the 2507 line as featuring significant enhancements over the previous version, including extended 256K long-context understanding, and notes this thinking variant has an increased thinking length recommended for the hardest reasoning tasks.

Per the model card, it natively handles a 262,144-token context, and an optional configuration extends inputs toward one million tokens with sparse attention. The catalog exposes a 128K-token window in an FP8 quantization for more efficient serving.

Within the broader Qwen lineup, it sits alongside vision-language siblings such as Qwen3 VL 235B and the efficiency-focused Qwen 3 Next 80b. It retains strong tool-calling and agentic integration through the Qwen-Agent framework.

🤗View model card on HuggingFace ↗View source on GitHub ↗

Sources

Qwen3: Think Deeper, Act Faster | Qwenqwenlm.github.io ↗

Qwen/Qwen3-235B-A22B-Thinking-2507 · Hugging Facehuggingface.co ↗

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Research & Papers

Primary reference paper for this model family, sourced from the HuggingFace model card.

arXiv2505.09388May 2025

Qwen3 Technical Report(2025)

An Yang, Anfeng Li, Baosong Yang et al.

In this work, we present Qwen3, the latest version of the Qwen model family. Qwen3 comprises a series of large language models (LLMs) designed to advance performance, efficiency, and multilingual capabilities. The Qwen3 series includes models of both dense and Mixture-of-Expert…

Data sources: Venice API · HuggingFace · Wikipedia · arXiv — enrichment updated 4d ago