AlibabaAlibaba·💬 Text Generation

Qwen 3 235B A22B Thinking 2507

ReasoningFunction CallingWeb Searchfp8private
🧠 Try in Intelligence →Try on Venice.ai ↗
Quick reference
Qwen 3 235B A22B Thinking 2507 — TLDR
  • 🧠 Mixture-of-Experts reasoning model: 235B total, 22B active per pass.
  • 🆕 Dedicated thinking-only variant separate from the original hybrid Qwen3 design.
  • 📏 Native 262K-token context for long-document work.
  • 🏢 Built by Alibaba's Qwen team, released 2025 under Apache 2.0.
  • 🔧 Function calling and agentic tool use via Qwen-Agent.
  • ⚡ FP8 quantization for more efficient deployment.
  • 🎯 Tuned for math, science, coding, and long-document research.
  • 📚 Increased thinking length recommended for highly complex tasks.
💰 Pricing
$0.450 / $3.50
per 1M · input / output
📏 Context
128K tokens
📅 On Venice since
Apr 29, 2025
400 days ago
Provider

Alibaba Group is a Chinese multinational technology company founded in 1999 and headquartered in Hangzhou, Zhejiang. Originally built around e-commerce and cloud computing, Alibaba has become one of the most prolific contributors to open-weight AI research,…

Read full profile →
46 models on Venice
17 text · 16 video · 5 image · 4 inpaint · 2 embedding · 2 tts
Since Jan 11, 2025

About this model

Qwen 3 235B A22B Thinking 2507 is Alibaba's reasoning-focused refresh of its flagship Mixture-of-Experts model. Per its model card, it activates roughly 22 billion of 235 billion total parameters per forward pass, organized across 94 layers with 128 experts (8 active per token), and natively handles up to 262,144 tokens of context. The catalog lists a 128K window and FP8 quantization for this deployment.

The key change versus the same-family predecessor is structural. The original Qwen3-235B-A22B uniquely switched between thinking and non-thinking modes within a single model. The July 2025 "2507" update split that hybrid design into two specialized siblings: a non-thinking Qwen 3 235B A22B Instruct 2507 and this Thinking variant, which always emits reasoning traces. Alibaba describes the Thinking model as having increased thinking length and enhanced long-context understanding, recommending it for highly complex reasoning.

According to the Qwen team's model card, this release brings improvements over the previous generation in reasoning, instruction-following, and agentic capabilities. The model supports tool calling through Qwen-Agent and targets technical work: multi-step mathematics, scientific analysis, algorithmic coding, and detailed document processing. Released under the permissive Apache 2.0 license, its open weights make it suitable for self-hosted research and enterprise pipelines.

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Research & Papers

Primary reference paper for this model family, sourced from the HuggingFace model card.

Data sources: Venice API · HuggingFace · Wikipedia · arXiv — enrichment updated 1d ago