Alibaba·💬 Text Generation·↑ Newer: Qwen 3 235B A22B Instruct 2507

Qwen 3.5 35B A3B

ReasoningVisionCodeFunction CallingWeb Searchprivate

🧠 Try in Intelligence →

Try on Venice.ai ↗

Quick reference

Qwen 3.5 35B A3B — TLDR

🧠 35B-parameter mixture-of-experts activating only ~3B per token
📏 Native 256K-token context window
👁️ Multimodal model handling both text and vision inputs
🔧 Built for reasoning, coding, agents, function calling, web search
🏢 From Alibaba's Qwen team, released under Apache 2.0
🆕 Provider reports it surpasses the larger Qwen3-235B-A22B
⚡ Sparse activation targets efficient, lower-cost inference

💰 Pricing

$0.313 / $1.25

per 1M · input / output

📏 Context

256K tokens

📅 On Venice since

Feb 25, 2026

144 days ago

Provider

Alibaba

Alibaba Group is a Chinese multinational technology company founded in 1999 and headquartered in Hangzhou, Zhejiang. Originally built around e-commerce and cloud computing, Alibaba has become one of the most prolific contributors to open-weight AI research,…

Read full profile →

51 models on Venice

20 video · 18 text · 5 image · 4 inpaint · 2 embedding · 2 tts

Since Jan 11, 2025

Wikipedia ↗Official site ↗

See 50 other models from Alibaba →

About this model

Qwen 3.5 35B A3B is a sparse mixture-of-experts model from Alibaba's Qwen team, released in February 2026 under the Apache 2.0 license. It carries roughly 35 billion total parameters but activates only about 3 billion per token, a design intended to deliver large-model behavior at a fraction of the compute cost. The model is multimodal, accepting text and vision inputs, and supports a native 256K-token context window alongside reasoning, code-optimized generation, function calling, and web search.

Within the Qwen family, this release sits between smaller and larger 3.5-generation siblings such as Qwen 3.5 9B and the much larger Qwen 3.5 397B. The provider describes the 35B A3B model as surpassing the earlier, denser Qwen 3 235B A22B Instruct 2507 while being roughly 6.7 times smaller in total parameters — a generational efficiency gain attributed to the company's own description.

In practice, the model targets reasoning, coding, and general-knowledge tasks where its small active-parameter footprint keeps latency and serving costs low. Later Qwen releases such as Qwen 3.6 27B continue this compact mixture-of-experts direction. Because primary benchmark documentation specific to this checkpoint is limited, the description here stays to verifiable architectural and licensing facts.

🤗View model card on HuggingFace ↗View source on GitHub ↗

Sources

Qwen3.6-35B-A3B: Agentic Coding Power, Now Open to Allqwen.ai ↗

Qwen/Qwen3.5-35B-A3B · Hugging Facehuggingface.co ↗

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 4d ago