Alibaba·💬 Text Generation

Qwen 3.6 35B A3B FP8🔒Private

ReasoningCodeFunction CallingWeb SearchE2EEfp8private

🧠 Try in Intelligence →

Try on Venice.ai ↗

Quick reference

Qwen 3.6 35B A3B FP8 — TLDR

🆕 First open-weight model in Alibaba's Qwen3.6 series
🧠 Mixture-of-experts: 35B total, ~3B active per token
🔧 Gated-delta-networks MoE, 256 experts, 8 routed plus 1 shared
💬 Switchable thinking and non-thinking inference modes
🎯 Tuned for agentic coding and repository-level reasoning
🔒 Served inside a Trusted Execution Environment with hardware attestation
📏 32K context window in this deployment; FP8 quantization
⚡ Compact active footprint enables fast, lower-cost inference

💰 Pricing

$0.182 / $1.18

per 1M · input / output

📏 Context

32K tokens

📅 On Venice since

May 20, 2026

60 days ago

Provider

Alibaba

Alibaba Group is a Chinese multinational technology company founded in 1999 and headquartered in Hangzhou, Zhejiang. Originally built around e-commerce and cloud computing, Alibaba has become one of the most prolific contributors to open-weight AI research,…

Read full profile →

51 models on Venice

20 video · 18 text · 5 image · 4 inpaint · 2 embedding · 2 tts

Since Jan 11, 2025

Wikipedia ↗Official site ↗

See 50 other models from Alibaba →

About this model

Qwen 3.6 35B A3B FP8 is Alibaba's first open-weight release in the Qwen3.6 line, a sparse mixture-of-experts model with 35 billion total parameters but only about 3 billion active per token. Architecturally it shares the gated-delta-networks MoE design of the 3.5 generation, routing 8 of 256 experts plus one shared expert each step, with switchable thinking and non-thinking modes. This Venice deployment runs the official FP8 checkpoint inside a Trusted Execution Environment, exposing hardware attestation so enclave identity and configuration can be independently verified.

Compared with its direct predecessor Qwen 3.5 35B A3B, Alibaba reports that Qwen3.6 improves agentic coding and reasoning, and adds a thinking-preservation capability for steadier long agent runs. Alibaba also states the model rivals its larger dense Qwen 3.6 27B sibling on several coding benchmarks despite the smaller active count. These figures are vendor self-reported and not yet independently reproduced.

Within the same Qwen family on this catalog, it sits alongside the earlier Qwen3 30B A3B enclave model and an uncensored 3.6 variant. It is licensed Apache 2.0, with capabilities spanning reasoning, code, function calling, and web search.

🤗View model card on HuggingFace ↗View source on GitHub ↗

Sources

Qwen3.6-35B-A3B: Agentic Coding Power, Now Open to Allqwen.ai ↗

Qwen/Qwen3.6-35B-A3B (and FP8) has landed - DGX Spark / GB10 - NVIDIA Developer Forumsforums.developer.nvidia.com ↗

Qwen/Qwen3.5-35B-A3B · Hugging Facehuggingface.co ↗

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 2d ago