AlibabaAlibaba·💬 Text Generation

Qwen 3.6 35B A3B FP8

ReasoningCodeFunction CallingWeb SearchE2EEfp8private
🧠 Try in Intelligence →Try on Venice.ai ↗
Quick reference
Qwen 3.6 35B A3B FP8 — TLDR
  • 🆕 Alibaba's Qwen3.6-generation mixture-of-experts text model, offered in FP8.
  • 🧠 35B total parameters with roughly 3B active per token.
  • 🔒 Runs inside a Trusted Execution Environment with hardware attestation.
  • 📏 Deployed here with a 32K-token context window.
  • 🔧 Supports function calling for tool-driven agent workflows.
  • 🌐 Built-in web search capability for retrieval tasks.
  • 🎯 Capabilities include reasoning and code-optimized generation.
  • 📚 Apache-2.0 licensed; official FP8 checkpoint published on Hugging Face.
💰 Pricing
$0.182 / $1.18
per 1M · input / output
📏 Context
32K tokens
📅 On Venice since
May 20, 2026
14 days ago
Provider

Alibaba Group is a Chinese multinational technology company founded in 1999 and headquartered in Hangzhou, Zhejiang. Originally built around e-commerce and cloud computing, Alibaba has become one of the most prolific contributors to open-weight AI research,…

Read full profile →
46 models on Venice
17 text · 16 video · 5 image · 4 inpaint · 2 embedding · 2 tts
Since Jan 11, 2025

About this model

Qwen 3.6 35B A3B FP8 is Alibaba's compact mixture-of-experts language model from the Qwen3.6 generation, offered here in an FP8 build that runs inside a Trusted Execution Environment so that enclave identity and configuration can be independently verified through hardware attestation. It carries 35B total parameters with roughly 3B activated per token, an architecture that keeps inference relatively light despite the larger total parameter count. The model is published under the Apache-2.0 license, with an official FP8 checkpoint available on Hugging Face.

Within this catalog's secure-enclave Qwen line, it follows the larger Qwen3.5 122B A10B and the earlier Qwen3 30B A3B. Compared with those predecessors, this entry pairs a 35B sparse MoE design with the same ~3B active-parameter routing as the 30B variant while representing the newer 3.6 generation.

The model lists reasoning, code-optimized generation, function calling, and web search among its capabilities, making it suited to tool-driven agent workflows. Here it is deployed with a 32K-token context window. No independently reproduced benchmark figures were available at publication time, so generational performance is described only in factual, feature-level terms.

A separate Qwen3.6 35B A3B Uncensored variant is also available for users seeking fewer content restrictions.

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 6d ago