Z.ai·💬 Text Generation·VS Pick

GLM 5.2

ReasoningCodeFunction CallingWeb Searchfp8private

🧠 Try in Intelligence →

Try on Venice.ai ↗

Quick reference

GLM 5.2 — TLDR

🎯 Z.ai's newest flagship GLM, current incumbent of the line
📏 Massive 1M-token context window
🧠 Near-Opus reasoning with fast inference
🔧 Function calling and web search supported
📜 Open-weight under permissive MIT license
🌍 Strong multilingual and instruction-following ability

💰 Pricing

$1.40 / $4.40

per 1M · input / output

📏 Context

1M tokens

📅 On Venice since

Jun 16, 2026

46 days ago

Provider

Z.ai

Z.ai, formally Knowledge Atlas Technology Joint Stock Co., Ltd., is a Chinese technology company specializing in artificial intelligence. Previously known internationally as Zhipu AI, the company rebranded to Z.ai in 2025. Its core focus is the GLM family of…

Read full profile →

12 models on Venice

11 text · 1 image

Since Apr 1, 2024

Wikipedia ↗Official site ↗

See 11 other models from Z.ai →

About this model

GLM 5.2 is Z.ai's latest large language model and the current head of the open-source GLM line, released in June 2026. It pushes the series' reasoning, instruction-following, and multilingual strengths further, pairing them with a one-million-token context window and fp8 quantization for fast inference across long documents and extended analysis. Z.ai — the Chinese AI company formerly known as Zhipu AI — has shipped the GLM family under the permissive MIT license since mid-2025, and GLM 5.2 continues that fully open-weight tradition.

Within the lineup, GLM 5.2 supersedes the earlier GLM 5.1 (April 2026) and GLM 5 (February 2026), and stands above the prior-generation GLM 4.7. It builds on the same core line rather than the lighter Turbo and Flash variants, positioning it as the reasoning-focused workhorse of the family.

With reasoning, tool calling, and web search built in, GLM 5.2 is best suited to agentic workflows, long-context retrieval and analysis, and multilingual tasks where capable reasoning at a lower cost than frontier closed models matters.

🤗View model card on HuggingFace ↗View source on GitHub ↗

This About section is AI-generated from public sources (Claude Opus 5), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Research & Papers

2 reference papers linked from the HuggingFace model card.

arXiv2602.15763Feb 2026

GLM-5: from Vibe Coding to Agentic Engineering(2026)

GLM-5-Team, :, Aohan Zeng et al.

We present GLM-5, a next-generation foundation model designed to transition the paradigm of vibe coding to agentic engineering. Building upon the agentic, reasoning, and coding (ARC) capabilities of its predecessor, GLM-5 adopts DSA to significantly reduce training and inference…

arXiv2603.12201Mar 2026

IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse(2026)

Yushi Bai, Qian Dong, Ting Jiang et al.

Long-context agentic workflows have emerged as a defining use case for large language models, making attention efficiency critical for both inference speed and serving cost. Sparse attention addresses this challenge effectively, and DeepSeek Sparse Attention (DSA) is a…

Data sources: Venice API · HuggingFace · Wikipedia · arXiv — enrichment updated 3d ago