Z.ai·💬 Text Generation

GLM 5.2🔒Private

ReasoningCodeFunction CallingWeb SearchE2EEprivate

🧠 Try in Intelligence →

Try on Venice.ai ↗

Quick reference

GLM 5.2 — TLDR

- 🔒 Runs in a Trusted Execution Environment with hardware attestation evidence.
- 📏 One-million-token context window for project-level engineering work.
- 🔧 Coding-first flagship tuned for long-horizon agentic software tasks.
- 🧠 Enhanced reasoning with project-level engineering context.
- 🌐 Web search and tool calling supported.
- 📚 MIT-licensed model from Z.ai.
- 🏢 Z.ai's flagship for long-horizon tasks.

💰 Pricing

$1.75 / $5.75

per 1M · input / output

📏 Context

524K tokens

📅 On Venice since

Jun 16, 2026

46 days ago

Provider

Z.ai

Z.ai, formally Knowledge Atlas Technology Joint Stock Co., Ltd., is a Chinese technology company specializing in artificial intelligence. Previously known internationally as Zhipu AI, the company rebranded to Z.ai in 2025. Its core focus is the GLM family of…

Read full profile →

12 models on Venice

11 text · 1 image

Since Apr 1, 2024

Wikipedia ↗Official site ↗

See 11 other models from Z.ai →

About this model

GLM 5.2 is Z.ai's confidential-compute flagship for long-horizon tasks, packaging the standard GLM 5.2 release inside a Trusted Execution Environment that produces hardware attestation evidence for independent verification. Z.ai positions it as a model built for project-level engineering: planning, executing, and refactoring across an entire codebase with enhanced reasoning. The catalog lists a one-million-token context window, supporting whole-repository operations in a single session.

Within this TEE-hosted family, the prior entry was GLM 4.7, and the immediate lineage predecessor is GLM 5.1. GLM 5.2 advances that line with its enlarged context window and a stated focus on long-horizon, multi-file agentic work, alongside built-in web search and tool calling.

On the underlying capability, Z.ai's own documentation reports that GLM-5 reached open-model scores of 77.8 on SWE-bench Verified and 56.2 on Terminal Bench 2.0, and that it showed substantial gains over GLM-4.7 across frontend, backend, and long-horizon execution tasks. These figures are vendor-reported and describe the GLM-5 generation rather than 5.2 specifically.

The model is offered under an MIT license. Treat early generational performance comparisons with caution until a 5.2-specific technical report is available.

🤗View model card on HuggingFace ↗View source on GitHub ↗

Sources

GLM-5 - Overview - Z.AI DEVELOPER DOCUMENTdocs.z.ai ↗

This About section is AI-generated from public sources (Claude Opus 5), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Research & Papers

2 reference papers linked from the HuggingFace model card.

arXiv2602.15763Feb 2026

GLM-5: from Vibe Coding to Agentic Engineering(2026)

GLM-5-Team, :, Aohan Zeng et al.

We present GLM-5, a next-generation foundation model designed to transition the paradigm of vibe coding to agentic engineering. Building upon the agentic, reasoning, and coding (ARC) capabilities of its predecessor, GLM-5 adopts DSA to significantly reduce training and inference…

arXiv2603.12201Mar 2026

IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse(2026)

Yushi Bai, Qian Dong, Ting Jiang et al.

Long-context agentic workflows have emerged as a defining use case for large language models, making attention efficiency critical for both inference speed and serving cost. Sparse attention addresses this challenge effectively, and DeepSeek Sparse Attention (DSA) is a…

Data sources: Venice API · HuggingFace · Wikipedia · arXiv — enrichment updated 3d ago