About this model
GLM 5.2 is Z.ai's confidential-compute flagship for long-horizon tasks, packaging the standard GLM 5.2 release inside a Trusted Execution Environment that produces hardware attestation evidence for independent verification. Z.ai positions it as a model built for project-level engineering: planning, executing, and refactoring across an entire codebase with enhanced reasoning. The catalog lists a one-million-token context window, supporting whole-repository operations in a single session.
Within this TEE-hosted family, the prior entry was GLM 4.7, and the immediate lineage predecessor is GLM 5.1. GLM 5.2 advances that line with its enlarged context window and a stated focus on long-horizon, multi-file agentic work, alongside built-in web search and tool calling.
On the underlying capability, Z.ai's own documentation reports that GLM-5 reached open-model scores of 77.8 on SWE-bench Verified and 56.2 on Terminal Bench 2.0, and that it showed substantial gains over GLM-4.7 across frontend, backend, and long-horizon execution tasks. These figures are vendor-reported and describe the GLM-5 generation rather than 5.2 specifically.
The model is offered under an MIT license. Treat early generational performance comparisons with caution until a 5.2-specific technical report is available.
This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.
Research & Papers
2 reference papers linked from the HuggingFace model card.
GLM-5: from Vibe Coding to Agentic Engineering(2026)
GLM-5-Team, :, Aohan Zeng et al.
We present GLM-5, a next-generation foundation model designed to transition the paradigm of vibe coding to agentic engineering. Building upon the agentic, reasoning, and coding (ARC) capabilities of its predecessor, GLM-5 adopts DSA to significantly reduce training and inference…
IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse(2026)
Yushi Bai, Qian Dong, Ting Jiang et al.
Long-context agentic workflows have emerged as a defining use case for large language models, making attention efficiency critical for both inference speed and serving cost. Sparse attention addresses this challenge effectively, and DeepSeek Sparse Attention (DSA) is a…
Data sources: Venice API · HuggingFace · Wikipedia · arXiv — enrichment updated 8h ago