About this model
GLM 5.2 is Z.ai's next-generation large language model and the newest entry in the company's flagship GLM line, released in June 2026. Built by the Chinese AI firm formerly known as Zhipu AI, it advances the series with significantly stronger reasoning, sharper instruction following, and broad multilingual support, all served under the permissive MIT License that Z.ai has applied to GLM releases since mid-2025.
Within the lineup, GLM 5.2 sits at the top of the core GLM family, succeeding GLM 5.1 (April 2026) and the earlier GLM 5 and GLM 4.7 generations. It pairs a roomy 200K-token context window with fp8 quantization for efficient, fast inference, and ships with reasoning, function-calling, and web-search capabilities baked in.
That combination makes GLM 5.2 well suited to demanding work: long-document analysis, multi-step reasoning tasks, tool-augmented agents, and multilingual applications where an open, MIT-licensed model is preferred. Users wanting the most capable current GLM should reach for this release, while older siblings remain available for those balancing cost or speed.
This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.
Research & Papers
2 reference papers linked from the HuggingFace model card.
GLM-5: from Vibe Coding to Agentic Engineering(2026)
GLM-5-Team, :, Aohan Zeng et al.
We present GLM-5, a next-generation foundation model designed to transition the paradigm of vibe coding to agentic engineering. Building upon the agentic, reasoning, and coding (ARC) capabilities of its predecessor, GLM-5 adopts DSA to significantly reduce training and inference…
IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse(2026)
Yushi Bai, Qian Dong, Ting Jiang et al.
Long-context agentic workflows have emerged as a defining use case for large language models, making attention efficiency critical for both inference speed and serving cost. Sparse attention addresses this challenge effectively, and DeepSeek Sparse Attention (DSA) is a…
Data sources: Venice API · HuggingFace · Wikipedia · arXiv — enrichment updated 12h ago