About this model
GLM 5 is the fifth-generation large language model from Z.ai (formerly Zhipu AI), released on February 11, 2026, and distributed under the MIT license. It is a Mixture-of-Experts model with roughly 744 billion total parameters and 40 billion active per inference, designed for advanced reasoning, code generation, function calling, and long-horizon agentic workflows. The model card and primary docs describe a context window near 200K tokens with FP8 quantization, and Z.ai publishes both BF16 and FP8 weight formats.
Compared with its same-family predecessors such as GLM 4.7 and the earlier GLM 4.6, GLM 5 represents a substantial scale-up. Z.ai's model card notes the architecture grew from GLM-4.5's 355B parameters (32B active) to 744B (40B active), while pre-training data expanded from 23T to 28.5T tokens. A key architectural change is the adoption of DeepSeek Sparse Attention (DSA), which dynamically allocates attention to reduce training and inference cost while preserving long-context fidelity.
On the training side, GLM 5 uses a new asynchronous reinforcement-learning framework called "slime" that decouples generation from training to improve post-training efficiency. According to the GLM-5 technical report, the model "significantly outperforms GLM-4.7 across frontend, backend, and long-horizon tasks," with mid-training progressively extending context from 4K to 200K tokens.
GLM 5 anchors a broad family that also includes the faster GLM 5 Turbo variant and the later GLM 5.1 update, positioning it as Z.ai's open-weight foundation for agentic engineering.
This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.
Research & Papers
Primary reference paper for this model family, sourced from the HuggingFace model card.
Data sources: Venice API · HuggingFace · Wikipedia · arXiv — enrichment updated 1d ago