Z.ai·💬 Text Generation

GLM 5.1🔒Private

ReasoningWeb SearchE2EEfp8private

🧠 Try in Intelligence →

Try on Venice.ai ↗

Quick reference

GLM 5.1 — TLDR

🔒 Runs in a Trusted Execution Environment with hardware attestation.
🧠 Post-training upgrade to GLM-5 for agentic engineering work.
📏 200,000-token context window, FP8 precision.
🏢 Large Mixture-of-Experts design per NVIDIA's model card.
⚡ Z.ai documents up to 8 hours of autonomous task execution.
🔧 Reasoning, web search, and tool-calling agent capabilities.
📚 Open weights under the permissive MIT license.
🌐 From Z.ai (formerly Zhipu AI), released April 2026.

💰 Pricing

$1.10 / $4.15

per 1M · input / output

📏 Context

200K tokens

📅 On Venice since

Apr 24, 2026

86 days ago

Provider

Z.ai

Z.ai, formally Knowledge Atlas Technology Joint Stock Co., Ltd., is a Chinese technology company specializing in artificial intelligence. Previously known internationally as Zhipu AI, the company rebranded to Z.ai in 2025. Its core focus is the GLM family of…

Read full profile →

12 models on Venice

11 text · 1 image

Since Apr 1, 2024

Wikipedia ↗Official site ↗

See 11 other models from Z.ai →

About this model

GLM 5.1 is the privacy-hardened deployment of Z.ai's long-horizon flagship, served inside a Trusted Execution Environment where hardware attestation lets users independently verify enclave identity and configuration, so even the host cannot read prompts. It runs in FP8 precision under the MIT license, with a 200,000-token context window, and is released by Z.ai (formerly Zhipu AI) in April 2026.

It builds directly on Z.ai's GLM-5 foundation as a post-training upgrade rather than a new architecture. NVIDIA's model card describes GLM-5.1 as a large Mixture-of-Experts model and states it has significantly stronger coding capabilities along with improved results across coding, math, and agentic tasks versus its same-family predecessor GLM 5.

Z.ai positions the model for sustained autonomous work. Its documentation describes planning, execution, testing, and fixing loops running up to eight hours, and reports a 3.6× geometric-mean speedup on KernelBench Level 3 optimization, compared with 1.49× for torch.compile in max-autotune mode. These figures are vendor-reported.

Within the broader lineage GLM 5.1 sits between GLM 4.7 and the newer, longer-context GLM 5.2, reflecting Z.ai's steady iteration on its open-weight GLM family.

🤗View model card on HuggingFace ↗View source on GitHub ↗

Sources

GLM-5.1 - Overview - Z.AI DEVELOPER DOCUMENTdocs.z.ai ↗

z-ai / glm5.1docs.api.nvidia.com ↗

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Research & Papers

Primary reference paper for this model family, sourced from the HuggingFace model card.

arXiv2602.15763Feb 2026

GLM-5: from Vibe Coding to Agentic Engineering(2026)

GLM-5-Team, :, Aohan Zeng et al.

We present GLM-5, a next-generation foundation model designed to transition the paradigm of vibe coding to agentic engineering. Building upon the agentic, reasoning, and coding (ARC) capabilities of its predecessor, GLM-5 adopts DSA to significantly reduce training and inference…

Data sources: Venice API · HuggingFace · Wikipedia · arXiv — enrichment updated 13h ago