Z.aiZ.ai·💬 Text Generation

GLM 5 Turbo

ReasoningCodeFunction CallingWeb Searchanonymized
🧠 Try in Intelligence →Try on Venice.ai ↗
Quick reference
GLM 5 Turbo — TLDR
  • ⚡ Latency-optimized GLM-5 variant tuned for fast agent inference.
  • 🔧 Built for tool use and function-calling workflows.
  • 🎯 Targets production coding and agent-driven environments.
  • 📏 Context window of roughly 200K tokens.
  • 🧠 Capabilities include reasoning, code optimization, and web search.
  • 🏢 From Z.ai (formerly Zhipu AI), released March 2026.
  • 🔒 Released under an MIT license per catalog metadata.
💰 Pricing
$1.20 / $4.00
per 1M · input / output
📏 Context
200K tokens
📅 On Venice since
Mar 15, 2026
80 days ago
Provider

Z.ai, formally Knowledge Atlas Technology Joint Stock Co., Ltd., is a Chinese technology company specializing in artificial intelligence. Previously known internationally as Zhipu AI, the company rebranded to Z.ai in 2025. Its core focus is the GLM family of…

Read full profile →
11 models on Venice
10 text · 1 image
Since Apr 1, 2024

About this model

GLM 5 Turbo is a latency-optimized member of Z.ai's GLM-5 generation, released in March 2026 and aimed at agent-driven environments and production coding workflows. Per its catalog description, it is tuned for fast inference and strong performance in agentic settings, with listed capabilities spanning reasoning, code optimization, function-calling and web search. It carries a context window of roughly 200K tokens, suiting long documents, codebases and multi-step sessions.

Rather than a straight successor, GLM 5 Turbo is an execution-focused variant of the flagship GLM 5: where the base model targets frontier-level reasoning and coding, the Turbo version emphasizes speed and tool-calling for agent workflows. Both belong to the same GLM-5 release wave from Z.ai.

Within the same Turbo family, the later GLM 5V Turbo extends this design to native multimodal input — image, video and text. Z.ai's developer documentation describes GLM 5V Turbo as a vision-language model with tool-calling and reasoning support built on the Turbo line. Compared with Z.ai's earlier GLM 4.7 generation, GLM 5 Turbo reflects the provider's broader shift toward agent-first, low-latency serving. The catalog metadata lists an MIT license for this entry.

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Research & Papers

Primary reference paper for this model family, sourced from the HuggingFace model card.

Data sources: Venice API · HuggingFace · Wikipedia · arXiv — enrichment updated 1d ago