Google Gemma 4 31B Instruct
About this model
Gemma 4 31B Instruct is the dense, maximum-quality member of Google DeepMind's open Gemma 4 family, built for consumer GPUs and workstations rather than edge devices. It handles text and image inputs, processes video as sequences of frames, and generates text, with a 256K-token context window and support for over 140 languages under the Apache 2.0 license.
Against its same-family predecessor Google Gemma 3 27B Instruct, Gemma 4 introduces several documented changes: configurable thinking modes that emit internal reasoning before a final answer, and native function calling for agentic workflows.
Architecturally, Gemma 4 uses a hybrid attention mechanism interleaving local sliding-window and full global attention, with unified Keys and Values in global layers and Proportional RoPE to aid long-context performance. It sits alongside the latency-focused Google Gemma 4 26B A4B Instruct, a Mixture-of-Experts sibling that activates only a subset of its parameters per token for faster inference, whereas the 31B Dense model keeps all parameters active for quality.
Both pre-trained and instruction-tuned variants are released as open weights, and the model cards note that Gemma 4 underwent safety evaluations and sensitive-data filtering during training.
This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.
Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 1d ago