MoonshotMoonshot·💬 Text Generation

Kimi K2.5

ReasoningVisionCodeFunction CallingWeb Searchprivate
🧠 Try in Intelligence →Try on Venice.ai ↗
Quick reference
Kimi K2.5 — TLDR
  • 🧠 Trillion-parameter Mixture-of-Experts, 32B active per token
  • 📏 256K-token context for long documents and codebases
  • 👁️ Native multimodality via MoonViT vision encoder
  • 🆕 Adds subagent "Agent Swarm" for parallel task execution
  • 🔧 Instant and thinking modes; tool calls, web search
  • 🌐 Open-weight, built on Kimi-K2-Base over mixed tokens
  • ⚡ Native INT4 quantization for efficient deployment
  • 🏢 Released January 2026 by Moonshot AI; OpenAI/Anthropic-compatible API
💰 Pricing
$0.560 / $3.50
per 1M · input / output
📏 Context
256K tokens
📅 On Venice since
Jan 27, 2026
127 days ago
Provider

Moonshot is an AI research lab known for developing the Kimi family of large language models. The organization has gained recognition for building capable reasoning-oriented models, with the Kimi line representing its flagship series of text generation…

Read full profile →
2 models on Venice
2 text
Since Jan 27, 2026

About this model

Kimi K2.5 is Moonshot AI's open-weight, native multimodal agentic model, released in January 2026. It uses a Mixture-of-Experts transformer with one trillion total parameters but activates only 32 billion per token across 384 experts (8 selected per token), paired with MLA attention and SwiGLU activations. The model supports a 256K-token context window and employs native INT4 weight-only quantization for efficient inference, per Moonshot's model card and NVIDIA's NIM documentation.

Compared with its text-only predecessor Kimi K2, the most significant change is native multimodality: Moonshot built K2.5 through continual pretraining on roughly 15 trillion mixed visual and text tokens atop Kimi-K2-Base, adding a 400M-parameter MoonViT vision encoder so it understands images alongside text. Where Kimi K2 was a strong agentic text model, K2.5 fuses vision and language during pretraining rather than after the fact.

A second generational addition is an "Agent Swarm" mechanism, which spawns parallel subagents to handle research, fact-checking, and web-development subtasks concurrently. The model also offers both instant and thinking modes, tool calling, and web search.

On Moonshot's own reported Humanity's Last Exam, K2.5 scores 31.5 (text) and 21.3 (image) without tools, rising to 51.8 (text) and 39.8 (image) with tools. It is available via Moonshot's API and was later succeeded by Kimi K2.6.

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 1d ago