🧪Inception Labs·💬 Text Generation

Mercury 2

ReasoningFunction CallingWeb Searchanonymized
🧠 Try in Intelligence →Try on Venice.ai ↗
Quick reference
Mercury 2 — TLDR
  • - 🆕 Diffusion-based reasoning LLM from Inception Labs.
  • - ⚡ Provider describes throughput exceeding 1,000 tokens per second.
  • - 🧠 Built for reasoning, not just fast generation.
  • - 📏 128K-token context window.
  • - 🔧 Native function calling and structured output support.
  • - 🌐 Adds integrated web search to its capability set.
  • - 🎯 Generates and refines tokens in parallel rather than one at a time.
  • - 🏢 Aimed at latency-sensitive agent and pipeline workloads.
💰 Pricing
$0.313 / $0.938
per 1M · input / output
📏 Context
128K tokens
📅 On Venice since
Feb 20, 2026
103 days ago
Provider

Inception Labs is an AI research organization focused on building large language models. The lab operates under the Mercury product line, with successive generations released as standalone frontier models rather than a sprawling family of variants. Its work…

Read full profile →
1 model on Venice
1 text
Added Feb 20, 2026

About this model

Mercury 2 is a diffusion-based large language model released in 2026 by Inception Labs, an AI startup focused on bringing the diffusion paradigm to language modeling. Unlike conventional autoregressive transformers that generate text strictly one token at a time, diffusion language models produce and iteratively refine spans of output in parallel — a coarse-to-fine approach adapted from the denoising process used in image generation.

According to its catalog description, Mercury 2 is positioned as a reasoning-capable model rather than a purely speed-optimized one, combining parallel diffusion decoding with reasoning, tool use, and structured output. Inception describes throughput exceeding 1,000 tokens per second, which the company frames as the model's headline advantage for workloads where generation latency matters.

The model ships with a 128,000-token context window and supports function calling and integrated web search, making it suited to agentic loops, retrieval pipelines, and extraction tasks where many sequential model calls compound latency.

Within Inception's own lineage, Mercury 2 extends the company's earlier diffusion language-model efforts, which began with code-generation systems, into a more general reasoning model with agentic tooling. Independent, third-party benchmark replication was not available among the sources reviewed here, so specific quality figures are omitted; the description above reflects the model's documented architecture and capabilities rather than performance rankings.

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 1d ago