MiniMax·💬 Text Generation·VS Pick

MiniMax M3 Preview

ReasoningVisionCodeFunction CallingWeb Searchfp8private

🧠 Try in Intelligence →

Try on Venice.ai ↗

Quick reference

MiniMax M3 Preview — TLDR

🧠 Frontier 1.4-trillion-parameter MiniMax model for coding, agents, reasoning.
📏 512K-token context window in this preview, served at fp8.
🆕 Built on new MiniMax Sparse Attention (MSA) for long context.
⚡ MSA enables efficient native ultra-long-context pretraining.
🔧 Function calling, tool use, and structured agentic task execution.
👁️ Sibling M3 is natively multimodal (text, image, video input).
🌐 Web search and long-horizon agentic workflows supported.
🎯 Targets autonomous coding and multi-step agentic reasoning.

💰 Pricing

$0.300 / $1.20

per 1M · input / output

📏 Context

524K tokens

📅 On Venice since

Jun 12, 2026

50 days ago

Provider

MiniMax

MiniMax is an AI company building generative models across multiple modalities, with a focus that spans both language understanding and audio creation. Their rapid release cadence in early 2026—delivering several new models within just a few months—reflects…

Read full profile →

10 models on Venice

3 text · 3 video · 3 music · 1 tts

Since Feb 12, 2026

Wikipedia ↗

See 9 other models from MiniMax →

About this model

MiniMax M3 Preview is a preview build of MiniMax's flagship M-series language model, described in this catalog as a 1.4-trillion-parameter frontier model for coding, agentic workflows, and complex reasoning, served at fp8 with a 512K-token context window. It is positioned alongside the full MiniMax M3 release, which MiniMax presents as a model combining frontier coding, ultra-long context, and native multimodal input in a single architecture.

The central change from earlier M-series models is MSA (MiniMax Sparse Attention), which replaces the quadratic cost of full attention to enable native ultra-long-context pretraining, according to MiniMax. The production M3 supports up to 1M tokens with a guaranteed 512K minimum; this preview exposes the 512K tier.

Compared with prior family members such as MiniMax M2.7 and MiniMax M2.5, which remain available for existing workflows, MiniMax frames coding and agentic capability as M3's key areas of improvement, with autonomous task decomposition, tool invocation, and multi-step reasoning.

As a function-calling and web-search-capable model, M3 Preview is aimed at long-horizon agentic and computer-use tasks, including code generation and tool-driven workflows. Being a preview, weights, the technical report, and full availability above 512K tokens were still being rolled out around launch, with the model exposed here at the 512K context tier.

🤗View model card on HuggingFace ↗View source on GitHub ↗

Sources

MiniMax M3: Frontier Coding, 1M Context, Native Multimodality — All in One Model - MiniMax Research | MiniMaxminimax.io ↗

Model Invocation - MiniMax API Docsplatform.minimax.io ↗

This About section is AI-generated from public sources (Claude Opus 5), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Research & Papers

Primary reference paper for this model family, sourced from the HuggingFace model card.

arXiv2606.13392Jun 2026

MiniMax Sparse Attention(2026)

Xunhao Lai, Weiqi Xu, Yufeng Yang et al.

Ultra-long-context capability is becoming indispensable for frontier LLMs: agentic workflows, repository-scale code reasoning, and persistent memory all require the model to jointly attend over hundreds of thousands to millions of tokens, yet the quadratic cost of softmax…

Data sources: Venice API · HuggingFace · Wikipedia · arXiv — enrichment updated 16h ago