About this model
Nemotron Cascade 2 30B A3B is NVIDIA's reasoning-focused open-weight model, structured as a 30B Mixture-of-Experts network that activates only about 3B parameters per token for efficient inference. It is post-trained from the NVIDIA Nemotron 3 Nano 30B base (Nemotron-Nano-V3), inheriting that architecture while layering on a new reasoning recipe. The model follows a ChatML template and can switch between extended chain-of-thought "thinking" mode and a direct instruct mode by prepending an empty reasoning block.
The headline change over its base predecessor is the post-training method NVIDIA calls Cascade RL, expanded to a broader spectrum of reasoning and agentic domains, plus multi-domain on-policy distillation from the strongest intermediate teacher models. NVIDIA reports that Cascade 2 improves on the Nemotron-Nano-V3 base across nearly every benchmark, and that it reaches gold-medal-level results on the 2025 IMO, IOI, and ICPC World Finals. These olympiad figures are vendor-reported.
On context and modality, this catalog build exposes a 256K-token window, matching NVIDIA's default vLLM configuration, though the model card states support up to 1M tokens. The model is text-only and does not handle image input.
Independently, Artificial Analysis places Cascade 2 at 28 on its Intelligence Index. The collection ships under the NVIDIA Open Model License, which permits commercial use, with checkpoints and training data released.
This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.
Research & Papers
Primary reference paper for this model family, sourced from the HuggingFace model card.
Data sources: Venice API · HuggingFace · Wikipedia · arXiv — enrichment updated 1d ago