OpenAI·💬 Text Generation·↑ Newer: GPT-4o Mini

GPT-5.4 Mini

ReasoningVisionFunction CallingWeb Searchanonymized

🧠 Try in Intelligence →

Try on Venice.ai ↗

Quick reference

GPT-5.4 Mini — TLDR

- 🆕 Compact GPT-5.4-class model for high-throughput, latency-sensitive workloads.
- 📏 400,000-token context window with up to 128K output tokens.
- 👁️ Accepts text and image inputs.
- 🔧 Supports function calling, tool use, file search, and computer use.
- 🌐 Built-in web search for grounded responses.
- ⚡ Optimized for speed in coding assistants and parallel subagents.
- 🎯 Recommended for classification, extraction, ranking, and coding subtasks.

💰 Pricing

$0.938 / $5.63

per 1M · input / output

📏 Context

400K tokens

📅 On Venice since

Mar 27, 2026

114 days ago

Provider

OpenAI

OpenAI is an American artificial intelligence research organization headquartered in San Francisco, structured as both a for-profit public benefit corporation and a nonprofit foundation. The lab developed the GPT family of large language models, the DALL-E…

Read full profile →

30 models on Venice

19 text · 4 video · 2 image · 2 embedding · 2 inpaint · 1 asr

Since Jan 15, 2025

Wikipedia ↗Official site ↗

See 29 other models from OpenAI →

About this model

GPT-5.4 Mini, released in 2026 by OpenAI, is a smaller, faster sibling within the GPT-5.4 generation, bringing many of the strengths of GPT-5.4 to a model designed for high-volume, cost-sensitive deployments. It supports text and image inputs, tool use, function calling, web search, file search, computer use, and skills, alongside a 400,000-token context window. OpenAI positions it for workloads where latency directly shapes the product experience, such as responsive coding assistants and computer-using systems that interpret screenshots.

Within the catalog's mini lineage, it follows GPT-4o Mini. According to OpenAI, GPT-5.4 Mini and its companion nano are the company's most capable small models yet, and OpenAI now recommends starting with GPT-5.4 mini for most new low-latency, high-volume workloads in place of the earlier GPT-5 mini.

In practice, OpenAI describes a delegation pattern in Codex where a larger model like GPT-5.4 handles planning, coordination, and final judgment, while GPT-5.4 Mini subagents tackle narrower subtasks in parallel—searching a codebase, reviewing a large file, or processing supporting documents.

It sits alongside other GPT-5.4 tier models, including GPT-5.4 Pro, and the later GPT-5.5 and GPT-5.5 Pro releases. OpenAI recommends it as a default starting point for new low-latency, high-volume agent and chat workloads.

Sources

GPT-5.4 mini Model | OpenAI APIdevelopers.openai.com ↗

Introducing GPT-5.4 mini and nano | OpenAIopenai.com ↗

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 4d ago