OpenAIOpenAI·💬 Text Generation

GPT-5.4 Mini

ReasoningVisionFunction CallingWeb Searchanonymized
🧠 Try in Intelligence →Try on Venice.ai ↗
Quick reference
GPT-5.4 Mini — TLDR
  • - 🆕 Compact GPT-5.4-class model for high-throughput, latency-sensitive workloads.
  • - 📏 400,000-token context window with up to 128K output tokens.
  • - 👁️ Accepts text and image inputs.
  • - 🔧 Supports function calling, tool use, file search, and computer use.
  • - 🌐 Built-in web search for grounded responses.
  • - ⚡ Optimized for speed in coding assistants and parallel subagents.
  • - 🎯 Recommended for classification, extraction, ranking, and coding subtasks.
💰 Pricing
$0.938 / $5.63
per 1M · input / output
📏 Context
400K tokens
📅 On Venice since
Mar 27, 2026
69 days ago
Provider

OpenAI is an American artificial intelligence research organization headquartered in San Francisco, structured as both a for-profit public benefit corporation and a nonprofit foundation. The lab developed the GPT family of large language models, the DALL-E…

Read full profile →
24 models on Venice
13 text · 4 video · 2 image · 2 embedding · 2 inpaint · 1 asr
Since Jan 15, 2025

About this model

GPT-5.4 Mini, released in 2026 by OpenAI, is a smaller, faster sibling within the GPT-5.4 generation, bringing many of the strengths of GPT-5.4 to a model designed for high-volume, cost-sensitive deployments. It supports text and image inputs, tool use, function calling, web search, file search, computer use, and skills, alongside a 400,000-token context window. OpenAI positions it for workloads where latency directly shapes the product experience, such as responsive coding assistants and computer-using systems that interpret screenshots.

Within the catalog's mini lineage, it follows GPT-4o Mini. According to OpenAI, GPT-5.4 Mini and its companion nano are the company's most capable small models yet, and OpenAI now recommends starting with GPT-5.4 mini for most new low-latency, high-volume workloads in place of the earlier GPT-5 mini.

In practice, OpenAI describes a delegation pattern in Codex where a larger model like GPT-5.4 handles planning, coordination, and final judgment, while GPT-5.4 Mini subagents tackle narrower subtasks in parallel—searching a codebase, reviewing a large file, or processing supporting documents.

It sits alongside other GPT-5.4 tier models, including GPT-5.4 Pro, and the later GPT-5.5 and GPT-5.5 Pro releases. OpenAI recommends it as a default starting point for new low-latency, high-volume agent and chat workloads.

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 1d ago