OpenAI·💬 Text Generation

GPT-4o Mini

VisionFunction CallingWeb Searchanonymized

🧠 Try in Intelligence →

Try on Venice.ai ↗

Quick reference

GPT-4o Mini — TLDR

🧠 Brings much of GPT-4o capability to cost-efficient small-model workloads
📏 128K-token context window, up to 16K output tokens
👁️ Accepts text and image inputs, produces text outputs
🔧 Strong function calling and structured outputs support
🌐 Built-in web search capability in this catalog deployment
📚 Knowledge cutoff October 2023; shares GPT-4o tokenizer
⚡ Optimized for high-volume, low-latency chained and real-time tasks
🆕 Surpasses GPT-3.5 Turbo on academic and multimodal benchmarks

💰 Pricing

$0.188 / $0.750

per 1M · input / output

📏 Context

128K tokens

📅 On Venice since

Feb 28, 2026

141 days ago

Provider

OpenAI

OpenAI is an American artificial intelligence research organization headquartered in San Francisco, structured as both a for-profit public benefit corporation and a nonprofit foundation. The lab developed the GPT family of large language models, the DALL-E…

Read full profile →

30 models on Venice

19 text · 4 video · 2 image · 2 embedding · 2 inpaint · 1 asr

Since Jan 15, 2025

Wikipedia ↗Official site ↗

See 29 other models from OpenAI →

About this model

GPT-4o Mini is OpenAI's compact, cost-efficient member of the GPT-4 "omni" family, designed to bring much of GPT-4o's capability to high-volume, latency-sensitive workloads. OpenAI positions it as a fast, affordable small model for focused tasks, accepting both text and image inputs and producing text outputs including Structured Outputs. It carries a 128K-token context window, supports up to 16,384 output tokens per request, and has knowledge up to October 2023.

Compared with its own predecessor in the small-model line, GPT-3.5 Turbo, OpenAI reports that GPT-4o Mini surpasses it on academic benchmarks across both textual intelligence and multimodal reasoning, adds vision support, and delivers improved long-context and function-calling performance. It also shares the improved tokenizer used by GPT-4o, making non-English text handling more efficient, and model outputs from larger models can be distilled into it for similar results at lower cost.

Within this catalog's family lineage, GPT-4o Mini has since been succeeded by GPT-5.4 Mini, which OpenAI describes as one of its most capable small models with a 400K context window and broader tool support including web search, file search, and computer use. According to OpenAI, GPT-5.4 mini consistently outperforms earlier small models at similar latencies, reflecting the family's generational progress.

For developers, GPT-4o Mini remains suited to chaining or parallelizing multiple model calls, processing large context volumes, and powering real-time chatbots where cost and speed matter.

Sources

GPT-4o mini Model | OpenAI APIdevelopers.openai.com ↗

GPT-4o mini: advancing cost-efficient intelligence | OpenAIopenai.com ↗

o4-mini Model | OpenAI APIplatform.openai.com ↗

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 4d ago