GoogleGoogle·💬 Text Generation

Gemini 3 Flash Preview

ReasoningVisionFunction CallingWeb SearchAudioanonymized
🧠 Try in Intelligence →Try on Venice.ai ↗
Quick reference
Gemini 3 Flash Preview — TLDR
  • - ⚡ High-speed thinking model for agentic workflows, chat, and coding.
  • - 🧠 Configurable reasoning via thinking levels to balance depth and latency.
  • - 📏 Google documents a 1M-token input window, up to 64k output.
  • - 👁️ Multimodal inputs including text, images, audio, and video.
  • - 🔧 Supports function calling, structured output, and context caching.
  • - 🌐 Web search grounding and tool use for agent loops.
  • - 🏢 Built by Google, released December 2025 in preview.
  • - 🎯 Catalog describes it as a high-speed, high-value member of Gemini 3.
💰 Pricing
$0.700 / $3.75
per 1M · input / output
📏 Context
256K tokens
📅 On Venice since
Dec 19, 2025
166 days ago
Provider

Google is an American multinational technology corporation and one of the world's most valuable brands. A subsidiary of parent company Alphabet Inc., Google operates across search, cloud computing, consumer electronics, and artificial intelligence. Its…

Read full profile →
25 models on Venice
10 text · 8 video · 2 image · 2 inpaint · 1 music · 1 embedding · 1 tts
Since Oct 15, 2024

About this model

Gemini 3 Flash Preview is Google's high-throughput, cost-efficient member of the Gemini 3 family, designed for agentic workflows, multi-turn chat, and coding assistance. Venice's catalog describes it as a high-speed, high-value thinking model that targets near-Pro reasoning with substantially lower latency, making it suited to interactive development and long-running agent loops. It sits below the Pro tier represented by Gemini 3.1 Pro Preview within the broader Gemini 3 lineup.

A central feature is configurable reasoning depth, exposed through a thinking-level control that lets developers dial internal reasoning to balance quality against latency and cost. Google documents that the lowest thinking setting approximates the latency and cost profile of a minimal thinking budget on the prior Flash generation, while stricter thought-signature handling improves reliability in multi-turn function calling.

The model natively handles interleaved text, images, audio, and video, and supports function calling, structured output, web search grounding, and context caching. Google documents a roughly one-million-token input context window and up to 64k tokens of output for Gemini 3 models, though the served context on this catalog is listed at 256k.

Within Venice's catalog this preview was later succeeded by Gemini 3.5 Flash. As a preview release, the model identifier and behavior may change before general availability.

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 1d ago