GoogleGoogle·💬 Text Generation·New

Gemini 3.5 Flash

ReasoningVisionFunction CallingWeb SearchAudioanonymized
🧠 Try in Intelligence →Try on Venice.ai ↗
Quick reference
Gemini 3.5 Flash — TLDR
  • 🆕 Google's latest Flash-tier reasoning model, released May 2026.
  • 📏 One-million-token context window; up to 65k output tokens.
  • 🧠 Built on the Gemini 3 Flash reasoning foundation with thinking levels.
  • 👁️ Natively multimodal: text, images, audio, video, and PDF input.
  • 🔧 Tools include function calling, search grounding, code execution.
  • 🎯 Tuned for agentic workflows, coding, and long-horizon tasks.
  • ⚡ Configurable thinking levels trade quality against latency and cost.
  • 🏢 Generally available across Gemini app, API, and Enterprise platform.
💰 Pricing
$1.55 / $9.45
per 1M · input / output
📏 Context
1M tokens
📅 On Venice since
May 22, 2026
12 days ago
Provider

Google is an American multinational technology corporation and one of the world's most valuable brands. A subsidiary of parent company Alphabet Inc., Google operates across search, cloud computing, consumer electronics, and artificial intelligence. Its…

Read full profile →
25 models on Venice
10 text · 8 video · 2 image · 2 inpaint · 1 music · 1 embedding · 1 tts
Since Oct 15, 2024

About this model

Gemini 3.5 Flash is Google's high-speed "thinking" model, positioned as a cost-efficient route that approaches Pro-tier capability while preserving the low latency of the Flash series. It is built directly on the Gemini 3 Flash reasoning foundation, accepting text, images, audio, video, and PDFs as input with a one-million-token context window and up to 65k output tokens. Built-in tooling covers function calling, structured outputs, search and URL grounding, and code execution.

Compared with its same-family predecessor Gemini 3 Flash Preview, 3.5 Flash ships as a generally available model across the Gemini app, API, and Enterprise platform, inheriting the Gemini 3 family's reasoning and multimodal capabilities. It uses configurable thinking levels so developers can trade reasoning depth against latency and cost for a given workload.

For maximum abstract-reasoning depth or the heaviest long-context retrieval, Google's Pro tier such as Gemini 3.1 Pro Preview may remain preferable. Gemini 3.5 Flash, however, is designed as a fast reasoning core for agentic pipelines, multi-turn chat, coding assistance, and document-heavy workloads where responsiveness and value matter.

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 5d ago