GoogleGoogle·🎬 Video Generation·New

Gemini Omni Flash R2V

anonymized
Try on Venice.ai ↗
Quick reference
Gemini Omni Flash R2V — TLDR
  • 🆕 Reference-to-video variant in Google's new Gemini Omni Flash family
  • 👁️ Multimodal video generation exposed through the Gemini API
  • 🧠 Combines physics understanding with Gemini's world knowledge for storytelling
  • 💬 Supports conversational, stateful editing that carries context across turns
  • 🔧 Accepts text, image, audio and video inputs in one workflow
  • ⚡ Positioned as a fast default for video generation
  • 🔒 Generated media carries SynthID watermarking
  • 🏢 Aimed at collapsing multi-tool production into plain-language instructions
💰 Pricing
$0.570 – $1.43
per generation
📅 On Venice since
Jun 30, 2026
1 day ago
Provider

Google is an American multinational technology corporation and one of the world's most valuable brands. A subsidiary of parent company Alphabet Inc., Google operates across search, cloud computing, consumer electronics, and artificial intelligence. Its…

Read full profile →
30 models on Venice
11 video · 10 text · 3 image · 3 inpaint · 1 music · 1 embedding · 1 tts
Since Oct 15, 2024

About this model

Gemini Omni Flash R2V is the reference-to-video configuration of Google's Gemini Omni Flash line, a fast, multimodal model for video generation and conversational editing exposed through the Gemini API. The reference-to-video mode conditions generation on supplied reference material, complementing the family's other entry points such as Gemini Omni Flash text-to-video and Gemini Omni Flash image-to-video.

The most notable shift versus Google's earlier video systems is statefulness. Where clip generators typically restart from a blank prompt, Omni Flash keeps video context across a conversation, so each turn builds on the prior result and applies incremental changes—adjusting lighting or swapping backgrounds—without re-describing the whole scene. It also treats text, image, audio and video as combinable inputs rather than a single text prompt.

Compared with the earlier Veo 3.1 Full Quality and Veo 3 Full Quality generations, which remain available for video work, Omni Flash emphasizes conversational, multi-turn editing and Gemini's world knowledge—historical, scientific, cultural and physical context—to move from photorealism toward narrative. Google notes generated media includes SynthID watermarking.

Because primary documentation for this specific reference-to-video variant is limited, capability and benchmark specifics beyond the family-level description above are not detailed here.

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 16h ago