Google·🎬 Video Generation·New

Gemini Omni Flash R2V

anonymized

Try on Venice.ai ↗

Quick reference

Gemini Omni Flash R2V — TLDR

🆕 Reference-to-video variant in Google's new Gemini Omni Flash family
👁️ Multimodal video generation exposed through the Gemini API
🧠 Combines physics understanding with Gemini's world knowledge for storytelling
💬 Supports conversational, stateful editing that carries context across turns
🔧 Accepts text, image, audio and video inputs in one workflow
⚡ Positioned as a fast default for video generation
🔒 Generated media carries SynthID watermarking
🏢 Aimed at collapsing multi-tool production into plain-language instructions

💰 Pricing

$0.570 – $1.43

per generation

📅 On Venice since

Jun 30, 2026

1 day ago

Provider

Google

Google is an American multinational technology corporation and one of the world's most valuable brands. A subsidiary of parent company Alphabet Inc., Google operates across search, cloud computing, consumer electronics, and artificial intelligence. Its…

Read full profile →

30 models on Venice

11 video · 10 text · 3 image · 3 inpaint · 1 music · 1 embedding · 1 tts

Since Oct 15, 2024

Wikipedia ↗Official site ↗

See 29 other models from Google →

About this model

Gemini Omni Flash R2V is the reference-to-video configuration of Google's Gemini Omni Flash line, a fast, multimodal model for video generation and conversational editing exposed through the Gemini API. The reference-to-video mode conditions generation on supplied reference material, complementing the family's other entry points such as Gemini Omni Flash text-to-video and Gemini Omni Flash image-to-video.

The most notable shift versus Google's earlier video systems is statefulness. Where clip generators typically restart from a blank prompt, Omni Flash keeps video context across a conversation, so each turn builds on the prior result and applies incremental changes—adjusting lighting or swapping backgrounds—without re-describing the whole scene. It also treats text, image, audio and video as combinable inputs rather than a single text prompt.

Compared with the earlier Veo 3.1 Full Quality and Veo 3 Full Quality generations, which remain available for video work, Omni Flash emphasizes conversational, multi-turn editing and Gemini's world knowledge—historical, scientific, cultural and physical context—to move from photorealism toward narrative. Google notes generated media includes SynthID watermarking.

Because primary documentation for this specific reference-to-video variant is limited, capability and benchmark specifics beyond the family-level description above are not detailed here.

Sources

Gemini Omni Flash Guide: Prompts, Safety Risks, SynthID and PixVerse Workflow | PixVersepixverse.ai ↗

Generate content with the Gemini API | Gemini Enterprise Agent Platform | Google Cloud Documentationdocs.cloud.google.com ↗

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 16h ago