← Blog
explainx / blog

Gemini Omni Video Model emerges in early Gemini app tests: remix videos, edit in chat, and generate impressive samples ahead of Google I/O 2026

Google's unreleased Gemini Omni video model has been spotted in early Gemini app tests on May 12, 2026, allowing users to remix videos, edit directly in chat, and generate impressive samples from simple prompts. Early feedback praises math coherence, voice quality, and editing features, with samples showing suited men dining oceanside with shifting camera angles. Tied to high usage limits, the model hints at a major upgrade ahead of Google I/O on May 19-20.

9 min readYash Thakker
GoogleGeminiVideo GenerationAI VideoGoogle I/OMultimodal AI

MDX restores the committed source plus an HTML comment attribution; plain text bundles the rendered markdown body with the explainx.ai attribution footer.

Gemini Omni Video Model emerges in early Gemini app tests: remix videos, edit in chat, and generate impressive samples ahead of Google I/O 2026

On May 12, 2026, just days before Google I/O 2026 (scheduled for May 19-20), early evidence of Google's unreleased Gemini Omni video model has surfaced in the Gemini mobile app.

According to reports and leaked samples, Gemini Omni allows users to remix videos, edit directly in chat, and generate impressive video samples from simple text prompts—with early feedback praising prompt adherence, motion quality, voice coherence, and editing capabilities like object swaps.

The leak hints at a major video generation upgrade that potentially unifies video creation with Gemini's reasoning capabilities, positioning Google to compete directly with Runway Gen-3, Pika, Kling, and other frontier video models.

This article breaks down what early testers are seeing, how Gemini Omni compares to existing models, what the leak tells us about Google I/O announcements, and what developers and creators should watch for.

Gemini Omni video model — early samples and capabilities

What early testers are seeing in Gemini Omni

According to reports from @testingcatalog, @chatgpt21, and @chetaslua, the unreleased Gemini Omni model appeared in the Gemini mobile app with the following description:

"Meet our new video model. Remix your videos, edit directly in chat, try a template, and more."

Sample generation: suited men dining oceanside

One of the shared samples shows:

  • Suited men dining at a table next to the ocean
  • Shifting camera angles (multiple perspectives within a single generation)
  • Clinking glasses (fine-grained motion and audio coherence)
  • Oceanside environment with realistic lighting and atmosphere

@chatgpt21 commented:

"Google is COOKING 🧑🏻‍🍳. Early Sample Of The New Omni Video Model: (I'd put this slightly above Runway Gen-3)"

Editing capabilities: object swaps in anime clips

@testingcatalog noted:

"I won't lie, this is one of the best video models I have seen, maybe not the best, but a really strong performance. I was particularly impressed by the prompt adherence (except for the one shot with the missing centerpiece), the model's ability to swap objects in anime clips, and the overall motion quality."

This suggests Gemini Omni can:

  1. Remix existing videos — not just generate from scratch
  2. Edit in chat — conversational video editing workflows
  3. Swap objects — targeted edits without full regeneration
  4. Maintain coherence across edits

How Gemini Omni compares to other video models

Based on early feedback, here's how Gemini Omni is being positioned relative to other frontier video generation models:

ModelStrengths (per early reports)Weaknesses (per early reports)
Gemini OmniStrong prompt adherence, smooth motion, editing in chat, object swaps, math coherence (complex scenes), voice qualityMinor motion glitches, some missing elements (e.g., "missing centerpiece" in one shot)
Runway Gen-3High visual quality, cinematic feelLess conversational editing, no chat interface
Pika 2.0Fast generation, good for short clipsLess prompt adherence for complex scenes
Kling (Kuaishou)Strong motion dynamics, longer videosLess accessible (China-focused rollout)
OpenAI SoraImpressive samples, strong physicsNot publicly available
Luma Dream MachineFast, accessible, good qualityLess control over editing

Net: Early testers are placing Gemini Omni in the top tier of publicly accessible video models, with slightly better performance than Runway Gen-3 on prompt adherence and editing capabilities.

What "Omni" likely means: multimodal unification

The "Omni" branding is significant—it suggests Google is positioning this as a unified multimodal model that handles text, image, video, and voice in an integrated way, similar to:

  • OpenAI GPT-4o ("o" for omni) — unified text, image, and voice
  • Google Gemini 1.5 Pro — long-context multimodal reasoning
  • Anthropic Claude 4.7 with vision — multimodal but not video generation

What "Omni" likely means for Gemini:

  1. Video generation is not a separate model — it's integrated into the core Gemini architecture
  2. Edit directly in chat — leverage Gemini's conversational reasoning to guide video edits
  3. Cross-modal reasoning — e.g., "show me the key moment from this video and explain what happened"
  4. Unified API — developers can generate, edit, and analyze videos within the same Gemini API call

This is a structural bet on video as a first-class modality in LLMs, not a bolt-on feature.

Usage limits and pricing hints

@testingcatalog noted that Gemini Omni samples are tied to high usage limits in the Gemini app, suggesting:

  1. Premium feature — likely part of Gemini Advanced or a new "Gemini Ultra" tier
  2. Compute-intensive — video generation is expensive, so high limits indicate Google is targeting enterprise and creator use cases
  3. Not free-tier — unlike Gemini 1.5 Flash, which is broadly accessible, Omni will likely require paid access

For comparison:

  • Runway Gen-3 costs ~$0.10-0.20 per second of video
  • Pika offers limited free generations, then paid plans
  • Luma Dream Machine has free and paid tiers

Google may follow a similar model, or bundle Gemini Omni into Google One AI Premium (rumored to launch at Google I/O).

What this leak tells us about Google I/O 2026

The timing of this leak—one week before Google I/O 2026—is almost certainly intentional marketing, similar to:

  • OpenAI's GPT-4o leak before the Spring Update event
  • Anthropic's Opus 4.7 teasers before official launch
  • Meta's Llama 3.1 preview ahead of Connect

What we can expect at Google I/O (May 19-20):

  1. Official Gemini Omni announcement — likely part of the keynote
  2. API access — developers will get access through Vertex AI and Google AI Studio
  3. Pricing details — per-second or per-generation costs
  4. Integration with Google Workspace — e.g., generate videos in Google Slides, Docs with video narration
  5. Gemini Advanced or Gemini Ultra tier — bundling Omni with other premium features
  6. Developer tools — templates, editing workflows, conversational video editing APIs

Related announcements likely include:

  • Gemini 2.0 (next-generation reasoning model)
  • Gemini Code (competitor to GitHub Copilot and Claude Code)
  • Gemini for Enterprise (security, compliance, on-prem options)
  • Google Cloud AI Platform updates (Vertex AI, TPU v7, etc.)

Practical use cases for Gemini Omni

If Gemini Omni delivers on the early samples, here are the most compelling use cases:

1. Marketing and social media

Prompt: "Generate a 15-second video of our product being used in a coffee shop,
modern aesthetic, shot from multiple angles"
  • No need for stock footage or expensive shoots
  • Edit directly in chat to tweak lighting, angles, or pacing

2. Educational content

Prompt: "Show a video of how photosynthesis works, with zooming into a leaf cell,
chloroplasts visible, narrated explanation"
  • Complex scientific concepts visualized instantly
  • Math coherence ensures accurate representations

3. Product demos

Prompt: "Create a video demo of our app's onboarding flow,
showing a user's hand tapping through screens"
  • Rapid prototyping without filming or screen recording
  • Iterate in chat to adjust timing or UI elements

4. Creative storytelling

Prompt: "Generate a scene where a detective enters a rainy noir-style alley,
camera pans from above, neon signs reflecting in puddles"
  • Cinematic quality for indie filmmakers and creators
  • Object swaps to change props, characters, or settings

5. A/B testing video ads

Prompt: "Create three variations of this ad with different color grading and pacing"
  • Rapid iteration for performance marketing
  • Test creative hypotheses before expensive production

How developers should prepare for Gemini Omni

If Google announces Gemini Omni at I/O 2026, here's how to prepare:

1. Explore the API early

  • Sign up for Google AI Studio or Vertex AI access
  • Test prompt engineering patterns for video generation
  • Benchmark costs against Runway, Pika, and other tools

2. Build conversational video editing workflows

  • Leverage Gemini's chat interface for iterative edits
  • Design workflows where users can refine videos in natural language
  • Integrate with existing tools (e.g., video editing in Figma, Notion, Slack)

3. Combine with other Gemini capabilities

  • Long-context reasoning — generate videos from entire documents or transcripts
  • Multimodal search — find moments in videos and remix them
  • Voice integration — narrate videos using Gemini's voice synthesis

4. Monitor pricing and usage limits

  • Video generation is expensive—track costs carefully
  • Consider hybrid workflows (Gemini for generation, cheaper models for iteration)
  • Evaluate Google One AI Premium if bundled access is cheaper

How Gemini Omni fits into Google's AI strategy

Gemini Omni is part of a broader push to make Gemini the default multimodal foundation for Google's ecosystem:

ProductGemini integrationVideo capability
Gemini AdvancedCore reasoning modelLikely gets Omni access
Google WorkspaceDocs, Sheets, Slides, GmailGenerate videos in Slides, narrate Docs
YouTubeVideo understanding, summariesRemix and edit YouTube videos directly
Google CloudVertex AI, Gemini APIEnterprise video generation at scale
AndroidOn-device Gemini NanoLocal video editing on Pixel devices
Google SearchAI OverviewsGenerate video explainers in search results

This is a platform play—Google wants video generation to be as accessible as text generation across all its products.

Related on ExplainX

Bottom line

Gemini Omni is Google's unreleased video generation model that has surfaced in early Gemini app tests just days before Google I/O 2026. Early samples show strong prompt adherence, smooth motion, editing in chat, and object swaps—with testers placing it slightly above Runway Gen-3 in quality.

The "Omni" branding suggests Google is unifying video generation with Gemini's reasoning capabilities, making video a first-class modality across its ecosystem. The model is tied to high usage limits, indicating it will likely be a premium feature in Gemini Advanced or a new Gemini Ultra tier.

If Google announces Gemini Omni at I/O 2026 on May 19-20, expect:

  • API access through Vertex AI and Google AI Studio
  • Pricing details (likely per-second or per-generation)
  • Integration with Google Workspace (Slides, Docs, YouTube)
  • Developer tools for conversational video editing

For creators, marketers, and developers, Gemini Omni represents a step-function improvement in accessible, high-quality video generation—with chat-based editing as a potential killer feature that distinguishes it from Runway, Pika, and other tools.

Watch Google I/O 2026 keynote on May 19 for the official announcement.


Early reports via X: @testingcatalog, @chatgpt21, @chetaslua. Google I/O 2026: May 19-20. ExplainX is not affiliated with Google.

Related posts