Models
May 19, 2026

Google introduces Gemini Omni for next-generation multimodal AI creation

Google has unveiled Gemini Omni, a new multimodal AI model capable of generating and editing video, audio, images, and text from a single conversational interface.

Google has introduced Gemini Omni, a new family of multimodal AI models designed to generate and edit content across video, audio, images, and text from a unified conversational interface.

The first release, Gemini Omni Flash, allows users to create videos using text prompts, existing videos, images, or audio while editing outputs directly through natural language instructions.

Google says the model combines Gemini reasoning capabilities with advanced media generation to support “anything from any input.” Gemini Omni is rolling out across the Gemini app, Google Flow, and YouTube Shorts as part of Google’s broader push toward agentic and multimodal AI experiences.

#
Google

Read Our Content

See All Blogs
Gen AI

The complete guide to Claude Fable 5 and Mythos 5: Series part one

Sanjay P N

June 10, 2026
Read more
Gen AI

Why enterprise AI consulting fails without engineering

Siddharth Menon

June 10, 2026
Read more