Google Drops Nano Banana 2 Lite and Gemini Omni Flash for Developers

Google dropped a pair of new generative media models today that cover pretty much the full pipeline from image generation to video editing.

Nano Banana 2 Lite is Google’s fastest and cheapest image model yet. It delivers text-to-image in 4 seconds at $0.034 per 1K resolution image. Gemini Omni Flash brings high-quality video generation and conversational editing to developers for the first time, priced at $0.10 per second of output. Both are available today in Google AI Studio, the Gemini API, and the Gemini Enterprise Agent Platform.

Google is positioning them as complementary tools you can chain together: generate an image with Nano Banana 2 Lite, then animate it into a video with Omni Flash.

Today we're releasing two generative media models for developers and enterprises with strong cost-performance.

🍌 Nano Banana 2 Lite: our fastest, most cost-efficient image model in the Nano Banana family.

🌐 Gemini Omni Flash: our natively multimodal high quality, cost… pic.twitter.com/3EyONu2vhX
— Google (@Google) June 30, 2026

Nano Banana 2 Lite: speed and cost first

Nano Banana 2 Lite (model name gemini-3.1-flash-lite-image) is built for rapid ideation and high-volume pipelines where latency and budget are the primary constraints. Google says it’s the recommended replacement for anyone still using the first-gen Nano Banana (gemini-2.5-flash-image).

Key specs:

Text-to-image in 4 seconds
$0.034 per 1K resolution image
Strong prompt adherence with good character consistency and in-image text rendering

It sits at the bottom of the Nano Banana family, with Nano Banana 2 (the generalist workhorse) and Nano Banana Pro (for complex professional use) above it.

Beyond developer platforms, Nano Banana 2 Lite is rolling into Google consumer surfaces too, including AI Mode in Search, the Gemini app, NotebookLM, Google Photos, Stitch, Google Flow, and Google Ads.

Gemini Omni Flash: video generation for developers

Gemini Omni Flash (gemini-omni-flash-preview) is the model Google previewed at I/O that brings Gemini’s multimodal reasoning into video generation and editing.

It supports:

Video generation from text, image, and video inputs
Conversational editing, so you can refine videos with natural language
Multimodal referencing (combine images, text, and video for scene consistency)
$0.10 per second of output (same pricing as Veo 3.1 Fast)

There are some important limitations to note since this is a preview release: – Maximum 10-second video generations (longer clips are coming) – Audio reference uploading and scene extension are not supported yet – Video references over 3 seconds are not correctly processed yet – Character consistency across scene changes has some limitations

Chaining them together

The real differentiator is how these models work together. Google released three demo apps to show the pipeline:

Anywhere: upload a selfie, Nano Banana 2 Lite transports you to landmarks, Omni Flash animates the result
Space Lift: interior design app that generates room concepts and animates them into walkthrough videos
Omni Product Studio: converts static product images into cinematic e-commerce videos

Using the Interactions API, developers can chain up to three sequential edits while maintaining session history and context.

This is a follow-up to the Gemini Omni guide I wrote a few weeks ago when Google first showed the concept at I/O. Now it’s actually shippable.

Reviewed & Written By

Tony Simons

Independent tech reviewer and creator of Tony Reviews Things. 14 years of hands-on testing, software auditing, and workflow automation. I test the gear so you don't waste your money on junk.

About Me How I Test

Nano Banana 2 Lite: speed and cost first

Gemini Omni Flash: video generation for developers

Chaining them together

Tony Simons

Submit a Take Cancel reply

Related signals

Claude Sonnet 5 Is Here — Near-Opus Performance at Sonnet Pricing

NotebookLM Adds Short Video Overviews — AI-Powered Educational Doom Scrolling

OpenAI Fixed an 18-Year-Old Bug in a Library Everyone Uses. Here’s How They Found It.