Google Drops Nano Banana 2 Lite and Gemini Omni Flash for Developers

Google released Nano Banana 2 Lite for fast cheap image generation and Gemini Omni Flash for video creation and conversational editing -- available today in AI Studio and the Gemini API.

Google Drops Nano Banana 2 Lite and Gemini Omni Flash for Developers

Google dropped a pair of new generative media models today that cover pretty much the full pipeline from image generation to video editing.

Nano Banana 2 Lite is Google’s fastest and cheapest image model yet. It delivers text-to-image in 4 seconds at $0.034 per 1K resolution image. Gemini Omni Flash brings high-quality video generation and conversational editing to developers for the first time, priced at $0.10 per second of output. Both are available today in Google AI Studio, the Gemini API, and the Gemini Enterprise Agent Platform.

Google is positioning them as complementary tools you can chain together: generate an image with Nano Banana 2 Lite, then animate it into a video with Omni Flash.

Nano Banana 2 Lite: speed and cost first

Nano Banana 2 Lite (model name gemini-3.1-flash-lite-image) is built for rapid ideation and high-volume pipelines where latency and budget are the primary constraints. Google says it’s the recommended replacement for anyone still using the first-gen Nano Banana (gemini-2.5-flash-image).

Key specs:

  • Text-to-image in 4 seconds
  • $0.034 per 1K resolution image
  • Strong prompt adherence with good character consistency and in-image text rendering

It sits at the bottom of the Nano Banana family, with Nano Banana 2 (the generalist workhorse) and Nano Banana Pro (for complex professional use) above it.

Beyond developer platforms, Nano Banana 2 Lite is rolling into Google consumer surfaces too, including AI Mode in Search, the Gemini app, NotebookLM, Google Photos, Stitch, Google Flow, and Google Ads.

Gemini Omni Flash: video generation for developers

Gemini Omni Flash (gemini-omni-flash-preview) is the model Google previewed at I/O that brings Gemini’s multimodal reasoning into video generation and editing.

It supports:

  • Video generation from text, image, and video inputs
  • Conversational editing, so you can refine videos with natural language
  • Multimodal referencing (combine images, text, and video for scene consistency)
  • $0.10 per second of output (same pricing as Veo 3.1 Fast)

There are some important limitations to note since this is a preview release: – Maximum 10-second video generations (longer clips are coming) – Audio reference uploading and scene extension are not supported yet – Video references over 3 seconds are not correctly processed yet – Character consistency across scene changes has some limitations

Chaining them together

The real differentiator is how these models work together. Google released three demo apps to show the pipeline:

  • Anywhere: upload a selfie, Nano Banana 2 Lite transports you to landmarks, Omni Flash animates the result
  • Space Lift: interior design app that generates room concepts and animates them into walkthrough videos
  • Omni Product Studio: converts static product images into cinematic e-commerce videos

Using the Interactions API, developers can chain up to three sequential edits while maintaining session history and context.

This is a follow-up to the Gemini Omni guide I wrote a few weeks ago when Google first showed the concept at I/O. Now it’s actually shippable.

Tony Simons

Reviewed & Written By

Tony Simons

Independent tech reviewer and creator of Tony Reviews Things. 14 years of hands-on testing, software auditing, and workflow automation. I test the gear so you don't waste your money on junk.

Submit a Take

Your email address will not be published. Required fields are marked *