AI DevelopmentNew Release8 min readPublished June 30, 2026

Fast image + conversational video · ~4s images · $0.10/sec video, preview

Nano Banana 2 Lite & Gemini Omni Flash: Image to Video

Google shipped two creative models on June 30, 2026. Nano Banana 2 Lite makes a text-to-image in about four seconds at $0.034 a shot — fast and cheap enough for high-volume creative pipelines. Gemini Omni Flash, new in public preview, turns those images into video at $0.10 per second, edited in plain language. Chained together they are an image-to-video assembly line for product and ad creative.

DA
Digital Applied Team
Senior AI engineers · Published Jun 30, 2026
PublishedJun 30, 2026
Read time8 min
SourceGoogle blog + 1 more
Image latency
~4s
Nano Banana 2 Lite · Google-stated
Image cost
$0.034/1K
per 1K-resolution image
Video cost
$0.10/sec
Omni Flash · 10-second clips
Sequential edits
3
via the Interactions API

Google shipped two creative-AI models on June 30, 2026 that are more interesting together than apart. Nano Banana 2 Lite is the fastest, cheapest model in Google’s Nano Banana image family — a text-to-image in roughly four seconds at $0.034 per 1K-resolution image. Gemini Omni Flash, launching in public preview, generates and conversationally edits video. The headline is not either model on its own; it is the pipeline they form — generate an image cheaply, then animate it — at a price and latency that make high-volume creative production practical.

For a marketing team that is the whole point. The expensive, slow part of creative has always been volume: a hundred product variations, a dozen ad cuts, localized versions for every market. A four-second image and a per-second video price change what is worth producing at all. Google’s own launch demos lean directly into this — one, “Omni Product Studio,” turns static product shots into cinematic e-commerce videos.

This guide covers what each model actually is, where Nano Banana 2 Lite sits in the four-tier family, what Omni Flash can and cannot do yet (it is a preview, with real limitations), how to chain the two, and what the release means for a marketing or e-commerce team that produces creative at scale. Every latency and price figure here is Google-stated.

Key takeaways
  1. 01
    Nano Banana 2 Lite is built for speed and volume.Google's fastest Nano Banana image model (gemini-3.1-flash-lite-image) returns a text-to-image in about 4 seconds at $0.034 per 1K image — tuned for near-real-time, high-volume creative pipelines while keeping prompt adherence, character consistency, and legible in-image text.
  2. 02
    Gemini Omni Flash brings conversational video.A new public-preview model (gemini-omni-flash-preview) for video generation and natural-language editing at $0.10 per second of output. It produces 10-second clips today, with longer durations and multimodal referencing of images, text, and video.
  3. 03
    The real product is the chain.Generate an image with Nano Banana 2 Lite, pass it to Omni Flash for animation, and keep context across up to three sequential edits via the Interactions API. That image-to-video loop is the assembly line behind Google's product, interior, and 'Anywhere' demos.
  4. 04
    It is a preview, with honest gaps.Omni Flash is a public preview: no audio-reference uploads in the API yet, character consistency wobbles across scene changes, and video references are not fully functional. Treat it as something to prototype with, not to wire into production untested.
  5. 05
    Marketers are the obvious early adopters.Cheap, fast image and video generation maps straight onto e-commerce product media, ad-creative variation, and localized content at volume. Nano Banana 2 Lite is also rolling out inside Google Ads, AI Mode in Search, the Gemini app, and Photos.

01What ShippedTwo models, one creative pipeline.

Both models landed on June 30, 2026 through the Gemini API, Google AI Studio, and the Gemini Enterprise Agent Platform, with consumer surfaces following. Nano Banana 2 Lite is generally rolling out; Gemini Omni Flash is a public preview (gemini-omni-flash-preview) and available to developers for the first time, alongside placements in the Gemini app and Google Flow.

The naming is worth untangling, because Google now ships four image models under the Nano Banana banner. Nano Banana 2 Lite is the new, speed-focused entry; Nano Banana 2 is the generalist; Nano Banana Pro is the high-end option; and the original Nano Banana — the model we covered in our first Nano Banana guide — is now the legacy tier Google recommends upgrading off. Omni Flash sits outside that family entirely: it is a video model, not an image model.

Read in sequence, this is Google doing to video roughly what Nano Banana did to image generation a year ago — making a capable model fast and cheap enough that the constraint stops being the model and starts being your imagination. The person who leads Google AI Studio and the Gemini API put it more plainly the morning it shipped.

"The speed of Nano Banana 2 Lite is going to enable so many new use cases where there is a high degree of latency sensitivity, honestly feels like magic. I also expect Omni to open up a whole new category of (video) use cases like Nano Banana itself did!"— Logan Kilpatrick, Google (AI Studio & Gemini API), June 30, 2026

02Nano Banana 2 LiteSpeed is the product.

Nano Banana 2 Lite (gemini-3.1-flash-lite-image) is positioned as the fastest, most cost-efficient model in the family. Google’s two headline numbers tell the story: a text-to-image in about four seconds, at $0.034 per 1K-resolution image. That latency is the part that changes behavior — four seconds is fast enough to put image generation inside an interactive loop, a high-velocity pipeline, or a consumer feature where a user is waiting on the result.

The interesting claim is what Google says it did not trade away for that speed. Lite is described as keeping reliable prompt adherence, strong character consistency, and legible in-image text rendering — the three things that usually break first when an image model is optimized for throughput. In-image text in particular has been a perennial weak spot for generative image models; keeping it legible at this price point is the meaningful part of the claim, if it holds up in your own testing.

Latency
Text-to-image
~4s

Google-stated time to a generated image. Fast enough to sit inside an interactive feature or a high-volume batch loop rather than a slow offline job.

near-real-time
Cost
Per 1K image
$0.034

Per 1K-resolution image. At fractions of a cent, a hundred-variation creative test stops being a budget line and starts being a default step.

high-volume friendly
Quality kept
Things speed didn't break
3

Prompt adherence, character consistency, and legible in-image text — the three quality axes Google says Lite holds despite the speed focus. Verify on your own prompts.

Google-stated

Where Lite fits is clearest against the rest of the lineup. Google now frames the Nano Banana family as a four-tier ladder — Lite for speed, Nano Banana 2 as the generalist workhorse, Nano Banana Pro for complex professional work, and the original Nano Banana as a legacy model to migrate off. Only Lite has published latency and price figures in the announcement; the table below uses Google’s positioning for the others rather than inventing numbers.

The four-tier Nano Banana image-model family as positioned in Google's June 30, 2026 announcement. Only Nano Banana 2 Lite has published latency and price figures (about four seconds, $0.034 per 1K image); the other tiers are described by Google's stated positioning, not by figures, and all claims are Google-stated.
ModelModel IDBest forHeadline spec
Nano Banana 2 Litegemini-3.1-flash-lite-imageNear-real-time, high-volume pipelines~4s · $0.034 / 1K image
Nano Banana 2gemini-3.1-flash-imageEveryday generalist workBalanced speed and cost
Nano Banana Progemini-3-pro-imageComplex, professional outputHighest quality, premium tier
Nano Banana (legacy)gemini-2.5-flash-imageExisting apps — upgrade recommendedSuperseded by Nano Banana 2
The flash-lite pattern, now for images
Lite extends a pattern Google has run before: take the cheapest, fastest tier of a model line and make it good enough that price stops being a reason not to use it. We saw it with the text model in our look at Gemini 3.1 Flash Lite — and gemini-3.1-flash-lite-image is the same idea applied to pixels. The strategic bet is identical: win on the volume of everyday work, not the ceiling of the hardest job.

03Gemini Omni FlashConversational video, in preview.

Gemini Omni Flash is the more experimental half of the release: a multimodal model that takes text, image, and video inputs and specializes in video generation and conversational editing. Google calls it a “high quality, cost-efficient model for video generation and conversational editing,” priced at $0.10 per second of video output. It generates 10-second clips today, with longer durations described as coming soon.

“Conversational editing” is the differentiator. Rather than re-prompting from scratch, you refine a clip in natural language — change the camera move, swap an element, adjust the action — and the model draws on Gemini’s broader world knowledge (history, biology, narrative logic) to keep edits coherent. It also supports multimodal referencing, combining image, text, and video inputs to steer a generation.

Edit
Conversational editing
natural language

Refine a generated clip by describing the change instead of re-prompting — adjust the camera, the action, or an element, and keep the rest of the shot. The headline capability of the model.

describe the change
Reference
Multimodal inputs
image · text · video

Combine image, text, and video references to steer a generation. Video references up to 3 seconds are accepted, though Google notes they are not yet fully processed in the preview.

steer the output
Reason
World knowledge
history · biology · logic

Omni Flash draws on Gemini's knowledge — history, biology, narrative logic — plus text-and-action synchronization, so a generated scene holds together rather than drifting frame to frame.

coherent scenes
What does not work yet — stated plainly
Omni Flash is a public preview, and Google is upfront about the gaps: no audio-reference uploads in the Gemini API yet, character consistency that can break across scene changes, and video references that are accepted but not yet correctly processed. That is the right framing for it today — a model to prototype against and pressure-test, not one to wire into a customer-facing pipeline without a careful review of every clip it produces.

04The ChainImage to video, in one loop.

The two models are designed to hand off to each other. The intended pattern: generate a still cheaply and quickly with Nano Banana 2 Lite, then pass that image to Omni Flash to animate it into a clip — and keep refining, maintaining context across up to three sequential edits through the Interactions API. That image-to-video loop is the engine behind all three of Google’s launch demos.

Anywhere
Photo to places to motion
1

Upload a photo, let Nano Banana 2 Lite relocate the subject to new landmarks, then have Omni Flash animate each into a short video clip. Travel and lifestyle content, generated end to end.

image → video
Space Lift
Interior concepts, animated
2

Generate room-design concepts as images, then animate them into a cinematic walkthrough. The same loop applied to interiors, real estate, and home retail.

concept → showcase
Omni Product Studio
Static shots to e-commerce video
3

Turn flat product images into cinematic e-commerce videos — the demo that maps most directly onto a marketing team's day-to-day creative output.

product → ad clip

The three-edit context window is the practical detail that makes this a workflow rather than a party trick. Because the Interactions API holds state across sequential edits, you can iterate on a clip conversationally — generate, adjust the framing, then tune the action — without losing the thread. For a fuller picture of where Omni Flash sits against the dedicated video models, our Omni vs Sora vs Veo 3 comparison covers the trade-offs, and it shares a lineage with the native computer-use work in Gemini 3.5 Flash.

05What It Means For MarketersWhere cheap, fast creative actually pays.

Strip away the demos and the value to a marketing team is about volume economics. Most creative work that does not get made is blocked on cost and time, not ideas: the fortieth product video, the localized ad cut for a small market, the weekly batch of social variations. A four-second image and a ten-cent-per-second video move those from “not worth it” to “default.” That Nano Banana 2 Lite is rolling out inside Google Ads is the clearest signal of where Google expects it to be used.

Highest leverage
E-commerce product media

Generate product images with Lite and animate hero shots into short clips with Omni Flash — the 'Omni Product Studio' pattern. Most stores have catalogs far larger than their video budget allows; this closes that gap.

Start here for retail
High volume
Ad-creative variation

Spin up dozens of image and short-video variants for testing at a few cents each, instead of one expensive hero asset. Lite's speed makes per-variant generation cheap enough to test broadly.

Variant testing at scale
Localized
Market-specific content

Regenerate creative per market — different settings, products, on-image copy — without a per-market shoot. Watch in-image text rendering closely when localizing copy.

Per-market at low cost
Guardrails first
Anything customer-facing

Omni Flash is a preview with consistency gaps, and brand and rights review still matter. Keep a human approving every asset that ships, and verify usage rights before publishing generated media.

Review before publish

The honest caveat sits alongside the opportunity. Generative media at this volume needs a review layer, not just a generate button: brand consistency, factual accuracy on product detail, and usage rights all still require a human in the loop — and Omni Flash’s preview limitations make that doubly true for video. The teams that win here will treat these models as a throughput multiplier on a governed creative process, not a replacement for one. That is the same balance we cover in our look at image-generation API pricing and the higher-end Nano Banana Pro for marketing.

06Where To StartOne workflow, this week.

The practical opening is narrow on purpose. Pick a single high-volume creative task you already do by hand — product-image variations, social cutdowns, localized banners — and rebuild just that one as a Nano Banana 2 Lite pipeline, with a human approving the output. Prove the quality and the cost on a real workflow before you add Omni Flash video on top, since the image side is generally available and the video side is still a preview.

From there the sequencing mirrors how we roll any generative system out with clients: prove value on one governed workflow, add the video step once the image quality earns trust, and only then widen to more creative types — keeping brand review and rights checks in the loop throughout. That scoping is exactly where our AI content engine and e-commerce engagements begin, before any tool commitment.

07ConclusionThe constraint moves from cost to taste.

The shape of creative AI, June 2026

When image and video generation get this cheap and fast, the bottleneck stops being production.

Nano Banana 2 Lite and Gemini Omni Flash are best read as one release: a fast, cheap image model and a conversational video model designed to chain into a single image-to-video pipeline. The individual specs — about four seconds per image at $0.034, video at $0.10 a second — matter less than what they unlock together, which is high-volume creative production at a price that makes most of it viable for the first time.

Keep the framing precise. Nano Banana 2 Lite is generally rolling out; Gemini Omni Flash is a public preview with real gaps — no audio references yet, shaky cross-scene consistency, video references that do not fully work. The figures are Google-stated, not independently verified. Prototype eagerly, but ship deliberately, with a human reviewing what goes out.

The forward read is the same one that followed the original Nano Banana. When production gets cheap, the differentiator stops being who can make the asset and becomes who makes the right one — on-brand, accurate, and worth a customer’s attention. The teams that win will pair this throughput with judgment about what is actually worth producing. That, not the price per image, is what this release changes.

Turn generative media into a governed pipeline

Cheap, fast creative is only an advantage when it stays on brand.

We help marketing and e-commerce teams turn models like Nano Banana 2 Lite and Gemini Omni Flash into governed creative pipelines — product media and ad-variant generation at volume, with brand review and rights checks in the loop, delivered in days not quarters.

Free consultationExpert guidanceTailored solutions
What we work on

Generative creative engagements

  • Product-media pipelines — image to short-form video
  • Ad-creative variation and testing at volume
  • Localized content without per-market shoots
  • Brand review and rights checks in the loop
  • Cost modelling per asset across the Nano Banana tiers
FAQ · Nano Banana 2 Lite & Gemini Omni Flash

The questions we get every week.

Nano Banana 2 Lite (model ID gemini-3.1-flash-lite-image) is the fastest, most cost-efficient model in Google's Nano Banana image-generation family, announced June 30, 2026. Google states it returns a text-to-image in about four seconds at $0.034 per 1K-resolution image, and that it keeps reliable prompt adherence, strong character consistency, and legible in-image text despite the speed focus. It is built for near-real-time, high-volume creative pipelines, and is available through Google AI Studio, the Gemini API, and the Gemini Enterprise Agent Platform, with rollout to consumer products including AI Mode in Search, the Gemini app, NotebookLM, Google Photos, and Google Ads.
Related dispatches

Continue exploring frontier releases.