Google releases Gemini native image models to general availability, adds video-to-image generation
Google has moved two of its native Gemini image generation models from preview to general availability on the Gemini API: gemini-3.1-flash-image and gemini-3-pro-image. The GA release also introduces video-to-image generation, a new capability that lets developers pass video files or YouTube URLs as inputs to generate thumbnails, cinematic posters, and summary infographics.
What's new
From the Gemini API changelog for May 28, 2026:
"Released
gemini-3.1-flash-image(Nano Banana 2) andgemini-3-pro-image(Nano Banana Pro), the generally available (GA) versions of our native visual models"
The two models serve distinct use cases:
gemini-3.1-flash-image— optimized for speed and cost, suited for high-throughput image generation tasksgemini-3-pro-image— the higher-capability option, targeting professional-grade creative and commercial output
Video-to-image generation is the new capability that ships alongside the GA promotion. Developers can now pass a video file or a YouTube URL alongside a text prompt, and the model generates a relevant image — the changelog specifies use cases including "high-quality thumbnails, cinematic movie posters, or summary infographics."
The corresponding preview model versions — gemini-3.1-flash-image-preview and gemini-3-pro-image-preview — will shut down on June 25, 2026. Developers on the preview models have approximately four weeks to migrate.
Context
These models are Google's native multimodal image generation capability within the Gemini API, distinct from third-party image generation models available through Vertex AI. They are designed to accept text, images, and now video as inputs, and produce images as output — making them natural fits for workflows that need coherent cross-modal transformation.
The GA promotion follows Google I/O 2026, where Google outlined its broader Gemini 3.x family strategy. The image models are part of an effort to give enterprise developers production-stable access to Google's image generation capabilities without relying on preview-tier SLAs.
The video-to-image capability is a notable addition to the API surface. While text-to-image generation is well-established, video-as-input opens a distinct class of use cases: automatically generating cover images for video content at scale, creating visual summaries of video material, or building tools that extract keyframe-aligned imagery from video inputs.
Why it matters
GA status signals production-readiness in Google's release taxonomy: these models now carry stability commitments and predictable lifecycle windows that preview models do not. For enterprise developers, that distinction matters — building production pipelines on preview models carries deprecation risk, while GA models provide a contractual basis for SLA planning.
The video-to-image capability, while not widely previewed before this changelog entry, opens a practical gap in the market. Most image generation APIs are text-first; adding video as a first-class input type positions Gemini's image models for media workflows that other providers cannot directly serve.
The June 25 shutdown of the preview versions creates a short migration window. Developers using the preview model IDs in production should treat this as an immediate action item.
Corroborating sources
- Ai.google
https://ai.google.dev/gemini-api/docs/changelog.md.txt
“Released `gemini-3.1-flash-image` (Nano Banana 2) and `gemini-3-pro-image` (Nano Banana Pro), the generally available (GA) versions of our native visual models”