Google Retires All Gemini 2.0 Flash Models, Directs Developers to Gemini 3.5 Flash
Google shut down all four Gemini 2.0 Flash model variants on June 1, 2026, completing the retirement of the 2.0 Flash generation and clearing a path for Gemini 3.5 Flash as the standard fast-inference model in the Gemini API.
What Changed
Four model IDs were deactivated on June 1, 2026:
gemini-2.0-flashgemini-2.0-flash-001gemini-2.0-flash-litegemini-2.0-flash-lite-001
Google's documentation directs developers to migrate to either gemini-3.5-flash or gemini-3.1-flash-lite, depending on their latency and cost requirements. Requests to any of the retired model IDs now return errors.
Context
Gemini 2.0 Flash launched in early 2025 as Google's fast, cost-efficient model optimized for high-frequency inference. It was widely adopted for search grounding, short-context Q&A, and high-throughput production workloads. The Lite variant targeted even lower-latency use cases.
Both successors are now generally available:
gemini-3.5-flash: GA as of May 19, 2026, targeting agentic tasks and coding with a significant capability step up from 2.0 Flashgemini-3.1-flash-lite: GA as the cost-efficient fast-inference replacement forgemini-2.0-flash-lite, targeting high-volume, lower-complexity use cases
The gemini-3.1-flash-lite-preview model was shut down on May 25, 2026 — one week before the full 2.0 Flash line retired — giving developers who used the preview a brief window to migrate.
Why It Matters
For developers with production systems on gemini-2.0-flash, the forced migration requires testing prompt compatibility and output consistency against the new model versions. Unlike soft deprecations with grace-period fallbacks, the retired IDs return errors immediately — any app that hasn't updated its model ID is now broken.
The retirement illustrates Google's accelerating model turnover pace in 2026. Gemini 2.0 Flash launched roughly 18 months before retirement — a lifecycle that matches OpenAI and Anthropic's increasingly rapid model replacement cadence. Developers building on Google's models should factor in similar transitions for current 3.x models as a 4.x family emerges.
The migration path from gemini-2.0-flash to gemini-3.5-flash is not a lateral swap: 3.5 Flash is positioned as a more capable model, not just a rename, which means developers may see improved results but should validate regression-sensitive workflows before relying on it in production.
Corroborating sources
- Ai.google
https://ai.google.dev/gemini-api/docs/changelog.md.txt
“The following Gemini 2.0 models are now shut down.”