ProductGoogleVerified

Google Retires All Gemini 2.0 Flash Models, Directs Developers to Gemini 3.5 Flash

No audio yetJun 1, 2026published Jun 10, 2026

Google shut down all four Gemini 2.0 Flash model variants on June 1, 2026, completing the retirement of the 2.0 Flash generation and clearing a path for Gemini 3.5 Flash as the standard fast-inference model in the Gemini API.

What Changed

Four model IDs were deactivated on June 1, 2026:

gemini-2.0-flash
gemini-2.0-flash-001
gemini-2.0-flash-lite
gemini-2.0-flash-lite-001

Google's documentation directs developers to migrate to either gemini-3.5-flash or gemini-3.1-flash-lite, depending on their latency and cost requirements. Requests to any of the retired model IDs now return errors.

Context

Gemini 2.0 Flash launched in early 2025 as Google's fast, cost-efficient model optimized for high-frequency inference. It was widely adopted for search grounding, short-context Q&A, and high-throughput production workloads. The Lite variant targeted even lower-latency use cases.

Both successors are now generally available:

gemini-3.5-flash: GA as of May 19, 2026, targeting agentic tasks and coding with a significant capability step up from 2.0 Flash
gemini-3.1-flash-lite: GA as the cost-efficient fast-inference replacement for gemini-2.0-flash-lite, targeting high-volume, lower-complexity use cases

The gemini-3.1-flash-lite-preview model was shut down on May 25, 2026 — one week before the full 2.0 Flash line retired — giving developers who used the preview a brief window to migrate.

Why It Matters

For developers with production systems on gemini-2.0-flash, the forced migration requires testing prompt compatibility and output consistency against the new model versions. Unlike soft deprecations with grace-period fallbacks, the retired IDs return errors immediately — any app that hasn't updated its model ID is now broken.

The retirement illustrates Google's accelerating model turnover pace in 2026. Gemini 2.0 Flash launched roughly 18 months before retirement — a lifecycle that matches OpenAI and Anthropic's increasingly rapid model replacement cadence. Developers building on Google's models should factor in similar transitions for current 3.x models as a 4.x family emerges.

The migration path from gemini-2.0-flash to gemini-3.5-flash is not a lateral swap: 3.5 Flash is positioned as a more capable model, not just a rename, which means developers may see improved results but should validate regression-sensitive workflows before relying on it in production.

Corroborating sources

Ai.google
https://ai.google.dev/gemini-api/docs/changelog.md.txt
“The following Gemini 2.0 models are now shut down.”

What Changed

Four model IDs were deactivated on June 1, 2026:

gemini-2.0-flash

gemini-2.0-flash-001

gemini-2.0-flash-lite

gemini-2.0-flash-lite-001

Context

Both successors are now generally available:

gemini-3.5-flash: GA as of May 19, 2026, targeting agentic tasks and coding with a significant capability step up from 2.0 Flash

gemini-3.1-flash-lite: GA as the cost-efficient fast-inference replacement for gemini-2.0-flash-lite, targeting high-volume, lower-complexity use cases

The gemini-3.1-flash-lite-preview model was shut down on May 25, 2026 — one week before the full 2.0 Flash line retired — giving developers who used the preview a brief window to migrate.

Why It Matters