Google shuts down Gemini 2.0 Flash and Lite models, urges migration to 3.x series
Google retired four Gemini 2.0 API models on June 1, 2026, ending support for the Flash and Flash Lite variants that anchored the 2.0 generation. Developers still hitting these endpoints receive errors; Google's official recommendation is to migrate to Gemini 3.5 Flash or Gemini 3.1 Flash Lite.
What's new
- Shut down:
gemini-2.0-flash,gemini-2.0-flash-001,gemini-2.0-flash-lite,gemini-2.0-flash-lite-001 - Effective date: June 1, 2026
- Recommended replacements:
gemini-3.5-flash— high-performance successor, described as Google's most capable model for agentic and coding tasksgemini-3.1-flash-lite— lower-cost, high-throughput alternative
Context
The Gemini 2.0 family was Google's primary API offering through late 2025 and early 2026, widely used for its speed and cost profile. It was superseded by the Gemini 3.x series, which Google launched at I/O 2026 in May with Gemini 3.5 Flash, Gemini Spark, and Gemini Omni — a broader, more capable lineup with native managed agents and expanded multimodal capabilities.
This shutdown follows a string of earlier cleanups. In late May, Google retired the gemini-3.1-flash-lite-preview endpoint. On May 28, Google moved its visual generation models to GA and simultaneously deprecated the preview variants. The June 1 Gemini 2.0 shutdown is the largest batch retirement yet.
The 2.0 Flash models were particularly common in high-throughput use cases — document processing, real-time chat, agentic pipelines — because of their latency characteristics. Gemini 3.1 Flash Lite is the closest structural successor for cost-conscious workloads; Gemini 3.5 Flash is the performance upgrade path.
Why it matters
This is a hard shutdown, not a soft deprecation. Any code calling the retired model IDs as of June 1 fails immediately with no grace period. Teams that built integrations during the Gemini 2.0 era and haven't acted on earlier migration guidance are now seeing live errors in production.
The compressed timeline from 2.0 to 3.x — roughly six months from the 3.x launch to 2.0 shutdown — reflects Google's accelerating model release cadence. Gemini API users should treat model version pinning as a risk factor rather than a stability feature: Google's deprecation windows are measured in months, not years.
For teams migrating to Gemini 3.5 Flash, the capability gain is substantial: Google positioned it as their frontier agentic model at I/O 2026. The tradeoff is that it carries higher pricing than 2.0 Flash. Teams sensitive to cost should benchmark 3.1 Flash Lite before committing to the higher-tier model.
Corroborating sources
- Ai.google
https://ai.google.dev/gemini-api/docs/changelog.md.txt
“The following Gemini 2.0 models are now shut down: gemini-2.0-flash, gemini-2.0-flash-001, gemini-2.0-flash-lite, gemini-2.0-flash-lite-001”