Google shuts down Gemini 2.0 Flash family on June 1, directing developers to Gemini 3.5 Flash
Google formally ended API access to all four Gemini 2.0 Flash models on June 1, 2026. Requests to gemini-2.0-flash, gemini-2.0-flash-001, gemini-2.0-flash-lite, and gemini-2.0-flash-lite-001 now fail. The Gemini API changelog instructs developers to migrate to gemini-3.5-flash or gemini-3.1-flash-lite.
What's new
According to the official Gemini API changelog at ai.google.dev:
"The following Gemini 2.0 models are now shut down: gemini-2.0-flash, gemini-2.0-flash-001, gemini-2.0-flash-lite, gemini-2.0-flash-lite-001. Use gemini-3.5-flash or gemini-3.1-flash-lite instead."
All four models are no longer accessible via the Gemini API. No grace-period extension was announced.
Recommended migration paths:
- gemini-2.0-flash → gemini-3.5-flash: Higher capability, updated tokenizer, new pricing tier
- gemini-2.0-flash-lite → gemini-3.1-flash-lite: Cost-optimized path for high-volume, latency-sensitive applications
Context
Google announced the June 1 shutdown in its Gemini API changelog on February 18, 2026, giving developers approximately 3.5 months of advance notice. The Gemini 2.0 Flash family launched in late 2024 as Google's speed-and-cost efficiency tier and was widely adopted in production applications — particularly for high-throughput inference tasks where token cost is the primary constraint.
The Gemini 3.x series represents a substantial generational jump. Gemini 3.5 Flash introduces the updated Gemini 3 tokenizer, extended context support, and improved reasoning capabilities. However, the tokenizer change means prompt costs are not directly comparable between the 2.0 and 3.x series — developers need to recount tokens under the new model before estimating cost impact.
At the same time, the May 28, 2026 Gemini API changelog shows gemini-3.1-flash-image and gemini-3-pro-image reaching general availability, and preview versions of those models face deprecation by June 25.
Why it matters
For production teams that built on Gemini 2.0 Flash as a stable, versioned API endpoint, June 1 was a hard cutover: applications using the model ID without fallback logic have been broken since that date. The shutdown also marks the end of the Gemini 2.0 generation as a live inference option — pushing all Google API users onto the Gemini 3.x lineup or newer.
The migration affects not only model IDs in code but also cost modeling, latency benchmarks, and output quality validations. Teams running A/B tests or fine-tuning pipelines will need to re-calibrate against the new model versions. The parallel sunset of the gemini-3.1-flash-image preview (June 25 deadline) adds urgency for any teams building image-generation workflows on Gemini.
Corroborating sources
- Ai.google
https://ai.google.dev/gemini-api/docs/changelog.md.txt
“The following Gemini 2.0 models are now shut down: gemini-2.0-flash, gemini-2.0-flash-001, gemini-2.0-flash-lite, gemini-2.0-flash-lite-001”