Google shuts down the Gemini 2.0 Flash family on the Gemini API
Google on June 1 shut down the entire Gemini 2.0 Flash family on the Gemini API, retiring gemini-2.0-flash, gemini-2.0-flash-001, gemini-2.0-flash-lite, and gemini-2.0-flash-lite-001 in a single coordinated turn-off. Requests to those model IDs no longer route; Google is steering developers to gemini-3.5-flash and gemini-3.1-flash-lite as the recommended replacements.
What's new
- Four model IDs in the Gemini 2.0 Flash family are now non-functional on the Gemini API:
gemini-2.0-flash,gemini-2.0-flash-001,gemini-2.0-flash-lite,gemini-2.0-flash-lite-001. - The official migration paths in Google's changelog are
gemini-3.5-flash(the latest balanced Flash tier) for general-purpose workloads andgemini-3.1-flash-litefor cost-sensitive or latency-critical workloads. - The shutdown is on the Gemini API specifically. Developers using Vertex AI typically see model retirements on a separate, often longer timeline.
Context
Gemini 2.0 Flash was introduced in late 2024 and became one of the most heavily used model IDs across the Gemini API thanks to its combination of low latency and very low per-token pricing -- for many production teams it was the default workhorse model through most of 2025. Google began signaling its retirement well in advance: when Gemini 3 launched, the 2.0 family moved into a publicly scheduled deprecation window, and Google has been pushing developers toward the 3.x Flash tiers in changelog notices and developer relations posts for months.
The shutdown closes a deprecation cycle that has been quietly running since early 2026. It also lands the same week that Google promoted Gemini 3.5 Flash to general availability on the Gemini API and shipped the Nano Banana 2 and Nano Banana Pro image models, tightening the supported model lineup around the 3.x line.
Why it matters
For any team still pinning gemini-2.0-flash or one of its -001 variants in a production call site, this is a hard cutover, not a deprecation warning: those IDs are off, and outgoing requests will start failing immediately until they are repointed. The migration to gemini-3.5-flash and gemini-3.1-flash-lite is mostly mechanical -- the request shape on the Gemini API is broadly compatible -- but token accounting, default safety behavior, and per-call cost will all change, so teams should expect a short period of monitoring to recalibrate budgets and dashboards. More broadly, the shutdown is a useful data point on Google's model-lifecycle cadence: the 2.0 generation lasted roughly 18 months from launch to full retirement on the API, in line with Google's recent pattern of moving aggressively to consolidate around the latest generation rather than carrying old IDs indefinitely. Developers planning long-lived workloads on the Gemini API should size that 18-month horizon into their model-version risk planning.
Corroborating sources
- Changelog
https://ai.google.dev/gemini-api/docs/changelog
“The following Gemini 2.0 models are now shut down:”