FeatureGoogleVerified

Google shuts down the Gemini 2.0 Flash family on the Gemini API

ListenJun 1, 2026published Jun 4, 2026

Google on June 1 shut down the entire Gemini 2.0 Flash family on the Gemini API, retiring gemini-2.0-flash, gemini-2.0-flash-001, gemini-2.0-flash-lite, and gemini-2.0-flash-lite-001 in a single coordinated turn-off. Requests to those model IDs no longer route; Google is steering developers to gemini-3.5-flash and gemini-3.1-flash-lite as the recommended replacements.

What's new

Four model IDs in the Gemini 2.0 Flash family are now non-functional on the Gemini API: gemini-2.0-flash, gemini-2.0-flash-001, gemini-2.0-flash-lite, gemini-2.0-flash-lite-001.
The official migration paths in Google's changelog are gemini-3.5-flash (the latest balanced Flash tier) for general-purpose workloads and gemini-3.1-flash-lite for cost-sensitive or latency-critical workloads.
The shutdown is on the Gemini API specifically. Developers using Vertex AI typically see model retirements on a separate, often longer timeline.

Context

Gemini 2.0 Flash was introduced in late 2024 and became one of the most heavily used model IDs across the Gemini API thanks to its combination of low latency and very low per-token pricing -- for many production teams it was the default workhorse model through most of 2025. Google began signaling its retirement well in advance: when Gemini 3 launched, the 2.0 family moved into a publicly scheduled deprecation window, and Google has been pushing developers toward the 3.x Flash tiers in changelog notices and developer relations posts for months.

The shutdown closes a deprecation cycle that has been quietly running since early 2026. It also lands the same week that Google promoted Gemini 3.5 Flash to general availability on the Gemini API and shipped the Nano Banana 2 and Nano Banana Pro image models, tightening the supported model lineup around the 3.x line.

Why it matters

For any team still pinning gemini-2.0-flash or one of its -001 variants in a production call site, this is a hard cutover, not a deprecation warning: those IDs are off, and outgoing requests will start failing immediately until they are repointed. The migration to gemini-3.5-flash and gemini-3.1-flash-lite is mostly mechanical -- the request shape on the Gemini API is broadly compatible -- but token accounting, default safety behavior, and per-call cost will all change, so teams should expect a short period of monitoring to recalibrate budgets and dashboards. More broadly, the shutdown is a useful data point on Google's model-lifecycle cadence: the 2.0 generation lasted roughly 18 months from launch to full retirement on the API, in line with Google's recent pattern of moving aggressively to consolidate around the latest generation rather than carrying old IDs indefinitely. Developers planning long-lived workloads on the Gemini API should size that 18-month horizon into their model-version risk planning.

Corroborating sources

Changelog
https://ai.google.dev/gemini-api/docs/changelog
“The following Gemini 2.0 models are now shut down:”

What's new

Four model IDs in the Gemini 2.0 Flash family are now non-functional on the Gemini API: gemini-2.0-flash, gemini-2.0-flash-001, gemini-2.0-flash-lite, gemini-2.0-flash-lite-001.

The official migration paths in Google's changelog are gemini-3.5-flash (the latest balanced Flash tier) for general-purpose workloads and gemini-3.1-flash-lite for cost-sensitive or latency-critical workloads.

The shutdown is on the Gemini API specifically. Developers using Vertex AI typically see model retirements on a separate, often longer timeline.

Context

Why it matters