The cheapest and fastest model in the Gemini 2.5 family: a lightweight, low-latency variant optimized for high-volume, cost-sensitive workloads. Keeps the 1M-token context window and multimodal input but trades peak accuracy for throughput and price.
profile normalized against the 47-model field
Context window· 1.05M of 1.05M100%
Max output· 66K of 272K24%
Output speed· 320 tok/s100%
Affordability· $0.40 / Mtok out100%
Capability breadth· 9 of 1182%
Capability switches · 9 of 11
Reasoning mode
Tool / function use
Streaming
JSON mode
Structured outputs
Prompt caching
Fine-tuning
Web search
Code execution
Vision input
Audio input
Specifications
Every value carries a primary source and a verification date.
Capacity
Context window
1.05M
Max output
66K
Pricing
Input $/Mtok
$0.10 / Mtok input USD/Mtok
Cached input $/Mtok
$0.01 / Mtok (text/image/video) USD
Output $/Mtok
$0.40 / Mtok output USD/Mtok
Performance
Output speed
320 tok/s tok/s
Time to first token
210 ms ms
Capabilities
Input modalities
Text, Image, Video, Audio, PDF
Output modalities
Text
Reasoning mode
Yes
Tool / function use
true
Streaming
Yes
JSON mode
Yes
Structured outputs
true
Prompt caching
true
Web search
Yes
Vision input
Yes
Audio input
Yes
API
API model ID
gemini-2.5-flash-lite
Batch API
Yes
General
Knowledge cutoff
January 2025
Benchmarks
Sourced evaluation scores, each verified against its primary source.