Loading…
Loading…
The Pro tier of the original Gemini 3 line: a high-accuracy multimodal model with a 2M-token context window and strong reasoning and coding performance. Still current and widely deployed, sitting just below Gemini 3.5 Pro on quality and price.
Every value carries a primary source and a verification date.
Sourced evaluation scores, each verified against its primary source.
GPQA Diamond
GPQA Diamond: 88.9%
SWE-bench Verified
SWE-bench Verified: 74.2%
Humanity's Last Exam (no tools)
Humanity's Last Exam (no tools): 28.4%
AIME 2025
AIME 2025: 91.2%
MMMU-Pro
MMMU-Pro: 78.6%
MMLU-Pro
MMLU-Pro: 87.3%
LiveCodeBench
LiveCodeBench: 76.8%