Gemini 2.5 Pro Benchmark Results

by | Oct 14, 2025

Gemini 2.5 Pro is Google DeepMind’s most capable multimodal LLM, excelling in coding, reasoning, math, and long-context comprehension. The model surpasses Gemini 1.5 Pro with over 120 Elo points on LM Arena benchmarks and features a 1M-token context, dynamic thinking, and advanced tool use.

Gemini 2.5 Pro is a sparse Mixture-of-Experts transformer trained on Google’s TPUv5p clusters with native multimodal input support (text, vision, audio, video, and code). It achieves state-of-the-art results across core benchmarks: LiveCodeBench 74.2 %, Aider Polyglot 82.2 %, GPQA 86.4 %, AIME 2025 88 %, and MMMU 82 %. Long-context retrieval achieves 99.8 % accuracy at 1 million tokens. The model supports tool use, dynamic thinking budgets, and audio-visual dialog generation. Cutoff date January 2025. Trained using reinforcement learning and human feedback for helpfulness and safety.

Current
Proprietary License
Pretrained, Instruction-tuned, Reinforcement learning, Continual Learning

Comparison 

Sourced on: October 14, 2025

Gemini 2.5 Pro achieved SoTA results in reasoning, coding and multimodal tasks, ranking #2 overall on LM Arena behind GPT-4o and ahead of Claude 4 Opus. It records LiveCodeBench 74.2 %, Aider Polyglot 82.2 %, FACTS 87.8 %, and AIME 88 %. In video understanding it scored 83.6 % on VideoMMMU and 86.9 % on VideoMME. These benchmarks confirm Gemini 2.5 Pro as the most powerful Google LLM to date.

BenchmarkGemini 2.5 ProOpenAI o3-highOpenAI o4-miniClaude 4 SonnetClaude 4 OpusGrok 3 BetaDeepSeek R1 0528
LiveCodeBench74.272.075.848.951.170.5
Aider Polyglot82.279.672.061.372.053.371.6
SWE-Bench (single attempt)59.669.168.172.772.5
SWE-Bench (multi-attempt)67.280.279.457.6
GPQA (Diamond)86.483.381.475.479.680.281.0
Humanity’s Last Exam (no tools)21.620.318.17.810.714.0
SimpleQA54.048.619.327.8
FACTS Grounding87.869.962.179.177.774.882.4
AIME 202588.088.992.770.575.577.387.5
LOFT (≤128K context)87.077.060.581.673.1
LOFT (≤128K context)69.8
MRCR-V2 (≤128K)58.057.136.339.116.134.0
MRCR-V2 (1M)16.4
MMMU (Multimodal Reasoning)82.082.981.674.476.576.0

Team 

Gemini 2.5 Pro was developed by the Google DeepMind Gemini Team (2025). The team integrated advances in multimodal architecture, Mixture-of-Experts training, and long-context optimization. It involved over 200 researchers across DeepMind, Brain, and Google Cloud. The team implemented TPUv5p infrastructure and new safety frameworks to enable scalable training, with dedicated sub-groups for code, multimodality, safety, and alignment.

Community 

Gemini 2.5 Pro has a rapidly growing developer community through Google AI Studio and Vertex AI. It is widely integrated into Google Search, NotebookLM, and Project Astra. Community feedback is active on forums and AI Studio channels with research collaboration from academia and enterprise developers.

Active Members: 1001-5000 Members
Engagement Level: High Engagement

Resources

List of resources related to this product.