Gemini 2.5 Pro is a sparse Mixture-of-Experts transformer trained on Google’s TPUv5p clusters with native multimodal input support (text, vision, audio, video, and code). It achieves state-of-the-art results across core benchmarks: LiveCodeBench 74.2 %, Aider Polyglot 82.2 %, GPQA 86.4 %, AIME 2025 88 %, and MMMU 82 %. Long-context retrieval achieves 99.8 % accuracy at 1 million tokens. The model supports tool use, dynamic thinking budgets, and audio-visual dialog generation. Cutoff date January 2025. Trained using reinforcement learning and human feedback for helpfulness and safety.
Gemini 2.5 Pro achieved SoTA results in reasoning, coding and multimodal tasks, ranking #2 overall on LM Arena behind GPT-4o and ahead of Claude 4 Opus. It records LiveCodeBench 74.2 %, Aider Polyglot 82.2 %, FACTS 87.8 %, and AIME 88 %. In video understanding it scored 83.6 % on VideoMMMU and 86.9 % on VideoMME. These benchmarks confirm Gemini 2.5 Pro as the most powerful Google LLM to date.
Benchmark | Gemini 2.5 Pro | OpenAI o3-high | OpenAI o4-mini | Claude 4 Sonnet | Claude 4 Opus | Grok 3 Beta | DeepSeek R1 0528 |
---|---|---|---|---|---|---|---|
LiveCodeBench | 74.2 | 72.0 | 75.8 | 48.9 | 51.1 | 70.5 | |
Aider Polyglot | 82.2 | 79.6 | 72.0 | 61.3 | 72.0 | 53.3 | 71.6 |
SWE-Bench (single attempt) | 59.6 | 69.1 | 68.1 | 72.7 | 72.5 | ||
SWE-Bench (multi-attempt) | 67.2 | 80.2 | 79.4 | 57.6 | |||
GPQA (Diamond) | 86.4 | 83.3 | 81.4 | 75.4 | 79.6 | 80.2 | 81.0 |
Humanity’s Last Exam (no tools) | 21.6 | 20.3 | 18.1 | 7.8 | 10.7 | 14.0 | |
SimpleQA | 54.0 | 48.6 | 19.3 | 27.8 | |||
FACTS Grounding | 87.8 | 69.9 | 62.1 | 79.1 | 77.7 | 74.8 | 82.4 |
AIME 2025 | 88.0 | 88.9 | 92.7 | 70.5 | 75.5 | 77.3 | 87.5 |
LOFT (≤128K context) | 87.0 | 77.0 | 60.5 | 81.6 | 73.1 | ||
LOFT (≤128K context) | 69.8 | ||||||
MRCR-V2 (≤128K) | 58.0 | 57.1 | 36.3 | 39.1 | 16.1 | 34.0 | |
MRCR-V2 (1M) | 16.4 | ||||||
MMMU (Multimodal Reasoning) | 82.0 | 82.9 | 81.6 | 74.4 | 76.5 | 76.0 |
Gemini 2.5 Pro was developed by the Google DeepMind Gemini Team (2025). The team integrated advances in multimodal architecture, Mixture-of-Experts training, and long-context optimization. It involved over 200 researchers across DeepMind, Brain, and Google Cloud. The team implemented TPUv5p infrastructure and new safety frameworks to enable scalable training, with dedicated sub-groups for code, multimodality, safety, and alignment.
Gemini 2.5 Pro has a rapidly growing developer community through Google AI Studio and Vertex AI. It is widely integrated into Google Search, NotebookLM, and Project Astra. Community feedback is active on forums and AI Studio channels with research collaboration from academia and enterprise developers.