Chatbot Arena

Leaderboard Illusion in AI Benchmarks

Posted by Fede Nolasco | Mar 11, 2025

Delve into the ‘Leaderboard Illusion’ paper, revealing systematic flaws in AI benchmarks and the implications for the AI community.

Gemini Pro vs GPT-4 Turbo: New Bard Update Surpasses GPT-4

Posted by Fede Nolasco | Aug 14, 2024

Google’s Bard with Gemini Pro has surpassed GPT-4 on the Chatbot Arena leaderboard. Explore the new internet-enabled features and performance comparisons.