Llm

Open Compass Leaderboard

Explore Open Compass Leaderboard, a Large language Model Evaluation System, an open-source hub for efficient model evaluation.

EvalPlus software provides enhanced testing for LLM code with HumanEval+ and MBPP+.

Chatbot Arena: Revolutionizing the benchmarking of large language models with community participation and advanced evaluation mechanisms.