LLM benchmark

Arc-c

ARC-c is a challenging variation of the ARC Benchmark, designed to assess the reasoning and commonsense understanding of large language models. Learn more about this dataset and the difficulty it presents for models.

Read More