Arc-e

ARC-e is an acronym for AI2 Reasoning Challenge – Essential. It is a subset of the ARC dataset, which is a large-scale collection of multiple-choice questions that require reasoning and commonsense knowledge to answer. The ARC-e dataset contains 1,169 questions that are more difficult and diverse than the questions in the original ARC dataset, and that cannot be answered by simple retrieval or word association methods

Arc-e

Areas of application

ARC-e is a benchmark for testing the commonsense reasoning ability of large language models (LLMs), such as BERT, GPT-3, and RoBERTa. It can be used to evaluate how well these models can understand natural language and answer questions that involve implicit knowledge, causal relations, and logical inference. ARC-e is also intended to encourage the development of more robust and generalizable models that can handle various domains and tasks

Example

  • Examples: Here are some examples of questions from the ARC-e dataset, taken from different domains:
  • Which of these is most likely to be found in a desert?
    • A) A camel
    • B) A penguin
    • C) A polar bear
    • D) A whale
      • The correct answer is A, as camels are well-adapted to live in hot and dry environments, while the other animals are not.
  • Why do we need to breathe oxygen?
    • A) To make water
    • B) To make carbon dioxide
    • C) To make energy
    • D) To make nitrogen
      • The correct answer is C, as oxygen is essential for cellular respiration, which is the process of converting glucose into energy.
  • Which of these is an example of a metamorphic rock?
    • A) Granite
    • B) Marble
    • C) Pumice
    • D) Sandstone
      • The correct answer is B, as marble is formed from limestone that has been subjected to high temperature and pressure, which is the definition of a metamorphic rock.