PIQA

PIQA stands for Physical Interaction: Question Answering. It is a dataset of 16,000 multiple-choice questions about everyday physical tasks, requiring the model to choose the most appropriate action in different scenarios. The questions are generated by an adversarial filtering method that ensures that simple heuristics or word associations are not enough to solve them

Areas of application

PIQA is a benchmark for testing the commonsense reasoning ability of large language models (LLMs), such as BERT, GPT-3, and RoBERTa. It can be used to evaluate how well these models can understand natural language and perform physical tasks in various domains, such as cooking, cleaning, gardening, etc. PIQA can also be used to encourage the development of more robust and generalizable models that can handle diverse and challenging scenarios

Examples: Here are some examples of questions from the PIQA dataset, taken from different domains:
- How to quickly soften butter for baking?
  - Place the stick of butter inside a freezer bag and beat it with a wooden rolling pin
  - Place the stick of butter atop a freezer bag and beat it with a wooden rolling pin
  - The correct answer is A, as it is the most effective way to soften butter without melting it.
- How to make a hole in a wall?
  - Use a drill and a drill bit that matches the size of the hole you want
  - Use a hammer and a nail that matches the size of the hole you want
  - The correct answer is A, as it is the most precise and efficient way to make a hole in a wall.
- How to remove a splinter from your finger?
  - Use a pair of tweezers to gently pull out the splinter
  - Use a pair of scissors to cut out the splinter
  - The correct answer is A, as it is the safest and least painful way to remove a splinter.