NQ (Natural Questions) is a large-scale dataset of real user questions issued to Google search, and answers found from Wikipedia by annotators. NQ is designed for the training and evaluation of automatic question answering systems that can handle open-domain questions that require reasoning and commonsense knowledge
NQ is a benchmark for testing the natural language understanding and question answering ability of large language models (LLMs), such as BERT, GPT-3, and RoBERTa. It can be used to evaluate how well these models can understand natural language and answer questions that are naturally posed by humans in a search engine context. NQ is also intended to encourage the development of more robust and generalizable models that can handle various domains and tasks.
Here are some examples of questions from the NQ dataset, taken from different domains1: