Text-to-SQL Evaluation with BIRD Dataset

Text-to-SQL Evaluation is revolutionizing the field of semantic parsing with the BIRD dataset, a comprehensive collection featuring over 12,751 unique question-SQL pairs and 95 databases spanning 37 professional domains. The dataset’s real-world application is further enhanced by its focus on efficient SQL query generation, a critical aspect in business analysis. Additionally, the related work, Tapilot-Crossing, introduces an interactive benchmark for evaluating Large Language Model agents in data analysis tasks, marking a significant advancement in the domain.

Bird-bench
1 to 1000 stars
March 28, 2024
BIRD-bench GitHub Home Page
Understanding the Effects of Noise in Text-to-SQL: An Examination of the BIRD-Bench Benchmark - arXiv