TACO: A Benchmark for Open-Domain Text-to-SQL with Ambiguous and Cross-Database Queries
Summary: TACO benchmarks open-domain text-to-SQL beyond standard closed-schema settings, targeting ambiguous questions, unspecified databases, and cross-database queries. It combines 1.5K real smart-city examples with 13K synthesized open-data queries, plus a TACO-SQL baseline revealing a large gap to human SQL. (summarized by gpt-5.4-mini on Apr 12 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Chao Deng
- 2. Ju Fan
- 3. Yuyu Luo
- 4. Qinliang Xue
- 5. Meihao Fan
- 6. Yuxin Zhang
- 7. Min Zhang
- 8. Xiaofeng Jia
- 9. Jing Zhang
- 10. Xiaoyong Du
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 984 | Natural language to SQL: Where are we today? | 2020 | VLDB | 0.00014857465 |
| 998 | CodeS: Towards Building Open-source Language Models for Text-to-SQL | 2024 | SIGMOD | 0.00014729379 |
| 2,945 | Few-shot Text-to-SQL Translation using Structure and Content Prompt Learning | 2023 | SIGMOD | 7.8377395e-05 |
| 4,908 | Combining Small Language Models and Large Language Models for Zero-Shot NL2SQL | 2024 | VLDB | 5.8339245e-05 |
| 5,437 | SNAILS: Schema Naming Assessments for Improved LLM-Based SQL Inference | 2025 | SIGMOD | 5.5033018e-05 |
| 10,682 | AutoPrep: Natural Language Question-Aware Data Preparation with a Multi-Agent Framework | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 7,354 | Reliable Text-to-SQL with Adaptive Abstention | 2025 | SIGMOD | 4.7529612e-05 |
| 2,945 | Few-shot Text-to-SQL Translation using Structure and Content Prompt Learning | 2023 | SIGMOD | 7.8377395e-05 |
| 10,451 | RTS+: Reliable Text to SQL | 2025 | SIGMOD | 4.1945683e-05 |
| 998 | CodeS: Towards Building Open-source Language Models for Text-to-SQL | 2024 | SIGMOD | 0.00014729379 |
| 984 | Natural language to SQL: Where are we today? | 2020 | VLDB | 0.00014857465 |
| 369 | Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation | 2024 | VLDB | 0.0002547515 |
| 3,978 | OmniSQL: Synthesizing High-quality Text-to-SQL Data at Scale | 2025 | VLDB | 6.5725884e-05 |
| 3,359 | Text2SQL is Not Enough: Unifying AI and Databases with TAG | 2025 | CIDR | 7.1744146e-05 |
| 10,268 | OpenSQL: Data-Efficient Text-to-SQL for Open-Source LLMs via Synthesized Intermediate Supervision | 2026 | VLDB | 4.1945683e-05 |
| 5,353 | An In-Depth Benchmarking of Text-to-SQL Systems | 2021 | SIGMOD | 5.5521332e-05 |