Back to papers
OpenSQL: Data-Efficient Text-to-SQL for Open-Source LLMs via Synthesized Intermediate Supervision
Summary: OpenSQL targets data-efficient Text-to-SQL with open-source LLMs by synthesizing intermediate supervision from scarce (question, SQL) pairs. Global-local schema linking + reasoning-enhanced, clause/semantic-level candidate selection + task-aware augmentation yield 70% Bird-dev with only 14K samples (32B), beating OmniSQL (2.5M).
(summarized by gpt-5.4-mini on May 27 2026)
- Paper ID
- 14305
- Venue
- VLDB
- Year
- 2026
- Pagerank
- 4.1945683e-05
- Overall Rank
- 10,268 | 28.57%
- DOI
-
10.14778/3801059.3801074
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
Outgoing Citations (Sorted by Pagerank)
Showing 15 of 15 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 300 |
Deep Learning for Entity Matching: A Design Space Exploration |
2018 |
SIGMOD |
0.00028441466 |
| 369 |
Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation |
2024 |
VLDB |
0.0002547515 |
| 513 |
TURL: Table Understanding through Representation Learning |
2021 |
VLDB |
0.00021288342 |
| 998 |
CodeS: Towards Building Open-source Language Models for Text-to-SQL |
2024 |
SIGMOD |
0.00014729379 |
| 1,732 |
CatSQL: Towards Real World Natural Language to SQL Applications |
2023 |
VLDB |
0.00010732004 |
| 1,914 |
Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks |
2020 |
SIGMOD |
0.00010109102 |
| 2,433 |
ScienceBenchmark: A Complex Real-World Benchmark for Evaluating Natural Language to SQL Systems |
2024 |
VLDB |
8.8285962e-05 |
| 2,945 |
Few-shot Text-to-SQL Translation using Structure and Content Prompt Learning |
2023 |
SIGMOD |
7.8377395e-05 |
| 3,662 |
The Dawn of Natural Language to SQL: Are We Fully Ready? |
2024 |
VLDB |
6.8672143e-05 |
| 3,859 |
OpenSearch-SQL: Enhancing Text-to-SQL with Dynamic Few-shot and Consistency Alignment |
2025 |
SIGMOD |
6.6907933e-05 |
| 3,978 |
OmniSQL: Synthesizing High-quality Text-to-SQL Data at Scale |
2025 |
VLDB |
6.5725884e-05 |
| 4,908 |
Combining Small Language Models and Large Language Models for Zero-Shot NL2SQL |
2024 |
VLDB |
5.8339245e-05 |
| 7,705 |
AOP: Automated and Interactive LLM Pipeline Orchestration for Answering Complex Queries |
2025 |
CIDR |
4.6730494e-05 |
| 9,151 |
The Power of Constraints in Natural Language to SQL Translation |
2025 |
VLDB |
4.3849295e-05 |
| 10,800 |
Unify: A System For Unstructured Data Analytics |
2025 |
VLDB |
4.1945683e-05 |
Semantically Similar Papers