Back to papers
SQLBarber: A System Leveraging Large Language Models to Generate Customized and Realistic SQL Workloads
Summary: SQLBarber uses LLMs to synthesize realistic, customizable SQL workloads from natural-language constraints, avoiding hand-written templates. It couples self-correcting template generation with Bayesian cost targeting to match production-derived cardinality/plan-cost distributions and scales to large benchmark synthesis.
(summarized by gpt-5-mini on Apr 11 2026)
- Paper ID
- 7524
- Venue
- SIGMOD
- Year
- 2026
- Pagerank
- 4.1945683e-05
- Overall Rank
- 10,212 | 28.96%
- DOI
-
10.1145/3786699
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
Outgoing Citations (Sorted by Pagerank)
Showing 19 of 19 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 71 |
How Good Are Query Optimizers, Really? |
2016 |
VLDB |
0.00059038975 |
| 998 |
CodeS: Towards Building Open-source Language Models for Text-to-SQL |
2024 |
SIGMOD |
0.00014729379 |
| 1,082 |
CAESURA: Language Models as Multi-Modal Query Planners |
2024 |
CIDR |
0.00014214232 |
| 1,407 |
DB-BERT: A Database Tuning Tool that "Reads the Manual" |
2022 |
SIGMOD |
0.00012146739 |
| 1,643 |
CodexDB: Synthesizing Code for Query Processing from Natural Language Instructions using GPT-3 Codex |
2022 |
VLDB |
0.0001104256 |
| 2,106 |
Palimpzest: Optimizing AI-Powered Analytics with Declarative Query Processing |
2025 |
CIDR |
9.5342543e-05 |
| 2,277 |
Generating Targeted Queries for Database Testing |
2008 |
SIGMOD |
9.1241198e-05 |
| 3,114 |
GPTuner: A Manual-Reading Database Tuning System via GPT-Guided Bayesian Optimization |
2024 |
VLDB |
7.5451724e-05 |
| 3,178 |
Why TPC Is Not Enough: An Analysis of the Amazon Redshift Fleet |
2024 |
VLDB |
7.4325992e-05 |
| 3,429 |
Real-time Workload Pattern Analysis for Large-scale Cloud Databases |
2023 |
VLDB |
7.1010535e-05 |
| 3,472 |
LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency |
2025 |
VLDB |
7.0639229e-05 |
| 4,717 |
Cloud Analytics Benchmark |
2023 |
VLDB |
5.9751539e-05 |
| 5,214 |
ThalamusDB: Approximate Query Processing on Multi-Modal Data |
2024 |
SIGMOD |
5.624434e-05 |
| 5,371 |
LearnedSQLGen: Constraint-aware SQL Generation using Reinforcement Learning |
2022 |
SIGMOD |
5.5428776e-05 |
| 5,942 |
SAM: Database Generation from Query Workloads with Supervised Autoregressive Models |
2022 |
SIGMOD |
5.2634242e-05 |
| 7,990 |
Blueprinting the Cloud: Unifying and Automatically Optimizing Cloud Data Infrastructures with BRAD |
2024 |
VLDB |
4.6117441e-05 |
| 8,186 |
E2ETune: End-to-End Knob Tuning via Fine-tuned Generative Language Model |
2025 |
VLDB |
4.5651684e-05 |
| 8,207 |
SQLStorm: Taking Database Benchmarking into the LLM Era |
2025 |
VLDB |
4.5583637e-05 |
| 9,392 |
Demonstrating SQLBarber: Leveraging Large Language Models to Generate Customized and Realistic SQL Workloads |
2025 |
SIGMOD |
4.3441378e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 4,908 |
Combining Small Language Models and Large Language Models for Zero-Shot NL2SQL |
2024 |
VLDB |
5.8339245e-05 |
| 10,217 |
This is Going to Sound Crazy, But What If We Used Large Language Models to Boost Automatic Database Tuning Algorithms By Leveraging Prior History? We Will Find Better Configurations More Quickly Than Retraining From Scratch! |
2026 |
SIGMOD |
4.1945683e-05 |
| 5,023 |
GenRewrite: Query Rewriting via Large Language Models |
2026 |
SIGMOD |
5.75363e-05 |
| 8,896 |
SQL-Factory: A Multi-Agent Framework for High-Quality and Large-Scale SQL Generation |
2026 |
VLDB |
4.427232e-05 |
| 10,221 |
NL2SQLBench: A Modular Benchmarking Framework for LLM-Enabled NL2SQL Solutions |
2026 |
VLDB |
4.1945683e-05 |
| 9,974 |
BenchPress: A Human-in-the-Loop Annotation System for Rapid Text-to-SQL Benchmark Curation |
2026 |
CIDR |
4.1945683e-05 |
| 9,993 |
Leveraging Query Optimizers to Verify the Soundness of LLM-based Query Rewrites for Real-World Workloads, and More! |
2026 |
CIDR |
4.1945683e-05 |
| 369 |
Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation |
2024 |
VLDB |
0.0002547515 |
| 8,207 |
SQLStorm: Taking Database Benchmarking into the LLM Era |
2025 |
VLDB |
4.5583637e-05 |
| 9,392 |
Demonstrating SQLBarber: Leveraging Large Language Models to Generate Customized and Realistic SQL Workloads |
2025 |
SIGMOD |
4.3441378e-05 |