Database Paper Browser

Back to papers

SQLStorm: Taking Database Benchmarking into the LLM Era

Summary: SQLStorm: LLM-driven methodology and v1.0 benchmark (1GB/12GB/220GB, >18K queries) that cheaply (~$15) generates large realistic SQL workloads covering far more SQL constructions than TPC‑H/DS/JOB. Enables cross-system compatibility testing, bug/crash discovery, and targeted optimizer/cardinality and performance robustness analysis. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
14033
Venue
VLDB
Year
2025
Pagerank
4.5583637e-05
Overall Rank
8,207 | 42.91%
DOI
10.14778/3749646.3749683

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 6 of 6 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 31 of 31 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
71 How Good Are Query Optimizers, Really? 2016 VLDB 0.00059038975
204 Learned Cardinalities: Estimating Correlated Joins with Deep Learning 2019 CIDR 0.00034784455
517 Can Foundation Models Wrangle Your Data? 2023 VLDB 0.00021169035
629 Preventing Bad Plans by Bounding the Impact of Cardinality Estimation Errors 2009 VLDB 0.00018942366
640 Bao: Making Learned Query Optimization Practical 2021 SIGMOD 0.00018759152
1,407 DB-BERT: A Database Tuning Tool that "Reads the Manual" 2022 SIGMOD 0.00012146739
1,638 Cardinality Estimation in DBMS: A Comprehensive Benchmark Evaluation 2022 VLDB 0.00011049779
1,643 CodexDB: Synthesizing Code for Query Processing from Natural Language Instructions using GPT-3 Codex 2022 VLDB 0.0001104256
1,956 D-Bot: Database Diagnosis System using Large Language Models 2024 VLDB 9.960627e-05
2,916 Quantifying TPC-H Choke Points and Their Optimizations 2020 VLDB 7.9068048e-05
2,985 DSB: A Decision Support Benchmark for Workload-Driven and Traditional Database Systems 2021 VLDB 7.7795847e-05
3,114 GPTuner: A Manual-Reading Database Tuning System via GPT-Guided Bayesian Optimization 2024 VLDB 7.5451724e-05
3,178 Why TPC Is Not Enough: An Analysis of the Amazon Redshift Fleet 2024 VLDB 7.4325992e-05
3,472 LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency 2025 VLDB 7.0639229e-05
3,789 DIAMetrics: Benchmarking Query Engines at Scale 2020 VLDB 6.7644737e-05
3,952 Exact Cardinality Query Optimization for Optimizer Testing 2009 VLDB 6.5939652e-05
4,717 Cloud Analytics Benchmark 2023 VLDB 5.9751539e-05
5,023 GenRewrite: Query Rewriting via Large Language Models 2026 SIGMOD 5.75363e-05
5,633 Analyzing the Impact of Cardinality Estimation on Execution Plans in Microsoft SQL Server 2023 VLDB 5.4011156e-05
6,389 Chat2Data: An Interactive Data Analysis System with RAG, Vector Databases and LLMs 2024 VLDB 5.0844009e-05
6,765 Automatic Database Configuration Debugging using Retrieval-Augmented Language Models 2025 SIGMOD 4.9325583e-05
7,020 LLM for Data Management 2024 VLDB 4.8595728e-05
7,035 R-Bot: An LLM-based Query Rewrite System 2025 VLDB 4.8548467e-05
7,126 Debunking the Myth of Join Ordering: Toward Robust SQL Analytics 2025 SIGMOD 4.8232367e-05
7,344 Join Size Bounds using l_p-Norms on Degree Sequences 2024 PODS 4.7565607e-05
7,467 Yannakakis+: Practical Acyclic Query Evaluation with Theoretical Guarantees 2025 SIGMOD 4.7218691e-05
8,488 Can Large Language Models Be Query Optimizer for Relational Databases? 2026 SIGMOD 4.4998609e-05
8,718 Parachute: Single-Pass Bi-Directional Information Passing 2025 VLDB 4.4612599e-05
8,884 Workload Insights From The Snowflake Data Cloud: What Do Production Analytic Queries Really Look Like? 2025 VLDB 4.4283999e-05
8,974 DataLoom: Simplifying Data Loading with LLMs 2024 VLDB 4.4184286e-05
9,587 Low Rank Learning for Offline Query Optimization 2025 SIGMOD 4.3215645e-05
Previous Page 1 / 1 Next

Semantically Similar Papers