Database Paper Browser

Back to papers

PBench: Workload Synthesizer with Real Statistics for Cloud Analytics Benchmarking

Summary: PBench synthesizes cloud analytics workloads that replicate real execution statistics (performance metrics, operator distributions, temporal dynamics) by selecting and combining benchmark components and augmenting missing pieces. Key innovations: multi-objective optimization for component selection, progressive timestamp assignment, and LLM-based component augmentation to preserve statistical fidelity, achieving up to 6x lower approximation error vs prior work. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
14008
Venue
VLDB
Year
2025
Pagerank
4.1945683e-05
Overall Rank
10,707 | 25.52%
DOI
10.14778/3749646.3749661

Incoming Non-self Citations Over Time

No non-self incoming citations found for this paper in this database.

Authors

Incoming Citations (Sorted by Pagerank)

Showing 0 of 0 citing papers.

Rank Citing Paper Year Venue Pagerank
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 20 of 20 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
71 How Good Are Query Optimizers, Really? 2016 VLDB 0.00059038975
167 The Snowflake Elastic Data Warehouse 2016 SIGMOD 0.00039180521
183 Automatic Database Management System Tuning Through Large-scale Machine Learning 2017 SIGMOD 0.00036721403
1,284 Amazon Redshift Re-invented 2022 SIGMOD 0.00012837822
2,568 Towards Cost-Optimal Query Processing in the Cloud 2021 VLDB 8.5239227e-05
2,916 Quantifying TPC-H Choke Points and Their Optimizations 2020 VLDB 7.9068048e-05
3,178 Why TPC Is Not Enough: An Analysis of the Amazon Redshift Fleet 2024 VLDB 7.4325992e-05
3,789 DIAMetrics: Benchmarking Query Engines at Scale 2020 VLDB 6.7644737e-05
3,951 Why You Should Run TPC-DS: A Workload Analysis 2007 VLDB 6.5953162e-05
4,593 Auto-WLM: Machine Learning Enhanced Workload Management in Amazon Redshift 2023 SIGMOD 6.0606891e-05
4,717 Cloud Analytics Benchmark 2023 VLDB 5.9751539e-05
4,884 Relational Data Synthesis using Generative Adversarial Networks: A Design Space Exploration 2020 VLDB 5.8540287e-05
5,037 Keep It Simple: Testing Databases via Differential Query Plans 2024 SIGMOD 5.7434825e-05
5,371 LearnedSQLGen: Constraint-aware SQL Generation using Reinforcement Learning 2022 SIGMOD 5.5428776e-05
5,634 Intelligent Scaling in Amazon Redshift 2024 SIGMOD 5.4000904e-05
5,832 Stage: Query Execution Time Prediction in Amazon Redshift 2024 SIGMOD 5.3111109e-05
5,861 Machine Learning for Databases 2021 VLDB 5.298883e-05
5,923 HyBench: A New Benchmark for HTAP Databases 2024 VLDB 5.2721765e-05
7,543 Cloud Databases: New Techniques, Challenges, and Opportunities 2022 VLDB 4.715241e-05
7,889 Cost-Intelligent Data Analytics in the Cloud 2024 CIDR 4.6253386e-05
Previous Page 1 / 1 Next

Semantically Similar Papers