Database Paper Browser

Back to papers

Steering Query Optimizers: A Practical Take on Big Data Workloads

Summary: Steering query optimizers for big data; Bao adapted to SCOPE. Introduces rule signatures, a pipeline for recurring configs, and a learning method for unseen workloads; evaluated on 150K daily jobs with 7–30% latency savings, up to 90% on subset. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
6257
Venue
SIGMOD
Year
2021
Pagerank
5.2412035e-05
Overall Rank
6,040 | 57.99%
DOI
10.1145/3448016.3457568

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 21 of 21 citing papers.

Rank Citing Paper Year Venue Pagerank
3,248 A Learned Query Rewrite System using Monte Carlo Tree Search 2022 VLDB 7.3258782e-05
3,348 Lero: A Learning-to-Rank Query Optimizer 2023 VLDB 7.1904529e-05
3,990 FactorJoin: A New Cardinality Estimation Framework for Join Queries 2023 SIGMOD 6.5581983e-05
4,593 Auto-WLM: Machine Learning Enhanced Workload Management in Amazon Redshift 2023 SIGMOD 6.0606891e-05
4,690 Deploying a Steered Query Optimizer in Production at Microsoft 2022 SIGMOD 5.997226e-05
5,334 LEON: A New Framework for ML-Aided Query Optimization 2023 VLDB 5.5649836e-05
5,640 AutoSteer: Learned Query Optimization for Any SQL Database 2023 VLDB 5.3933314e-05
6,297 Towards instance-optimized data systems 2021 VLDB 5.1227886e-05
6,885 PilotScope: Steering Databases with Machine Learning Drivers 2024 VLDB 4.895386e-05
7,655 Machine Learning for Cloud Data Systems: the Progress so far and the Path Forward 2021 VLDB 4.6872456e-05
8,164 Efficiently Computing Join Orders with Heuristic Search 2023 SIGMOD 4.5718104e-05
8,197 SparkCruise: Workload Optimization in Managed Spark Clusters at Microsoft 2021 VLDB 4.5607121e-05
8,220 PerfGuard: Deploying ML-for-Systems without Performance Regressions, Almost! 2021 VLDB 4.5557328e-05
8,416 Towards Building Autonomous Data Services on Azure 2023 SIGMOD 4.5196199e-05
8,582 Towards Query Optimizer as a Service (QOaaS) in a Unified LakeHouse Ecosystem: Can One QO Rule Them All? 2025 CIDR 4.492033e-05
8,659 Learned Offline Query Planning via Bayesian Optimization 2025 SIGMOD 4.4722928e-05
8,783 GEqO: ML-Accelerated Semantic Equivalence Detection 2023 SIGMOD 4.452825e-05
9,006 Hit the Gym: Accelerating Query Execution to Efficiently Bootstrap Behavior Models for Self-Driving Database Management Systems 2024 VLDB 4.4101482e-05
9,587 Low Rank Learning for Offline Query Optimization 2025 SIGMOD 4.3215645e-05
9,710 QO-Insight: Inspecting Steered Query Optimizers 2023 VLDB 4.299267e-05
10,491 Intra-Query Runtime Elasticity for Cloud-Native Data Analysis 2025 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 17 of 17 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
22 SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets 2008 VLDB 0.0008456613
71 How Good Are Query Optimizers, Really? 2016 VLDB 0.00059038975
167 The Snowflake Elastic Data Warehouse 2016 SIGMOD 0.00039180521
204 Learned Cardinalities: Estimating Correlated Joins with Deep Learning 2019 CIDR 0.00034784455
333 Neo: A Learned Query Optimizer 2019 VLDB 0.00027206884
544 Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources 2018 SIGMOD 0.00020521965
758 Deep Unsupervised Cardinality Estimation 2020 VLDB 0.0001706608
906 F1: A Distributed SQL Database That Scales 2013 VLDB 0.00015448884
910 NeuroCard: One Cardinality Estimator for All Tables 2021 VLDB 0.00015423056
1,254 Selectivity Estimation for Range Predicates using Lightweight Models 2019 VLDB 0.00013027411
1,300 The Picasso Database Query Optimizer Visualizer 2010 VLDB 0.00012733214
2,083 Towards a Learning Optimizer for Shared Clouds 2019 VLDB 9.5834572e-05
3,625 Cost Models for Big Data Query Processing: Learning, Retrofitting, and Our Findings 2020 SIGMOD 6.9055212e-05
3,954 Efficiently Approximating Selectivity Functions using Low Overhead Regression Models 2020 VLDB 6.5926838e-05
4,174 Computation Reuse in Analytics Job Service at Microsoft 2018 SIGMOD 6.3856219e-05
6,763 Robustness Metrics for Relational Query Execution Plans 2018 VLDB 4.9338479e-05
7,684 AutoToken: Predicting Peak Parallelism for Big Data Analytics at Microsoft 2020 VLDB 4.6796855e-05
Previous Page 1 / 1 Next

Semantically Similar Papers