Database Paper Browser

Back to papers

Dynamically Optimizing Queries over Large Scale Data Platforms

Summary: Dynamic optimization of queries on large-scale data platforms, handling opaque UDFs and cross-relational correlations. Pilot runs estimate selectivities to seed a cost-based plan, after which plans evolve during execution to yield up to 2x (Jaql) and 4x (Hive) gains over hand-written left-deep baselines. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
4902
Venue
SIGMOD
Year
2014
Pagerank
5.7586174e-05
Overall Rank
5,014 | 65.12%
DOI
10.1145/2588555.2610531

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 11 of 11 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 22 of 22 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
1 Access Path Selection in a Relational Database Management System 1979 SIGMOD 0.0040449103
42 A Comparison of Approaches to Large-Scale Data Analysis 2009 SIGMOD 0.00073498298
70 Hive - A Warehousing Solution Over a Map-Reduce Framework 2009 VLDB 0.00059533166
99 On the Propagation of Errors in the Size of Join Results 1991 SIGMOD 0.00050022914
106 Extensible/Rule Based Query Rewrite Optimization in Starburst 1992 SIGMOD 0.00048400734
220 Efficient Mid-Query Re-Optimization of Sub-Optimal Query Execution Plans 1998 SIGMOD 0.00033194808
224 CORDS: Automatic Discovery of Correlations and Soft Functional Dependencies 2004 SIGMOD 0.00032746205
378 Towards Estimation Error Guarantees for Distinct Values 2000 PODS 0.0002497492
542 Shark: SQL and Rich Analytics at Scale 2013 SIGMOD 0.00020595648
650 Robust Query Processing through Progressive Optimization 2004 SIGMOD 0.00018659177
727 On Synopses for Distinct-Value Estimation Under Multiset Operations 2007 SIGMOD 0.00017508726
780 Building a High-Level Dataflow System on top of Map-Reduce: The Pig Experience 2009 VLDB 0.00016775082
960 A Comparison of Join Algorithms for Log Processing in MapReduce 2010 SIGMOD 0.00015012242
1,265 Jaql: A Scripting Language for Large Scale Semistructured Data Analysis 2011 VLDB 0.00012947629
1,272 Proactive Re-Optimization 2005 SIGMOD 0.00012920076
1,464 Online Aggregation for Large MapReduce Jobs 2011 VLDB 0.00011865546
1,727 BigBench: Towards an Industry Standard Benchmark for Big Data Analytics 2013 SIGMOD 0.00010740936
1,797 Effective Use of Block-Level Sampling in Statistics Estimation 2004 SIGMOD 0.00010523169
2,611 Opening the Black Boxes in Data Flow Optimization 2012 VLDB 8.4536967e-05
2,747 Stubby: A Transformation-based Optimizer for MapReduce Workflows 2012 VLDB 8.1828918e-05
5,297 Continuous Cloud-Scale Query Optimization and Processing 2013 VLDB 5.5801669e-05
8,165 Progressive Optimization in a Shared-Nothing Parallel Database 2007 SIGMOD 4.5717277e-05
Previous Page 1 / 1 Next

Semantically Similar Papers