Database Paper Browser

Back to papers

MRShare: Sharing Across Multiple Queries in MapReduce

Summary: MRShare merges related MapReduce jobs into groups and runs them as a single query to share work across jobs. Using a MapReduce-specific cost model, it derives an optimal grouping to maximize overlap, with a Hadoop prototype demonstrating substantial cost savings. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
10099
Venue
VLDB
Year
2010
Pagerank
0.00015114576
Overall Rank
947 | 93.42%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 27 of 27 citing papers.

Rank Citing Paper Year Venue Pagerank
868 Profiling, What-if Analysis, and Cost-based Optimization of MapReduce Programs 2011 VLDB 0.00015789681
1,071 Starfish: A Self-tuning System for Big Data Analytics 2011 CIDR 0.00014312777
1,402 Hybrid Parallelization Strategies for Large-Scale Machine Learning in SystemML 2014 VLDB 0.00012180605
1,922 Selecting Subexpressions to Materialize at Datacenter Scale 2018 VLDB 0.00010082599
2,205 ReStore: Reusing Results of MapReduce Jobs 2012 VLDB 9.2920002e-05
2,674 Minimal MapReduce Algorithms 2013 SIGMOD 8.3328645e-05
2,688 Accelerating Recommendation System Training by Leveraging Popular Choices 2022 VLDB 8.2991144e-05
2,747 Stubby: A Transformation-based Optimizer for MapReduce Workflows 2012 VLDB 8.1828918e-05
2,848 Exploiting Matrix Dependency for Efficient Distributed Matrix Computation 2015 SIGMOD 8.0208832e-05
2,998 Major Technical Advancements in Apache Hive 2014 SIGMOD 7.753765e-05
3,062 Efficient Multi-way Theta-Join Processing Using MapReduce 2012 VLDB 7.6343994e-05
3,550 Chi: A Scalable and Programmable Control Plane for Distributed Stream Processing Systems 2018 VLDB 6.9843512e-05
3,562 MISO: Souping Up Big Data Query Processing with a Multistore System 2014 SIGMOD 6.9694564e-05
3,703 Multi-Query Optimization in MapReduce Framework 2014 VLDB 6.8289978e-05
4,174 Computation Reuse in Analytics Job Service at Microsoft 2018 SIGMOD 6.3856219e-05
5,105 Only Aggressive Elephants are Fast Elephants 2012 VLDB 5.694494e-05
6,075 Opportunistic Physical Design for Big Data Analytics 2014 SIGMOD 5.223901e-05
6,173 Exploiting Soft and Hard Correlations in Big Data Query Optimization 2016 VLDB 5.1699414e-05
6,469 Materialization and Reuse Optimizations for Production Data Science Pipelines 2022 SIGMOD 5.0519488e-05
7,351 Distributed Outlier Detection using Compressive Sensing 2015 SIGMOD 4.7545562e-05
7,689 ROBUS: Fair Cache Allocation for Data-parallel Workloads 2017 SIGMOD 4.6765769e-05
9,344 Hippo: Sharing Computations in Hyper-Parameter Optimization 2022 VLDB 4.3539442e-05
11,835 An Efficient MapReduce Cube Algorithm for Varied Data Distributions 2016 SIGMOD 4.1945683e-05
11,882 Parallel Evaluation of Multi-Semi-Joins 2016 VLDB 4.1945683e-05
11,958 Shared Execution of Recurring Workloads in MapReduce 2015 VLDB 4.1945683e-05
11,976 Anti-Combining for MapReduce 2014 SIGMOD 4.1945683e-05
12,071 Mosquito: Another One Bites the Data Upload STream 2013 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 16 of 16 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
3 Pig Latin: A Not-So-Foreign Language for Data Processing 2008 SIGMOD 0.0024183614
15 Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters 2007 SIGMOD 0.0010654262
22 SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets 2008 VLDB 0.0008456613
42 A Comparison of Approaches to Large-Scale Data Analysis 2009 SIGMOD 0.00073498298
70 Hive - A Warehousing Solution Over a Map-Reduce Framework 2009 VLDB 0.00059533166
88 Common Expression Analysis in Database Applications 1982 SIGMOD 0.00052316625
157 HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads 2009 VLDB 0.00040397359
168 MAD Skills: New Analysis Practices for Big Data 2009 VLDB 0.00038946305
515 QPipe: A Simultaneously Pipelined Relational Query Engine 2005 SIGMOD 0.00021214633
780 Building a High-Level Dataflow System on top of Map-Reduce: The Pig Experience 2009 VLDB 0.00016775082
830 Main-Memory Scan Sharing For Multi-Core CPUs 2008 VLDB 0.00016171897
1,026 Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS 2007 VLDB 0.00014589172
1,355 SQL/MapReduce: A practical approach to self-describing, polymorphic, and parallelizable user-defined functions 2009 VLDB 0.00012404572
1,429 A Scalable, Predictable Join Operator for Highly Concurrent Data Warehouses 2009 VLDB 0.00012033518
1,476 Efficient Exploitation of Similar Subexpressions for Query Processing 2007 SIGMOD 0.00011779092
3,539 Scheduling Shared Scans of Large Data Files 2008 VLDB 6.9956521e-05
Previous Page 1 / 1 Next

Semantically Similar Papers

Overall Rank Paper Year Venue Pagerank
447 Efficient Parallel Set-Similarity Joins Using MapReduce 2010 SIGMOD 0.00022900171
2,674 Minimal MapReduce Algorithms 2013 SIGMOD 8.3328645e-05
1,464 Online Aggregation for Large MapReduce Jobs 2011 VLDB 0.00011865546
3,062 Efficient Multi-way Theta-Join Processing Using MapReduce 2012 VLDB 7.6343994e-05
2,476 A Platform for Scalable One-Pass Analytics using MapReduce 2011 SIGMOD 8.6960139e-05
15 Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters 2007 SIGMOD 0.0010654262
11,958 Shared Execution of Recurring Workloads in MapReduce 2015 VLDB 4.1945683e-05
1,615 The Performance of MapReduce: An In-depth Study 2010 VLDB 0.00011132319
4,147 Exploiting MapReduce-based Similarity Joins 2012 SIGMOD 6.4096022e-05
3,703 Multi-Query Optimization in MapReduce Framework 2014 VLDB 6.8289978e-05