Database Paper Browser

Back to papers

The Performance of MapReduce: An In-depth Study

Summary: In-depth performance study of Hadoop MapReduce on a 100-node EC2 cluster; identifies five design factors shaping throughput. Tuning these factors yields 2.5–3.5x gains, narrowing the gap with parallel DBs and enabling economical elastic cloud processing. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
10020
Venue
VLDB
Year
2010
Pagerank
0.00011132319
Overall Rank
1,615 | 88.77%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 21 of 21 citing papers.

Rank Citing Paper Year Venue Pagerank
868 Profiling, What-if Analysis, and Cost-based Optimization of MapReduce Programs 2011 VLDB 0.00015789681
1,071 Starfish: A Self-tuning System for Big Data Analytics 2011 CIDR 0.00014312777
1,534 PerfXplain: Debugging MapReduce Job Performance 2012 VLDB 0.00011468393
1,800 epiC: an Extensible and Scalable System for Processing Big Data 2014 VLDB 0.00010512649
1,931 Efficient Processing of k Nearest Neighbor Joins using MapReduce 2012 VLDB 0.00010040427
2,439 CoHadoop: Flexible Data Placement and Its Exploitation in Hadoop 2011 VLDB 8.8190594e-05
2,476 A Platform for Scalable One-Pass Analytics using MapReduce 2011 SIGMOD 8.6960139e-05
3,062 Efficient Multi-way Theta-Join Processing Using MapReduce 2012 VLDB 7.6343994e-05
3,066 HAWQ: A Massively Parallel Processing SQL Engine in Hadoop 2014 SIGMOD 7.6221974e-05
3,115 Llama: Leveraging Columnar Storage for Scalable Join Processing in the MapReduce Framework 2011 SIGMOD 7.543505e-05
3,208 Column-Oriented Storage Techniques for MapReduce 2011 VLDB 7.3781897e-05
3,710 Optimizing Analytic Data Flows for Multiple Execution Engines 2012 SIGMOD 6.8238962e-05
5,105 Only Aggressive Elephants are Fast Elephants 2012 VLDB 5.694494e-05
5,903 Building Wavelet Histograms on Large Data in MapReduce 2012 VLDB 5.2791351e-05
6,173 Exploiting Soft and Hard Correlations in Big Data Query Optimization 2016 VLDB 5.1699414e-05
6,268 Speedup Your Analytics: Automatic Parameter Tuning for Databases and Big Data Systems 2019 VLDB 5.133857e-05
8,084 ScalaGiST: Scalable Generalized Search Trees for MapReduce Systems [Innovative Systems Paper] 2014 VLDB 4.5902866e-05
8,464 Piranha: Optimizing Short Jobs in Hadoop 2013 VLDB 4.5052127e-05
9,375 Efficient Big Data Processing in Hadoop MapReduce 2012 VLDB 4.347384e-05
11,694 An Experimental Evaluation of Garbage Collectors on Big Data Applications 2019 VLDB 4.1945683e-05
11,987 DGFIndex for Smart Grid: Enhancing Hive with a Cost-Effective Multidimensional Range Index 2014 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 9 of 9 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers