Database Paper Browser

Back to papers

Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters

Summary: Extends Map-Reduce with a Merge phase to efficiently process related heterogeneous relational data on large clusters. Merge consolidates data already partitioned and sorted (or hashed) by map and reduce, enabling relational algebra operators and multiple join algorithms. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
3927
Venue
SIGMOD
Year
2007
Pagerank
0.0010654262
Overall Rank
15 | 99.90%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 29 of 29 citing papers.

Rank Citing Paper Year Venue Pagerank
3 Pig Latin: A Not-So-Foreign Language for Data Processing 2008 SIGMOD 0.0024183614
22 SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets 2008 VLDB 0.0008456613
447 Efficient Parallel Set-Similarity Joins Using MapReduce 2010 SIGMOD 0.00022900171
794 Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) 2010 VLDB 0.00016605103
886 Fast Personalized PageRank on MapReduce 2011 SIGMOD 0.00015597161
913 Tenzing A SQL Implementation On The MapReduce Framework 2011 VLDB 0.00015408131
947 MRShare: Sharing Across Multiple Queries in MapReduce 2010 VLDB 0.00015114576
960 A Comparison of Join Algorithms for Log Processing in MapReduce 2010 SIGMOD 0.00015012242
1,074 Processing Theta-Joins using MapReduce* 2011 SIGMOD 0.00014260096
1,280 Automatic Optimization for MapReduce Programs 2011 VLDB 0.0001285503
1,440 Provenance for Generalized Map and Reduce Workflows 2011 CIDR 0.00011961469
1,615 The Performance of MapReduce: An In-depth Study 2010 VLDB 0.00011132319
1,800 epiC: an Extensible and Scalable System for Processing Big Data 2014 VLDB 0.00010512649
1,863 Cheetah: A High Performance, Custom Data Warehouse on Top of MapReduce 2010 VLDB 0.00010286531
2,337 Efficient Processing of Data Warehousing Queries in a Split Execution Environment 2011 SIGMOD 9.0098186e-05
2,413 Automated Partitioning Design in Parallel Database Systems 2011 SIGMOD 8.8672223e-05
2,476 A Platform for Scalable One-Pass Analytics using MapReduce 2011 SIGMOD 8.6960139e-05
3,382 Scalable and Adaptive Online Joins 2014 VLDB 7.1597145e-05
3,517 Integrating Hadoop and Parallel DBMS 2010 SIGMOD 7.0199423e-05
3,626 Behavioral Simulations in MapReduce 2010 VLDB 6.9047458e-05
4,147 Exploiting MapReduce-based Similarity Joins 2012 SIGMOD 6.4096022e-05
5,045 Massive Scale-out of Expensive Continuous Queries 2011 VLDB 5.740793e-05
6,568 Efficient Parallel Skyline Processing using Hyperplane Projections 2011 SIGMOD 5.0068521e-05
8,888 E = MC^3: Managing Uncertain Enterprise Data in a Cluster-Computing Environment 2009 SIGMOD 4.4278238e-05
11,358 Scaling Equi-Joins 2022 SIGMOD 4.1945683e-05
12,010 An Approach towards the Study of Symmetric Queries 2014 VLDB 4.1945683e-05
12,226 Indexing Multi-dimensional Data in a Cloud System 2010 SIGMOD 4.1945683e-05
12,287 LifeRaft: Data-Driven, Batch Processing for the Exploration of Scientific Databases 2009 CIDR 4.1945683e-05
12,400 Ad-Hoc Data Processing in the Cloud 2008 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 3 of 3 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
20 GAMMA - A High Performance Dataflow Database Machine 1986 VLDB 0.00086459551
78 Multiprocessor Hash-Based Join Algorithms 1985 VLDB 0.00056413752
236 High-Performance Sorting on Networks of Workstations 1997 SIGMOD 0.00031779642
Previous Page 1 / 1 Next

Semantically Similar Papers