Database Paper Browser

Back to papers

Processing Theta-Joins using MapReduce*

Summary: Maps arbitrary theta-joins to MapReduce via a simple key-equality data-flow model; supports non-equi joins without changing MR. Introduces 1-Bucket-Theta, a randomized, memory-aware algorithm needing only input cardinality, near-optimal for many joins. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
4445
Venue
SIGMOD
Year
2011
Pagerank
0.00014260096
Overall Rank
1,074 | 92.54%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 35 of 35 citing papers.

Rank Citing Paper Year Venue Pagerank
1,206 Rack-Scale In-Memory Join Processing using RDMA 2015 SIGMOD 0.00013281657
1,308 Upper and Lower Bounds on the Cost of a Map-Reduce Computation 2013 VLDB 0.00012661651
1,931 Efficient Processing of k Nearest Neighbor Joins using MapReduce 2012 VLDB 0.00010040427
2,175 Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services 2017 SIGMOD 9.3644117e-05
2,526 Track Join: Distributed Joins with Minimal Network Traffic 2014 SIGMOD 8.5968612e-05
2,674 Minimal MapReduce Algorithms 2013 SIGMOD 8.3328645e-05
2,946 BigDansing: A System for Big Data Cleansing 2015 SIGMOD 7.8372441e-05
3,062 Efficient Multi-way Theta-Join Processing Using MapReduce 2012 VLDB 7.6343994e-05
3,129 Scalable Big Graph Processing in MapReduce 2014 SIGMOD 7.5008242e-05
3,141 ClusterJoin: A Similarity Joins Framework using Map-Reduce 2014 VLDB 7.4829448e-05
3,382 Scalable and Adaptive Online Joins 2014 VLDB 7.1597145e-05
3,528 Distributed Data Deduplication 2016 VLDB 7.0066139e-05
3,571 Lightning Fast and Space Efficient Inequality Joins 2015 VLDB 6.9580858e-05
4,167 Scalable Distributed Stream Join Processing 2015 SIGMOD 6.3919506e-05
4,273 Cleaning Denial Constraint Violations through Relaxation 2020 SIGMOD 6.3003864e-05
4,775 Set Similarity Joins on MapReduce: An Experimental Survey 2018 VLDB 5.9315784e-05
5,902 The Communication Complexity of Distributed Set-Joins with Applications to Matrix Multiplication 2015 PODS 5.2796864e-05
6,507 Similarity Join over Array Data 2016 SIGMOD 5.0337166e-05
6,619 Near-Optimal Distributed Band-Joins through Recursive Partitioning 2020 SIGMOD 4.9910152e-05
7,153 Submodularity of Distributed Join Computation 2018 SIGMOD 4.8153963e-05
7,237 CleanM: An Optimizable Query Language for Unified Scale-Out Data Cleaning 2017 VLDB 4.7928651e-05
7,573 Squall: Scalable Real-time Analytics 2016 VLDB 4.7071608e-05
7,599 Quill: Efficient, Transferable, and Rich Analytics at Scale 2016 VLDB 4.7003593e-05
8,857 Distributed Evaluation of Top-k Temporal Joins 2016 SIGMOD 4.4345027e-05
9,115 MapReduce Algorithms for Big Data Analysis 2012 VLDB 4.3932167e-05
9,347 Rank Join Queries in NoSQL Databases 2014 VLDB 4.3526718e-05
9,375 Efficient Big Data Processing in Hadoop MapReduce 2012 VLDB 4.347384e-05
9,437 BlockJoin: Efficient Matrix Partitioning Through Joins 2017 VLDB 4.3425552e-05
9,519 PAXQuery: Parallel Analytical XML Processing 2015 SIGMOD 4.3323764e-05
10,967 Low-Latency Adaptive Distributed Stream Join System Based on a Flexible Join Model 2024 SIGMOD 4.1945683e-05
11,358 Scaling Equi-Joins 2022 SIGMOD 4.1945683e-05
11,835 An Efficient MapReduce Cube Algorithm for Varied Data Distributions 2016 SIGMOD 4.1945683e-05
11,882 Parallel Evaluation of Multi-Semi-Joins 2016 VLDB 4.1945683e-05
11,976 Anti-Combining for MapReduce 2014 SIGMOD 4.1945683e-05
13,446 Scolopax: Exploratory Analysis of Scientific Data 2013 VLDB -
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 7 of 7 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers