Parallel Evaluation of Multi-Semi-Joins

Summary: Introduces a Multi-Semi-Join (MSJ) MapReduce operator to evaluate a set of semi-joins in one job for SGF queries, enabling parallel plans that beat sequential execution. Optimizes total time with greedy NP-hard heuristics while keeping net time low; implemented as Gumbo on Hadoop and scalable versus Pig/Hive. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID: 11363
Venue: VLDB
Year: 2016
Pagerank: 4.1905499e-05
Overall Rank: 11,890 | 17.37%
DOI: -

Incoming Non-self Citations Over Time

No non-self incoming citations found for this paper in this database.

Authors

Incoming Citations (Sorted by Pagerank)

Showing 0 of 0 citing papers.

Rank	Citing Paper	Year	Venue	Pagerank

Outgoing Citations (Sorted by Pagerank)

Showing 14 of 14 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank	Cited Paper	Year	Venue	Pagerank
3	Pig Latin: A Not-So-Foreign Language for Data Processing	2008	SIGMOD	0.0024217964
454	An Overview of Query Optimization in Relational Systems	1998	PODS	0.00022796106
539	Shark: SQL and Rich Analytics at Scale	2013	SIGMOD	0.00020615453
947	MRShare: Sharing Across Multiple Queries in MapReduce	2010	VLDB	0.00015112344
962	A Comparison of Join Algorithms for Log Processing in MapReduce	2010	SIGMOD	0.00015003834
1,073	Processing Theta-Joins using MapReduce*	2011	SIGMOD	0.00014255717
1,114	Parallel Evaluation of Conjunctive Queries	2011	PODS	0.00013871948
1,312	Upper and Lower Bounds on the Cost of a Map-Reduce Computation	2013	VLDB	0.00012650678
1,411	Communication Steps for Parallel Query Processing	2013	PODS	0.00012118832
1,938	From Theory to Practice: Efficient Join Query Evaluation in a Parallel Database System	2015	SIGMOD	0.00010025547
2,216	Skew in Parallel Query Processing	2014	PODS	9.2693784e-05
2,714	Minimal MapReduce Algorithms	2013	SIGMOD	8.2426646e-05
3,384	Scalable and Adaptive Online Joins	2014	VLDB	7.153329e-05
3,709	Multi-Query Optimization in MapReduce Framework	2014	VLDB	6.8211506e-05

Semantically Similar Papers

Overall Rank	Paper	Year	Venue	Pagerank
6,322	Revisiting Pipelined Parallelism in Multi-Join Query Processing	2005	VLDB	5.1074949e-05
3,068	Efficient Multi-way Theta-Join Processing Using MapReduce	2012	VLDB	7.6241861e-05
15	Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters	2007	SIGMOD	0.0010668335
8,857	Distributed Evaluation of Top-k Temporal Joins	2016	SIGMOD	4.4302523e-05
3,709	Multi-Query Optimization in MapReduce Framework	2014	VLDB	6.8211506e-05
1,938	From Theory to Practice: Efficient Join Query Evaluation in a Parallel Database System	2015	SIGMOD	0.00010025547
11,805	Runtime Optimization of Join Location in Parallel Data Management Systems	2017	VLDB	4.1905499e-05
7,117	Parallel Algorithms for Sparse Matrix Multiplication and Join-Aggregate Queries	2020	PODS	4.8205884e-05
1,114	Parallel Evaluation of Conjunctive Queries	2011	PODS	0.00013871948
442	Efficient Parallel Set-Similarity Joins Using MapReduce	2010	SIGMOD	0.00023095823