Database Paper Browser

Back to papers

MG-Join: A Scalable Join for Massively Parallel Multi-GPU Architectures

Summary: MG-Join proposes a scalable partitioned hash join for multi-GPU single-machine architectures. Adaptive multi-hop cross-GPU routing minimizes congestion, achieving up to 97% bisection-bandwidth utilization, and up to 2.5x join speedups with 4.5x TPC-H gains over Omnisci. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
6145
Venue
SIGMOD
Year
2021
Pagerank
6.545665e-05
Overall Rank
4,002 | 72.17%
DOI
10.1145/3448016.3457254

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 20 of 20 citing papers.

Rank Citing Paper Year Venue Pagerank
5,019 Orchestrating Data Placement and Query Execution in Heterogeneous CPU-GPU DBMS 2022 VLDB 5.7559197e-05
5,040 Tile-based Lightweight Integer Compression in GPU 2022 SIGMOD 5.7425187e-05
5,247 Triton Join: Efficiently Scaling to a Large Join State on GPUs with Fast Interconnects 2022 SIGMOD 5.6057839e-05
5,426 RTIndeX: Exploiting Hardware-Accelerated GPU Raytracing for Database Indexing 2023 VLDB 5.5096704e-05
6,066 GPU Database Systems Characterization and Optimization 2024 VLDB 5.2290447e-05
6,223 Distributed GPU Joins on Fast RDMA-capable Networks 2023 SIGMOD 5.1496398e-05
6,453 Vortex: Overcoming Memory Capacity Limitations in GPU-Accelerated Large-Scale Data Analytics 2025 VLDB 5.0571108e-05
7,155 Evaluating Multi-GPU Sorting with Modern Interconnects 2022 SIGMOD 4.8149812e-05
7,328 BOSS - An Architecture for Database Kernel Composition 2024 VLDB 4.7610909e-05
7,568 Powerful GPUs or Fast Interconnects: Analyzing Relational Workloads on Modern GPUs 2025 VLDB 4.7084322e-05
7,751 Efficiently Processing Joins and Grouped Aggregations on GPUs 2025 SIGMOD 4.6603427e-05
7,916 Terabyte-Scale Analytics in the Blink of an Eye 2026 VLDB 4.6173899e-05
8,846 Scaling your Hybrid CPU-GPU DBMS to Multiple GPUs 2024 VLDB 4.4372012e-05
9,142 Design and Analysis of a Processing-in-DIMM Join Algorithm: A Case Study with UPMEM DIMMs 2023 SIGMOD 4.3853149e-05
9,838 Efficiently Joining Large Relations on Multi-GPU Systems 2025 VLDB 4.2740344e-05
10,253 Scalable GPU Acceleration of Scalar Functions in Analytical Databases: Compilation, Benchmarking, and Optimization 2026 VLDB 4.1945683e-05
10,749 Scaling GPU-Accelerated Databases beyond GPU Memory Size 2025 VLDB 4.1945683e-05
10,993 SPID-Join: A Skew-resistant Processing-in-DIMM Join Algorithm Exploiting the Bank- and Rank-level Parallelisms of DIMMs 2024 SIGMOD 4.1945683e-05
11,020 Accelerating Merkle Patricia Trie with GPU 2024 VLDB 4.1945683e-05
11,358 Scaling Equi-Joins 2022 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 15 of 15 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
775 Relational Joins on Graphics Processors 2008 SIGMOD 0.00016823862
1,206 Rack-Scale In-Memory Join Processing using RDMA 2015 SIGMOD 0.00013281657
1,273 The Yin and Yang of Processing Data Warehousing Queries on GPU Devices 2013 VLDB 0.00012912938
1,804 An Experimental Comparison of Thirteen Relational Equi-Joins in Main Memory 2016 SIGMOD 0.00010501185
2,519 Revisiting Co-Processing for Hash Joins on the Coupled CPU-GPU Architecture 2013 VLDB 8.6078505e-05
2,526 Track Join: Distributed Joins with Minimal Network Traffic 2014 SIGMOD 8.5968612e-05
2,651 HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines 2019 VLDB 8.3694317e-05
3,363 CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU Servers 2019 VLDB 7.1731921e-05
3,670 A Distributed Multi-GPU System for Fast Graph Processing 2018 VLDB 6.8567044e-05
4,085 In-Cache Query Co-Processing on Coupled CPU-GPU Architectures 2015 VLDB 6.4620277e-05
4,363 Hardware-conscious Query Processing in GPU-accelerated Analytical Engines 2019 CIDR 6.2552614e-05
5,197 Data-Parallel Query Processing on Non-Uniform Data 2020 VLDB 5.6347409e-05
5,578 Ocelot/HyPE: Optimized Data Processing on Heterogeneous Hardware 2014 VLDB 5.4252837e-05
6,369 Improving Execution Efficiency of Just-in-time Compilation based Query Processing on GPUs 2021 VLDB 5.0936663e-05
7,060 SquirrelJoin: Network-Aware Distributed Join Processing with Lazy Partitioning 2017 VLDB 4.8465382e-05
Previous Page 1 / 1 Next

Semantically Similar Papers