Database Paper Browser

Back to papers

Triton Join: Efficiently Scaling to a Large Join State on GPUs with Fast Interconnects

Summary: Triton Join scales large joins on GPUs by using fast interconnects (NVLink 2.0) to spill state to main memory. Delivers >100x GPU hash-join gains over non-partitioned approaches, and up to 2.5x vs CPU radix-join, enabling GPU DBMSs to exceed GPU memory. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
6363
Venue
SIGMOD
Year
2022
Pagerank
5.6057839e-05
Overall Rank
5,247 | 63.50%
DOI
10.1145/3514221.3517911

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 15 of 15 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 41 of 41 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
9 Implementation Techniques For Main Memory Database Systems 1984 SIGMOD 0.0014279444
81 Cache Conscious Algorithms for Relational Query Processing 1994 VLDB 0.00055548574
351 Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs 2009 VLDB 0.0002636504
404 Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited 2014 VLDB 0.00024143076
775 Relational Joins on Graphics Processors 2008 SIGMOD 0.00016823862
930 Fast Sort on CPUs and GPUs: A Case for Bandwidth Oblivious SIMD Sort 2010 SIGMOD 0.00015238545
958 Rethinking SIMD Vectorization for In-Memory Databases 2015 SIGMOD 0.00015045316
1,016 Memory-Efficient Hash Joins 2015 VLDB 0.00014638492
1,206 Rack-Scale In-Memory Join Processing using RDMA 2015 SIGMOD 0.00013281657
1,273 The Yin and Yang of Processing Data Warehousing Queries on GPU Devices 2013 VLDB 0.00012912938
1,287 Hardware-Oblivious Parallelism for In-Memory Column-Stores 2013 VLDB 0.00012820443
1,696 A Seven-Dimensional Analysis of Hashing Methods and its Implications on Query Processing 2016 VLDB 0.00010881034
1,804 An Experimental Comparison of Thirteen Relational Equi-Joins in Main Memory 2016 SIGMOD 0.00010501185
2,040 A Study of the Fundamental Performance Characteristics of GPUs and CPUs for Database Analytics 2020 SIGMOD 9.7057698e-05
2,287 Pipelined Query Processing in Coprocessor Environments 2018 SIGMOD 9.0972606e-05
2,519 Revisiting Co-Processing for Hash Joins on the Coupled CPU-GPU Architecture 2013 VLDB 8.6078505e-05
2,651 HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines 2019 VLDB 8.3694317e-05
3,151 A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs 2017 SIGMOD 7.4720668e-05
3,305 Robust Query Processing in Co-Processor-accelerated Databases 2016 SIGMOD 7.2460965e-05
3,327 Pump Up the Volume: Processing Large Data on GPUs with Fast Interconnects 2020 SIGMOD 7.2205738e-05
3,443 Distributed Join Algorithms on Thousands of Cores 2017 VLDB 7.0887214e-05
3,465 GPL: A GPU-based Pipelined Query Processing Engine 2016 SIGMOD 7.0695873e-05
3,721 To Partition, or Not to Partition, That is the Join Question in a Real System 2021 SIGMOD 6.8179379e-05
3,722 Cache-Conscious Radix-Decluster Projections 2004 VLDB 6.8176075e-05
3,898 Efficient Join Algorithms For Large Database Tables in a Multi-GPU Environment 2021 VLDB 6.6551268e-05
3,993 Improving Main Memory Hash Joins on Intel Xeon Phi Processors: An Experimental Approach 2015 VLDB 6.5534805e-05
4,002 MG-Join: A Scalable Join for Massively Parallel Multi-GPU Architectures 2021 SIGMOD 6.545665e-05
4,033 In-RDBMS Hardware Acceleration of Advanced Analytics 2018 VLDB 6.5113267e-05
4,085 In-Cache Query Co-Processing on Coupled CPU-GPU Architectures 2015 VLDB 6.4620277e-05
4,363 Hardware-conscious Query Processing in GPU-accelerated Analytical Engines 2019 CIDR 6.2552614e-05
5,125 The Art of Balance: A RateupDBTM Experience of Building a CPU/GPU Hybrid Database Product 2021 VLDB 5.679423e-05
5,178 FPGA-based Data Partitioning 2017 SIGMOD 5.6438393e-05
5,653 On the Surprising Difficulty of Simple Things: the Case of Radix Partitioning 2015 VLDB 5.3889513e-05
5,670 Joins on Encoded and Partitioned Data 2014 VLDB 5.3804618e-05
5,699 EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal in GPUs 2021 VLDB 5.3654927e-05
5,714 MCJoin: A Memory-Constrained Join for Column-Store Main-Memory Databases. 2012 SIGMOD 5.3578116e-05
5,721 FPGA-based Multithreading for In-Memory Hash Joins 2015 CIDR 5.3525009e-05
6,540 Data Partitioning for In-Memory Systems: Myths, Challenges, and Opportunities 2019 CIDR 5.0219214e-05
7,155 Evaluating Multi-GPU Sorting with Modern Interconnects 2022 SIGMOD 4.8149812e-05
7,209 GPU-accelerated data management under the test of time 2020 CIDR 4.7996023e-05
9,842 A four-dimensional Analysis of Partitioned Approximate Filters 2021 VLDB 4.2722447e-05
Previous Page 1 / 1 Next

Semantically Similar Papers