Database Paper Browser

Back to papers

A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs

Summary: Memory-bandwidth-aware hybrid GPU radix sort halves transfers, delivering ~2.3× speedup on uniform data. A pipelined heterogeneous mode handles off-GPU/large inputs, enabling strong end-to-end gains over CPU radix sort on large key-value workloads. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
5419
Venue
SIGMOD
Year
2017
Pagerank
7.4720668e-05
Overall Rank
3,151 | 78.09%
DOI
10.1145/3035918.3064043

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 22 of 22 citing papers.

Rank Citing Paper Year Venue Pagerank
2,040 A Study of the Fundamental Performance Characteristics of GPUs and CPUs for Database Analytics 2020 SIGMOD 9.7057698e-05
2,651 HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines 2019 VLDB 8.3694317e-05
3,327 Pump Up the Volume: Processing Large Data on GPUs with Fast Interconnects 2020 SIGMOD 7.2205738e-05
4,363 Hardware-conscious Query Processing in GPU-accelerated Analytical Engines 2019 CIDR 6.2552614e-05
5,019 Orchestrating Data Placement and Query Execution in Heterogeneous CPU-GPU DBMS 2022 VLDB 5.7559197e-05
5,040 Tile-based Lightweight Integer Compression in GPU 2022 SIGMOD 5.7425187e-05
5,125 The Art of Balance: A RateupDBTM Experience of Building a CPU/GPU Hybrid Database Product 2021 VLDB 5.679423e-05
5,247 Triton Join: Efficiently Scaling to a Large Join State on GPUs with Fast Interconnects 2022 SIGMOD 5.6057839e-05
6,404 ColumnML: Column-Store Machine Learning with On-The-Fly Data Transformation 2019 VLDB 5.0786954e-05
6,476 Parallel Index-based Stream Join on a Multicore CPU 2020 SIGMOD 5.0496617e-05
6,540 Data Partitioning for In-Memory Systems: Myths, Challenges, and Opportunities 2019 CIDR 5.0219214e-05
7,155 Evaluating Multi-GPU Sorting with Modern Interconnects 2022 SIGMOD 4.8149812e-05
7,328 BOSS - An Architecture for Database Kernel Composition 2024 VLDB 4.7610909e-05
7,360 ParPaRaw: Massively Parallel Parsing of Delimiter-Separated Raw Data 2020 VLDB 4.7525925e-05
7,551 Efficient Top-K Query Processing on Massively Parallel Hardware 2018 SIGMOD 4.7134746e-05
8,432 SPRINTER: A Fast n-ary Join Query Processing Method for Complex OLAP Queries 2020 SIGMOD 4.5153924e-05
8,846 Scaling your Hybrid CPU-GPU DBMS to Multiple GPUs 2024 VLDB 4.4372012e-05
9,030 A Case for Ecological Efficiency in Database Server Lifecycles 2025 CIDR 4.4039656e-05
9,838 Efficiently Joining Large Relations on Multi-GPU Systems 2025 VLDB 4.2740344e-05
9,953 Distributed Stream KNN Join 2021 SIGMOD 4.2405999e-05
10,749 Scaling GPU-Accelerated Databases beyond GPU Memory Size 2025 VLDB 4.1945683e-05
11,020 Accelerating Merkle Patricia Trie with GPU 2024 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 14 of 14 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
145 Quickly Generating Billion-Record Synthetic Databases 1994 SIGMOD 0.0004138408
239 GPUTeraSort: High Performance Graphics Co-processor Sorting for Large Database Management 2006 SIGMOD 0.00031617428
351 Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs 2009 VLDB 0.0002636504
396 One Trillion Edges: Graph Processing at Facebook-Scale 2015 VLDB 0.00024424102
404 Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited 2014 VLDB 0.00024143076
585 Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems 2012 VLDB 0.00019706145
930 Fast Sort on CPUs and GPUs: A Case for Bandwidth Oblivious SIMD Sort 2010 SIGMOD 0.00015238545
946 Efficient Implementation of Sorting on Multi-Core SIMD CPU Architecture 2008 VLDB 0.0001513324
1,607 A Comprehensive Study of Main-Memory Partitioning and its Application to Large-Scale Comparison- and Radix-Sort 2014 SIGMOD 0.00011162682
3,655 CloudRAMSort: Fast and Efficient Large-Scale Distributed RAM Sort on Shared-Nothing Cluster 2012 SIGMOD 6.8718304e-05
3,777 A Hybrid B+-tree as Solution for In-Memory Indexing on CPU-GPU Heterogeneous Computing Platforms 2016 SIGMOD 6.7750901e-05
4,042 PARADIS: An Efficient Parallel Algorithm for In-place Radix Sort 2015 VLDB 6.5026989e-05
4,655 SIMD- and Cache-Friendly Algorithm for Sorting an Array of Structures 2015 VLDB 6.0221672e-05
6,434 Patience is a Virtue: Revisiting Merge and Sort on Modern Processors 2014 SIGMOD 5.0640194e-05
Previous Page 1 / 1 Next

Semantically Similar Papers