Back to papers
A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs
Summary: Memory-bandwidth-aware hybrid GPU radix sort halves transfers, delivering ~2.3× speedup on uniform data. A pipelined heterogeneous mode handles off-GPU/large inputs, enabling strong end-to-end gains over CPU radix sort on large key-value workloads.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 5419
- Venue
- SIGMOD
- Year
- 2017
- Pagerank
- 7.4720668e-05
- Overall Rank
- 3,151 | 78.09%
- DOI
-
10.1145/3035918.3064043
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 22 of 22 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 2,040 |
A Study of the Fundamental Performance Characteristics of GPUs and CPUs for Database Analytics |
2020 |
SIGMOD |
9.7057698e-05 |
| 2,651 |
HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines |
2019 |
VLDB |
8.3694317e-05 |
| 3,327 |
Pump Up the Volume: Processing Large Data on GPUs with Fast Interconnects |
2020 |
SIGMOD |
7.2205738e-05 |
| 4,363 |
Hardware-conscious Query Processing in GPU-accelerated Analytical Engines |
2019 |
CIDR |
6.2552614e-05 |
| 5,019 |
Orchestrating Data Placement and Query Execution in Heterogeneous CPU-GPU DBMS |
2022 |
VLDB |
5.7559197e-05 |
| 5,040 |
Tile-based Lightweight Integer Compression in GPU |
2022 |
SIGMOD |
5.7425187e-05 |
| 5,125 |
The Art of Balance: A RateupDBTM Experience of Building a CPU/GPU Hybrid Database Product |
2021 |
VLDB |
5.679423e-05 |
| 5,247 |
Triton Join: Efficiently Scaling to a Large Join State on GPUs with Fast Interconnects |
2022 |
SIGMOD |
5.6057839e-05 |
| 6,404 |
ColumnML: Column-Store Machine Learning with On-The-Fly Data Transformation |
2019 |
VLDB |
5.0786954e-05 |
| 6,476 |
Parallel Index-based Stream Join on a Multicore CPU |
2020 |
SIGMOD |
5.0496617e-05 |
| 6,540 |
Data Partitioning for In-Memory Systems: Myths, Challenges, and Opportunities |
2019 |
CIDR |
5.0219214e-05 |
| 7,155 |
Evaluating Multi-GPU Sorting with Modern Interconnects |
2022 |
SIGMOD |
4.8149812e-05 |
| 7,328 |
BOSS - An Architecture for Database Kernel Composition |
2024 |
VLDB |
4.7610909e-05 |
| 7,360 |
ParPaRaw: Massively Parallel Parsing of Delimiter-Separated Raw Data |
2020 |
VLDB |
4.7525925e-05 |
| 7,551 |
Efficient Top-K Query Processing on Massively Parallel Hardware |
2018 |
SIGMOD |
4.7134746e-05 |
| 8,432 |
SPRINTER: A Fast n-ary Join Query Processing Method for Complex OLAP Queries |
2020 |
SIGMOD |
4.5153924e-05 |
| 8,846 |
Scaling your Hybrid CPU-GPU DBMS to Multiple GPUs |
2024 |
VLDB |
4.4372012e-05 |
| 9,030 |
A Case for Ecological Efficiency in Database Server Lifecycles |
2025 |
CIDR |
4.4039656e-05 |
| 9,838 |
Efficiently Joining Large Relations on Multi-GPU Systems |
2025 |
VLDB |
4.2740344e-05 |
| 9,953 |
Distributed Stream KNN Join |
2021 |
SIGMOD |
4.2405999e-05 |
| 10,749 |
Scaling GPU-Accelerated Databases beyond GPU Memory Size |
2025 |
VLDB |
4.1945683e-05 |
| 11,020 |
Accelerating Merkle Patricia Trie with GPU |
2024 |
VLDB |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 14 of 14 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers