Back to papers
Scaling your Hybrid CPU-GPU DBMS to Multiple GPUs
Summary: Jointly scales hybrid CPU–GPU DBMSs across multiple GPUs by optimizing data placement and distributed query execution. Introduces cache-aware replication that accounts for shuffle cost and coordinates caching/replication; Lancelot implements distributed hybrid execution and achieves 2–12× speedups on SSB and TPC‑H.
(summarized by gpt-5-mini on Feb 09 2026)
- Paper ID
- 13715
- Venue
- VLDB
- Year
- 2024
- Pagerank
- 4.4372012e-05
- Overall Rank
- 8,846 | 38.47%
- DOI
-
10.14778/3704965.3704977
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 4 of 4 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 29 of 29 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 145 |
Quickly Generating Billion-Record Synthetic Databases |
1994 |
SIGMOD |
0.0004138408 |
| 185 |
DuckDB: an Embeddable Analytical Database |
2019 |
SIGMOD |
0.00036538405 |
| 775 |
Relational Joins on Graphics Processors |
2008 |
SIGMOD |
0.00016823862 |
| 1,273 |
The Yin and Yang of Processing Data Warehousing Queries on GPU Devices |
2013 |
VLDB |
0.00012912938 |
| 1,287 |
Hardware-Oblivious Parallelism for In-Memory Column-Stores |
2013 |
VLDB |
0.00012820443 |
| 2,040 |
A Study of the Fundamental Performance Characteristics of GPUs and CPUs for Database Analytics |
2020 |
SIGMOD |
9.7057698e-05 |
| 2,067 |
HippogriffDB: Balancing I/O and GPU Bandwidth in Big Data Analytics |
2016 |
VLDB |
9.6392739e-05 |
| 2,287 |
Pipelined Query Processing in Coprocessor Environments |
2018 |
SIGMOD |
9.0972606e-05 |
| 2,519 |
Revisiting Co-Processing for Hash Joins on the Coupled CPU-GPU Architecture |
2013 |
VLDB |
8.6078505e-05 |
| 2,651 |
HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines |
2019 |
VLDB |
8.3694317e-05 |
| 3,151 |
A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs |
2017 |
SIGMOD |
7.4720668e-05 |
| 3,254 |
Query Processing on Tensor Computation Runtimes |
2022 |
VLDB |
7.3161051e-05 |
| 3,305 |
Robust Query Processing in Co-Processor-accelerated Databases |
2016 |
SIGMOD |
7.2460965e-05 |
| 3,327 |
Pump Up the Volume: Processing Large Data on GPUs with Fast Interconnects |
2020 |
SIGMOD |
7.2205738e-05 |
| 3,696 |
Why it is time for a HyPE: A Hybrid Query Processing Engine for Efficient GPU Coprocessing in DBMS |
2013 |
VLDB |
6.834483e-05 |
| 3,898 |
Efficient Join Algorithms For Large Database Tables in a Multi-GPU Environment |
2021 |
VLDB |
6.6551268e-05 |
| 4,002 |
MG-Join: A Scalable Join for Massively Parallel Multi-GPU Architectures |
2021 |
SIGMOD |
6.545665e-05 |
| 4,363 |
Hardware-conscious Query Processing in GPU-accelerated Analytical Engines |
2019 |
CIDR |
6.2552614e-05 |
| 4,678 |
OmniDB: Towards Portable and Efficient Query Processing on Parallel CPU/GPU Architectures |
2013 |
VLDB |
6.0046271e-05 |
| 4,999 |
Adaptive Work Placement for Query Processing on Heterogeneous Computing Resources |
2017 |
VLDB |
5.7752801e-05 |
| 5,019 |
Orchestrating Data Placement and Query Execution in Heterogeneous CPU-GPU DBMS |
2022 |
VLDB |
5.7559197e-05 |
| 5,040 |
Tile-based Lightweight Integer Compression in GPU |
2022 |
SIGMOD |
5.7425187e-05 |
| 5,197 |
Data-Parallel Query Processing on Non-Uniform Data |
2020 |
VLDB |
5.6347409e-05 |
| 5,247 |
Triton Join: Efficiently Scaling to a Large Join State on GPUs with Fast Interconnects |
2022 |
SIGMOD |
5.6057839e-05 |
| 5,814 |
Towards a Hybrid Design for Fast Query Processing in DB2 with BLU Acceleration Using Graphical Processing Units: A Technology Demonstration |
2016 |
SIGMOD |
5.3167137e-05 |
| 6,066 |
GPU Database Systems Characterization and Optimization |
2024 |
VLDB |
5.2290447e-05 |
| 6,223 |
Distributed GPU Joins on Fast RDMA-capable Networks |
2023 |
SIGMOD |
5.1496398e-05 |
| 6,369 |
Improving Execution Efficiency of Just-in-time Compilation based Query Processing on GPUs |
2021 |
VLDB |
5.0936663e-05 |
| 6,861 |
HetCache: Synergising NVMe Storage and GPU acceleration for Memory-Efficient Analytics |
2023 |
CIDR |
4.905263e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 3,898 |
Efficient Join Algorithms For Large Database Tables in a Multi-GPU Environment |
2021 |
VLDB |
6.6551268e-05 |
| 4,363 |
Hardware-conscious Query Processing in GPU-accelerated Analytical Engines |
2019 |
CIDR |
6.2552614e-05 |
| 2,067 |
HippogriffDB: Balancing I/O and GPU Bandwidth in Big Data Analytics |
2016 |
VLDB |
9.6392739e-05 |
| 3,696 |
Why it is time for a HyPE: A Hybrid Query Processing Engine for Efficient GPU Coprocessing in DBMS |
2013 |
VLDB |
6.834483e-05 |
| 7,916 |
Terabyte-Scale Analytics in the Blink of an Eye |
2026 |
VLDB |
4.6173899e-05 |
| 3,305 |
Robust Query Processing in Co-Processor-accelerated Databases |
2016 |
SIGMOD |
7.2460965e-05 |
| 2,330 |
Concurrent Analytical Query Processing with GPUs |
2014 |
VLDB |
9.0192228e-05 |
| 6,066 |
GPU Database Systems Characterization and Optimization |
2024 |
VLDB |
5.2290447e-05 |
| 5,019 |
Orchestrating Data Placement and Query Execution in Heterogeneous CPU-GPU DBMS |
2022 |
VLDB |
5.7559197e-05 |
| 10,749 |
Scaling GPU-Accelerated Databases beyond GPU Memory Size |
2025 |
VLDB |
4.1945683e-05 |