Database Paper Browser

Back to papers

Scaling GPU-Accelerated Databases beyond GPU Memory Size

Summary: Hybrid CPU-GPU query processing: CPU performs high-throughput selective filtering to reduce PCIe transfers while the GPU executes compute-heavy operators (joins). On TPC-H up to SF1000 (1TB) with a single A100 (80GB) it scales beyond GPU memory and outperforms a top CPU-only DB in both performance and cost. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
14063
Venue
VLDB
Year
2025
Pagerank
4.1945683e-05
Overall Rank
10,749 | 25.23%
DOI
10.14778/3749646.3749710

Incoming Non-self Citations Over Time

No non-self incoming citations found for this paper in this database.

Authors

Incoming Citations (Sorted by Pagerank)

Showing 4 of 4 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 40 of 40 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
21 C-Store: A Column-oriented DBMS 2005 VLDB 0.00086087497
131 Integrating Compression and Execution in Column-Oriented Database Systems 2006 SIGMOD 0.0004370331
232 A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment 1989 SIGMOD 0.00032122485
305 SIMD-Scan: Ultra Fast in-Memory Table Scan using on-Chip Vector Processing Units 2009 VLDB 0.00028248614
343 Implementing Database Operations Using SIMD Instructions 2002 SIGMOD 0.00026768139
365 On the Power of Magic 1987 PODS 0.00025585898
775 Relational Joins on Graphics Processors 2008 SIGMOD 0.00016823862
958 Rethinking SIMD Vectorization for In-Memory Databases 2015 SIGMOD 0.00015045316
1,270 BitWeaving: Fast Scans for Main Memory Data Processing 2013 SIGMOD 0.00012926086
1,273 The Yin and Yang of Processing Data Warehousing Queries on GPU Devices 2013 VLDB 0.00012912938
1,287 Hardware-Oblivious Parallelism for In-Memory Column-Stores 2013 VLDB 0.00012820443
1,618 Row-wise Parallel Predicate Evaluation 2008 VLDB 0.00011114015
2,014 Voodoo - A Vector Algebra for Portable Database Performance on Modern Hardware 2016 VLDB 9.7904029e-05
2,040 A Study of the Fundamental Performance Characteristics of GPUs and CPUs for Database Analytics 2020 SIGMOD 9.7057698e-05
2,287 Pipelined Query Processing in Coprocessor Environments 2018 SIGMOD 9.0972606e-05
2,390 ByteSlice: Pushing the Envelop of Main Memory Data Processing with a New Storage Layout 2015 SIGMOD 8.9084657e-05
2,651 HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines 2019 VLDB 8.3694317e-05
2,882 Database Compression on Graphics Processors 2010 VLDB 7.9661218e-05
3,151 A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs 2017 SIGMOD 7.4720668e-05
3,254 Query Processing on Tensor Computation Runtimes 2022 VLDB 7.3161051e-05
3,898 Efficient Join Algorithms For Large Database Tables in a Multi-GPU Environment 2021 VLDB 6.6551268e-05
4,002 MG-Join: A Scalable Join for Massively Parallel Multi-GPU Architectures 2021 SIGMOD 6.545665e-05
4,085 In-Cache Query Co-Processing on Coupled CPU-GPU Architectures 2015 VLDB 6.4620277e-05
4,276 Looking Ahead Makes Query Plans Robust: Making the Initial Case with In-Memory Star Schema Data Warehouse Workloads 2017 VLDB 6.2976602e-05
4,518 The FastLanes Compression Layout: Decoding >100 Billion Integers per Second with Scalar Code 2023 VLDB 6.117844e-05
5,019 Orchestrating Data Placement and Query Execution in Heterogeneous CPU-GPU DBMS 2022 VLDB 5.7559197e-05
5,040 Tile-based Lightweight Integer Compression in GPU 2022 SIGMOD 5.7425187e-05
5,194 Bitvector-aware Query Optimization for Decision Support Queries 2020 SIGMOD 5.6368209e-05
5,300 Applying Hash Filters To Improving The Execution Of Bushy Trees 1993 VLDB 5.5793265e-05
5,765 Predicate Transfer: Efficient Pre-Filtering on Multi-Join Queries 2024 CIDR 5.336442e-05
5,814 Towards a Hybrid Design for Fast Query Processing in DB2 with BLU Acceleration Using Graphical Processing Units: A Technology Demonstration 2016 SIGMOD 5.3167137e-05
6,066 GPU Database Systems Characterization and Optimization 2024 VLDB 5.2290447e-05
6,223 Distributed GPU Joins on Fast RDMA-capable Networks 2023 SIGMOD 5.1496398e-05
6,369 Improving Execution Efficiency of Just-in-time Compilation based Query Processing on GPUs 2021 VLDB 5.0936663e-05
6,604 MotherDuck: DuckDB in the cloud and in the client 2024 CIDR 4.9971118e-05
7,427 Selection Pushdown in Column Stores using Bit Manipulation Instructions 2023 SIGMOD 4.7327406e-05
8,415 Pruning in Snowflake: Working Smarter, Not Harder 2025 SIGMOD 4.5197687e-05
8,506 New Query Optimization Techniques in the Spark Engine of Azure Synapse 2022 VLDB 4.4957661e-05
8,846 Scaling your Hybrid CPU-GPU DBMS to Multiple GPUs 2024 VLDB 4.4372012e-05
9,695 Share the Tensor Tea: How Databases can Leverage the Machine Learning Ecosystem 2022 VLDB 4.3025567e-05
Previous Page 1 / 1 Next

Semantically Similar Papers