Database Paper Browser

Back to papers

Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs

Summary: Revisits hash join vs sort-merge join on modern multi-core CPUs with optimized parallel implementations. Hash join hits >100M tuples/s on Intel Core i7; sort-merge delivers 47–80M. Analytical models indicate wider SIMD and more cores favor sort-merge, suggesting future architectures may swing dominance away from hash. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
9865
Venue
VLDB
Year
2009
Pagerank
0.0002636504
Overall Rank
351 | 97.57%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 30 of 80 citing papers.

Rank Citing Paper Year Venue Pagerank
6,221 Charting the Design Space of Query Execution using VOILA 2021 VLDB 5.1512158e-05
6,304 Elastic Pipelining in an In-Memory Database Cluster 2016 SIGMOD 5.1210182e-05
6,434 Patience is a Virtue: Revisiting Merge and Sort on Modern Processors 2014 SIGMOD 5.0640194e-05
6,524 The 3D Hash Join: Building On Non-Unique Join Attributes 2022 CIDR 5.0274964e-05
6,525 Database Technology for the Masses: Sub-Operators as First-Class Entities 2021 VLDB 5.027205e-05
6,540 Data Partitioning for In-Memory Systems: Myths, Challenges, and Opportunities 2019 CIDR 5.0219214e-05
7,097 Fast Multi-Column Sorting in Main-Memory Column-Stores 2016 SIGMOD 4.8336115e-05
7,492 Krypton: Real-time Serving and Analytical SQL Engine at ByteDance 2023 VLDB 4.7180617e-05
8,018 Parallelizing Intra-Window Join on Multicores: An Experimental Study 2021 SIGMOD 4.6046381e-05
8,094 Modularis: Modular Relational Analytics over Heterogeneous Distributed Platforms 2021 VLDB 4.5867812e-05
8,417 The Case for Learned In-Memory Joins 2023 VLDB 4.5194164e-05
8,432 SPRINTER: A Fast n-ary Join Query Processing Method for Complex OLAP Queries 2020 SIGMOD 4.5153924e-05
8,478 Analyzing Vectorized Hash Tables Across CPU Architectures 2023 VLDB 4.5015937e-05
8,514 UPLIFT: Parallelization Strategies for Feature Transformations in Machine Learning Workloads 2022 VLDB 4.4944285e-05
8,626 Adaptive Code Generation for Data-Intensive Analytics 2021 VLDB 4.4829152e-05
8,680 A Practical Approach to Groupjoin and Nested Aggregates 2021 VLDB 4.4694927e-05
8,781 Accelerate Distributed Joins with Predicate Transfer 2025 SIGMOD 4.4534753e-05
8,855 A Design Space Exploration and Evaluation for Main-Memory Hash Joins in Storage Class Memory 2023 VLDB 4.4348906e-05
9,142 Design and Analysis of a Processing-in-DIMM Join Algorithm: A Case Study with UPMEM DIMMs 2023 SIGMOD 4.3853149e-05
9,838 Efficiently Joining Large Relations on Multi-GPU Systems 2025 VLDB 4.2740344e-05
9,944 Out-of-order Execution of Database Queries 2020 VLDB 4.2446672e-05
10,121 TQEx: Tensor-based Query Engine Enhanced by Bridging the Gap 2026 SIGMOD 4.1945683e-05
10,372 Data Chunk Compaction in Vectorized Execution 2025 SIGMOD 4.1945683e-05
10,494 Nested Parquet Is Flat, Why Not Use It? How To Scan Nested Data With On-the-Fly Key Generation and Joins 2025 SIGMOD 4.1945683e-05
10,981 Enabling Adaptive Sampling for Intra-Window Join: Simultaneously Optimizing Quantity and Quality 2024 SIGMOD 4.1945683e-05
10,993 SPID-Join: A Skew-resistant Processing-in-DIMM Join Algorithm Exploiting the Bank- and Rank-level Parallelisms of DIMMs 2024 SIGMOD 4.1945683e-05
11,142 Cache-Efficient Top-k Aggregation over High Cardinality Large Datasets 2024 VLDB 4.1945683e-05
11,237 Cracking-Like Join for Trusted Execution Environments 2023 VLDB 4.1945683e-05
11,358 Scaling Equi-Joins 2022 SIGMOD 4.1945683e-05
11,381 Origami: A High-Performance Mergesort Framework 2022 VLDB 4.1945683e-05
Previous Page 2 / 2 Next

Outgoing Citations (Sorted by Pagerank)

Showing 17 of 17 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
52 Database Architecture Optimized for the new Bottleneck: Memory Access 1999 VLDB 0.00066474881
78 Multiprocessor Hash-Based Join Algorithms 1985 VLDB 0.00056413752
81 Cache Conscious Algorithms for Relational Query Processing 1994 VLDB 0.00055548574
84 AlphaSort: A RISC Machine Sort 1994 SIGMOD 0.00053866006
145 Quickly Generating Billion-Record Synthetic Databases 1994 SIGMOD 0.0004138408
233 A Study of Index Structures for Main Memory Database Management Systems 1986 VLDB 0.00032021526
239 GPUTeraSort: High Performance Graphics Co-processor Sorting for Large Database Management 2006 SIGMOD 0.00031617428
343 Implementing Database Operations Using SIMD Instructions 2002 SIGMOD 0.00026768139
714 Adaptive Aggregation on Chip Multiprocessors 2007 VLDB 0.00017730584
775 Relational Joins on Graphics Processors 2008 SIGMOD 0.00016823862
946 Efficient Implementation of Sorting on Multi-Core SIMD CPU Architecture 2008 VLDB 0.0001513324
1,079 What happens during a Join? Dissecting CPU and Memory Optimization Effects 2000 VLDB 0.00014233415
1,365 Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning 1991 VLDB 0.00012368421
1,856 An Adaptive Hash Join Algorithm for Multiuser Environments 1990 VLDB 0.00010304993
2,619 Hash-Based Join Algorithms for Multiprocessor Computers with Shared Memory 1990 VLDB 8.4431973e-05
2,763 Executing Stream Joins on the Cell Processor 2007 VLDB 8.1579306e-05
2,778 Database Servers on Chip Multiprocessors: Limitations and Opportunities 2007 CIDR 8.1321802e-05
Previous Page 1 / 1 Next

Semantically Similar Papers