FPGA-based Multithreading for In-Memory Hash Joins
Summary: First end-to-end in-memory FPGA hash join using massive multithreading to hide memory latency during build/probe with hundreds of on-chip thread contexts. 2–3.4× faster than multicore at similar bandwidth (up to 76.8GB/s, 1.6B tps); benefit drops on extreme skew. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 6 of 6 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,178 | FPGA-based Data Partitioning | 2017 | SIGMOD | 5.6438393e-05 |
| 5,247 | Triton Join: Efficiently Scaling to a Large Join State on GPUs with Fast Interconnects | 2022 | SIGMOD | 5.6057839e-05 |
| 8,018 | Parallelizing Intra-Window Join on Multicores: An Experimental Study | 2021 | SIGMOD | 4.6046381e-05 |
| 9,785 | Is FPGA Useful for Hash Joins? Exploring Hash Joins on Coupled CPU-FPGA Architecture | 2020 | CIDR | 4.284797e-05 |
| 9,838 | Efficiently Joining Large Relations on Multi-GPU Systems | 2025 | VLDB | 4.2740344e-05 |
| 10,993 | SPID-Join: A Skew-resistant Processing-in-DIMM Join Algorithm Exploiting the Bank- and Rank-level Parallelisms of DIMMs | 2024 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 8 of 8 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 52 | Database Architecture Optimized for the new Bottleneck: Memory Access | 1999 | VLDB | 0.00066474881 |
| 81 | Cache Conscious Algorithms for Relational Query Processing | 1994 | VLDB | 0.00055548574 |
| 145 | Quickly Generating Billion-Record Synthetic Databases | 1994 | SIGMOD | 0.0004138408 |
| 351 | Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs | 2009 | VLDB | 0.0002636504 |
| 404 | Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited | 2014 | VLDB | 0.00024143076 |
| 540 | Design and Evaluation of Main Memory Hash Join Algorithms for Multi-core CPUs | 2011 | SIGMOD | 0.0002063443 |
| 585 | Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems | 2012 | VLDB | 0.00019706145 |
| 1,694 | How Soccer Players Would do Stream Joins | 2011 | SIGMOD | 0.00010893764 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,460 | Performance Analysis of a Load Balancing Hash-Join Algorithm for a Shared Memory Multiprocessor | 1991 | VLDB | 6.1635864e-05 |
| 404 | Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited | 2014 | VLDB | 0.00024143076 |
| 4,781 | On Parallel Execution Of Multiple Pipelined Hash Joins | 1994 | SIGMOD | 5.9261504e-05 |
| 2,519 | Revisiting Co-Processing for Hash Joins on the Coupled CPU-GPU Architecture | 2013 | VLDB | 8.6078505e-05 |
| 78 | Multiprocessor Hash-Based Join Algorithms | 1985 | VLDB | 0.00056413752 |
| 2,619 | Hash-Based Join Algorithms for Multiprocessor Computers with Shared Memory | 1990 | VLDB | 8.4431973e-05 |
| 351 | Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs | 2009 | VLDB | 0.0002636504 |
| 5,178 | FPGA-based Data Partitioning | 2017 | SIGMOD | 5.6438393e-05 |
| 540 | Design and Evaluation of Main Memory Hash Join Algorithms for Multi-core CPUs | 2011 | SIGMOD | 0.0002063443 |
| 9,785 | Is FPGA Useful for Hash Joins? Exploring Hash Joins on Coupled CPU-FPGA Architecture | 2020 | CIDR | 4.284797e-05 |