Database Paper Browser

Back to papers

DiskJoin: Large-scale Vector Similarity Join with SSD

Summary: DiskJoin: first disk-based similarity join for billion-scale vectors on one machine, leveraging NVMe SSDs to avoid costly cluster communication. It minimizes read amplification via SSD-aware access, uses dynamic cache+eviction policies, and probabilistic pruning to achieve 50×–1000× speedups. (summarized by gpt-5-mini on Feb 11 2026)

Paper ID
7376
Venue
SIGMOD
Year
2026
Pagerank
4.1945683e-05
Overall Rank
10,068 | 29.96%
DOI
10.1145/3769780

Incoming Non-self Citations Over Time

No non-self incoming citations found for this paper in this database.

Authors

Incoming Citations (Sorted by Pagerank)

Showing 0 of 0 citing papers.

Rank Citing Paper Year Venue Pagerank
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 17 of 17 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
91 M-tree: An Efficient Access Method for Similarity Search in Metric Spaces 1997 VLDB 0.0005181666
148 Efficient Processing of Spatial Joins Using R-trees 1993 SIGMOD 0.00041182766
266 Efficient Exact Set-Similarity Joins 2006 VLDB 0.00029718727
562 Query-Aware Locality-Sensitive Hashing for Approximate Nearest Neighbor Search 2016 VLDB 0.00020091752
605 Locality-Sensitive Hashing Scheme Based on Dynamic Collision Counting 2012 SIGMOD 0.000193396
940 SharedDB: Killing One Thousand Queries With One Stone 2012 VLDB 0.00015173166
1,234 Ed-Join: An Efficient Algorithm for Similarity Joins With Edit Distance Constraints 2008 VLDB 0.00013122499
1,676 Speedup Graph Processing by Graph Ordering 2016 SIGMOD 0.00010946423
2,281 Epsilon Grid Order: An Algorithm for the Similarity Join on Massive High-Dimensional Data 2001 SIGMOD 9.1077704e-05
2,690 Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment 2024 SIGMOD 8.293714e-05
3,141 ClusterJoin: A Similarity Joins Framework using Map-Reduce 2014 VLDB 7.4829448e-05
3,514 Spatio-Textual Similarity Joins 2013 VLDB 7.0226998e-05
3,609 Similarity search in the blink of an eye with compressed indices 2023 VLDB 6.9215236e-05
3,624 SeRF: Segment Graph for Range-Filtering Approximate Nearest Neighbor Search 2024 SIGMOD 6.9056e-05
3,774 Efficient Exact Edit Similarity Query Processing with the Asymmetric Signature Scheme 2011 SIGMOD 6.7757301e-05
4,598 Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search 2025 SIGMOD 6.0586236e-05
7,544 A Topology-Aware Localized Update Strategy for Graph-Based ANN Index 2026 VLDB 4.7149033e-05
Previous Page 1 / 1 Next

Semantically Similar Papers

Overall Rank Paper Year Venue Pagerank
5,220 Similarity Join Size Estimation using Locality Sensitive Hashing 2011 VLDB 5.6216111e-05
10,706 Extensible and Robust Evaluation of Similarity Queries 2025 VLDB 4.1945683e-05
250 Efficient set joins on similarity predicates 2004 SIGMOD 0.00030661988
9,143 Similarity Query Processing Using Disk Arrays 1998 SIGMOD 4.3850454e-05
7,765 Cache-oblivious High-performance Similarity Join 2019 SIGMOD 4.6572085e-05
13,473 Exploiting Database Similarity Joins for Metric Spaces 2012 VLDB -
6,507 Similarity Join over Array Data 2016 SIGMOD 5.0337166e-05
10,930 Similarity Joins of Sparse Features 2024 SIGMOD 4.1945683e-05
3,141 ClusterJoin: A Similarity Joins Framework using Map-Reduce 2014 VLDB 7.4829448e-05
8,899 Fast Approximate Similarity Join in Vector Databases 2025 SIGMOD 4.427232e-05