Database Paper Browser

Back to papers

Similarity Search in High Dimensions via Hashing

Summary: Hashing-based scheme for approximate nearest neighbor in high-dimensional data, exploiting higher collision probability for nearby points. Experiments show substantial speedups over hierarchical-tree methods and scalability beyond 50 dimensions, addressing the curse of dimensionality. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
8595
Venue
VLDB
Year
1999
Pagerank
0.00076637636
Overall Rank
34 | 99.77%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 50 of 124 citing papers.

Rank Citing Paper Year Venue Pagerank
212 Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph 2019 VLDB 0.00033913475
266 Efficient Exact Set-Similarity Joins 2006 VLDB 0.00029718727
400 Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search 2007 VLDB 0.0002427237
447 Efficient Parallel Set-Similarity Joins Using MapReduce 2010 SIGMOD 0.00022900171
495 Milvus: A Purpose-Built Vector Data Management System 2021 SIGMOD 0.00021767688
539 Fast Time Sequence Indexing for Arbitrary L_p Norms 2000 VLDB 0.00020666392
562 Query-Aware Locality-Sensitive Hashing for Approximate Nearest Neighbor Search 2016 VLDB 0.00020091752
605 Locality-Sensitive Hashing Scheme Based on Dynamic Collision Counting 2012 SIGMOD 0.000193396
682 Quality and Efficiency in High Dimensional Nearest Neighbor Search 2009 SIGMOD 0.00018201541
709 Efficient Similarity Search and Classification via Rank Aggregation 2003 SIGMOD 0.00017768547
736 AnalyticDB-V: A Hybrid Analytical Engine Towards Query Fusion for Structured and Unstructured Data 2020 VLDB 0.00017447617
754 Distributed Representations of Tuples for Entity Resolution 2018 VLDB 0.00017117211
801 SageDB: A Learned Database System 2019 CIDR 0.00016505496
867 SRS: Solving c-Approximate Nearest Neighbor Queries in High Dimensional Euclidean Space with a Tiny Index 2015 VLDB 0.00015792021
1,005 FREDDY: Fast Word Embeddings in Database Systems 2018 SIGMOD 0.00014692665
1,229 SK-LSH : An Efficient Index Structure for Approximate Nearest Neighbor Search 2014 VLDB 0.00013157271
1,234 Ed-Join: An Efficient Algorithm for Similarity Joins With Edit Distance Constraints 2008 VLDB 0.00013122499
1,269 Cache locality is not enough: High-Performance Nearest Neighbor Search with Product Quantization Fast Scan 2016 VLDB 0.00012930432
1,298 Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms 2019 VLDB 0.00012758104
1,305 Bayesian Locality Sensitive Hashing for Fast Similarity Search 2012 VLDB 0.00012687101
1,925 The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation 2000 VLDB 0.00010073407
1,931 Efficient Processing of k Nearest Neighbor Joins using MapReduce 2012 VLDB 0.00010040427
1,976 Towards Effective Partition Management for Large Graphs 2012 SIGMOD 9.8844201e-05
2,073 Extending Autocompletion To Tolerate Errors 2009 SIGMOD 9.6142791e-05
2,181 PM-LSH: A Fast and Accurate LSH Framework for High-Dimensional Approximate NN Search 2020 VLDB 9.3451821e-05
2,262 Manu: A Cloud Native Vector Database Management System 2022 VLDB 9.1624446e-05
2,523 ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and Structured Data 2024 SIGMOD 8.604576e-05
2,542 Scalable Discovery of Best Clusters on Large Graphs 2010 VLDB 8.5794502e-05
2,641 Locality-Sensitive Hashing for Earthquake Detection: A Case Study of Scaling Data-Driven Science 2018 VLDB 8.3905374e-05
2,681 NET-FLi: On-the-fly Compression, Archiving and Indexing of Streaming Network Traffic 2010 VLDB 8.3232427e-05
2,836 Semantics-aware Dataset Discovery from Data Lakes with Contextualized Column-based Representation Learning 2023 VLDB 8.0443826e-05
2,971 Towards Efficient Index Construction and Approximate Nearest Neighbor Search in High-Dimensional Spaces 2023 VLDB 7.7970531e-05
3,056 DSH: Data Sensitive Hashing for High-Dimensional k-NN Search 2014 SIGMOD 7.6432146e-05
3,225 DeltaPQ: Lossless Product Quantization Code Compression for High Dimensional Similarity Search 2020 VLDB 7.3463484e-05
3,294 Approximate Embedding-Based Subsequence Matching of Time Series 2008 SIGMOD 7.2619257e-05
3,459 An Empirical Evaluation of Set Similarity Join Techniques 2016 VLDB 7.072508e-05
3,490 Leveraging Set Relations in Exact Set Similarity Join 2017 VLDB 7.0465856e-05
3,629 The Lernaean Hydra of Data Series Similarity Search: An Experimental Evaluation of the State of the Art 2019 VLDB 6.902069e-05
3,680 SingleStore-V: An Integrated Vector Database System in SingleStore 2024 VLDB 6.8496415e-05
3,772 FEXIPRO: Fast and Exact Inner Product Retrieval in Recommender Systems 2017 SIGMOD 6.7761705e-05
3,774 Efficient Exact Edit Similarity Query Processing with the Asymmetric Signature Scheme 2011 SIGMOD 6.7757301e-05
3,823 Automatic Discovery of Attributes in Relational Databases 2011 SIGMOD 6.7261168e-05
3,868 An Efficient Filter for Approximate Membership Checking 2008 SIGMOD 6.6822543e-05
3,915 A Benchmarking Study of Embedding-based Entity Alignment for Knowledge Graphs 2020 VLDB 6.6332294e-05
4,090 Finding Near Neighbors Through Cluster Pruning 2007 PODS 6.4577834e-05
4,243 Locality-Sensitive Hashing Scheme based on Longest Circular Co-Substring 2020 SIGMOD 6.32976e-05
4,250 Local Similarity Search for Unstructured Text 2016 SIGMOD 6.3241139e-05
4,278 Similarity Query Processing for High-Dimensional Data 2020 VLDB 6.2953764e-05
4,353 Overlap Set Similarity Joins with Theoretical Guarantees 2018 SIGMOD 6.263585e-05
4,401 LEMP: Fast Retrieval of Large Entries in a Matrix Product 2015 SIGMOD 6.2211271e-05
Previous Page 1 / 3 Next

Outgoing Citations (Sorted by Pagerank)

Showing 9 of 9 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers