Database Paper Browser

Back to papers

Similarity Query Processing for High-Dimensional Data

Summary: Tutorial surveying high-dimensional similarity query processing, bridging DB and ML with embeddings, auto-encoders, and pre-trained models. Reviews exact and approximate methods (cover trees, LSH, product quantization, proximity graphs), ML-driven selectivity estimation, and DB–ML synergy to spur ML-for-DB and DB-for-ML solutions. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
12223
Venue
VLDB
Year
2020
Pagerank
6.2953764e-05
Overall Rank
4,278 | 70.24%
DOI
10.14778/3415478.3415564

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 13 of 13 citing papers.

Rank Citing Paper Year Venue Pagerank
2,324 RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search 2024 SIGMOD 9.0326444e-05
2,811 High-Dimensional Approximate Nearest Neighbor Search: with Reliable and Efficient Distance Comparison Operations 2023 SIGMOD 8.0806307e-05
3,335 DeepJoin: Joinable Table Discovery with Pre-trained Language Models 2023 VLDB 7.2065006e-05
4,551 iRangeGraph: Improvising Range-dedicated Graphs for Range-filtering Nearest Neighbor Search 2024 SIGMOD 6.092287e-05
4,598 Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search 2025 SIGMOD 6.0586236e-05
5,184 SymphonyQG: Towards Symphonious Integration of Quantization and Graph for Approximate Nearest Neighbor Search 2025 SIGMOD 5.6406991e-05
7,369 Using VDMS to Index and Search 100M Images 2021 VLDB 4.750437e-05
7,611 UNIFY: Unified Index for Range Filtered Approximate Nearest Neighbors Search 2025 VLDB 4.6964271e-05
7,654 LiteHST: A Tree Embedding based Method for Similarity Search 2023 SIGMOD 4.687476e-05
8,783 GEqO: ML-Accelerated Semantic Equivalence Detection 2023 SIGMOD 4.452825e-05
10,042 Accelerating High-Dimensional ANN Search via Skipping Redundant Distance Computations 2026 SIGMOD 4.1945683e-05
10,841 Filtered Vector Search: State-of-the-art and Research Opportunities 2025 VLDB 4.1945683e-05
11,378 Interactive Mining with Ordered and Unordered Attributes 2022 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 22 of 22 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
34 Similarity Search in High Dimensions via Hashing 1999 VLDB 0.00076637636
79 A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces 1998 VLDB 0.00056242144
91 M-tree: An Efficient Access Method for Similarity Search in Metric Spaces 1997 VLDB 0.0005181666
102 The Case for Learned Index Structures 2018 SIGMOD 0.00049545203
212 Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph 2019 VLDB 0.00033913475
300 Deep Learning for Entity Matching: A Design Space Exploration 2018 SIGMOD 0.00028441466
643 Corleone: Hands-Off Crowdsourcing for Entity Matching 2014 SIGMOD 0.00018754451
682 Quality and Efficiency in High Dimensional Nearest Neighbor Search 2009 SIGMOD 0.00018201541
867 SRS: Solving c-Approximate Nearest Neighbor Queries in High Dimensional Euclidean Space with a Tiny Index 2015 VLDB 0.00015792021
931 The Pyramid-Technique: Towards Breaking the Curse of Dimensionality 1998 SIGMOD 0.00015238406
1,757 VHP: Approximate Nearest Neighbor Search via Virtual Hypersphere Partitioning 2020 VLDB 0.00010660932
1,806 Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces 2000 VLDB 0.00010490769
2,141 LSH Ensemble: Internet-Scale Domain Search 2016 VLDB 9.4542625e-05
2,165 Self-Tuning, GPU-Accelerated Kernel Density Models for Multidimensional Selectivity Estimation 2015 SIGMOD 9.389622e-05
2,175 Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services 2017 SIGMOD 9.3644117e-05
2,181 PM-LSH: A Fast and Accurate LSH Framework for High-Dimensional Approximate NN Search 2020 VLDB 9.3451821e-05
3,459 An Empirical Evaluation of Set Similarity Join Techniques 2016 VLDB 7.072508e-05
5,622 Monotonic Cardinality Estimation of Similarity Selection: A Deep Learning Approach 2020 SIGMOD 5.4060403e-05
5,958 Fine-grained Concept Linking using Neural Networks in Healthcare 2018 SIGMOD 5.2563968e-05
6,074 Pigeonring: A Principle for Faster Thresholded Similarity Search 2019 VLDB 5.2242306e-05
7,005 Indexing the Edges – A simple and yet efficient approach to high-dimensional indexing 2000 PODS 4.8654221e-05
8,384 Consistent and Flexible Selectivity Estimation for High-Dimensional Data 2021 SIGMOD 4.5304673e-05
Previous Page 1 / 1 Next

Semantically Similar Papers