Database Paper Browser

Back to papers

VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams

Summary: VGRAM selects variable-length grams to speed up approximate string queries. It derives query grams from selected grams and links gram-set similarity to edit distance, enabling adoption by algorithms with minimal changes; experiments show speedups. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
9585
Venue
VLDB
Year
2007
Pagerank
0.00013326298
Overall Rank
1,202 | 91.64%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 28 of 28 citing papers.

Rank Citing Paper Year Venue Pagerank
509 On Active Learning of Record Matching Packages 2010 SIGMOD 0.00021409518
936 Framework for Evaluating Clustering Algorithms in Duplicate Detection 2009 VLDB 0.0001521549
1,234 Ed-Join: An Efficient Algorithm for Similarity Joins With Edit Distance Constraints 2008 VLDB 0.00013122499
1,396 Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search 2012 SIGMOD 0.00012204748
1,944 WHAM: A High-throughput Sequence Alignment Method 2011 SIGMOD 0.00010004608
2,193 Cost-Based Variable-Length-Gram Selection for String Collections to Support Approximate Queries Efficiently 2008 SIGMOD 9.3178557e-05
3,578 Efficient Approximate Entity Extraction with Edit Distance Constraints 2009 SIGMOD 6.9503858e-05
3,774 Efficient Exact Edit Similarity Query Processing with the Asymmetric Signature Scheme 2011 SIGMOD 6.7757301e-05
4,216 Trie-Join: Efficient Trie-based String Similarity Joins with Edit-Distance Constraints 2010 VLDB 6.3521675e-05
4,359 Astrid: Accurate Selectivity Estimation for String Predicates using Deep Learning 2021 VLDB 6.2569955e-05
4,435 Sampling Dirty Data for Matching Attributes 2010 SIGMOD 6.1918164e-05
4,901 Probabilistic String Similarity Joins 2010 SIGMOD 5.8411648e-05
5,073 Faerie: Efficient Filtering Algorithms for Approximate Dictionary-based Entity Extraction 2011 SIGMOD 5.7177424e-05
5,291 Fast Subtrajectory Similarity Search in Road Networks under Weighted Edit Distance Constraints 2020 VLDB 5.5826473e-05
5,812 Reference-Based Alignment in Large Sequence Databases 2009 VLDB 5.3172025e-05
5,887 Efficient Approximate Search on String Collections (Tutorial) 2009 VLDB 5.2879769e-05
6,074 Pigeonring: A Principle for Faster Thresholded Similarity Search 2019 VLDB 5.2242306e-05
6,351 SigMatch: Fast and Scalable Multi-Pattern Matching 2010 VLDB 5.1005697e-05
6,726 A Pivotal Prefix Based Filtering Algorithm for String Similarity Search 2014 SIGMOD 4.9484027e-05
6,983 A Generic Framework for Efficient and Effective Subsequence Retrieval 2012 VLDB 4.8732757e-05
7,109 Efficient Similarity Join and Search on Multi-Attribute Data 2015 SIGMOD 4.8292998e-05
7,708 Efficient Top-k Algorithms for Approximate Substring Matching 2013 SIGMOD 4.6721808e-05
9,439 On-the-Fly Token Similarity Joins in Relational Databases 2014 SIGMOD 4.3423824e-05
9,832 Balance-Aware Distributed String Similarity-Based Query Processing System 2019 VLDB 4.2751057e-05
9,932 Local Filtering: Improving the Performance of Approximate Queries on String Collections 2015 SIGMOD 4.2500258e-05
9,933 Efficient and Effective KNN Sequence Search with Approximate n-grams 2014 VLDB 4.2500258e-05
10,216 The Case For Language Model Approximated LIKE Predicate 2026 SIGMOD 4.1945683e-05
11,724 ZigZag: Supporting Similarity Queries on Vector Space Models 2018 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 8 of 8 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers