Database Paper Browser

Back to papers

Efficient Approximate Search on String Collections (Tutorial)

Summary: Tutorial survey of efficient approximate search in string collections. Comprehensive coverage: indexes, search algorithms, filtering, selectivity estimation, and related work; analyzes merits/limits and offers synthesis for scalable, practical design. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
9983
Venue
VLDB
Year
2009
Pagerank
5.2879769e-05
Overall Rank
5,887 | 59.05%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 4 of 4 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 19 of 19 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
125 Approximate String Joins in a Database (Almost) for Free 2001 VLDB 0.00044847972
250 Efficient set joins on similarity predicates 2004 SIGMOD 0.00030661988
266 Efficient Exact Set-Similarity Joins 2006 VLDB 0.00029718727
322 Record Linkage: Similarity Measures and Algorithms 2006 SIGMOD 0.00027518768
1,202 VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams 2007 VLDB 0.00013326298
1,234 Ed-Join: An Efficient Algorithm for Similarity Joins With Edit Distance Constraints 2008 VLDB 0.00013122499
1,533 Example-driven Design of Efficient Record Matching Queries 2007 VLDB 0.00011471971
1,830 Relaxing Join and Selection Queries 2006 VLDB 0.000103862
2,073 Extending Autocompletion To Tolerate Errors 2009 SIGMOD 9.6142791e-05
2,193 Cost-Based Variable-Length-Gram Selection for String Collections to Support Approximate Queries Efficiently 2008 SIGMOD 9.3178557e-05
2,386 Leveraging Aggregate Constraints For Deduplication 2007 SIGMOD 8.9231895e-05
2,779 Hashed Samples: Selectivity Estimators For Set Similarity Selection Queries 2008 VLDB 8.1320575e-05
3,226 Extending Q-Grams to Estimate Selectivity of String Matching with Low Edit Distance 2007 VLDB 7.3433307e-05
3,868 An Efficient Filter for Approximate Membership Checking 2008 SIGMOD 6.6822543e-05
4,414 Efficient Type-Ahead Search on Relational Data: a TASTIER Approach 2009 SIGMOD 6.2056993e-05
4,438 Selectivity Estimation for Fuzzy String Predicates in Large Data Sets 2005 VLDB 6.1898903e-05
4,988 Incremental Maintenance of Length Normalized Indexes for Approximate String Matching 2009 SIGMOD 5.783959e-05
7,669 Incorporating String Transformations in Record Matching 2008 SIGMOD 4.6833751e-05
7,777 Indexing Mixed Types for Approximate Retrieval 2005 VLDB 4.653704e-05
Previous Page 1 / 1 Next

Semantically Similar Papers