Database Paper Browser

Back to papers

Extending Q-Grams to Estimate Selectivity of String Matching with Low Edit Distance

Summary: Extends q-grams with wildcards to estimate selectivity for approximate string matching. Proposes replacement semi-lattice and string hierarchy; presents BasicEQ and two improvements to OptEQ; experiments on three benchmarks show OptEQ outperforms SEPIA. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
9575
Venue
VLDB
Year
2007
Pagerank
7.3433307e-05
Overall Rank
3,226 | 77.56%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 15 of 15 citing papers.

Rank Citing Paper Year Venue Pagerank
1,234 Ed-Join: An Efficient Algorithm for Similarity Joins With Edit Distance Constraints 2008 VLDB 0.00013122499
2,193 Cost-Based Variable-Length-Gram Selection for String Collections to Support Approximate Queries Efficiently 2008 SIGMOD 9.3178557e-05
2,592 Pass-Join: A Partition-based Method for Similarity Joins 2012 VLDB 8.4795761e-05
2,779 Hashed Samples: Selectivity Estimators For Set Similarity Selection Queries 2008 VLDB 8.1320575e-05
3,578 Efficient Approximate Entity Extraction with Edit Distance Constraints 2009 SIGMOD 6.9503858e-05
4,359 Astrid: Accurate Selectivity Estimation for String Predicates using Deep Learning 2021 VLDB 6.2569955e-05
4,873 Power-Law Based Estimation of Set Similarity Join Size 2009 VLDB 5.8602304e-05
4,901 Probabilistic String Similarity Joins 2010 SIGMOD 5.8411648e-05
5,073 Faerie: Efficient Filtering Algorithms for Approximate Dictionary-based Entity Extraction 2011 SIGMOD 5.7177424e-05
5,887 Efficient Approximate Search on String Collections (Tutorial) 2009 VLDB 5.2879769e-05
7,186 LPLM: A Neural Language Model for Cardinality Estimation of LIKE-Queries 2024 SIGMOD 4.8063731e-05
7,474 Cardinality Estimation of Approximate Substring Queries using Deep Learning 2022 VLDB 4.7194345e-05
7,708 Efficient Top-k Algorithms for Approximate Substring Matching 2013 SIGMOD 4.6721808e-05
9,726 Cardinality Estimation of LIKE Predicate Queries using Deep Learning 2025 SIGMOD 4.2943379e-05
9,945 SSCard: Substring Cardinality Estimation using Suffix Tree-Guided Learned FM-Index 2026 SIGMOD 4.2432653e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 9 of 9 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers