Extending Q-Grams to Estimate Selectivity of String Matching with Low Edit Distance
Summary: Extends q-grams with wildcards to estimate selectivity for approximate string matching. Proposes replacement semi-lattice and string hierarchy; presents BasicEQ and two improvements to OptEQ; experiments on three benchmarks show OptEQ outperforms SEPIA. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Hongrae Lee
- 2. Raymond T. Ng
- 3. Kyuseok Shim
Incoming Citations (Sorted by Pagerank)
Showing 15 of 15 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 9 of 9 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 36 | Fast Algorithms for Mining Association Rules | 1994 | VLDB | 0.00076161096 |
| 125 | Approximate String Joins in a Database (Almost) for Free | 2001 | VLDB | 0.00044847972 |
| 155 | Robust and Efficient Fuzzy Match for Online Data Cleaning | 2003 | SIGMOD | 0.00040637896 |
| 1,046 | Estimating the Selectivity of XML Path Expressions for Internet Scale Applications | 2001 | VLDB | 0.00014462307 |
| 1,146 | Estimating Alphanumeric Selectivity in the Presence of Wildcards | 1996 | SIGMOD | 0.00013679782 |
| 1,379 | Substring Selectivity Estimation | 1999 | PODS | 0.00012286879 |
| 4,438 | Selectivity Estimation for Fuzzy String Predicates in Large Data Sets | 2005 | VLDB | 6.1898903e-05 |
| 4,660 | XPathLearner: An On-Line Self-Tuning Markov Histogram for XML Path Selectivity Estimation | 2002 | VLDB | 6.014625e-05 |
| 7,777 | Indexing Mixed Types for Approximate Retrieval | 2005 | VLDB | 4.653704e-05 |
Previous
Page 1 / 1
Next