Substring Selectivity Estimation
Summary: Introduces MO (Maximal Overlap), a substring selectivity estimator using pruned count-suffix trees that leverages all maximal query substrings to produce better estimates. Proves MO dominates KVI under short‑memory strings and gives MOC/MOLC algs trading accuracy vs cost. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. H. V. Jagadish
- 2. Raymond T. Ng
- 3. Divesh Srivastava
Incoming Citations (Sorted by Pagerank)
Showing 17 of 17 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 9 of 9 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1 | Access Path Selection in a Relational Database Management System | 1979 | SIGMOD | 0.0040449103 |
| 64 | Improved Histograms for Selectivity Estimation of Range Predicates | 1996 | SIGMOD | 0.00063612837 |
| 67 | The Merge/Purge Problem for Large Databases | 1995 | SIGMOD | 0.00061348205 |
| 116 | Equi-Depth Histograms For Estimating Selectivity Factors For Multi-Dimensional Queries | 1988 | SIGMOD | 0.00046148737 |
| 326 | Optimal Histograms with Quality Guarantees | 1998 | VLDB | 0.00027358981 |
| 327 | Balancing Histogram Optimality and Practicality for Query Result Size Estimation | 1995 | SIGMOD | 0.00027308479 |
| 762 | Query Size Estimation by Adaptive Sampling (Extended Abstract) | 1990 | PODS | 0.00017036868 |
| 808 | Universality of Serial Histograms | 1993 | VLDB | 0.00016432772 |
| 1,146 | Estimating Alphanumeric Selectivity in the Presence of Wildcards | 1996 | SIGMOD | 0.00013679782 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,143 | Approximate Substring Matching over Uncertain Strings | 2011 | VLDB | 4.5768015e-05 |
| 4,438 | Selectivity Estimation for Fuzzy String Predicates in Large Data Sets | 2005 | VLDB | 6.1898903e-05 |
| 7,708 | Efficient Top-k Algorithms for Approximate Substring Matching | 2013 | SIGMOD | 4.6721808e-05 |
| 9,945 | SSCard: Substring Cardinality Estimation using Suffix Tree-Guided Learned FM-Index | 2026 | SIGMOD | 4.2432653e-05 |
| 1,046 | Estimating the Selectivity of XML Path Expressions for Internet Scale Applications | 2001 | VLDB | 0.00014462307 |
| 5,813 | Space-efficient Substring Occurrence Estimation | 2011 | PODS | 5.3170565e-05 |
| 1,146 | Estimating Alphanumeric Selectivity in the Presence of Wildcards | 1996 | SIGMOD | 0.00013679782 |
| 3,226 | Extending Q-Grams to Estimate Selectivity of String Matching with Low Edit Distance | 2007 | VLDB | 7.3433307e-05 |
| 2,171 | Selectivity Estimation For Boolean Queries | 2000 | PODS | 9.3807165e-05 |
| 3,035 | Multi-Dimensional Substring Selectivity Estimation | 1999 | VLDB | 7.6748073e-05 |