Cost-Based Variable-Length-Gram Selection for String Collections to Support Approximate Queries Efficiently
Summary: Cost-based selection of variable-length grams for approximate string queries with VGRAM-style indexing. Dynamic programming yields tight lower bounds on shared grams and enables automatic gram discovery for workloads, linking gram choice to index structure and performance. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Xiaochun Yang
- 2. Bin Wang
- 3. Chen Li
Incoming Citations (Sorted by Pagerank)
Showing 17 of 17 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 10 of 10 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 125 | Approximate String Joins in a Database (Almost) for Free | 2001 | VLDB | 0.00044847972 |
| 155 | Robust and Efficient Fuzzy Match for Online Data Cleaning | 2003 | SIGMOD | 0.00040637896 |
| 250 | Efficient set joins on similarity predicates | 2004 | SIGMOD | 0.00030661988 |
| 322 | Record Linkage: Similarity Measures and Algorithms | 2006 | SIGMOD | 0.00027518768 |
| 1,146 | Estimating Alphanumeric Selectivity in the Presence of Wildcards | 1996 | SIGMOD | 0.00013679782 |
| 1,202 | VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams | 2007 | VLDB | 0.00013326298 |
| 1,379 | Substring Selectivity Estimation | 1999 | PODS | 0.00012286879 |
| 2,213 | n-Gram/2L: A Space and Time Efficient Two-Level n-Gram Inverted Index Structure | 2005 | VLDB | 9.2765152e-05 |
| 3,226 | Extending Q-Grams to Estimate Selectivity of String Matching with Low Edit Distance | 2007 | VLDB | 7.3433307e-05 |
| 4,438 | Selectivity Estimation for Fuzzy String Predicates in Large Data Sets | 2005 | VLDB | 6.1898903e-05 |
Previous
Page 1 / 1
Next