An Efficient Index Structure for String Databases
Summary: Substrings mapped to an integer space by wavelet coefficients for scalable, on-disk substring search. MBR-based indexing with a lower-bound edit-distance proxy enables NN and range pruning, yielding 50–95% pruning and substantial I/O/CPU savings. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Tamer Kahveci
- 2. Ambuj K. Singh
Incoming Citations (Sorted by Pagerank)
Showing 5 of 5 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,497 | OASIS: An Online and Accurate Technique for Local-alignment Searches on Biological Sequences | 2003 | VLDB | 8.6472036e-05 |
| 3,774 | Efficient Exact Edit Similarity Query Processing with the Asymmetric Signature Scheme | 2011 | SIGMOD | 6.7757301e-05 |
| 6,464 | Reference-Based Indexing of Sequence Databases | 2006 | VLDB | 5.0532607e-05 |
| 11,979 | Similarity Joins for Uncertain Strings | 2014 | SIGMOD | 4.1945683e-05 |
| 12,648 | Searching on the Secondary Structure of Protein Sequences | 2002 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 2 of 2 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 665 | Fast Nearest Neighbor Search in Medical Image Databases | 1996 | VLDB | 0.00018451109 |
| 802 | Optimal Multi-Step k-Nearest Neighbor Search | 1998 | SIGMOD | 0.00016502317 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,932 | Local Filtering: Improving the Performance of Approximate Queries on String Collections | 2015 | SIGMOD | 4.2500258e-05 |
| 2,583 | Practical Suffix Tree Construction | 2004 | VLDB | 8.497732e-05 |
| 13,272 | On the String Matching with k Differences in DNA Databases | 2021 | VLDB | - |
| 6,464 | Reference-Based Indexing of Sequence Databases | 2006 | VLDB | 5.0532607e-05 |
| 4,897 | The Wavelet Trie: Maintaining an Indexed Sequence of Strings in Compressed Space | 2012 | PODS | 5.8469152e-05 |
| 2,376 | Bed-Tree: An All-Purpose Index Structure for String Similarity Search Based on Edit Distance | 2010 | SIGMOD | 8.9424361e-05 |
| 5,813 | Space-efficient Substring Occurrence Estimation | 2011 | PODS | 5.3170565e-05 |
| 6,097 | Two-dimensional Substring Indexing | 2001 | PODS | 5.2119402e-05 |
| 7,708 | Efficient Top-k Algorithms for Approximate Substring Matching | 2013 | SIGMOD | 4.6721808e-05 |
| 1,184 | On Effective Multi-Dimensional Indexing for Strings | 2000 | SIGMOD | 0.00013455208 |