Reference-Based Indexing of Sequence Databases
Summary: Proposes a reference-based index for large sequence databases under edit distance, reducing expensive computations with limited memory. Two novel reference-selection strategies and a new assignment method prune up to 20x–30x more candidates than Omni and frequency vectors, scalable to long sequences. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,294 | Approximate Embedding-Based Subsequence Matching of Time Series | 2008 | SIGMOD | 7.2619257e-05 |
| 5,812 | Reference-Based Alignment in Large Sequence Databases | 2009 | VLDB | 5.3172025e-05 |
| 6,983 | A Generic Framework for Efficient and Effective Subsequence Retrieval | 2012 | VLDB | 4.8732757e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 91 | M-tree: An Efficient Access Method for Similarity Search in Metric Spaces | 1997 | VLDB | 0.0005181666 |
| 125 | Approximate String Joins in a Database (Almost) for Free | 2001 | VLDB | 0.00044847972 |
| 575 | Distance-Based Indexing For High-Dimensional Metric Spaces | 1997 | SIGMOD | 0.00019882723 |
| 1,118 | A Database Index to Large Biological Sequences | 2001 | VLDB | 0.00013879121 |
| 2,497 | OASIS: An Online and Accurate Technique for Local-alignment Searches on Biological Sequences | 2003 | VLDB | 8.6472036e-05 |
| 4,333 | An Efficient Index Structure for String Databases | 2001 | VLDB | 6.2805237e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,376 | Bed-Tree: An All-Purpose Index Structure for String Similarity Search Based on Edit Distance | 2010 | SIGMOD | 8.9424361e-05 |
| 12,086 | RCSI: Scalable similarity search in thousand(s) of genomes | 2013 | VLDB | 4.1945683e-05 |
| 7,777 | Indexing Mixed Types for Approximate Retrieval | 2005 | VLDB | 4.653704e-05 |
| 7,708 | Efficient Top-k Algorithms for Approximate Substring Matching | 2013 | SIGMOD | 4.6721808e-05 |
| 6,671 | Discovering Longest-lasting Correlation in Sequence Databases | 2013 | VLDB | 4.9669225e-05 |
| 3,774 | Efficient Exact Edit Similarity Query Processing with the Asymmetric Signature Scheme | 2011 | SIGMOD | 6.7757301e-05 |
| 4,333 | An Efficient Index Structure for String Databases | 2001 | VLDB | 6.2805237e-05 |
| 6,983 | A Generic Framework for Efficient and Effective Subsequence Retrieval | 2012 | VLDB | 4.8732757e-05 |
| 9,933 | Efficient and Effective KNN Sequence Search with Approximate n-grams | 2014 | VLDB | 4.2500258e-05 |
| 5,812 | Reference-Based Alignment in Large Sequence Databases | 2009 | VLDB | 5.3172025e-05 |