Reference-Based Alignment in Large Sequence Databases
Summary: RBSA speeds up optimal subsequence retrieval under edit distance and Smith-Waterman by reference-based filtering, assuming the best match deviates only slightly from the query. Query segments of fixed-length with precomputed reference scores and alphabet collapsing enable strong pruning; exact and approximate RBSA outperform q-grams, BLAST, and BWT. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 4 of 4 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,944 | WHAM: A High-throughput Sequence Alignment Method | 2011 | SIGMOD | 0.00010004608 |
| 6,074 | Pigeonring: A Principle for Faster Thresholded Similarity Search | 2019 | VLDB | 5.2242306e-05 |
| 6,983 | A Generic Framework for Efficient and Effective Subsequence Retrieval | 2012 | VLDB | 4.8732757e-05 |
| 12,086 | RCSI: Scalable similarity search in thousand(s) of genomes | 2013 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,202 | VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams | 2007 | VLDB | 0.00013326298 |
| 2,193 | Cost-Based Variable-Length-Gram Selection for String Collections to Support Approximate Queries Efficiently | 2008 | SIGMOD | 9.3178557e-05 |
| 2,213 | n-Gram/2L: A Space and Time Efficient Two-Level n-Gram Inverted Index Structure | 2005 | VLDB | 9.2765152e-05 |
| 2,497 | OASIS: An Online and Accurate Technique for Local-alignment Searches on Biological Sequences | 2003 | VLDB | 8.6472036e-05 |
| 3,294 | Approximate Embedding-Based Subsequence Matching of Time Series | 2008 | SIGMOD | 7.2619257e-05 |
| 5,397 | Fast nGram-Based String Search Over Data Encoded Using Algebraic Signatures | 2007 | VLDB | 5.5299002e-05 |
| 6,464 | Reference-Based Indexing of Sequence Databases | 2006 | VLDB | 5.0532607e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,878 | Ranked Subsequence Matching in Time-Series Databases | 2007 | VLDB | 5.2916009e-05 |
| 2,497 | OASIS: An Online and Accurate Technique for Local-alignment Searches on Biological Sequences | 2003 | VLDB | 8.6472036e-05 |
| 9,933 | Efficient and Effective KNN Sequence Search with Approximate n-grams | 2014 | VLDB | 4.2500258e-05 |
| 13,272 | On the String Matching with k Differences in DNA Databases | 2021 | VLDB | - |
| 12,086 | RCSI: Scalable similarity search in thousand(s) of genomes | 2013 | VLDB | 4.1945683e-05 |
| 3,294 | Approximate Embedding-Based Subsequence Matching of Time Series | 2008 | SIGMOD | 7.2619257e-05 |
| 6,983 | A Generic Framework for Efficient and Effective Subsequence Retrieval | 2012 | VLDB | 4.8732757e-05 |
| 8,706 | ALAE: Accelerating Local Alignment with Affine Gap Exactly in Biosequence Databases | 2012 | VLDB | 4.4642586e-05 |
| 7,708 | Efficient Top-k Algorithms for Approximate Substring Matching | 2013 | SIGMOD | 4.6721808e-05 |
| 6,464 | Reference-Based Indexing of Sequence Databases | 2006 | VLDB | 5.0532607e-05 |