Fast Processing and Querying of 170TB of Genomics Data via a Repeated And Merged BloOm Filter (RAMBO)
Summary: RAMBO uses a Repeated And Merged Bloom Filter to turn genome search into count-min style set membership tests, delivering zero false negatives and a small index. Streaming updates and indexing 170 TB in 9 hours, beating COBS, HowDeSBT, and SSBT. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Gaurav Gupta
- 2. Minghao Yan
- 3. Benjamin Coleman
- 4. Bryce Kille
- 5. R. A. Leo Elworth
- 6. Tharun Medini
- 7. Todd Treangen
- 8. Anshumali Shrivastava
Incoming Citations (Sorted by Pagerank)
Showing 4 of 4 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 7,732 | Double-Anonymous Sketch: Achieving Top-K-fairness for Finding Global Top-K Frequent Items | 2023 | SIGMOD | 4.6657123e-05 |
| 8,171 | GTS: GPU-based Tree Index for Fast Similarity Search | 2024 | SIGMOD | 4.5688498e-05 |
| 11,289 | ChainDash: An Ad-Hoc Blockchain Data Analytics System | 2023 | VLDB | 4.1945683e-05 |
| 11,374 | New Wine in an Old Bottle: Data-Aware Hash Functions for Bloom Filters | 2022 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 1 of 1 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 781 | Spectral Bloom Filters | 2003 | SIGMOD | 0.00016741046 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,894 | Building Highly-Optimized, Low-Latency Pipelines for Genomic Data Analysis | 2015 | CIDR | 4.1945683e-05 |
| 7,902 | Building Highly-Optimized, Low-Latency Pipelines for Genomic Data Analysis | 2015 | CIDR | 4.6215911e-05 |
| 8,634 | Building Fast and Compact Sketches for Approximately Multi-Set Multi-Membership Querying | 2021 | SIGMOD | 4.4801584e-05 |
| 9,933 | Efficient and Effective KNN Sequence Search with Approximate n-grams | 2014 | VLDB | 4.2500258e-05 |
| 1,118 | A Database Index to Large Biological Sequences | 2001 | VLDB | 0.00013879121 |
| 13,455 | Memory Efficient Minimum Substring Partitioning | 2013 | VLDB | - |
| 6,464 | Reference-Based Indexing of Sequence Databases | 2006 | VLDB | 5.0532607e-05 |
| 4,550 | Serial and Parallel Methods for I/O Efficient Suffix Tree Construction | 2009 | SIGMOD | 6.0924864e-05 |
| 12,086 | RCSI: Scalable similarity search in thousand(s) of genomes | 2013 | VLDB | 4.1945683e-05 |
| 2,250 | Genome-scale Disk-based Suffix Tree Indexing | 2007 | SIGMOD | 9.2009942e-05 |