Fast Processing and Querying of 170TB of Genomics Data via a Repeated And Merged BloOm Filter (RAMBO)

Summary: RAMBO uses a Repeated And Merged Bloom Filter to turn genome search into count-min style set membership tests, delivering zero false negatives and a small index. Streaming updates and indexing 170 TB in 9 hours, beating COBS, HowDeSBT, and SSBT. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID: 6224
Venue: SIGMOD
Year: 2021
Pagerank: 6.2345844e-05
Overall Rank: 4,375 | 69.60%
DOI: 10.1145/3448016.3457333

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 4 of 4 citing papers.

Rank	Citing Paper	Year	Venue	Pagerank
7,731	Double-Anonymous Sketch: Achieving Top-K-fairness for Finding Global Top-K Frequent Items	2023	SIGMOD	4.6612382e-05
8,167	GTS: GPU-based Tree Index for Fast Similarity Search	2024	SIGMOD	4.567569e-05
11,291	ChainDash: An Ad-Hoc Blockchain Data Analytics System	2023	VLDB	4.1905499e-05
11,376	New Wine in an Old Bottle: Data-Aware Hash Functions for Bloom Filters	2022	VLDB	4.1905499e-05

Outgoing Citations (Sorted by Pagerank)

Showing 1 of 1 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank	Cited Paper	Year	Venue	Pagerank
778	Spectral Bloom Filters	2003	SIGMOD	0.00016729191

Semantically Similar Papers

Overall Rank	Paper	Year	Venue	Pagerank
11,902	Building Highly-Optimized, Low-Latency Pipelines for Genomic Data Analysis	2015	CIDR	4.1905499e-05
7,899	Building Highly-Optimized, Low-Latency Pipelines for Genomic Data Analysis	2015	CIDR	4.6176856e-05
8,634	Building Fast and Compact Sketches for Approximately Multi-Set Multi-Membership Querying	2021	SIGMOD	4.4758632e-05
9,934	Efficient and Effective KNN Sequence Search with Approximate n-grams	2014	VLDB	4.245954e-05
1,115	A Database Index to Large Biological Sequences	2001	VLDB	0.00013865352
13,468	Memory Efficient Minimum Substring Partitioning	2013	VLDB	-
6,458	Reference-Based Indexing of Sequence Databases	2006	VLDB	5.0484248e-05
4,550	Serial and Parallel Methods for I/O Efficient Suffix Tree Construction	2009	SIGMOD	6.0866239e-05
12,094	RCSI: Scalable similarity search in thousand(s) of genomes	2013	VLDB	4.1905499e-05
2,251	Genome-scale Disk-based Suffix Tree Indexing	2007	SIGMOD	9.1921005e-05