Building Highly-Optimized, Low-Latency Pipelines for Genomic Data Analysis

Summary: End-to-end system that applies a principled optimization framework plus a parallel execution platform to transform long-running, deep genomic analysis pipelines into highly-optimized, low-latency workflows. Validated on NY Genome Center workloads to cut multi-day runtimes for clinical-scale sequencing. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID: 255
Venue: CIDR
Year: 2015
Pagerank: 4.6176856e-05
Overall Rank: 7,899 | 45.11%
DOI: -

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 2 of 2 citing papers.

Rank	Citing Paper	Year	Venue	Pagerank
3,060	Rethinking Data-Intensive Science Using Scalable Analytics Systems	2015	SIGMOD	7.639315e-05
11,798	Massively Parallel Processing of Whole Genome Sequence Data: An In-Depth Performance Study	2017	SIGMOD	4.1905499e-05

Outgoing Citations (Sorted by Pagerank)

Showing 8 of 8 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank	Cited Paper	Year	Venue	Pagerank
3	Pig Latin: A Not-So-Foreign Language for Data Processing	2008	SIGMOD	0.0024217964
42	A Comparison of Approaches to Large-Scale Data Analysis	2009	SIGMOD	0.00073570328
602	Mining Quantitative Association Rules in Large Relational Tables	1996	SIGMOD	0.00019350521
909	Tenzing A SQL Implementation On The MapReduce Framework	2011	VLDB	0.0001540191
1,947	WHAM: A High-throughput Sequence Alignment Method	2011	SIGMOD	9.9963826e-05
2,441	CoHadoop: Flexible Data Placement and Its Exploitation in Hadoop	2011	VLDB	8.8106295e-05
2,476	A Platform for Scalable One-Pass Analytics using MapReduce	2011	SIGMOD	8.6907971e-05
12,147	Massive Genomic Data Processing and Deep Analysis	2012	VLDB	4.1905499e-05

Semantically Similar Papers

Overall Rank	Paper	Year	Venue	Pagerank
11,927	ShareInsights - An Unified Approach to Full-stack Data Processing	2015	SIGMOD	4.1905499e-05
4,375	Fast Processing and Querying of 170TB of Genomics Data via a Repeated And Merged BloOm Filter (RAMBO)	2021	SIGMOD	6.2345844e-05
2,627	GenBase: A Complex Analytics Genomics Benchmark	2014	SIGMOD	8.4292895e-05
9,505	Supporting Scalable Analytics with Latency Constraints	2015	VLDB	4.3300131e-05
12,297	Data Management for High-Throughput Genomics	2009	CIDR	4.1905499e-05
6,408	Managing Data from High-Throughput Genomic Processing: A Case Study	2004	VLDB	5.0686661e-05
3,060	Rethinking Data-Intensive Science Using Scalable Analytics Systems	2015	SIGMOD	7.639315e-05
11,798	Massively Parallel Processing of Whole Genome Sequence Data: An In-Depth Performance Study	2017	SIGMOD	4.1905499e-05
12,147	Massive Genomic Data Processing and Deep Analysis	2012	VLDB	4.1905499e-05
11,902	Building Highly-Optimized, Low-Latency Pipelines for Genomic Data Analysis	2015	CIDR	4.1905499e-05