Rethinking Data-Intensive Science Using Scalable Analytics Systems
Summary: Maps scientific pipelines to commodity big-data platforms (Spark/Parquet) for scalable data-intensive science. ADAM delivers 28x genomics speedup and 63% cost savings; 2.8–8.9x astronomy gains, techniques for efficient analyses on big-data platforms. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Frank Austin Nothaft
- 2. Matt Massie
- 3. Timothy Danford
- 4. Zhao Zhang
- 5. Uri Laserson
- 6. Carl Yeksigian
- 7. Jey Kottalam
- 8. Arun Ahuja
- 9. Jeff Hammerbacher
- 10. Michael Linderman
- 11. Michael J. Franklin
- 12. Anthony D. Joseph
- 13. David A. Patterson
Incoming Citations (Sorted by Pagerank)
Showing 4 of 4 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 66 | Spark SQL: Relational Data Processing in Spark | 2015 | SIGMOD | 0.00061639801 |
| 746 | Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores | 2020 | VLDB | 0.00017326979 |
| 2,972 | ForkBase: An Efficient Storage Engine for Blockchain and Forkable Applications | 2018 | VLDB | 7.79259e-05 |
| 3,535 | Scaling Spark in the Real World: Performance and Usability | 2015 | VLDB | 6.9992495e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 10 of 10 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 66 | Spark SQL: Relational Data Processing in Spark | 2015 | SIGMOD | 0.00061639801 |
| 109 | Dremel: Interactive Analysis of Web-Scale Datasets | 2010 | VLDB | 0.00048186983 |
| 131 | Integrating Compression and Execution in Column-Oriented Database Systems | 2006 | SIGMOD | 0.0004370331 |
| 310 | The Vertica Analytic Database: C-Store 7 Years Later | 2012 | VLDB | 0.00028132402 |
| 318 | Overview of SciDB: Large Scale Array Storage, Processing and Analysis | 2010 | SIGMOD | 0.00027795661 |
| 476 | Impala: A Modern, Open-Source SQL Engine for Hadoop | 2015 | CIDR | 0.00022226941 |
| 1,261 | Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce | 2013 | VLDB | 0.00012989236 |
| 1,944 | WHAM: A High-throughput Sequence Alignment Method | 2011 | SIGMOD | 0.00010004608 |
| 7,902 | Building Highly-Optimized, Low-Latency Pipelines for Genomic Data Analysis | 2015 | CIDR | 4.6215911e-05 |
| 7,903 | A Demonstration of Iterative Parallel Array Processing in Support of Telescope Image Analysis | 2013 | VLDB | 4.6215911e-05 |
Previous
Page 1 / 1
Next