Comparative Evaluation of Big-Data Systems on Scientific Image Analytics Workloads
Summary: First cross-system, large-scale image analytics evaluation on real scientific workloads across SciDB, Myria, Spark, Dask, TensorFlow. Reveals shortcomings affecting implementation and performance, outlining directions for efficiency and usability. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Parmita Mehta
- 2. Sven Dorkenwald
- 3. Dongfang Zhao
- 4. Tomer Kaftan
- 5. Alvin Cheung
- 6. Magdalena Balazinska
- 7. Ariel Rokem
- 8. Andrew Connolly
- 9. Jacob Vanderplas
- 10. Yusra AlSayyad
Incoming Citations (Sorted by Pagerank)
Showing 7 of 7 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,934 | AIDA - Abstraction for Advanced In-Database Analytics | 2018 | VLDB | 7.8595778e-05 |
| 3,948 | A Comparative Evaluation of Systems for Scalable Linear Algebra-based Analytics | 2018 | VLDB | 6.5959084e-05 |
| 3,982 | The Myria Big Data Management and Analytics System and Cloud Service | 2017 | CIDR | 6.5651188e-05 |
| 7,917 | Array DBMS: Past, Present, and (Near) Future | 2021 | VLDB | 4.6173899e-05 |
| 11,184 | Toward Efficient Homomorphic Encryption for Outsourced Databases through Parallel Caching | 2023 | SIGMOD | 4.1945683e-05 |
| 11,408 | SimDB in Action: Road Traffic Simulations Completely Inside Array DBMS | 2022 | VLDB | 4.1945683e-05 |
| 11,489 | Convergence of Array DBMS and Cellular Automata: A Road Traffic Simulation Case | 2021 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 9 of 9 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 42 | A Comparison of Approaches to Large-Scale Data Analysis | 2009 | SIGMOD | 0.00073498298 |
| 318 | Overview of SciDB: Large Scale Array Storage, Processing and Analysis | 2010 | SIGMOD | 0.00027795661 |
| 1,071 | Starfish: A Self-tuning System for Big Data Analytics | 2011 | CIDR | 0.00014312777 |
| 1,343 | NoDB: Efficient Query Execution on Raw Data Files | 2012 | SIGMOD | 0.00012482538 |
| 2,623 | GenBase: A Complex Analytics Genomics Benchmark | 2014 | SIGMOD | 8.4374366e-05 |
| 2,757 | Parallel Data Analysis Directly on Scientific File Formats | 2014 | SIGMOD | 8.1679384e-05 |
| 3,377 | Demonstration of the Myria Big Data Management Service | 2014 | SIGMOD | 7.1624478e-05 |
| 3,982 | The Myria Big Data Management and Analytics System and Cloud Service | 2017 | CIDR | 6.5651188e-05 |
| 4,437 | Clash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics | 2015 | VLDB | 6.1907793e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,557 | Distributed Deep Learning on Data Systems: A Comparative Analysis of Approaches | 2021 | VLDB | 6.087611e-05 |
| 3,200 | Big Data Analytics with Datalog Queries on Spark | 2016 | SIGMOD | 7.3912411e-05 |
| 318 | Overview of SciDB: Large Scale Array Storage, Processing and Analysis | 2010 | SIGMOD | 0.00027795661 |
| 11,949 | Big Data Research: Will Industry Solve all the Problems? | 2015 | VLDB | 4.1945683e-05 |
| 2,757 | Parallel Data Analysis Directly on Scientific File Formats | 2014 | SIGMOD | 8.1679384e-05 |
| 3,058 | Rethinking Data-Intensive Science Using Scalable Analytics Systems | 2015 | SIGMOD | 7.6410159e-05 |
| 42 | A Comparison of Approaches to Large-Scale Data Analysis | 2009 | SIGMOD | 0.00073498298 |
| 13,356 | Big Data Science Needs Big Data Middleware | 2015 | CIDR | - |
| 3,982 | The Myria Big Data Management and Analytics System and Cloud Service | 2017 | CIDR | 6.5651188e-05 |
| 3,948 | A Comparative Evaluation of Systems for Scalable Linear Algebra-based Analytics | 2018 | VLDB | 6.5959084e-05 |