Shark: SQL and Rich Analytics at Scale
Summary: Shark unifies SQL and analytics on clusters via a distributed memory abstraction into a single scalable engine. In-memory columnar storage, replanning, and fault tolerance enable SQL and ML, 100x faster than Hive/Hadoop, competitive with MPP. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Reynold S. Xin
- 2. Josh Rosen
- 3. Matei Zaharia
- 4. Michael J. Franklin
- 5. Scott Shenker
- 6. Ion Stoica
Incoming Citations (Sorted by Pagerank)
Showing 4 of 54 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,948 | Tutorial: SQL-on-Hadoop Systems | 2015 | VLDB | 4.1945683e-05 |
| 11,974 | DoomDB - Kill the Query | 2014 | SIGMOD | 4.1945683e-05 |
| 11,993 | A Partitioning Framework for Aggressive Data Skipping | 2014 | VLDB | 4.1945683e-05 |
| 11,999 | Getting Your Big Data Priorities Straight: A Demonstration of Priority-based QoS using Social-network-driven Stock Recommendation | 2014 | VLDB | 4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 18 of 18 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 7,059 | Adaptive and Robust Query Execution for Lakehouses at Scale | 2024 | VLDB | 4.8477825e-05 |
| 2,127 | SQL-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures | 2014 | VLDB | 9.4863172e-05 |
| 7,599 | Quill: Efficient, Transferable, and Rich Analytics at Scale | 2016 | VLDB | 4.7003593e-05 |
| 66 | Spark SQL: Relational Data Processing in Spark | 2015 | SIGMOD | 0.00061639801 |
| 3,066 | HAWQ: A Massively Parallel Processing SQL Engine in Hadoop | 2014 | SIGMOD | 7.6221974e-05 |
| 3,973 | Apache Hive: From MapReduce to Enterprise-grade Big Data Warehousing | 2019 | SIGMOD | 6.5758017e-05 |
| 1,152 | Blink and It's Done: Interactive Queries on Very Large Data | 2012 | VLDB | 0.00013645792 |
| 4,713 | SharkDB: An In-Memory Storage System for Massive Trajectory Data | 2015 | SIGMOD | 5.9786915e-05 |
| 1,071 | Starfish: A Self-tuning System for Big Data Analytics | 2011 | CIDR | 0.00014312777 |
| 2,488 | Shark: Fast Data Analysis Using Coarse-grained Distributed Memory | 2012 | SIGMOD | 8.6683713e-05 |