SQL-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures
Summary: Comparing Hive (MapReduce/Tez) and Impala in a shared-nothing SQL-on-Hadoop setting, using ORC/Parquet and TPC-H/TPC-DS workloads with micro-bench I/O. Impala wins: 3.3–4.4x on MR, 2.1–2.8x on Tez for TPC-H; 8.2–10x MR and ~4x Tez for TPC-DS, with analysis of causes, strengths, and limitations. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 14 of 14 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 9 of 9 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 21 | C-Store: A Column-oriented DBMS | 2005 | VLDB | 0.00086087497 |
| 109 | Dremel: Interactive Analysis of Web-Scale Datasets | 2010 | VLDB | 0.00048186983 |
| 157 | HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads | 2009 | VLDB | 0.00040397359 |
| 542 | Shark: SQL and Rich Analytics at Scale | 2013 | SIGMOD | 0.00020595648 |
| 1,590 | Column-oriented Database Systems | 2009 | VLDB | 0.00011233838 |
| 1,977 | Split Query Processing in Polybase | 2013 | SIGMOD | 9.8824589e-05 |
| 3,066 | HAWQ: A Massively Parallel Processing SQL Engine in Hadoop | 2014 | SIGMOD | 7.6221974e-05 |
| 3,208 | Column-Oriented Storage Techniques for MapReduce | 2011 | VLDB | 7.3781897e-05 |
| 3,247 | Can the Elephants Handle the NoSQL Onslaught? | 2012 | VLDB | 7.3260831e-05 |
Previous
Page 1 / 1
Next