Impala: A Modern, Open-Source SQL Engine for Hadoop
Summary: Impala: an open-source MPP SQL engine for Hadoop providing low-latency, high-concurrency execution for BI/read-mostly analytic queries where batch frameworks (e.g., Hive) fall short. Paper presents architecture/components and empirical superiority vs other SQL-on-Hadoop systems. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Marcel Kornacker
- 2. Alexander Behm
- 3. Victor Bittorf
- 4. Taras Bobrovytsky
- 5. Casey Ching
- 6. Alan Choi
- 7. Justin Erickson
- 8. Martin Grund
- 9. Daniel Hecht
- 10. Matthew Jacobs
- 11. Ishaan Joshi
- 12. Lenni Kuff
- 13. Dileep Kumar
- 14. Alex Leblang
- 15. Nong Li
- 16. Ippokratis Pandis
- 17. Henry Robinson
- 18. David Rorke
- 19. Silvius Rus
- 20. John Russell
- 21. Dimitris Tsirogiannis
- 22. Skye Wanderman-Milne
- 23. Michael Yoder
Incoming Citations (Sorted by Pagerank)
Showing 50 of 62 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 80 | Weaving Relations for Cache Performance | 2001 | VLDB | 0.00055721729 |
| 109 | Dremel: Interactive Analysis of Web-Scale Datasets | 2010 | VLDB | 0.00048186983 |
| 113 | Encapsulation of Parallelism in the Volcano Query Processing System | 1990 | SIGMOD | 0.00046764513 |
| 241 | DB2 with BLU Acceleration: So Much More than Just a Column Store | 2013 | VLDB | 0.00031420034 |
| 305 | SIMD-Scan: Ultra Fast in-Memory Table Scan using on-Chip Vector Processing Units | 2009 | VLDB | 0.00028248614 |
| 2,127 | SQL-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures | 2014 | VLDB | 9.4863172e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 6,400 | iOLAP: Managing Uncertainty for Efficient Incremental OLAP | 2016 | SIGMOD | 5.0803518e-05 |
| 11,690 | Integration of Large-Scale Data Processing Systems and Traditional Parallel Database Technology | 2019 | VLDB | 4.1945683e-05 |
| 9,504 | Supporting Scalable Analytics with Latency Constraints | 2015 | VLDB | 4.3341665e-05 |
| 7,270 | Oracle In-Database Hadoop: When MapReduce Meets RDBMS | 2012 | SIGMOD | 4.7813984e-05 |
| 7,866 | Operational Analytics Data Management Systems | 2016 | VLDB | 4.6321795e-05 |
| 157 | HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads | 2009 | VLDB | 0.00040397359 |
| 3,066 | HAWQ: A Massively Parallel Processing SQL Engine in Hadoop | 2014 | SIGMOD | 7.6221974e-05 |
| 11,948 | Tutorial: SQL-on-Hadoop Systems | 2015 | VLDB | 4.1945683e-05 |
| 3,973 | Apache Hive: From MapReduce to Enterprise-grade Big Data Warehousing | 2019 | SIGMOD | 6.5758017e-05 |
| 2,127 | SQL-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures | 2014 | VLDB | 9.4863172e-05 |