Back to papers
Apache Hive: From MapReduce to Enterprise-grade Big Data Warehousing
Summary: Apache Hive evolves from MapReduce to an enterprise-grade data warehouse via a hybrid MPP/big-data architecture that integrates SQL, storage formats, and cloud concepts. Innovations cover Transactions, optimizer, runtime, and federation, with experiments on typical workloads and a forward-looking community roadmap.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 5657
- Venue
- SIGMOD
- Year
- 2019
- Pagerank
- 6.5758017e-05
- Overall Rank
- 3,973 | 72.37%
- DOI
-
10.1145/3299869.3314045
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 12 of 12 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 4,530 |
Big Metadata: When Metadata is Big Data |
2021 |
VLDB |
6.1075429e-05 |
| 6,340 |
Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine |
2024 |
SIGMOD |
5.1051018e-05 |
| 7,907 |
Petabyte-Scale Row-Level Operations in Data Lakehouses |
2024 |
VLDB |
4.6205839e-05 |
| 8,608 |
Unity Catalog: Open and Universal Governance for the Lakehouse and Beyond |
2025 |
SIGMOD |
4.4853979e-05 |
| 8,785 |
Bringing Cloud-Native Storage to SAP IQ |
2021 |
SIGMOD |
4.4522556e-05 |
| 9,232 |
AutoComp: Automated Data Compaction for Log-Structured Tables in Data Lakes |
2025 |
SIGMOD |
4.3690661e-05 |
| 9,689 |
LST-Bench: Benchmarking Log-Structured Tables in the Cloud |
2024 |
SIGMOD |
4.3043822e-05 |
| 9,692 |
GHive: A Demonstration of GPU-Accelerated Query Processing in Apache Hive |
2022 |
SIGMOD |
4.302852e-05 |
| 10,404 |
Dynamic Pruning for Recursive Joins |
2025 |
SIGMOD |
4.1945683e-05 |
| 11,197 |
QaaD (Query-as-a-Data): Scalable Execution of Massive Number of Small Queries in Spark |
2023 |
SIGMOD |
4.1945683e-05 |
| 11,291 |
ADOps: An Anomaly Detection Pipeline in Structured Logs |
2023 |
VLDB |
4.1945683e-05 |
| 11,690 |
Integration of Large-Scale Data Processing Systems and Traditional Parallel Database Technology |
2019 |
VLDB |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 20 of 20 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 11 |
Implementing Data Cubes Efficiently |
1996 |
SIGMOD |
0.0011708144 |
| 109 |
Dremel: Interactive Analysis of Web-Scale Datasets |
2010 |
VLDB |
0.00048186983 |
| 158 |
Automated Selection of Materialized Views and Indexes for SQL Databases |
2000 |
VLDB |
0.00040071492 |
| 167 |
The Snowflake Elastic Data Warehouse |
2016 |
SIGMOD |
0.00039180521 |
| 182 |
LEO - DB2's LEarning Optimizer |
2001 |
VLDB |
0.00036962631 |
| 258 |
DB2 Design Advisor: Integrated Automatic Physical Database Design |
2004 |
VLDB |
0.0003022091 |
| 349 |
Serializable Isolation for Snapshot Databases |
2008 |
SIGMOD |
0.00026440605 |
| 426 |
Amazon Redshift and the Case for Simpler Data Warehouses |
2015 |
SIGMOD |
0.00023594359 |
| 476 |
Impala: A Modern, Open-Source SQL Engine for Hadoop |
2015 |
CIDR |
0.00022226941 |
| 481 |
Incremental Maintenance of Views with Duplicates |
1995 |
SIGMOD |
0.00022167223 |
| 544 |
Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources |
2018 |
SIGMOD |
0.00020521965 |
| 731 |
Optimizing Queries Using Materialized Views: A Practical, Scalable Solution |
2001 |
SIGMOD |
0.00017468889 |
| 1,223 |
Enhancements to SQL Server Column Stores |
2013 |
SIGMOD |
0.00013207641 |
| 1,588 |
Druid: A Real-time Analytical Data Store |
2014 |
SIGMOD |
0.00011239313 |
| 2,249 |
Orca: A Modular Query Optimizer Architecture for Big Data |
2014 |
SIGMOD |
9.2034693e-05 |
| 2,998 |
Major Technical Advancements in Apache Hive |
2014 |
SIGMOD |
7.753765e-05 |
| 4,174 |
Computation Reuse in Analytics Job Service at Microsoft |
2018 |
SIGMOD |
6.3856219e-05 |
| 4,188 |
Apache Tez: A Unifying Framework for Modeling and Building Data Processing Applications |
2015 |
SIGMOD |
6.3753681e-05 |
| 4,571 |
Adaptive Statistics in Oracle 12c |
2017 |
VLDB |
6.0773174e-05 |
| 7,557 |
Invisible Glue: Scalable Self-Tuning Multi-Stores |
2015 |
CIDR |
4.7112819e-05 |
Semantically Similar Papers