Apache Tez: A Unifying Framework for Modeling and Building Data Processing Applications
Summary: Tez is an open framework to build data-flow engines on YARN, enabling component reuse with a flexible data plane. It unifies blocks to curb fragmentation and enables dynamic partition pruning; Tez-backed Hive, Pig, Spark, Cascading beat native YARN on TPC-DS/H. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Bikas Saha
- 2. Hitesh Shah
- 3. Siddharth Seth
- 4. Gopal Vijayaraghavan
- 5. Arun Murthy
- 6. Carlo Curino
Incoming Citations (Sorted by Pagerank)
Showing 9 of 9 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 544 | Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources | 2018 | SIGMOD | 0.00020521965 |
| 2,501 | DBEst: Revisiting Approximate Query Processing Engines with Machine Learning Models | 2019 | SIGMOD | 8.6453446e-05 |
| 3,973 | Apache Hive: From MapReduce to Enterprise-grade Big Data Warehousing | 2019 | SIGMOD | 6.5758017e-05 |
| 5,441 | Using Cloud Functions as Accelerator for Elastic Data Analytics | 2023 | SIGMOD | 5.5028093e-05 |
| 6,117 | REEF: Retainable Evaluator Execution Framework | 2015 | SIGMOD | 5.2036631e-05 |
| 10,883 | IcedTea: Efficient and Responsive Time-Travel Debugging in Dataflow Systems | 2025 | VLDB | 4.1945683e-05 |
| 11,531 | Fangorn: Adaptive Execution Framework for Heterogeneous Workloads on Shared Clusters | 2021 | VLDB | 4.1945683e-05 |
| 11,690 | Integration of Large-Scale Data Processing Systems and Traditional Parallel Database Technology | 2019 | VLDB | 4.1945683e-05 |
| 11,948 | Tutorial: SQL-on-Hadoop Systems | 2015 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3 | Pig Latin: A Not-So-Foreign Language for Data Processing | 2008 | SIGMOD | 0.0024183614 |
| 70 | Hive - A Warehousing Solution Over a Map-Reduce Framework | 2009 | VLDB | 0.00059533166 |
| 109 | Dremel: Interactive Analysis of Web-Scale Datasets | 2010 | VLDB | 0.00048186983 |
| 476 | Impala: A Modern, Open-Source SQL Engine for Hadoop | 2015 | CIDR | 0.00022226941 |
| 2,928 | WANalytics: Analytics for a Geo-Distributed Data-Intensive World | 2015 | CIDR | 7.8812874e-05 |
| 7,050 | REEF: Retainable Evaluator Execution Framework | 2013 | VLDB | 4.85001e-05 |
Previous
Page 1 / 1
Next