Piranha: Optimizing Short Jobs in Hadoop
Summary: Piranha optimizes short, latency-sensitive Hadoop jobs on existing clusters without impacting long-running tasks. It exploits short-job patterns learned from Yahoo production workloads to reduce query latency by up to 71% on unmodified Hadoop. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 4 of 4 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 7,207 | Kodiak: Leveraging Materialized Views For Very Low-Latency Analytics Over High-Dimensional Web-Scale Data | 2016 | VLDB | 4.800763e-05 |
| 8,978 | SpongeFiles: Mitigating Data Skew in MapReduce Using Distributed Memory | 2014 | SIGMOD | 4.417225e-05 |
| 11,197 | QaaD (Query-as-a-Data): Scalable Execution of Massive Number of Small Queries in Spark | 2023 | SIGMOD | 4.1945683e-05 |
| 11,341 | Juggler: Autonomous Cost Optimization and Performance Prediction of Big Data Applications | 2022 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 10 of 10 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3 | Pig Latin: A Not-So-Foreign Language for Data Processing | 2008 | SIGMOD | 0.0024183614 |
| 70 | Hive - A Warehousing Solution Over a Map-Reduce Framework | 2009 | VLDB | 0.00059533166 |
| 109 | Dremel: Interactive Analysis of Web-Scale Datasets | 2010 | VLDB | 0.00048186983 |
| 157 | HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads | 2009 | VLDB | 0.00040397359 |
| 542 | Shark: SQL and Rich Analytics at Scale | 2013 | SIGMOD | 0.00020595648 |
| 794 | Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) | 2010 | VLDB | 0.00016605103 |
| 913 | Tenzing A SQL Implementation On The MapReduce Framework | 2011 | VLDB | 0.00015408131 |
| 1,615 | The Performance of MapReduce: An In-depth Study | 2010 | VLDB | 0.00011132319 |
| 1,721 | Distributed Data-Parallel Computing Using a High-Level Programming Language | 2009 | SIGMOD | 0.00010762918 |
| 1,863 | Cheetah: A High Performance, Custom Data Warehouse on Top of MapReduce | 2010 | VLDB | 0.00010286531 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,933 | FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data | 2015 | VLDB | 4.1945683e-05 |
| 1,863 | Cheetah: A High Performance, Custom Data Warehouse on Top of MapReduce | 2010 | VLDB | 0.00010286531 |
| 13,380 | Job Scheduling with Minimizing Data Communication Costs | 2015 | SIGMOD | - |
| 11,958 | Shared Execution of Recurring Workloads in MapReduce | 2015 | VLDB | 4.1945683e-05 |
| 12,101 | Optimization Strategies for A/B Testing on HADOOP | 2013 | VLDB | 4.1945683e-05 |
| 9,504 | Supporting Scalable Analytics with Latency Constraints | 2015 | VLDB | 4.3341665e-05 |
| 794 | Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) | 2010 | VLDB | 0.00016605103 |
| 2,337 | Efficient Processing of Data Warehousing Queries in a Split Execution Environment | 2011 | SIGMOD | 9.0098186e-05 |
| 3,703 | Multi-Query Optimization in MapReduce Framework | 2014 | VLDB | 6.8289978e-05 |
| 1,615 | The Performance of MapReduce: An In-depth Study | 2010 | VLDB | 0.00011132319 |