Only Aggressive Elephants are Fast Elephants
Summary: HAIL extends HDFS upload to build per-block clustered indexes on every replica, enabling fast selective MapReduce access. Aggressive indexing yields up to 60% faster uploads (default three replicas) and up to 68x faster MapReduce queries, demonstrated across six large clusters. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Jens Dittrich
- 2. Jorge-Arnulfo Quiané-Ruiz
- 3. Stefan Richter
- 4. Stefan Schuh
- 5. Alekh Jindal
- 6. Jörg Schad
Incoming Citations (Sorted by Pagerank)
Showing 8 of 8 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,322 | Instant Loading for Main Memory Databases | 2013 | VLDB | 9.034874e-05 |
| 3,129 | Scalable Big Graph Processing in MapReduce | 2014 | SIGMOD | 7.5008242e-05 |
| 7,918 | Indexing HDFS Data in PDW: Splitting the data from the index | 2014 | VLDB | 4.6170838e-05 |
| 7,958 | CARTILAGE: Adding Flexibility to the Hadoop Skeleton | 2013 | SIGMOD | 4.613363e-05 |
| 8,084 | ScalaGiST: Scalable Generalized Search Trees for MapReduce Systems [Innovative Systems Paper] | 2014 | VLDB | 4.5902866e-05 |
| 8,366 | WWHow! Freeing Data Storage from Cages | 2013 | CIDR | 4.5357016e-05 |
| 9,375 | Efficient Big Data Processing in Hadoop MapReduce | 2012 | VLDB | 4.347384e-05 |
| 12,071 | Mosquito: Another One Bites the Data Upload STream | 2013 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 20 of 20 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,439 | CoHadoop: Flexible Data Placement and Its Exploitation in Hadoop | 2011 | VLDB | 8.8190594e-05 |
| 2,476 | A Platform for Scalable One-Pass Analytics using MapReduce | 2011 | SIGMOD | 8.6960139e-05 |
| 2,337 | Efficient Processing of Data Warehousing Queries in a Split Execution Environment | 2011 | SIGMOD | 9.0098186e-05 |
| 3,208 | Column-Oriented Storage Techniques for MapReduce | 2011 | VLDB | 7.3781897e-05 |
| 3,279 | Early Accurate Results for Advanced Analytics on MapReduce | 2012 | VLDB | 7.2855494e-05 |
| 11,958 | Shared Execution of Recurring Workloads in MapReduce | 2015 | VLDB | 4.1945683e-05 |
| 12,101 | Optimization Strategies for A/B Testing on HADOOP | 2013 | VLDB | 4.1945683e-05 |
| 413 | HaLoop: Efficient Iterative Data Processing on Large Clusters | 2010 | VLDB | 0.00023904409 |
| 1,615 | The Performance of MapReduce: An In-depth Study | 2010 | VLDB | 0.00011132319 |
| 794 | Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) | 2010 | VLDB | 0.00016605103 |