HaLoop: Efficient Iterative Data Processing on Large Clusters
Summary: HaLoop, a modified Hadoop MapReduce, adds loop-aware scheduling and caching to support iterative data mining workloads. By reusing state across iterations, it achieves 1.85x faster runtimes and shuffles only 4% of data between map and reduce. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Yingyi Bu
- 2. Bill Howe
- 3. Magdalena Balazinska
- 4. Michael D. Ernst
Incoming Citations (Sorted by Pagerank)
Showing 39 of 39 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3 | Pig Latin: A Not-So-Foreign Language for Data Processing | 2008 | SIGMOD | 0.0024183614 |
| 4 | Pregel: A System for Large-Scale Graph Processing | 2010 | SIGMOD | 0.0019005923 |
| 42 | A Comparison of Approaches to Large-Scale Data Analysis | 2009 | SIGMOD | 0.00073498298 |
| 77 | An Amateur's Introduction to Recursive Query Processing Strategies | 1986 | SIGMOD | 0.00057043861 |
| 157 | HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads | 2009 | VLDB | 0.00040397359 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 979 | Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads | 2012 | VLDB | 0.0001488055 |
| 7,687 | Experimental Analysis of Distributed Graph Systems | 2018 | VLDB | 4.677974e-05 |
| 1,685 | Fast Iterative Graph Computation with Block Updates | 2013 | VLDB | 0.0001091808 |
| 2,736 | Online Aggregation and Continuous Query support in MapReduce | 2010 | SIGMOD | 8.2043187e-05 |
| 2,172 | Spinning Fast Iterative Data Flows | 2012 | VLDB | 9.3706587e-05 |
| 794 | Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) | 2010 | VLDB | 0.00016605103 |
| 9,266 | Redoop Infrastructure for Recurring Big Data Queries | 2014 | VLDB | 4.3667196e-05 |
| 1,615 | The Performance of MapReduce: An In-depth Study | 2010 | VLDB | 0.00011132319 |
| 11,958 | Shared Execution of Recurring Workloads in MapReduce | 2015 | VLDB | 4.1945683e-05 |
| 2,476 | A Platform for Scalable One-Pass Analytics using MapReduce | 2011 | SIGMOD | 8.6960139e-05 |