Database Paper Browser

Back to papers

HaLoop: Efficient Iterative Data Processing on Large Clusters

Summary: HaLoop, a modified Hadoop MapReduce, adds loop-aware scheduling and caching to support iterative data mining workloads. By reusing state across iterations, it achieves 1.85x faster runtimes and shuffles only 4% of data between map and reduce. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
10084
Venue
VLDB
Year
2010
Pagerank
0.00023904409
Overall Rank
413 | 97.13%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 39 of 39 citing papers.

Rank Citing Paper Year Venue Pagerank
396 One Trillion Edges: Graph Processing at Facebook-Scale 2015 VLDB 0.00024424102
522 Differential dataflow 2013 CIDR 0.00021099241
542 Shark: SQL and Rich Analytics at Scale 2013 SIGMOD 0.00020595648
868 Profiling, What-if Analysis, and Cost-based Optimization of MapReduce Programs 2011 VLDB 0.00015789681
979 Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads 2012 VLDB 0.0001488055
1,158 Simulation of Database-Valued Markov Chains Using SimSQL 2013 SIGMOD 0.0001361064
1,280 Automatic Optimization for MapReduce Programs 2011 VLDB 0.0001285503
1,294 Distributed SociaLite: A Datalog-Based Language for Large-Scale Graph Analysis 2013 VLDB 0.00012779484
1,334 SkewTune: Mitigating Skew in MapReduce Applications 2012 SIGMOD 0.0001250413
1,452 Asynchronous Large-Scale Graph Processing Made Easy 2013 CIDR 0.00011919499
1,800 epiC: an Extensible and Scalable System for Processing Big Data 2014 VLDB 0.00010512649
1,873 An Architecture for Compiling UDF-centric Workflows 2015 VLDB 0.00010253002
2,172 Spinning Fast Iterative Data Flows 2012 VLDB 9.3706587e-05
2,255 LINVIEW: Incremental View Maintenance for Complex Analytical Queries 2014 SIGMOD 9.1884983e-05
2,418 Tupleware: "Big" Data, Big Analytics, Small Clusters 2015 CIDR 8.8556595e-05
2,458 REX: Recursive, Delta-Based Data-Centric Computation 2012 VLDB 8.7683462e-05
2,473 Photon: A Fast Query Engine for Lakehouse Systems 2022 SIGMOD 8.7237281e-05
2,529 Pregelix: Big(ger) Graph Analytics on A Dataflow Engine 2015 VLDB 8.5940768e-05
2,667 Cumulon: Optimizing Statistical Data Analysis in the Cloud 2013 SIGMOD 8.3413995e-05
2,818 Implicit Parallelism through Deep Language Embedding 2015 SIGMOD 8.0665558e-05
2,848 Exploiting Matrix Dependency for Efficient Distributed Matrix Computation 2015 SIGMOD 8.0208832e-05
3,081 Knowledge Expansion over Probabilistic Knowledge Bases 2014 SIGMOD 7.6031501e-05
3,279 Early Accurate Results for Advanced Analytics on MapReduce 2012 VLDB 7.2855494e-05
3,504 M3R: Increased Performance for In-Memory Hadoop Jobs 2012 VLDB 7.0347515e-05
3,694 Keys for Graphs 2015 VLDB 6.8345712e-05
4,696 Asynchronous and Fault-Tolerant Recursive Datalog Evaluation in Shared-Nothing Engines 2015 VLDB 5.9911301e-05
5,395 Large-scale Predictive Analytics in Vertica: Fast Data Transfer, Distributed Model Creation, and In-database Prediction 2015 SIGMOD 5.5318806e-05
5,688 PREDIcT: Towards Predicting the Runtime of Large Scale Iterative Analytics 2013 VLDB 5.3702808e-05
6,173 Exploiting Soft and Hard Correlations in Big Data Query Optimization 2016 VLDB 5.1699414e-05
7,294 Optimization for iterative queries on MapReduce 2014 VLDB 4.773119e-05
7,511 Hone: "Scaling Down" Hadoop on Shared-Memory Systems 2013 VLDB 4.7180617e-05
7,794 Large-scale Complex Analytics on Semi-structured Datasets using AsterixDB and Spark 2016 VLDB 4.6482977e-05
8,300 sPCA: Scalable Principal Component Analysis for Big Data on Distributed Platforms 2015 SIGMOD 4.5435639e-05
9,266 Redoop Infrastructure for Recurring Big Data Queries 2014 VLDB 4.3667196e-05
9,282 Hybrid Pulling/Pushing for I/O-Efficient Distributed and Iterative Graph Computing 2016 SIGMOD 4.3634964e-05
9,547 Optimistic Recovery for Iterative Dataflows in Action 2015 SIGMOD 4.3259935e-05
11,949 Big Data Research: Will Industry Solve all the Problems? 2015 VLDB 4.1945683e-05
11,976 Anti-Combining for MapReduce 2014 SIGMOD 4.1945683e-05
12,140 SkewTune in Action: Mitigating Skew in MapReduce Applications 2012 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 5 of 5 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers