Fast Failure Recovery in Distributed Graph Processing Systems
Summary: Proposes a partitioned, cost-aware recovery for distributed graph processing that parallelizes repair across remaining nodes after a failure. Augments checkpoint- and log-based schemes with a graph-partitioning mechanism, achieving up to 30x faster recovery on Giraph (40-node cluster). (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Yanyan Shen
- 2. Gang Chen
- 3. H. V. Jagadish
- 4. Wei Lu
- 5. Beng Chin Ooi
- 6. Bogdan Marius Tudor
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,696 | Asynchronous and Fault-Tolerant Recursive Datalog Evaluation in Shared-Nothing Engines | 2015 | VLDB | 5.9911301e-05 |
| 8,829 | A Distributed System for Large-scale n-gram Language Models at Tencent | 2019 | VLDB | 4.4406886e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4 | Pregel: A System for Large-Scale Graph Processing | 2010 | SIGMOD | 0.0019005923 |
| 37 | Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud | 2012 | VLDB | 0.0007522744 |
| 558 | Trinity: A Distributed Graph Engine on a Memory Cloud | 2013 | SIGMOD | 0.00020168032 |
| 1,800 | epiC: an Extensible and Scalable System for Processing Big Data | 2014 | VLDB | 0.00010512649 |
| 1,931 | Efficient Processing of k Nearest Neighbor Joins using MapReduce | 2012 | VLDB | 0.00010040427 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,115 | Start Late or Finish Early: A Distributed Graph Processing System with Redundancy Reduction | 2019 | VLDB | 4.5816155e-05 |
| 3,129 | Scalable Big Graph Processing in MapReduce | 2014 | SIGMOD | 7.5008242e-05 |
| 2,494 | Streaming Graph Partitioning: An Experimental Study | 2018 | VLDB | 8.6508229e-05 |
| 8,254 | A Study of Partitioning Policies for Graph Analytics on Large-scale Distributed Platforms | 2019 | VLDB | 4.5491792e-05 |
| 1,953 | Distributed Evaluation of Subgraph Queries Using Worst-case Optimal Low-Memory Dataflows | 2018 | VLDB | 9.9665955e-05 |
| 1,968 | An Experimental Comparison of Partitioning Strategies in Distributed Graph Processing | 2017 | VLDB | 9.9071968e-05 |
| 4,830 | Systems for Big-Graphs | 2014 | VLDB | 5.8924342e-05 |
| 9,448 | Cost-based Fault-tolerance for Parallel Data Processing | 2015 | SIGMOD | 4.3401906e-05 |
| 1,877 | Large-Scale Distributed Graph Computing Systems: An Experimental Evaluation | 2015 | VLDB | 0.00010236803 |
| 3,232 | Managing Large Dynamic Graphs Efficiently | 2012 | SIGMOD | 7.336861e-05 |