Optimistic Recovery for Iterative Dataflows in Action
Summary: Optimistic recovery for iterative dataflows eliminates intermediate state checkpoints via compensation functions to reach a consistent state after failure. Demonstrated on Apache Flink with graph algorithms, it provides fault tolerance without checkpoint overhead and near-optimal failure-free performance. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Sergey Dudoladov
- 2. Chen Xu
- 3. Sebastian Schelter
- 4. Asterios Katsifodimos
- 5. Stephan Ewen
- 6. Kostas Tzoumas
- 7. Volker Markl
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,368 | Fine-Grained Modeling and Optimization for Intelligent Resource Management in Big Data Processing | 2022 | VLDB | 5.5457532e-05 |
| 8,617 | A Spark Optimizer for Adaptive, Fine-Grained Parameter Tuning | 2024 | VLDB | 4.4846425e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4 | Pregel: A System for Large-Scale Graph Processing | 2010 | SIGMOD | 0.0019005923 |
| 22 | SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets | 2008 | VLDB | 0.0008456613 |
| 37 | Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud | 2012 | VLDB | 0.0007522744 |
| 413 | HaLoop: Efficient Iterative Data Processing on Large Clusters | 2010 | VLDB | 0.00023904409 |
| 979 | Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads | 2012 | VLDB | 0.0001488055 |
| 2,172 | Spinning Fast Iterative Data Flows | 2012 | VLDB | 9.3706587e-05 |
| 2,575 | A Latency and Fault-Tolerance Optimizer for Online Parallel Query Plans | 2011 | SIGMOD | 8.5133576e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,696 | Asynchronous and Fault-Tolerant Recursive Datalog Evaluation in Shared-Nothing Engines | 2015 | VLDB | 5.9911301e-05 |
| 3,886 | Fault-tolerant Stream Processing using a Distributed, Replicated File System | 2008 | VLDB | 6.6661649e-05 |
| 1,990 | Fault-Tolerance in the Borealis Distributed Stream Processing System | 2005 | SIGMOD | 9.8472819e-05 |
| 1,226 | Integrating Scale Out and Fault Tolerance in Stream Processing using Operator State Management | 2013 | SIGMOD | 0.00013180799 |
| 1,357 | Highly Available, Fault-Tolerant, Parallel Dataflows | 2004 | SIGMOD | 0.00012392275 |
| 11,804 | State Management in Apache Flink | 2017 | VLDB | 4.1945683e-05 |
| 9,448 | Cost-based Fault-tolerance for Parallel Data Processing | 2015 | SIGMOD | 4.3401906e-05 |
| 7,125 | Fast Failure Recovery in Distributed Graph Processing Systems | 2015 | VLDB | 4.8246382e-05 |
| 2,172 | Spinning Fast Iterative Data Flows | 2012 | VLDB | 9.3706587e-05 |
| 13,322 | Fault-Tolerance for Distributed Iterative Dataflows in Action | 2018 | VLDB | - |