Database Paper Browser

Back to papers

Integrating Scale Out and Fault Tolerance in Stream Processing using Operator State Management

Summary: Externalised operator state with explicit primitives enables dynamic scale-out and fault-tolerant recovery for stateful stream processing. Checkpoints are partitioned across new VMs for scale-out; recovery restores state and replays tuples, demonstrated on EC2 with up to 50 VMs and L=350. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
4691
Venue
SIGMOD
Year
2013
Pagerank
0.00013180799
Overall Rank
1,226 | 91.48%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 40 of 40 citing papers.

Rank Citing Paper Year Venue Pagerank
1,613 Realtime Data Processing at Facebook 2016 SIGMOD 0.00011140777
2,264 S-Store: Streaming Meets Transaction Processing 2015 VLDB 9.1575142e-05
2,338 Samza: Stateful Scalable Stream Processing at LinkedIn 2017 VLDB 9.00711e-05
3,210 Frontier: Resilient Edge Processing for the Internet of Things 2018 VLDB 7.3746627e-05
3,382 Scalable and Adaptive Online Joins 2014 VLDB 7.1597145e-05
3,550 Chi: A Scalable and Programmable Control Plane for Distributed Stream Processing Systems 2018 VLDB 6.9843512e-05
3,762 SABER: Window-Based Hybrid Stream Processing for Heterogeneous Architectures 2016 SIGMOD 6.7804471e-05
4,044 Megaphone: Latency-conscious state migration for distributed streaming dataflows 2019 VLDB 6.4995312e-05
4,488 Analyzing Efficient Stream Processing on Modern Hardware 2019 VLDB 6.145117e-05
4,795 Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines 2020 SIGMOD 5.9158043e-05
4,822 Consistency and Completeness: Rethinking Distributed Stream Processing in Apache Kafka 2021 SIGMOD 5.8959131e-05
5,193 LightSaber: Efficient Window Aggregation on Multi-core Processors 2020 SIGMOD 5.6371049e-05
5,286 StreamOps: Cloud-Native Runtime Management for Streaming Services in ByteDance 2023 VLDB 5.5838392e-05
5,644 FluxQuery: An Execution Framework for Highly Interactive Query Workloads 2016 SIGMOD 5.3924275e-05
5,657 BriskStream: Scaling Data Stream Processing on Shared-Memory Multicore Architectures 2019 SIGMOD 5.3864606e-05
5,939 Clonos: Consistent Causal Recovery for Highly-Available Streaming Dataflows 2021 SIGMOD 5.2641681e-05
6,629 A Holistic View of Stream Partitioning Costs 2017 VLDB 4.9880986e-05
6,648 Grizzly: Efficient Stream Processing Through Adaptive Query Compilation 2020 SIGMOD 4.9771723e-05
6,721 Beyond Analytics: The Evolution of Stream Processing Systems 2020 SIGMOD 4.9492015e-05
6,856 Liquid: Unifying Nearline and Offline Big Data Integration 2015 CIDR 4.9060615e-05
7,234 MgCrab: Transaction Crabbing for Live Migration in Deterministic Database Systems 2019 VLDB 4.7941449e-05
7,373 Hazelcast Jet: Low-latency Stream Processing at the 99.99th Percentile 2021 VLDB 4.7494183e-05
8,001 Rethinking Stateful Stream Processing with RDMA 2022 SIGMOD 4.6092573e-05
8,078 Meta-Dataflows: Efficient Exploratory Dataflow Jobs 2018 SIGMOD 4.5914967e-05
8,480 Optimization of Threshold Functions over Streams 2021 VLDB 4.5011552e-05
9,194 Phoebe: A Learning-based Checkpoint Optimizer 2021 VLDB 4.3761777e-05
9,217 Elasticutor: Rapid Elasticity for Realtime Stateful Stream Processing 2019 SIGMOD 4.3712054e-05
9,496 Scabbard: Single-Node Fault-Tolerant Stream Processing 2022 VLDB 4.3341665e-05
9,504 Supporting Scalable Analytics with Latency Constraints 2015 VLDB 4.3341665e-05
9,733 ContTune: Continuous Tuning by Conservative Bayesian Optimization for Distributed Stream Data Processing Systems 2023 VLDB 4.2942813e-05
9,883 Towards Resource-adaptive Query Execution in Cloud Native Databases 2024 CIDR 4.2635782e-05
10,043 Accelerating Stream Processing Engines via Hardware Offloading 2026 SIGMOD 4.1945683e-05
10,259 Scarf: Self-Adaptive Tuning via Multi-Objective Reinforcement Learning for Apache Flink 2026 VLDB 4.1945683e-05
10,402 CloudJump II: Optimizing Cloud Databases for Shared Storage 2025 SIGMOD 4.1945683e-05
11,243 Fries: Fast and Consistent Runtime Reconfiguration in Dataflow Systems with Transactional Guarantees 2023 VLDB 4.1945683e-05
11,695 Minimizing Cost by Reducing Scaling Operations in Distributed Stream Processing 2019 VLDB 4.1945683e-05
11,804 State Management in Apache Flink 2017 VLDB 4.1945683e-05
11,819 Toward High-Performance Distributed Stream Processing via Approximate Fault Tolerance 2017 VLDB 4.1945683e-05
11,909 CE-Storm: Confidential Elastic Processing of Data Streams 2015 SIGMOD 4.1945683e-05
11,925 Smooth Task Migration in Apache Storm 2015 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 8 of 8 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers