Database Paper Browser

Back to papers

MillWheel: Fault-Tolerant Stream Processing at Internet Scale

Summary: MillWheel offers fault-tolerant, low-latency stream processing at Internet scale via a directed computation graph, persistent state, and continuous dataflow. Logical time-based aggregations, scalable fault tolerance, and a Google case study (anomaly detector) illustrate its unique programming model and broad applicability to data-intensive, real-time analytics. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
10541
Venue
VLDB
Year
2013
Pagerank
0.00028084774
Overall Rank
314 | 97.82%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 16 of 66 citing papers.

Rank Citing Paper Year Venue Pagerank
9,604 GeaFlow: A Graph Extended and Accelerated Dataflow System 2023 SIGMOD 4.3177432e-05
10,410 Oceanus: Enable SLO-Aware Vertical Autoscaling for Cloud-Native Streaming Services in Tencent 2025 SIGMOD 4.1945683e-05
10,417 Streaming Democratized: Ease Across the Latency Spectrum with Delayed View Semantics and Snowflake Dynamic Tables 2025 SIGMOD 4.1945683e-05
10,431 CORE+: A Complex Event Recognition Engine in C++ 2025 SIGMOD 4.1945683e-05
10,766 Scribe: How Meta transports terabytes per second in real time 2025 VLDB 4.1945683e-05
11,261 Out-of-Order Sliding-Window Aggregation with Efficient Bulk Evictions and Insertions 2023 VLDB 4.1945683e-05
11,307 Making Cache Monotonic and Consistent 2023 VLDB 4.1945683e-05
11,435 Synchronization Schemas 2021 PODS 4.1945683e-05
11,468 Klink: Progress-Aware Scheduling for Streaming Data Systems 2021 SIGMOD 4.1945683e-05
11,625 InvaliDB: Scalable Push-Based Real-Time Queries on Top of Pull-Based Databases (Extended) 2020 VLDB 4.1945683e-05
11,673 Online Template Induction for Machine-Generated Emails 2019 VLDB 4.1945683e-05
11,709 Robust, Scalable, Real-Time Event Time Series Aggregation at Twitter 2018 SIGMOD 4.1945683e-05
11,728 Challenges and Experiences in Building an Efficient Apache Beam Runner For IBM Streams 2018 VLDB 4.1945683e-05
11,804 State Management in Apache Flink 2017 VLDB 4.1945683e-05
11,819 Toward High-Performance Distributed Stream Processing via Approximate Fault Tolerance 2017 VLDB 4.1945683e-05
11,849 The Challenges of Global-scale Data Management 2016 SIGMOD 4.1945683e-05
Previous Page 2 / 2 Next

Outgoing Citations (Sorted by Pagerank)

Showing 10 of 10 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers