Back to papers
Consistency and Completeness: Rethinking Distributed Stream Processing in Apache Kafka
Summary: Kafka's persistent log provides strong correctness under failures and out-of-order data. Kafka Streams uses read–process–write over log-append state with idempotent/transactional exactly-once and revision-based speculative processing for early results, enabling scalable correctness–performance–cost trade-offs.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 6245
- Venue
- SIGMOD
- Year
- 2021
- Pagerank
- 5.8959131e-05
- Overall Rank
- 4,822 | 66.46%
- DOI
-
10.1145/3448016.3457556
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 7 of 7 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 20 of 20 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 22 |
SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets |
2008 |
VLDB |
0.0008456613 |
| 142 |
TelegraphCQ: Continuous Dataflow Processing for an Uncertain World |
2003 |
CIDR |
0.00041725802 |
| 191 |
The Design of the Borealis Stream Processing Engine |
2005 |
CIDR |
0.00035738595 |
| 314 |
MillWheel: Fault-Tolerant Stream Processing at Internet Scale |
2013 |
VLDB |
0.00028084774 |
| 323 |
Gigascope: A Stream Database for Network Applications |
2003 |
SIGMOD |
0.00027492196 |
| 432 |
Flexible Time Management in Data Stream Systems |
2004 |
PODS |
0.00023368424 |
| 538 |
The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing |
2015 |
VLDB |
0.00020678804 |
| 1,084 |
Dhalion: Self-Regulating Stream Processing in Heron |
2017 |
VLDB |
0.00014209714 |
| 1,098 |
Trill: A High-Performance Incremental Query Processor for Diverse Analytics |
2015 |
VLDB |
0.00014114442 |
| 1,226 |
Integrating Scale Out and Fault Tolerance in Stream Processing using Operator State Management |
2013 |
SIGMOD |
0.00013180799 |
| 1,551 |
Out-of-Order Processing: A New Architecture for High-Performance Stream Systems |
2008 |
VLDB |
0.00011416058 |
| 2,182 |
Aurora: A Data Stream Management System |
2003 |
SIGMOD |
9.3449021e-05 |
| 2,264 |
S-Store: Streaming Meets Transaction Processing |
2015 |
VLDB |
9.1575142e-05 |
| 2,338 |
Samza: Stateful Scalable Stream Processing at LinkedIn |
2017 |
VLDB |
9.00711e-05 |
| 3,550 |
Chi: A Scalable and Programmable Control Plane for Distributed Stream Processing Systems |
2018 |
VLDB |
6.9843512e-05 |
| 4,044 |
Megaphone: Latency-conscious state migration for distributed streaming dataflows |
2019 |
VLDB |
6.4995312e-05 |
| 5,263 |
Consistent Regions: Guaranteed Tuple Processing in IBM Streams |
2016 |
VLDB |
5.5976361e-05 |
| 5,753 |
Building a Replicated Logging System with Apache Kafka |
2015 |
VLDB |
5.3404371e-05 |
| 5,971 |
Optimal and General Out-of-Order Sliding-Window Aggregation |
2019 |
VLDB |
5.2480159e-05 |
| 6,856 |
Liquid: Unifying Nearline and Offline Big Data Integration |
2015 |
CIDR |
4.9060615e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 6,767 |
Watermarks in Stream Processing Systems: Semantics and Comparative Analysis of Apache Flink and Google Cloud Dataflow |
2021 |
VLDB |
4.9322174e-05 |
| 9,411 |
Kora: A Cloud-Native Event Streaming Platform For Kafka |
2023 |
VLDB |
4.3441378e-05 |
| 7,938 |
Correctness in Stream Processing: Challenges and Opportunities |
2022 |
CIDR |
4.613363e-05 |
| 2,338 |
Samza: Stateful Scalable Stream Processing at LinkedIn |
2017 |
VLDB |
9.00711e-05 |
| 11,819 |
Toward High-Performance Distributed Stream Processing via Approximate Fault Tolerance |
2017 |
VLDB |
4.1945683e-05 |
| 1,990 |
Fault-Tolerance in the Borealis Distributed Stream Processing System |
2005 |
SIGMOD |
9.8472819e-05 |
| 11,360 |
KafkaDirect: Zero-copy Data Access for Apache Kafka over RDMA Networks |
2022 |
SIGMOD |
4.1945683e-05 |
| 11,804 |
State Management in Apache Flink |
2017 |
VLDB |
4.1945683e-05 |
| 10,862 |
How Reliable Are Streams? End-to-End Processing-Guarantee Validation and Performance Benchmarking of Stream Processing Systems |
2025 |
VLDB |
4.1945683e-05 |
| 5,753 |
Building a Replicated Logging System with Apache Kafka |
2015 |
VLDB |
5.3404371e-05 |