Building a Replicated Logging System with Apache Kafka
Summary: Replicated logging backbone on Kafka's distributed commit log extends its use from messaging to data storage and processing. Draws on LinkedIn's design and engineering experience to replicate Kafka logs for source-of-truth storage and stream processing. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Guozhang Wang
- 2. Joel Koshy
- 3. Sriram Subramanian
- 4. Kartik Paramasivam
- 5. Mammad Zadeh
- 6. Neha Narkhede
- 7. Jun Rao
- 8. Jay Kreps
- 9. Joe Stein
Incoming Citations (Sorted by Pagerank)
Showing 7 of 7 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,822 | Consistency and Completeness: Rethinking Distributed Stream Processing in Apache Kafka | 2021 | SIGMOD | 5.8959131e-05 |
| 5,939 | Clonos: Consistent Causal Recovery for Highly-Available Streaming Dataflows | 2021 | SIGMOD | 5.2641681e-05 |
| 6,759 | AStream: Ad-hoc Shared Stream Processing | 2019 | SIGMOD | 4.9352213e-05 |
| 7,479 | Amazon MemoryDB: A Fast and Durable Memory-First Cloud Database | 2024 | SIGMOD | 4.7180617e-05 |
| 8,128 | Lotus: Scalable Multi-Partition Transactions on Single-Threaded Partitioned Databases | 2022 | VLDB | 4.5785914e-05 |
| 10,546 | Evaluating Continuous Queries with Inconsistency Annotations | 2025 | VLDB | 4.1945683e-05 |
| 11,360 | KafkaDirect: Zero-copy Data Access for Apache Kafka over RDMA Networks | 2022 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 1 of 1 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,853 | On Brewing Fresh Espresso: LinkedIn’s Distributed Data Serving Platform | 2013 | SIGMOD | 0.00010320369 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,804 | State Management in Apache Flink | 2017 | VLDB | 4.1945683e-05 |
| 6,856 | Liquid: Unifying Nearline and Offline Big Data Integration | 2015 | CIDR | 4.9060615e-05 |
| 1,853 | On Brewing Fresh Espresso: LinkedIn’s Distributed Data Serving Platform | 2013 | SIGMOD | 0.00010320369 |
| 6,123 | Data Ingestion for the Connected World | 2017 | CIDR | 5.1991194e-05 |
| 11,566 | Rethinking Message Brokers on RDMA and NVM | 2020 | SIGMOD | 4.1945683e-05 |
| 2,338 | Samza: Stateful Scalable Stream Processing at LinkedIn | 2017 | VLDB | 9.00711e-05 |
| 11,802 | Query-able Kafka: An agile data analytics pipeline for mobile wireless networks | 2017 | VLDB | 4.1945683e-05 |
| 9,411 | Kora: A Cloud-Native Event Streaming Platform For Kafka | 2023 | VLDB | 4.3441378e-05 |
| 11,360 | KafkaDirect: Zero-copy Data Access for Apache Kafka over RDMA Networks | 2022 | SIGMOD | 4.1945683e-05 |
| 4,822 | Consistency and Completeness: Rethinking Distributed Stream Processing in Apache Kafka | 2021 | SIGMOD | 5.8959131e-05 |