Database Paper Browser

Back to papers

Realtime Data Processing at Facebook

Summary: Realtime data processing at Facebook handles hundreds of GB/s across pipelines, guided by five design decisions on usability, performance, fault tolerance, scalability, and correctness. Seconds-latency with a persistent message bus enables Puma, Swift, and Stylus, delivering scalable, fault-tolerant streaming and valuable design lessons. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
5204
Venue
SIGMOD
Year
2016
Pagerank
0.00011140777
Overall Rank
1,613 | 88.79%
DOI
10.1145/2882903.2904441

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 23 of 23 citing papers.

Rank Citing Paper Year Venue Pagerank
569 Optimizing Space Amplification in RocksDB 2017 CIDR 0.00019924098
1,610 MyRocks: LSM-Tree Database Storage Engine Serving Facebook's Social Graph 2020 VLDB 0.00011148094
2,798 Chucky: A Succinct Cuckoo Filter for LSM-Tree 2021 SIGMOD 8.1080111e-05
3,544 Rosetta: A Robust Space-Time Optimized Range Filter for Key-Value Stores 2020 SIGMOD 6.9898874e-05
3,965 Spooky: Granulating LSM-Tree Compactions Correctly 2022 VLDB 6.5820028e-05
5,286 StreamOps: Cloud-Native Runtime Management for Streaming Services in ByteDance 2023 VLDB 5.5838392e-05
6,436 Providing Streaming Joins as a Service at Facebook 2018 VLDB 5.0636254e-05
6,629 A Holistic View of Stream Partitioning Costs 2017 VLDB 4.9880986e-05
6,715 Shared Foundations: Modernizing Meta's Data Lakehouse 2023 CIDR 4.9509939e-05
7,995 BP-tree: Overcoming the Point-Range Operation Tradeoff for In-Memory B-trees 2023 VLDB 4.6109825e-05
8,009 CAMAL: Optimizing LSM-trees via Active Learning 2024 SIGMOD 4.6066863e-05
9,317 Are Joins over LSM-trees Ready? Take RocksDB as an Example 2025 VLDB 4.3556432e-05
9,318 Disaggregated State Management in Apache FlinkĀ® 2.0 2025 VLDB 4.3556432e-05
9,363 BonsaiKV: Towards Fast, Scalable, and Persistent Key-Value Stores with Tiered, Heterogeneous Memory System 2024 VLDB 4.3503444e-05
10,021 Hourglass: An Adaptive Range Filter with Lightweight Hybrid Encoding 2026 SIGMOD 4.1945683e-05
10,191 PartitionKV: Redesigning LSM-tree KV Stores on NVMs with Adaptive Partitioning for Reducing Write Stalls and Amplification 2026 SIGMOD 4.1945683e-05
10,419 Unified Lineage System: Tracking Data Provenance at Scale 2025 SIGMOD 4.1945683e-05
10,643 Keigo: Co-designing Log-Structured Merge Key-Value Stores with a Non-Volatile, Concurrency-aware Storage Hierarchy 2025 VLDB 4.1945683e-05
10,766 Scribe: How Meta transports terabytes per second in real time 2025 VLDB 4.1945683e-05
11,485 Real-time Data Infrastructure at Uber 2021 SIGMOD 4.1945683e-05
11,625 InvaliDB: Scalable Push-Based Real-Time Queries on Top of Pull-Based Databases (Extended) 2020 VLDB 4.1945683e-05
11,805 CarStream: An Industrial System of Big Data Processing for Internet-of-Vehicles 2017 VLDB 4.1945683e-05
11,807 Upsortable: Programming Top-K Queries Over Data Streams 2017 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 11 of 11 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers