Back to papers
Apache Hadoop Goes Realtime at Facebook
Summary: Facebook Messages runs on Hadoop with HBase to scale to billions of messages daily, arguing Hadoop/HBase beat Cassandra and Voldemort for realtime and scalability. It outlines Hadoop enhancements, configuration tradeoffs, and positions Hadoop-based solutions as a model over sharded MySQL for web-scale apps.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 4457
- Venue
- SIGMOD
- Year
- 2011
- Pagerank
- 0.00011675192
- Overall Rank
- 1,499 | 89.58%
- DOI
-
-
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 19 of 19 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 281 |
LinkBench: a Database Benchmark Based on the Facebook Social Graph |
2013 |
SIGMOD |
0.0002906793 |
| 379 |
bLSM: A General Purpose Log Structured Merge Tree |
2012 |
SIGMOD |
0.0002493527 |
| 979 |
Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads |
2012 |
VLDB |
0.0001488055 |
| 3,034 |
How to Fit when No One Size Fits |
2013 |
CIDR |
7.6752083e-05 |
| 3,062 |
Efficient Multi-way Theta-Join Processing Using MapReduce |
2012 |
VLDB |
7.6343994e-05 |
| 3,504 |
M3R: Increased Performance for In-Memory Hadoop Jobs |
2012 |
VLDB |
7.0347515e-05 |
| 3,922 |
Pushing Data-Induced Predicates Through Joins in Big-Data Clusters |
2020 |
VLDB |
6.6291079e-05 |
| 4,572 |
The Unified Logging Infrastructure for Data Analytics at Twitter |
2012 |
VLDB |
6.0760183e-05 |
| 4,857 |
The "Big Data" Ecosystem at LinkedIn |
2013 |
SIGMOD |
5.8736144e-05 |
| 6,131 |
Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture |
2013 |
SIGMOD |
5.1956688e-05 |
| 6,170 |
PolarDB-IMCI: A Cloud-Native HTAP Database System at Alibaba |
2023 |
SIGMOD |
5.171601e-05 |
| 6,246 |
Taking Omid to the Clouds: Fast, Scalable Transactions for Real-Time Cloud Analytics |
2018 |
VLDB |
5.1389356e-05 |
| 6,856 |
Liquid: Unifying Nearline and Offline Big Data Integration |
2015 |
CIDR |
4.9060615e-05 |
| 6,988 |
CrocodileDB: Efficient Database Execution through Intelligent Deferment |
2020 |
CIDR |
4.8718019e-05 |
| 8,658 |
Modernization of Databases in the Cloud Era: Building Databases that Run like Legos |
2023 |
VLDB |
4.4729338e-05 |
| 9,504 |
Supporting Scalable Analytics with Latency Constraints |
2015 |
VLDB |
4.3341665e-05 |
| 11,802 |
Query-able Kafka: An agile data analytics pipeline for mobile wireless networks |
2017 |
VLDB |
4.1945683e-05 |
| 11,805 |
CarStream: An Industrial System of Big Data Processing for Internet-of-Vehicles |
2017 |
VLDB |
4.1945683e-05 |
| 12,047 |
Execution and Optimization of Continuous Queries with Cyclops |
2013 |
SIGMOD |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 0 of 0 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
Semantically Similar Papers