Back to papers
Real-time Data Infrastructure at Uber
Summary: Uber's real-time data stack: PB-scale ingestion/processing built on OSS with Uber-specific refinements to meet latency and scale. Highlights three scaling challenges per component, and real-time use cases (incentives, fraud, ML) with key lessons.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 6241
- Venue
- SIGMOD
- Year
- 2021
- Pagerank
- 4.1945683e-05
- Overall Rank
- 11,485 | 20.11%
- DOI
-
10.1145/3448016.3457552
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
Outgoing Citations (Sorted by Pagerank)
Showing 16 of 16 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 66 |
Spark SQL: Relational Data Processing in Spark |
2015 |
SIGMOD |
0.00061639801 |
| 488 |
TiDB: A Raft-based HTAP Database |
2020 |
VLDB |
0.000220409 |
| 538 |
The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing |
2015 |
VLDB |
0.00020678804 |
| 824 |
Twitter Heron: Stream Processing at Scale |
2015 |
SIGMOD |
0.0001623129 |
| 890 |
F1 – The Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business |
2012 |
SIGMOD |
0.00015570935 |
| 1,098 |
Trill: A High-Performance Incremental Query Processor for Diverse Analytics |
2015 |
VLDB |
0.00014114442 |
| 1,286 |
Photon: Fault-tolerant and Scalable Joining of Continuous Data Streams |
2013 |
SIGMOD |
0.0001282373 |
| 1,507 |
BatchDB: Efficient Isolated Execution of Hybrid OLTP+OLAP Workloads for Interactive Applications |
2017 |
SIGMOD |
0.00011617967 |
| 1,588 |
Druid: A Real-time Analytical Data Store |
2014 |
SIGMOD |
0.00011239313 |
| 1,613 |
Realtime Data Processing at Facebook |
2016 |
SIGMOD |
0.00011140777 |
| 1,943 |
Procella: Unifying serving and analytical data at YouTube |
2019 |
VLDB |
0.00010012569 |
| 2,062 |
Dremel: A Decade of Interactive SQL Analysis at Web Scale |
2020 |
VLDB |
9.6481955e-05 |
| 3,768 |
F1 Lightning: HTAP as a Service |
2020 |
VLDB |
6.7782774e-05 |
| 4,688 |
Alibaba Hologres: A Cloud-Native Service for Hybrid Serving/Analytical Processing |
2020 |
VLDB |
5.9980609e-05 |
| 4,767 |
Pinot: Realtime OLAP for 530 Million Users |
2018 |
SIGMOD |
5.9364731e-05 |
| 6,242 |
Helios: Hyperscale Indexing for the Cloud & Edge |
2020 |
VLDB |
5.1408379e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 10,789 |
Ursa: A Lakehouse-Native Data Streaming Engine for Kafka |
2025 |
VLDB |
4.1945683e-05 |
| 12,005 |
Design and Implementation of a Real-Time Interactive Analytics System for Large Spatio-Temporal Data |
2014 |
VLDB |
4.1945683e-05 |
| 11,635 |
Automated Performance Management for the Big Data Stack |
2019 |
CIDR |
4.1945683e-05 |
| 2,844 |
Towards Scalable Real-time Analytics: An Architecture for Scale-out of OLxP Workloads |
2015 |
VLDB |
8.0243849e-05 |
| 6,131 |
Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture |
2013 |
SIGMOD |
5.1956688e-05 |
| 3,556 |
Solving Big Data Challenges for Enterprise Application Performance Management |
2012 |
VLDB |
6.9770145e-05 |
| 9,504 |
Supporting Scalable Analytics with Latency Constraints |
2015 |
VLDB |
4.3341665e-05 |
| 3,535 |
Scaling Spark in the Real World: Performance and Usability |
2015 |
VLDB |
6.9992495e-05 |
| 2,658 |
Data Warehousing and Analytics Infrastructure at Facebook |
2010 |
SIGMOD |
8.3607429e-05 |
| 1,613 |
Realtime Data Processing at Facebook |
2016 |
SIGMOD |
0.00011140777 |