Back to papers
Presto: A Decade of SQL Analytics at Meta
Summary: Meta's Presto is a decade-long, open-source distributed SQL engine for exabyte-scale analytics, optimized for low latency on elastic containers. Upgrades include hierarchical caching, vectorized execution, materialized views, and Presto on Spark, unifying analytics under one engine.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 6692
- Venue
- SIGMOD
- Year
- 2023
- Pagerank
- 5.4549499e-05
- Overall Rank
- 5,531 | 61.53%
- DOI
-
10.1145/3589769
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 15 of 15 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 7,059 |
Adaptive and Robust Query Execution for Lakehouses at Scale |
2024 |
VLDB |
4.8477825e-05 |
| 8,617 |
A Spark Optimizer for Adaptive, Fine-Grained Parameter Tuning |
2024 |
VLDB |
4.4846425e-05 |
| 8,781 |
Accelerate Distributed Joins with Predicate Transfer |
2025 |
SIGMOD |
4.4534753e-05 |
| 8,834 |
ByteCard: Enhancing ByteDance’s Data Warehouse with Learned Cardinality Estimation |
2024 |
SIGMOD |
4.4394021e-05 |
| 8,856 |
Composable Data Management: An Execution Overview |
2024 |
VLDB |
4.4346165e-05 |
| 9,587 |
Low Rank Learning for Offline Query Optimization |
2025 |
SIGMOD |
4.3215645e-05 |
| 9,981 |
Survivorship Bias in Industrial Database Workloads |
2026 |
CIDR |
4.1945683e-05 |
| 10,196 |
PTO: A Workload-driven Predictive Table Optimizer for Lakehouse Systems |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,491 |
Intra-Query Runtime Elasticity for Cloud-Native Data Analysis |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,633 |
AQETuner: Reliable Query-level Configuration Tuning for Analytical Query Engines |
2025 |
VLDB |
4.1945683e-05 |
| 10,766 |
Scribe: How Meta transports terabytes per second in real time |
2025 |
VLDB |
4.1945683e-05 |
| 10,854 |
LiquidCache: Efficient Pushdown Caching for Cloud-Native Data Analytics |
2025 |
VLDB |
4.1945683e-05 |
| 10,872 |
LASER: Buffer-Aware Learned Query Scheduling in Master-Standby Databases |
2025 |
VLDB |
4.1945683e-05 |
| 11,084 |
Presto’s History-based Query Optimizer |
2024 |
VLDB |
4.1945683e-05 |
| 11,090 |
Simple (yet Efficient) Function Authoring for Vectorized Engines |
2024 |
VLDB |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 19 of 19 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 9 |
Implementation Techniques For Main Memory Database Systems |
1984 |
SIGMOD |
0.0014279444 |
| 66 |
Spark SQL: Relational Data Processing in Spark |
2015 |
SIGMOD |
0.00061639801 |
| 109 |
Dremel: Interactive Analysis of Web-Scale Datasets |
2010 |
VLDB |
0.00048186983 |
| 167 |
The Snowflake Elastic Data Warehouse |
2016 |
SIGMOD |
0.00039180521 |
| 185 |
DuckDB: an Embeddable Analytical Database |
2019 |
SIGMOD |
0.00036538405 |
| 396 |
One Trillion Edges: Graph Processing at Facebook-Scale |
2015 |
VLDB |
0.00024424102 |
| 746 |
Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores |
2020 |
VLDB |
0.00017326979 |
| 789 |
Cypher: An Evolving Query Language for Property Graphs |
2018 |
SIGMOD |
0.00016634256 |
| 964 |
G-CORE: A Core for Future Graph Query Languages |
2018 |
SIGMOD |
0.0001497475 |
| 1,284 |
Amazon Redshift Re-invented |
2022 |
SIGMOD |
0.00012837822 |
| 1,943 |
Procella: Unifying serving and analytical data at YouTube |
2019 |
VLDB |
0.00010012569 |
| 2,062 |
Dremel: A Decade of Interactive SQL Analysis at Web Scale |
2020 |
VLDB |
9.6481955e-05 |
| 2,473 |
Photon: A Fast Query Engine for Lakehouse Systems |
2022 |
SIGMOD |
8.7237281e-05 |
| 2,505 |
Graph Pattern Matching in GQL and SQL/PGQ |
2022 |
SIGMOD |
8.634551e-05 |
| 2,528 |
Velox: Meta’s Unified Execution Engine |
2022 |
VLDB |
8.59454e-05 |
| 3,355 |
F1 Query: Declarative Querying at Scale |
2018 |
VLDB |
7.1829142e-05 |
| 4,688 |
Alibaba Hologres: A Cloud-Native Service for Hybrid Serving/Analytical Processing |
2020 |
VLDB |
5.9980609e-05 |
| 6,715 |
Shared Foundations: Modernizing Meta's Data Lakehouse |
2023 |
CIDR |
4.9509939e-05 |
| 8,357 |
Cubrick: Indexing Millions of Records per Second for Interactive Analytics |
2016 |
VLDB |
4.5373339e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 9,111 |
Meta's Next-generation Realtime Monitoring and Analytics Platform |
2022 |
VLDB |
4.3942367e-05 |
| 8,781 |
Accelerate Distributed Joins with Predicate Transfer |
2025 |
SIGMOD |
4.4534753e-05 |
| 10,196 |
PTO: A Workload-driven Predictive Table Optimizer for Lakehouse Systems |
2026 |
SIGMOD |
4.1945683e-05 |
| 6,715 |
Shared Foundations: Modernizing Meta's Data Lakehouse |
2023 |
CIDR |
4.9509939e-05 |
| 9,689 |
LST-Bench: Benchmarking Log-Structured Tables in the Cloud |
2024 |
SIGMOD |
4.3043822e-05 |
| 11,998 |
Pronto: A Software-Defined Networking based System for Performance Management of Analytical Queries on Distributed Data Stores |
2014 |
VLDB |
4.1945683e-05 |
| 2,528 |
Velox: Meta’s Unified Execution Engine |
2022 |
VLDB |
8.59454e-05 |
| 4,804 |
Efficient Deep Learning Pipelines for Accurate Cost Estimations Over Large Scale Query Workload |
2021 |
SIGMOD |
5.910467e-05 |
| 10,491 |
Intra-Query Runtime Elasticity for Cloud-Native Data Analysis |
2025 |
SIGMOD |
4.1945683e-05 |
| 11,084 |
Presto’s History-based Query Optimizer |
2024 |
VLDB |
4.1945683e-05 |