Shared Foundations: Modernizing Meta's Data Lakehouse
Summary: Meta's Shared Foundations refactors a fragmented lakehouse into composable components with common APIs to eliminate duplication. Outcome: a unified stack that improves performance, developer velocity, and UX for large-scale data processing. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Biswapesh Chattopadhyay
- 2. Pedro Pedreira
- 3. Sameer Agarwal
- 4. Yutian "James" Sun
- 5. Suketu Vakharia
- 6. Peng Li
- 7. Weiran Liu
- 8. Sundaram Narayanan
Incoming Citations (Sorted by Pagerank)
Showing 9 of 9 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,239 | The Composable Data Management System Manifesto | 2023 | VLDB | 6.3318452e-05 |
| 4,514 | An Empirical Evaluation of Columnar Storage Formats | 2024 | VLDB | 6.1204636e-05 |
| 5,531 | Presto: A Decade of SQL Analytics at Meta | 2023 | SIGMOD | 5.4549499e-05 |
| 7,469 | Bullion: A Column Store for Machine Learning | 2025 | CIDR | 4.7204398e-05 |
| 8,856 | Composable Data Management: An Execution Overview | 2024 | VLDB | 4.4346165e-05 |
| 9,455 | GraphScope Flex: LEGO-like Graph Computing Stack | 2024 | SIGMOD | 4.3388007e-05 |
| 9,760 | Adaptive data transformations for QaaS | 2025 | CIDR | 4.2856106e-05 |
| 10,777 | Magnus: A Holistic Approach to Data Management for Large-Scale Machine Learning Workloads | 2025 | VLDB | 4.1945683e-05 |
| 11,090 | Simple (yet Efficient) Function Authoring for Vectorized Engines | 2024 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 9 of 9 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 66 | Spark SQL: Relational Data Processing in Spark | 2015 | SIGMOD | 0.00061639801 |
| 167 | The Snowflake Elastic Data Warehouse | 2016 | SIGMOD | 0.00039180521 |
| 746 | Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores | 2020 | VLDB | 0.00017326979 |
| 1,377 | Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics | 2021 | CIDR | 0.00012296941 |
| 1,487 | Scuba: Diving into Data at Facebook | 2013 | VLDB | 0.00011701099 |
| 1,613 | Realtime Data Processing at Facebook | 2016 | SIGMOD | 0.00011140777 |
| 2,249 | Orca: A Modular Query Optimizer Architecture for Big Data | 2014 | SIGMOD | 9.2034693e-05 |
| 2,528 | Velox: Meta’s Unified Execution Engine | 2022 | VLDB | 8.59454e-05 |
| 8,357 | Cubrick: Indexing Millions of Records per Second for Interactive Analytics | 2016 | VLDB | 4.5373339e-05 |
Previous
Page 1 / 1
Next