Analyzing and Comparing Lakehouse Storage Systems
Summary: Systematic comparative analysis of Delta Lake, Apache Hudi, and Apache Iceberg exposing design tradeoffs in metadata, transactions, compaction and read/write paths for lakehouse storage. Provides cross-system performance/feature evaluation across multiple axes and releases LHBench, an open benchmark for reproducible lakehouse design comparisons. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Paras Jain
- 2. Peter Kraft
- 3. Conor Power
- 4. Tathagata Das
- 5. Ion Stoica
- 6. Matei Zaharia
Incoming Citations (Sorted by Pagerank)
Showing 10 of 10 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,495 | ClickHouse - Lightning Fast Analytics for Everyone | 2024 | VLDB | 6.1410277e-05 |
| 5,562 | A Deep Dive into Common Open Formats for Analytical DBMSs | 2023 | VLDB | 5.4331334e-05 |
| 7,879 | PDX: A Data Layout for Vector Similarity Search | 2025 | SIGMOD | 4.6292417e-05 |
| 7,907 | Petabyte-Scale Row-Level Operations in Data Lakehouses | 2024 | VLDB | 4.6205839e-05 |
| 9,689 | LST-Bench: Benchmarking Log-Structured Tables in the Cloud | 2024 | SIGMOD | 4.3043822e-05 |
| 10,196 | PTO: A Workload-driven Predictive Table Optimizer for Lakehouse Systems | 2026 | SIGMOD | 4.1945683e-05 |
| 10,571 | Quantum Data Management in the NISQ Era | 2025 | VLDB | 4.1945683e-05 |
| 10,736 | TreeCat: Standalone Catalog Engine for Large Data Systems | 2025 | VLDB | 4.1945683e-05 |
| 10,767 | The HANA Native Query Engine for Lakehouse Systems | 2025 | VLDB | 4.1945683e-05 |
| 10,777 | Magnus: A Holistic Approach to Data Management for Large-Scale Machine Learning Workloads | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 8 of 8 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 21 | C-Store: A Column-oriented DBMS | 2005 | VLDB | 0.00086087497 |
| 659 | The Making of TPC-DS | 2006 | VLDB | 0.00018500853 |
| 720 | Building a Database on S3 | 2008 | SIGMOD | 0.00017615431 |
| 746 | Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores | 2020 | VLDB | 0.00017326979 |
| 1,377 | Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics | 2021 | CIDR | 0.00012296941 |
| 2,473 | Photon: A Fast Query Engine for Lakehouse Systems | 2022 | SIGMOD | 8.7237281e-05 |
| 3,787 | White-box Compression: Learning and Exploiting Compact Table Representations | 2020 | CIDR | 6.7674374e-05 |
| 6,279 | Self-Organizing Data Containers | 2022 | CIDR | 5.1295282e-05 |
Previous
Page 1 / 1
Next