Database Paper Browser

Back to papers

Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores

Summary: Delta Lake provides ACID on cloud object stores via a compacted Parquet log, enabling time travel and fast metadata. Upserts, data-layout, caching, audit logs, and cross-engine access (Spark/Hive/Presto/Redshift) enable ecosystem-wide analytics. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
12219
Venue
VLDB
Year
2020
Pagerank
0.00017326979
Overall Rank
746 | 94.82%
DOI
10.14778/3415478.3415560

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 50 of 68 citing papers.

Rank Citing Paper Year Venue Pagerank
1,284 Amazon Redshift Re-invented 2022 SIGMOD 0.00012837822
1,377 Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics 2021 CIDR 0.00012296941
1,541 Symphony: Towards Natural Language Query Answering over Multi-modal Data Lakes 2023 CIDR 0.00011456579
2,473 Photon: A Fast Query Engine for Lakehouse Systems 2022 SIGMOD 8.7237281e-05
3,407 End-to-end Optimization of Machine Learning Prediction Queries 2022 SIGMOD 7.1295646e-05
3,844 The evolution of Amazon Redshift (extended abstract) 2021 VLDB 6.7076451e-05
4,514 An Empirical Evaluation of Columnar Storage Formats 2024 VLDB 6.1204636e-05
4,530 Big Metadata: When Metadata is Big Data 2021 VLDB 6.1075429e-05
4,870 Exploiting Cloud Object Storage for High-Performance Analytics 2023 VLDB 5.8613885e-05
5,318 Analyzing and Comparing Lakehouse Storage Systems 2023 CIDR 5.5715872e-05
5,441 Using Cloud Functions as Accelerator for Elastic Data Analytics 2023 SIGMOD 5.5028093e-05
5,476 Containerized Execution of UDFs: An Experimental Evaluation 2022 VLDB 5.4866534e-05
5,531 Presto: A Decade of SQL Analytics at Meta 2023 SIGMOD 5.4549499e-05
5,678 Cloud-Native Transactions and Analytics in SingleStore 2022 SIGMOD 5.3746593e-05
5,966 Cornus: Atomic Commit for a Cloud DBMS with Storage Disaggregation 2023 VLDB 5.2517881e-05
6,149 Crystal: A Unified Cache Storage System for Analytical Databases 2021 VLDB 5.1847534e-05
6,261 The Cosmos Big Data Platform at Microsoft: Over a Decade of Progress and a Decade to Look Forward 2021 VLDB 5.1350714e-05
6,340 Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine 2024 SIGMOD 5.1051018e-05
6,402 BigLake: BigQuery’s Evolution toward a Multi-Cloud Lakehouse 2024 SIGMOD 5.079818e-05
6,715 Shared Foundations: Modernizing Meta's Data Lakehouse 2023 CIDR 4.9509939e-05
6,972 Predicate Caching: Query-Driven Secondary Indexing for Cloud Data Warehouses 2024 SIGMOD 4.8785237e-05
7,059 Adaptive and Robust Query Execution for Lakehouses at Scale 2024 VLDB 4.8477825e-05
7,296 Multi-Tenant Cloud Data Services: State-of-the-Art, Challenges and Opportunities 2022 SIGMOD 4.7723197e-05
7,427 Selection Pushdown in Column Stores using Bit Manipulation Instructions 2023 SIGMOD 4.7327406e-05
7,469 Bullion: A Column Store for Machine Learning 2025 CIDR 4.7204398e-05
7,663 Optimizing Collections of Bloom Filters within a Space Budget 2024 VLDB 4.6857816e-05
7,814 Deep Lake: a Lakehouse for Deep Learning 2023 CIDR 4.6439001e-05
7,876 Two Birds With One Stone: Designing a Hybrid Cloud Storage Engine for HTAP 2024 VLDB 4.6298182e-05
7,907 Petabyte-Scale Row-Level Operations in Data Lakehouses 2024 VLDB 4.6205839e-05
8,519 Extending Polaris to Support Transactions 2024 SIGMOD 4.494088e-05
8,608 Unity Catalog: Open and Universal Governance for the Lakehouse and Beyond 2025 SIGMOD 4.4853979e-05
8,731 Columnar Formats for Schemaless LSM-based Document Stores 2022 VLDB 4.4577278e-05
8,758 Hyperspace: The Indexing Subsystem of Azure Synapse 2021 VLDB 4.456315e-05
8,785 Bringing Cloud-Native Storage to SAP IQ 2021 SIGMOD 4.4522556e-05
8,945 The Five-Minute Rule for the Cloud: Caching in Analytics Systems 2025 CIDR 4.4254423e-05
9,016 Making Data Engineering Declarative 2023 CIDR 4.4094312e-05
9,093 Databricks Lakeguard: Supporting Fine-grained Access Control and Multi-user Capabilities for Apache Spark Workloads 2025 SIGMOD 4.398149e-05
9,201 F3: The Open-Source Data File Format for the Future 2026 SIGMOD 4.3743539e-05
9,232 AutoComp: Automated Data Compaction for Log-Structured Tables in Data Lakes 2025 SIGMOD 4.3690661e-05
9,236 The Hopsworks Feature Store for Machine Learning 2024 SIGMOD 4.3690661e-05
9,689 LST-Bench: Benchmarking Log-Structured Tables in the Cloud 2024 SIGMOD 4.3043822e-05
9,699 The Story of AWS Glue 2023 VLDB 4.3018844e-05
9,701 Towards Functional Decomposition of Storage Formats 2025 CIDR 4.3008468e-05
9,904 TiQuE: Improving the Transactional Performance of Analytical Systems for True Hybrid Workloads 2023 VLDB 4.258022e-05
9,917 Check Out the Big Brain on BRAD: Simplifying Cloud Data Processing with Learned Automated Data Meshes 2023 VLDB 4.2561557e-05
10,196 PTO: A Workload-driven Predictive Table Optimizer for Lakehouse Systems 2026 SIGMOD 4.1945683e-05
10,238 TurboLynx: Schemaless Graph Engine Strikes Back for General-Purpose Analytics 2026 VLDB 4.1945683e-05
10,248 Active Data Lakes: Regaining Physical Data Independence Without Losing Interoperability 2026 VLDB 4.1945683e-05
10,385 Optimizing Block Skipping for High-Dimensional Data with Learned Adaptive Curve 2025 SIGMOD 4.1945683e-05
10,415 SAP HANA Cloud: Data Management for Modern Enterprise Applications 2025 SIGMOD 4.1945683e-05
Previous Page 1 / 2 Next

Outgoing Citations (Sorted by Pagerank)

Showing 9 of 9 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers