Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores
Summary: Delta Lake provides ACID on cloud object stores via a compacted Parquet log, enabling time travel and fast metadata. Upserts, data-layout, caching, audit logs, and cross-engine access (Spark/Hive/Presto/Redshift) enable ecosystem-wide analytics. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Michael Armbrust
- 2. Tathagata Das
- 3. Liwen Sun
- 4. Burak Yavuz
- 5. Shixiong Zhu
- 6. Mukul Murthy
- 7. Joseph Torres
- 8. Herman van Hovell
- 9. Adrian Ionescu
- 10. Alicja Łuszczak
- 11. Michał Świtakowski
- 12. Michał Szafrański
- 13. Xiao Li
- 14. Takuya Ueshin
- 15. Mostafa Mokhtar
- 16. Peter Boncz
- 17. Ali Ghodsi
- 18. Sameer Paranjpye
- 19. Pieter Senster
- 20. Reynold Xin
- 21. Matei Zaharia
Incoming Citations (Sorted by Pagerank)
Showing 50 of 68 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 9 of 9 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 21 | C-Store: A Column-oriented DBMS | 2005 | VLDB | 0.00086087497 |
| 66 | Spark SQL: Relational Data Processing in Spark | 2015 | SIGMOD | 0.00061639801 |
| 156 | Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases | 2017 | SIGMOD | 0.00040504295 |
| 167 | The Snowflake Elastic Data Warehouse | 2016 | SIGMOD | 0.00039180521 |
| 664 | Relational Cloud: A Database-as-a-Service for the Cloud | 2011 | CIDR | 0.00018465843 |
| 720 | Building a Database on S3 | 2008 | SIGMOD | 0.00017615431 |
| 1,548 | Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark | 2018 | SIGMOD | 0.00011431383 |
| 2,016 | Bolt-on Causal Consistency | 2013 | SIGMOD | 9.789465e-05 |
| 3,058 | Rethinking Data-Intensive Science Using Scalable Analytics Systems | 2015 | SIGMOD | 7.6410159e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,248 | Active Data Lakes: Regaining Physical Data Independence Without Losing Interoperability | 2026 | VLDB | 4.1945683e-05 |
| 6,402 | BigLake: BigQuery’s Evolution toward a Multi-Cloud Lakehouse | 2024 | SIGMOD | 5.079818e-05 |
| 7,059 | Adaptive and Robust Query Execution for Lakehouses at Scale | 2024 | VLDB | 4.8477825e-05 |
| 3,644 | BtrBlocks: Efficient Columnar Compression for Data Lakes | 2023 | SIGMOD | 6.8854928e-05 |
| 9,689 | LST-Bench: Benchmarking Log-Structured Tables in the Cloud | 2024 | SIGMOD | 4.3043822e-05 |
| 9,232 | AutoComp: Automated Data Compaction for Log-Structured Tables in Data Lakes | 2025 | SIGMOD | 4.3690661e-05 |
| 13,124 | Delta Sharing: An Open Protocol for Cross-Platform Data Sharing | 2025 | VLDB | - |
| 7,907 | Petabyte-Scale Row-Level Operations in Data Lakehouses | 2024 | VLDB | 4.6205839e-05 |
| 11,545 | Pixels: Multiversion Wide Table Store for Data Lakes | 2020 | CIDR | 4.1945683e-05 |
| 5,318 | Analyzing and Comparing Lakehouse Storage Systems | 2023 | CIDR | 5.5715872e-05 |