Deep Lake: a Lakehouse for Deep Learning
Summary: Deep Lake: an open-source lakehouse storing images, video, annotations and tabular data as tensors, bringing ACID/time-travel/SQL semantics to non‑tabular deep‑learning datasets. Efficiently streams tensors to a Tensor Query Language, in‑browser visualizer, and PyTorch/TF/JAX to preserve GPU utilization and integrate with MLOps. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Sasun Hambardzumyan
- 2. Abhinav Tuli
- 3. Levon Ghukasyan
- 4. Fariz Rahman
- 5. Hrant Topchyan
- 6. David Isayan
- 7. Mark McQuade
- 8. Mikayel Harutyunyan
- 9. Tatevik Hakobyan
- 10. Ivo Stranic
- 11. Davit Buniatyan
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,415 | SAP HANA Cloud: Data Management for Modern Enterprise Applications | 2025 | SIGMOD | 4.1945683e-05 |
| 10,767 | The HANA Native Query Engine for Lakehouse Systems | 2025 | VLDB | 4.1945683e-05 |
| 10,777 | Magnus: A Holistic Approach to Data Management for Large-Scale Machine Learning Workloads | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 9 of 9 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 167 | The Snowflake Elastic Data Warehouse | 2016 | SIGMOD | 0.00039180521 |
| 426 | Amazon Redshift and the Case for Simpler Data Warehouses | 2015 | SIGMOD | 0.00023594359 |
| 495 | Milvus: A Purpose-Built Vector Data Management System | 2021 | SIGMOD | 0.00021767688 |
| 734 | The TileDB Array Data Storage Manager | 2017 | VLDB | 0.00017455248 |
| 746 | Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores | 2020 | VLDB | 0.00017326979 |
| 1,377 | Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics | 2021 | CIDR | 0.00012296941 |
| 2,473 | Photon: A Fast Query Engine for Lakehouse Systems | 2022 | SIGMOD | 8.7237281e-05 |
| 2,528 | Velox: Meta’s Unified Execution Engine | 2022 | VLDB | 8.59454e-05 |
| 3,254 | Query Processing on Tensor Computation Runtimes | 2022 | VLDB | 7.3161051e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,377 | Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics | 2021 | CIDR | 0.00012296941 |
| 5,318 | Analyzing and Comparing Lakehouse Storage Systems | 2023 | CIDR | 5.5715872e-05 |
| 7,061 | Serving Deep Learning Models with Deduplication from Relational Databases | 2022 | VLDB | 4.8463881e-05 |
| 11,732 | CoreKG: a Knowledge Lake Service | 2018 | VLDB | 4.1945683e-05 |
| 6,402 | BigLake: BigQuery’s Evolution toward a Multi-Cloud Lakehouse | 2024 | SIGMOD | 5.079818e-05 |
| 8,864 | Cerebro: A Layered Data Platform for Scalable Deep Learning | 2021 | CIDR | 4.4326439e-05 |
| 4,557 | Distributed Deep Learning on Data Systems: A Comparative Analysis of Approaches | 2021 | VLDB | 6.087611e-05 |
| 4,003 | Data Platform for Machine Learning | 2019 | SIGMOD | 6.54347e-05 |
| 3,335 | DeepJoin: Joinable Table Discovery with Pre-trained Language Models | 2023 | VLDB | 7.2065006e-05 |
| 13,171 | Reimagining Deep Learning Systems Through the Lens of Data Systems | 2024 | VLDB | - |