The Challenge of Building Effective Data Lakes
Summary: Data lakes offer scalable cloud storage for analytics but lack context, quality, and discoverability, eroding trust in analytics. Emphasizes real-world implementation patterns and the need for automated, learning-based discovery at enterprise scale. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Awez Syed
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 0 of 0 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 13,503 | Is It Still "Big Data" If It Fits In My Pocket? | 2011 | VLDB | - |
| 9,973 | End-to-End Declarative Data Analytics: Co-designing Engines, Interfaces, and Cloud Infrastructure | 2026 | CIDR | 4.1945683e-05 |
| 13,244 | Deep Data Integration | 2021 | SIGMOD | - |
| 3,281 | Constance: An Intelligent Data Lake System | 2016 | SIGMOD | 7.2823287e-05 |
| 6,165 | When the Web is your Data Lake: Creating a Search Engine for Datasets on the Web | 2020 | SIGMOD | 5.1728052e-05 |
| 10,836 | Data Discovery in Data Lakes: Operations, Indexes, Systems | 2025 | VLDB | 4.1945683e-05 |
| 1,377 | Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics | 2021 | CIDR | 0.00012296941 |
| 9,555 | Bringing the Operational and Analytical Worlds Together with Lakebase | 2025 | VLDB | 4.3254416e-05 |
| 1,833 | Data Wrangling: The Challenging Journey from the Wild to the Lake | 2015 | CIDR | 0.00010378976 |
| 939 | Data Lake Management: Challenges and Opportunities | 2019 | VLDB | 0.00015187344 |