Data Discovery in Data Lakes: Operations, Indexes, Systems
Summary: Tutorial-style synthesis of data discovery in data lakes: catalogs operations, index designs and system architectures, with emphasis on indexing structures and scalable algorithms for join and union discovery. Identifies tradeoffs and open challenges for holistic systems, evaluation methodologies, and federated discovery. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 16 of 16 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 12,114 | Database Techniques for Linked Data Management | 2012 | SIGMOD | 4.1945683e-05 |
| 7,643 | Cross Modal Data Discovery over Structured and Unstructured Data Lakes | 2023 | VLDB | 4.6901105e-05 |
| 7,582 | LakeCompass: An End-to-End System for Data Maintenance, Search and Analysis in Data Lakes | 2024 | VLDB | 4.7046388e-05 |
| 11,063 | Searching Data Lakes for Nested and Joined Data | 2024 | VLDB | 4.1945683e-05 |
| 8,116 | LakeBench: A Benchmark for Discovering Joinable and Unionable Tables in Data Lakes | 2024 | VLDB | 4.581507e-05 |
| 8,079 | High-Dimensional Index Structures: Database Support for Next Decade's Applications | 1998 | SIGMOD | 4.5914115e-05 |
| 10,197 | Qualitative Join Discovery in Data Lakes using Examples | 2026 | SIGMOD | 4.1945683e-05 |
| 13,277 | The Challenge of Building Effective Data Lakes | 2020 | SIGMOD | - |
| 1,552 | Overview of Data Exploration Techniques | 2015 | SIGMOD | 0.00011408814 |
| 939 | Data Lake Management: Challenges and Opportunities | 2019 | VLDB | 0.00015187344 |