Integrating Data Lake Tables
Summary: ALITE: first scalable system to compute Full Disjunction for integrating tables discovered from data lakes (via join/union/related-table search). Relaxes assumptions of identical attribute names, completeness (no nulls) and acyclic joins, empirically outperforms prior algorithms and supplies three real-data benchmarks. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 15 of 15 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 30 of 30 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,735 | Auto-Join: Joining Tables by Leveraging Transformations | 2017 | VLDB | 6.8061318e-05 |
| 11,312 | Amalur: Next-generation Data Integration in Data Lakes | 2022 | CIDR | 4.1945683e-05 |
| 1,187 | JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes | 2019 | SIGMOD | 0.00013443639 |
| 3,335 | DeepJoin: Joinable Table Discovery with Pre-trained Language Models | 2023 | VLDB | 7.2065006e-05 |
| 2,836 | Semantics-aware Dataset Discovery from Data Lakes with Contextualized Column-based Representation Learning | 2023 | VLDB | 8.0443826e-05 |
| 1,178 | Table Union Search on Open Data | 2018 | VLDB | 0.00013468118 |
| 10,197 | Qualitative Join Discovery in Data Lakes using Examples | 2026 | SIGMOD | 4.1945683e-05 |
| 1,644 | Finding Related Tables in Data Lakes for Interactive Data Science | 2020 | SIGMOD | 0.00011041787 |
| 11,063 | Searching Data Lakes for Nested and Joined Data | 2024 | VLDB | 4.1945683e-05 |
| 8,116 | LakeBench: A Benchmark for Discovering Joinable and Unionable Tables in Data Lakes | 2024 | VLDB | 4.581507e-05 |