| 939 |
Data Lake Management: Challenges and Opportunities |
2019 |
VLDB |
0.00015187344 |
| 1,178 |
Table Union Search on Open Data |
2018 |
VLDB |
0.00013468118 |
| 1,187 |
JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes |
2019 |
SIGMOD |
0.00013443639 |
| 1,644 |
Finding Related Tables in Data Lakes for Interactive Data Science |
2020 |
SIGMOD |
0.00011041787 |
| 2,730 |
Open Data Integration |
2018 |
VLDB |
8.2126735e-05 |
| 2,836 |
Semantics-aware Dataset Discovery from Data Lakes with Contextualized Column-based Representation Learning |
2023 |
VLDB |
8.0443826e-05 |
| 3,000 |
SANTOS: Relationship-based Semantic Table Union Search |
2023 |
SIGMOD |
7.7462128e-05 |
| 3,335 |
DeepJoin: Joinable Table Discovery with Pre-trained Language Models |
2023 |
VLDB |
7.2065006e-05 |
| 3,358 |
Organizing Data Lakes for Navigation |
2020 |
SIGMOD |
7.1784949e-05 |
| 3,400 |
ELPIS: Graph-Based Similarity Search for Scalable Data Science |
2023 |
VLDB |
7.1405533e-05 |
| 3,824 |
Correlation Sketches for Approximate Join-Correlation Queries |
2021 |
SIGMOD |
6.7260705e-05 |
| 3,963 |
Pytheas: Pattern-based Table Discovery in CSV Files |
2020 |
VLDB |
6.5840643e-05 |
| 4,278 |
Similarity Query Processing for High-Dimensional Data |
2020 |
VLDB |
6.2953764e-05 |
| 4,967 |
Leva: Boosting Machine Learning Performance with Relational Embedding Data Augmentation |
2022 |
SIGMOD |
5.7956612e-05 |
| 5,024 |
Towards Distribution-aware Query Answering in Data Markets |
2022 |
VLDB |
5.7535043e-05 |
| 5,200 |
SetSketch: Filling the Gap between MinHash and HyperLogLog |
2021 |
VLDB |
5.6337581e-05 |
| 5,789 |
Interactive Navigation of Open Data Linkages |
2017 |
VLDB |
5.3269741e-05 |
| 5,794 |
Discovering Related Data At Scale |
2021 |
VLDB |
5.3245122e-05 |
| 5,976 |
Responsible Data Integration: Next-generation Challenges |
2022 |
SIGMOD |
5.245976e-05 |
| 6,092 |
Observatory: Characterizing Embeddings of Relational Tables |
2024 |
VLDB |
5.2138566e-05 |
| 6,270 |
MATE: Multi-Attribute Table Extraction |
2022 |
VLDB |
5.1337451e-05 |
| 6,438 |
RONIN: Data Lake Exploration |
2021 |
VLDB |
5.0620163e-05 |
| 6,798 |
Searching Web Data using MinHash LSH |
2016 |
SIGMOD |
4.9235251e-05 |
| 7,061 |
Serving Deep Learning Models with Deduplication from Relational Databases |
2022 |
VLDB |
4.8463881e-05 |
| 7,582 |
LakeCompass: An End-to-End System for Data Maintenance, Search and Analysis in Data Lakes |
2024 |
VLDB |
4.7046388e-05 |
| 7,643 |
Cross Modal Data Discovery over Structured and Unstructured Data Lakes |
2023 |
VLDB |
4.6901105e-05 |
| 8,116 |
LakeBench: A Benchmark for Discovering Joinable and Unionable Tables in Data Lakes |
2024 |
VLDB |
4.581507e-05 |
| 8,618 |
Nexus: Correlation Discovery over Collections of Spatio-Temporal Tabular Data |
2024 |
SIGMOD |
4.4838259e-05 |
| 8,717 |
Scotch: Generating FPGA-Accelerators for Sketching at Line Rate |
2021 |
VLDB |
4.4614498e-05 |
| 8,729 |
OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance From Database Query Event Logs |
2023 |
VLDB |
4.4582221e-05 |
| 8,910 |
R2D2: Reducing Redundancy and Duplication in Data Lakes |
2023 |
SIGMOD |
4.427232e-05 |
| 9,211 |
Optimistic Data Parallelism for FPGA-Accelerated Sketching |
2023 |
VLDB |
4.3726511e-05 |
| 10,197 |
Qualitative Join Discovery in Data Lakes using Examples |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,540 |
Discovering Approximate Inclusion Dependencies |
2025 |
VLDB |
4.1945683e-05 |
| 10,685 |
LakeVisage: Towards Scalable, Flexible and Interactive Visualization Recommendation for Data Discovery over Data Lakes |
2025 |
VLDB |
4.1945683e-05 |
| 10,836 |
Data Discovery in Data Lakes: Operations, Indexes, Systems |
2025 |
VLDB |
4.1945683e-05 |
| 10,951 |
Determining the Largest Overlap between Tables |
2024 |
SIGMOD |
4.1945683e-05 |
| 11,025 |
Sampling Methods for Inner Product Sketching |
2024 |
VLDB |
4.1945683e-05 |
| 11,063 |
Searching Data Lakes for Nested and Joined Data |
2024 |
VLDB |
4.1945683e-05 |
| 11,168 |
Weighted Minwise Hashing Beats Linear Sketching for Inner Product Estimation |
2023 |
PODS |
4.1945683e-05 |
| 11,183 |
Matching Roles from Temporal Data |
2023 |
SIGMOD |
4.1945683e-05 |