Database Paper Browser

Back to papers

SEMA-JOIN: Joining Semantically-Related Tables Using Big Table Corpora

Summary: SEMA-JOIN automates semantic joins beyond equi-joins by mining a big table corpus (>100M tables) to learn row- and column-level correlations. Join discovery is framed as maximizing aggregate correlation with a linear-program relaxation and a 2-approximation, yielding high precision on public and enterprise data. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
11014
Venue
VLDB
Year
2015
Pagerank
5.8768452e-05
Overall Rank
4,850 | 66.27%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 16 of 16 citing papers.

Rank Citing Paper Year Venue Pagerank
1,469 BlinkFill: Semi-supervised Programming By Example for Syntactic String Transformations 2016 VLDB 0.00011836053
2,158 Uni-Detect: A Unified Approach to Automated Error Detection in Tables 2019 SIGMOD 9.4141354e-05
2,506 Auto-Detect: Data-Driven Error Detection in Tables 2018 SIGMOD 8.6335464e-05
3,252 Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks 2020 SIGMOD 7.3178277e-05
3,335 DeepJoin: Joinable Table Discovery with Pre-trained Language Models 2023 VLDB 7.2065006e-05
3,735 Auto-Join: Joining Tables by Leveraging Transformations 2017 VLDB 6.8061318e-05
5,096 Auto-Transform: Learning-to-Transform by Patterns 2020 VLDB 5.7011825e-05
5,434 Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples 2021 SIGMOD 5.5045402e-05
5,691 Putting Things into Context: Rich Explanations for Query Answers using Join Graphs 2021 SIGMOD 5.3684557e-05
6,800 DTT: An Example-Driven Tabular Transformer for Joinability by Leveraging Large Language Models 2024 SIGMOD 4.9231471e-05
8,499 Synthesizing Mapping Relationships Using Table Corpus 2017 SIGMOD 4.4975851e-05
9,142 Design and Analysis of a Processing-in-DIMM Join Algorithm: A Case Study with UPMEM DIMMs 2023 SIGMOD 4.3853149e-05
9,399 TabulaX: Leveraging Large Language Models for Multi-Class Table Transformations 2025 VLDB 4.3441378e-05
9,490 Auto-BI: Automatically Build BI-Models Leveraging Local Join Prediction and Global Schema Graph 2023 VLDB 4.3341665e-05
10,598 Auto-Prep: Holistic Prediction of Data Preparation Steps for Self-Service Business Intelligence 2025 VLDB 4.1945683e-05
11,087 Dealing with Acronyms, Abbreviations, and Typos in Real-World Entity Matching 2024 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 7 of 7 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers