SEMA-JOIN: Joining Semantically-Related Tables Using Big Table Corpora
Summary: SEMA-JOIN automates semantic joins beyond equi-joins by mining a big table corpus (>100M tables) to learn row- and column-level correlations. Join discovery is framed as maximizing aggregate correlation with a linear-program relaxation and a 2-approximation, yielding high precision on public and enterprise data. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Yeye He
- 2. Kris Ganjam
- 3. Xu Chu
Incoming Citations (Sorted by Pagerank)
Showing 16 of 16 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1 | Access Path Selection in a Relational Database Management System | 1979 | SIGMOD | 0.0040449103 |
| 9 | Implementation Techniques For Main Memory Database Systems | 1984 | SIGMOD | 0.0014279444 |
| 62 | Freebase: A Collaboratively Created Graph Database For Structuring Human Knowledge | 2008 | SIGMOD | 0.0006429466 |
| 420 | InfoGather: Entity Augmentation and Attribute Discovery By Holistic Matching with Web Tables | 2012 | SIGMOD | 0.00023719065 |
| 518 | Data Integration for the Relational Web | 2009 | VLDB | 0.00021158934 |
| 3,328 | Multi-column Substring Matching for Database Schema Translation | 2006 | VLDB | 7.2174278e-05 |
| 3,992 | Discovering Linkage Points over Web Data | 2013 | VLDB | 6.5544834e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,499 | Synthesizing Mapping Relationships Using Table Corpus | 2017 | SIGMOD | 4.4975851e-05 |
| 266 | Efficient Exact Set-Similarity Joins | 2006 | VLDB | 0.00029718727 |
| 9,563 | Towards a Unified Framework for String Similarity Joins | 2019 | VLDB | 4.3254416e-05 |
| 3,490 | Leveraging Set Relations in Exact Set Similarity Join | 2017 | VLDB | 7.0465856e-05 |
| 4,119 | A System for Semantic Query Optimization | 1987 | SIGMOD | 6.4365852e-05 |
| 3,335 | DeepJoin: Joinable Table Discovery with Pre-trained Language Models | 2023 | VLDB | 7.2065006e-05 |
| 3,824 | Correlation Sketches for Approximate Join-Correlation Queries | 2021 | SIGMOD | 6.7260705e-05 |
| 8,899 | Fast Approximate Similarity Join in Vector Databases | 2025 | SIGMOD | 4.427232e-05 |
| 10,197 | Qualitative Join Discovery in Data Lakes using Examples | 2026 | SIGMOD | 4.1945683e-05 |
| 3,735 | Auto-Join: Joining Tables by Leveraging Transformations | 2017 | VLDB | 6.8061318e-05 |