Back to papers
Auto-Join: Joining Tables by Leveraging Transformations
Summary: Auto-Join searches operator space to synthesize a transformation that renders join-columns across representations equi-joinable. An optimal sampling strategy scales to large data and yields high-probability joins, validated on web and enterprise tables.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 11390
- Venue
- VLDB
- Year
- 2017
- Pagerank
- 6.8061318e-05
- Overall Rank
- 3,735 | 74.02%
- DOI
-
-
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 25 of 25 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 1,914 |
Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks |
2020 |
SIGMOD |
0.00010109102 |
| 2,158 |
Uni-Detect: A Unified Approach to Automated Error Detection in Tables |
2019 |
SIGMOD |
9.4141354e-05 |
| 2,587 |
Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks |
2024 |
SIGMOD |
8.4924618e-05 |
| 2,730 |
Open Data Integration |
2018 |
VLDB |
8.2126735e-05 |
| 3,252 |
Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks |
2020 |
SIGMOD |
7.3178277e-05 |
| 3,335 |
DeepJoin: Joinable Table Discovery with Pre-trained Language Models |
2023 |
VLDB |
7.2065006e-05 |
| 3,478 |
Transform-Data-by-Example (TDE): An Extensible Search Engine for Data Transformations |
2018 |
VLDB |
7.054159e-05 |
| 4,859 |
Integrating Data Lake Tables |
2023 |
VLDB |
5.8732433e-05 |
| 5,096 |
Auto-Transform: Learning-to-Transform by Patterns |
2020 |
VLDB |
5.7011825e-05 |
| 5,275 |
Auto-Tables: Synthesizing Multi-Step Transformations to Relationalize Tables without Using Examples |
2023 |
VLDB |
5.5905507e-05 |
| 5,280 |
Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V |
2023 |
VLDB |
5.5896735e-05 |
| 5,383 |
Auto-Pipeline: Synthesizing Complex Data Pipelines By-Target Using Reinforcement Learning and Search |
2021 |
VLDB |
5.5393038e-05 |
| 5,434 |
Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples |
2021 |
SIGMOD |
5.5045402e-05 |
| 5,691 |
Putting Things into Context: Rich Explanations for Query Answers using Join Graphs |
2021 |
SIGMOD |
5.3684557e-05 |
| 6,270 |
MATE: Multi-Attribute Table Extraction |
2022 |
VLDB |
5.1337451e-05 |
| 6,800 |
DTT: An Example-Driven Tabular Transformer for Joinability by Leveraging Large Language Models |
2024 |
SIGMOD |
4.9231471e-05 |
| 7,643 |
Cross Modal Data Discovery over Structured and Unstructured Data Lakes |
2023 |
VLDB |
4.6901105e-05 |
| 7,858 |
ConnectionLens: Finding Connections Across Heterogeneous Data Sources |
2018 |
VLDB |
4.6342491e-05 |
| 9,142 |
Design and Analysis of a Processing-in-DIMM Join Algorithm: A Case Study with UPMEM DIMMs |
2023 |
SIGMOD |
4.3853149e-05 |
| 9,371 |
Auto-Formula: Recommend Formulas in Spreadsheets using Contrastive Learning for Table Representations |
2024 |
SIGMOD |
4.3480692e-05 |
| 9,399 |
TabulaX: Leveraging Large Language Models for Multi-Class Table Transformations |
2025 |
VLDB |
4.3441378e-05 |
| 9,490 |
Auto-BI: Automatically Build BI-Models Leveraging Local Join Prediction and Global Schema Graph |
2023 |
VLDB |
4.3341665e-05 |
| 10,595 |
Optimized Batch Prompting for Cost-effective LLMs |
2025 |
VLDB |
4.1945683e-05 |
| 10,598 |
Auto-Prep: Holistic Prediction of Data Preparation Steps for Self-Service Business Intelligence |
2025 |
VLDB |
4.1945683e-05 |
| 11,343 |
SPINE: Scaling up Programming-by-Negative-Example for String Filtering and Transformation |
2022 |
SIGMOD |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 2,740 |
String Similarity Joins: An Experimental Evaluation |
2014 |
VLDB |
8.1980628e-05 |
| 1,572 |
Reverse Engineering Complex Join Queries |
2013 |
SIGMOD |
0.00011298251 |
| 9,490 |
Auto-BI: Automatically Build BI-Models Leveraging Local Join Prediction and Global Schema Graph |
2023 |
VLDB |
4.3341665e-05 |
| 3,335 |
DeepJoin: Joinable Table Discovery with Pre-trained Language Models |
2023 |
VLDB |
7.2065006e-05 |
| 5,096 |
Auto-Transform: Learning-to-Transform by Patterns |
2020 |
VLDB |
5.7011825e-05 |
| 10,598 |
Auto-Prep: Holistic Prediction of Data Preparation Steps for Self-Service Business Intelligence |
2025 |
VLDB |
4.1945683e-05 |
| 6,800 |
DTT: An Example-Driven Tabular Transformer for Joinability by Leveraging Large Language Models |
2024 |
SIGMOD |
4.9231471e-05 |
| 5,434 |
Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples |
2021 |
SIGMOD |
5.5045402e-05 |
| 4,850 |
SEMA-JOIN: Joining Semantically-Related Tables Using Big Table Corpora |
2015 |
VLDB |
5.8768452e-05 |
| 5,275 |
Auto-Tables: Synthesizing Multi-Step Transformations to Relationalize Tables without Using Examples |
2023 |
VLDB |
5.5905507e-05 |