Database Paper Browser

Back to papers

Auto-Join: Joining Tables by Leveraging Transformations

Summary: Auto-Join searches operator space to synthesize a transformation that renders join-columns across representations equi-joinable. An optimal sampling strategy scales to large data and yields high-probability joins, validated on web and enterprise tables. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
11390
Venue
VLDB
Year
2017
Pagerank
6.8061318e-05
Overall Rank
3,735 | 74.02%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 25 of 25 citing papers.

Rank Citing Paper Year Venue Pagerank
1,914 Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks 2020 SIGMOD 0.00010109102
2,158 Uni-Detect: A Unified Approach to Automated Error Detection in Tables 2019 SIGMOD 9.4141354e-05
2,587 Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks 2024 SIGMOD 8.4924618e-05
2,730 Open Data Integration 2018 VLDB 8.2126735e-05
3,252 Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks 2020 SIGMOD 7.3178277e-05
3,335 DeepJoin: Joinable Table Discovery with Pre-trained Language Models 2023 VLDB 7.2065006e-05
3,478 Transform-Data-by-Example (TDE): An Extensible Search Engine for Data Transformations 2018 VLDB 7.054159e-05
4,859 Integrating Data Lake Tables 2023 VLDB 5.8732433e-05
5,096 Auto-Transform: Learning-to-Transform by Patterns 2020 VLDB 5.7011825e-05
5,275 Auto-Tables: Synthesizing Multi-Step Transformations to Relationalize Tables without Using Examples 2023 VLDB 5.5905507e-05
5,280 Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V 2023 VLDB 5.5896735e-05
5,383 Auto-Pipeline: Synthesizing Complex Data Pipelines By-Target Using Reinforcement Learning and Search 2021 VLDB 5.5393038e-05
5,434 Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples 2021 SIGMOD 5.5045402e-05
5,691 Putting Things into Context: Rich Explanations for Query Answers using Join Graphs 2021 SIGMOD 5.3684557e-05
6,270 MATE: Multi-Attribute Table Extraction 2022 VLDB 5.1337451e-05
6,800 DTT: An Example-Driven Tabular Transformer for Joinability by Leveraging Large Language Models 2024 SIGMOD 4.9231471e-05
7,643 Cross Modal Data Discovery over Structured and Unstructured Data Lakes 2023 VLDB 4.6901105e-05
7,858 ConnectionLens: Finding Connections Across Heterogeneous Data Sources 2018 VLDB 4.6342491e-05
9,142 Design and Analysis of a Processing-in-DIMM Join Algorithm: A Case Study with UPMEM DIMMs 2023 SIGMOD 4.3853149e-05
9,371 Auto-Formula: Recommend Formulas in Spreadsheets using Contrastive Learning for Table Representations 2024 SIGMOD 4.3480692e-05
9,399 TabulaX: Leveraging Large Language Models for Multi-Class Table Transformations 2025 VLDB 4.3441378e-05
9,490 Auto-BI: Automatically Build BI-Models Leveraging Local Join Prediction and Global Schema Graph 2023 VLDB 4.3341665e-05
10,595 Optimized Batch Prompting for Cost-effective LLMs 2025 VLDB 4.1945683e-05
10,598 Auto-Prep: Holistic Prediction of Data Preparation Steps for Self-Service Business Intelligence 2025 VLDB 4.1945683e-05
11,343 SPINE: Scaling up Programming-by-Negative-Example for String Filtering and Transformation 2022 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 7 of 7 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers