Back to papers
OmniMatch: Joinability Discovery in Data Products
Summary: OmniMatch: joinability discovery for curated data products that fuses multiple column-pair similarity measures with a self-supervised GNN exploiting graph neighborhood to boost recall. Automated negative-pair generation raises precision, yielding up to 14% F1/AUC gains without per-metric thresholds.
(summarized by gpt-5-mini on Feb 09 2026)
- Paper ID
- 14069
- Venue
- VLDB
- Year
- 2025
- Pagerank
- 4.1945683e-05
- Overall Rank
- 10,754 | 25.19%
- DOI
-
10.14778/3749646.3749715
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
Outgoing Citations (Sorted by Pagerank)
Showing 29 of 29 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 303 |
Generic Schema Matching with Cupid |
2001 |
VLDB |
0.00028301477 |
| 382 |
COMA - A system for flexible combination of schema matching approaches |
2002 |
VLDB |
0.00024823252 |
| 420 |
InfoGather: Entity Augmentation and Attribute Discovery By Holistic Matching with Web Tables |
2012 |
SIGMOD |
0.00023719065 |
| 513 |
TURL: Table Understanding through Representation Learning |
2021 |
VLDB |
0.00021288342 |
| 984 |
Natural language to SQL: Where are we today? |
2020 |
VLDB |
0.00014857465 |
| 1,178 |
Table Union Search on Open Data |
2018 |
VLDB |
0.00013468118 |
| 1,187 |
JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes |
2019 |
SIGMOD |
0.00013443639 |
| 1,463 |
ARDA: Automatic Relational Data Augmentation for Machine Learning |
2020 |
VLDB |
0.00011869295 |
| 1,644 |
Finding Related Tables in Data Lakes for Interactive Data Science |
2020 |
SIGMOD |
0.00011041787 |
| 1,664 |
On Multi-Column Foreign Key Discovery |
2010 |
VLDB |
0.00010976887 |
| 1,914 |
Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks |
2020 |
SIGMOD |
0.00010109102 |
| 2,836 |
Semantics-aware Dataset Discovery from Data Lakes with Contextualized Column-based Representation Learning |
2023 |
VLDB |
8.0443826e-05 |
| 3,000 |
SANTOS: Relationship-based Semantic Table Union Search |
2023 |
SIGMOD |
7.7462128e-05 |
| 3,335 |
DeepJoin: Joinable Table Discovery with Pre-trained Language Models |
2023 |
VLDB |
7.2065006e-05 |
| 3,823 |
Automatic Discovery of Attributes in Relational Databases |
2011 |
SIGMOD |
6.7261168e-05 |
| 4,703 |
Medical Entity Disambiguation Using Graph Neural Networks |
2021 |
SIGMOD |
5.9855056e-05 |
| 4,859 |
Integrating Data Lake Tables |
2023 |
VLDB |
5.8732433e-05 |
| 4,967 |
Leva: Boosting Machine Learning Performance with Relational Embedding Data Augmentation |
2022 |
SIGMOD |
5.7956612e-05 |
| 5,179 |
SilkMoth: An Efficient Method for Finding Related Sets with Maximum Matching Constraints |
2017 |
VLDB |
5.6428428e-05 |
| 5,434 |
Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples |
2021 |
SIGMOD |
5.5045402e-05 |
| 5,449 |
Transformers for Tabular Data Representation: A Tutorial on Models and Applications |
2022 |
VLDB |
5.5008652e-05 |
| 5,794 |
Discovering Related Data At Scale |
2021 |
VLDB |
5.3245122e-05 |
| 7,006 |
Synthesizing Products for Online Catalogs |
2011 |
VLDB |
4.8653916e-05 |
| 7,048 |
Magneto: Combining Small and Large Language Models for Schema Matching |
2025 |
VLDB |
4.8520651e-05 |
| 7,613 |
ADnEV: Cross-Domain Schema Matching using Deep Similarity Matrix Adjustment and Evaluation |
2020 |
VLDB |
4.6961059e-05 |
| 8,137 |
Customizable and Scalable Fuzzy Join for Big Data |
2019 |
VLDB |
4.5774794e-05 |
| 8,193 |
WarpGate: A Semantic Join Discovery System for Cloud Data Warehouses |
2023 |
CIDR |
4.5618596e-05 |
| 8,503 |
A Demonstration of KGLac: A Data Discovery and Enrichment Platform for Data Science |
2021 |
VLDB |
4.496339e-05 |
| 8,958 |
FlexER: Flexible Entity Resolution for Multiple Intents |
2023 |
SIGMOD |
4.4210635e-05 |
Semantically Similar Papers