Database Paper Browser

Back to papers

Table Overlap Estimation through Graph Embeddings

Summary: Armadillo uses graph neural networks to learn table representations. Cosine similarity between embeddings estimates the overlap ratio; introduces GitTables- and Wikipedia-based datasets (1.32M pairs) and yields speedups over Sloth with strong accuracy. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
7279
Venue
SIGMOD
Year
2025
Pagerank
4.1945683e-05
Overall Rank
10,510 | 26.89%
DOI
10.1145/3725365

Incoming Non-self Citations Over Time

No non-self incoming citations found for this paper in this database.

Authors

Incoming Citations (Sorted by Pagerank)

Showing 0 of 0 citing papers.

Rank Citing Paper Year Venue Pagerank
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 19 of 19 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
513 TURL: Table Understanding through Representation Learning 2021 VLDB 0.00021288342
818 Finding Related Tables 2012 SIGMOD 0.00016311524
939 Data Lake Management: Challenges and Opportunities 2019 VLDB 0.00015187344
1,178 Table Union Search on Open Data 2018 VLDB 0.00013468118
1,187 JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes 2019 SIGMOD 0.00013443639
1,211 Truth Finding on the Deep Web: Is the Problem Solved? 2013 VLDB 0.00013257101
1,644 Finding Related Tables in Data Lakes for Interactive Data Science 2020 SIGMOD 0.00011041787
1,914 Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks 2020 SIGMOD 0.00010109102
2,517 Annotating Columns with Pre-trained Language Models 2022 SIGMOD 8.6092139e-05
2,836 Semantics-aware Dataset Discovery from Data Lakes with Contextualized Column-based Representation Learning 2023 VLDB 8.0443826e-05
3,000 SANTOS: Relationship-based Semantic Table Union Search 2023 SIGMOD 7.7462128e-05
3,335 DeepJoin: Joinable Table Discovery with Pre-trained Language Models 2023 VLDB 7.2065006e-05
3,520 GitTables: A Large-Scale Corpus of Relational Tables 2023 SIGMOD 7.0131061e-05
3,942 Ember: No-Code Context Enrichment via Similarity-Based Keyless Joins 2022 VLDB 6.6114622e-05
4,859 Integrating Data Lake Tables 2023 VLDB 5.8732433e-05
5,449 Transformers for Tabular Data Representation: A Tutorial on Models and Applications 2022 VLDB 5.5008652e-05
5,506 Exploring Change – A New Dimension of Data Analytics 2019 VLDB 5.473324e-05
6,270 MATE: Multi-Attribute Table Extraction 2022 VLDB 5.1337451e-05
10,951 Determining the Largest Overlap between Tables 2024 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Semantically Similar Papers