Observatory: Characterizing Embeddings of Relational Tables

Summary: Introduces Observatory, an extensible framework of eight primitive properties and quantitative measures for systematically characterizing table embeddings. Applied to nine models, it uncovers column-order sensitivity, weak encoding of functional dependencies, and lower sample-fidelity in specialized table models. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID: 13760
Venue: VLDB
Year: 2024
Pagerank: 5.2231739e-05
Overall Rank: 6,071 | 57.81%
DOI: 10.14778/3636218.3636237

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 4 of 4 citing papers.

Rank	Citing Paper	Year	Venue	Pagerank
1,839	DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing	2025	VLDB	0.00010351287
10,597	Birdie: Natural Language-Driven Table Discovery Using Differentiable Search Index	2025	VLDB	4.1905499e-05
10,759	Cents: A Flexible and Cost-Effective Framework for LLM-Based Table Understanding	2025	VLDB	4.1905499e-05
10,848	Panel on Neural Relational Data: Tabular Foundation Models, LLMs... or both?	2025	VLDB	4.1905499e-05

Outgoing Citations (Sorted by Pagerank)

Showing 17 of 17 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank	Cited Paper	Year	Venue	Pagerank
7	Extending the Data Base Relational Model to Capture More Meaning	1979	SIGMOD	0.0015464728
219	Deep Entity Matching with Pre-Trained Language Models	2021	VLDB	0.00033354456
420	InfoGather: Entity Augmentation and Attribute Discovery By Holistic Matching with Web Tables	2012	SIGMOD	0.00023700634
514	TURL: Table Understanding through Representation Learning	2021	VLDB	0.00021280726
516	Can Foundation Models Wrangle Your Data?	2023	VLDB	0.00021194444
890	A Hybrid Approach to Functional Dependency Discovery	2016	SIGMOD	0.00015542177
1,179	Table Union Search on Open Data	2018	VLDB	0.00013458551
1,185	JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes	2019	SIGMOD	0.00013432692
1,661	On Multi-Column Foreign Key Discovery	2010	VLDB	0.00010967185
1,914	Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks	2020	SIGMOD	0.00010111859
2,142	LSH Ensemble: Internet-Scale Domain Search	2016	VLDB	9.4461701e-05
2,348	RPT: Relational Pre-trained Transformer Is Almost All You Need towards Democratizing Data Preparation	2021	VLDB	8.9903659e-05
2,513	Annotating Columns with Pre-trained Language Models	2022	SIGMOD	8.6155767e-05
2,895	Sato: Contextual Semantic Type Detection in Tables	2020	VLDB	7.9539265e-05
3,003	Chorus: Foundation Models for Unified Data Discovery and Exploration	2024	VLDB	7.7358219e-05
5,457	Transformers for Tabular Data Representation: A Tutorial on Models and Applications	2022	VLDB	5.4960654e-05
8,194	WarpGate: A Semantic Join Discovery System for Cloud Data Warehouses	2023	CIDR	4.5574831e-05

Semantically Similar Papers

Overall Rank	Paper	Year	Venue	Pagerank
2,585	Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks	2024	SIGMOD	8.4909917e-05
5,993	Tables As a Paradigm for Querying and Restructuring	1996	PODS	5.2378977e-05
8,915	Making Table Understanding Work in Practice	2022	CIDR	4.4229886e-05
3,520	GitTables: A Large-Scale Corpus of Relational Tables	2023	SIGMOD	7.0136102e-05
10,269	Database Views as Explanations for Relational Deep Learning	2026	VLDB	4.1905499e-05
514	TURL: Table Understanding through Representation Learning	2021	VLDB	0.00021280726
9,885	Scalable and Usable Relational Learning With Automatic Language Bias	2021	SIGMOD	4.2580321e-05
3,326	DeepJoin: Joinable Table Discovery with Pre-trained Language Models	2023	VLDB	7.2148323e-05
5,457	Transformers for Tabular Data Representation: A Tutorial on Models and Applications	2022	VLDB	5.4960654e-05
1,914	Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks	2020	SIGMOD	0.00010111859