Database Paper Browser

Back to papers

TURL: Table Understanding through Representation Learning

Summary: TURL pre-trains on relational Web tables using a structure-aware Transformer and a Masked Entity Recovery objective for unsupervised representation learning. The universal model adapts to multiple tasks (relation extraction, cell filling) with minimal fine-tuning, outperforming prior methods on 6 benchmarks. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
12513
Venue
VLDB
Year
2021
Pagerank
0.00021288342
Overall Rank
513 | 96.44%
DOI
10.14778/3430915.3430921

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 50 of 55 citing papers.

Rank Citing Paper Year Venue Pagerank
517 Can Foundation Models Wrangle Your Data? 2023 VLDB 0.00021169035
2,349 RPT: Relational Pre-trained Transformer Is Almost All You Need towards Democratizing Data Preparation 2021 VLDB 8.9876423e-05
2,517 Annotating Columns with Pre-trained Language Models 2022 SIGMOD 8.6092139e-05
2,587 Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks 2024 SIGMOD 8.4924618e-05
2,836 Semantics-aware Dataset Discovery from Data Lakes with Contextualized Column-based Representation Learning 2023 VLDB 8.0443826e-05
3,000 SANTOS: Relationship-based Semantic Table Union Search 2023 SIGMOD 7.7462128e-05
3,015 Chorus: Foundation Models for Unified Data Discovery and Exploration 2024 VLDB 7.7092391e-05
3,335 DeepJoin: Joinable Table Discovery with Pre-trained Language Models 2023 VLDB 7.2065006e-05
3,520 GitTables: A Large-Scale Corpus of Relational Tables 2023 SIGMOD 7.0131061e-05
3,995 How Large Language Models Will Disrupt Data Management 2023 VLDB 6.5513237e-05
4,212 Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration 2023 SIGMOD 6.3555142e-05
4,661 PreQR: Pre-training Representation for SQL Understanding 2022 SIGMOD 6.0137947e-05
4,859 Integrating Data Lake Tables 2023 VLDB 5.8732433e-05
4,967 Leva: Boosting Machine Learning Performance with Relational Embedding Data Augmentation 2022 SIGMOD 5.7956612e-05
5,099 ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models 2024 VLDB 5.6997784e-05
5,275 Auto-Tables: Synthesizing Multi-Step Transformations to Relationalize Tables without Using Examples 2023 VLDB 5.5905507e-05
5,449 Transformers for Tabular Data Representation: A Tutorial on Models and Applications 2022 VLDB 5.5008652e-05
5,840 Logical and Physical Optimizations for SQL Query Execution over Large Language Models 2025 SIGMOD 5.3042561e-05
6,092 Observatory: Characterizing Embeddings of Relational Tables 2024 VLDB 5.2138566e-05
6,280 Self-supervised and Interpretable Data Cleaning with Sequence Generative Adversarial Networks 2023 VLDB 5.1290457e-05
6,775 A Unified Transferable Model for ML-Enhanced DBMS 2022 CIDR 4.9299192e-05
6,800 DTT: An Example-Driven Tabular Transformer for Joinability by Leveraging Large Language Models 2024 SIGMOD 4.9231471e-05
7,643 Cross Modal Data Discovery over Structured and Unstructured Data Lakes 2023 VLDB 4.6901105e-05
8,193 WarpGate: A Semantic Join Discovery System for Cloud Data Warehouses 2023 CIDR 4.5618596e-05
8,579 RECA: Related Tables Enhanced Column Semantic Type Annotation Framework 2023 VLDB 4.4922446e-05
8,712 ANN Softmax: Acceleration of Extreme Classification Training 2022 VLDB 4.4626362e-05
8,736 Unveiling Challenges for LLMs in Enterprise Data Engineering 2026 VLDB 4.456315e-05
8,743 CtxPipe: Context-aware Data Preparation Pipeline Construction for Machine Learning 2024 SIGMOD 4.456315e-05
8,847 Towards Foundation Database Models 2025 CIDR 4.4371897e-05
8,852 Watchog: A Light-weight Contrastive Learning based Framework for Column Annotation 2023 SIGMOD 4.4356508e-05
8,913 Making Table Understanding Work in Practice 2022 CIDR 4.427232e-05
9,077 VerifAI: Verified Generative AI 2024 CIDR 4.4010762e-05
9,348 GIDCL: A Graph-Enhanced Interpretable Data Cleaning Framework with Large Language Models 2024 SIGMOD 4.3526427e-05
9,399 TabulaX: Leveraging Large Language Models for Multi-Class Table Transformations 2025 VLDB 4.3441378e-05
9,479 Data Imputation with Limited Data Redundancy Using Data Lakes 2025 VLDB 4.3341665e-05
9,777 Data Augmentation for ML-driven Data Preparation and Integration 2021 VLDB 4.2856106e-05
9,961 QueryArtisan: Generating Data Manipulation Codes for Ad-hoc Analysis in Data Lakes 2025 VLDB 4.2294678e-05
10,059 Burr: A Benchmark for Ontology Learning from Relational Databases 2026 SIGMOD 4.1945683e-05
10,109 Retrieve-and-Verify: A Table Context Selection Framework for Accurate Column Annotations 2026 SIGMOD 4.1945683e-05
10,142 AutoDDG: Automated Dataset Description Generation using Large Language Models 2026 SIGMOD 4.1945683e-05
10,197 Qualitative Join Discovery in Data Lakes using Examples 2026 SIGMOD 4.1945683e-05
10,268 OpenSQL: Data-Efficient Text-to-SQL for Open-Source LLMs via Synthesized Intermediate Supervision 2026 VLDB 4.1945683e-05
10,498 PLM4NDV: Minimizing Data Access for Number of Distinct Values Estimation with Pre-trained Language Models 2025 SIGMOD 4.1945683e-05
10,510 Table Overlap Estimation through Graph Embeddings 2025 SIGMOD 4.1945683e-05
10,589 Birdie: Natural Language-Driven Table Discovery Using Differentiable Search Index 2025 VLDB 4.1945683e-05
10,753 Cents: A Flexible and Cost-Effective Framework for LLM-Based Table Understanding 2025 VLDB 4.1945683e-05
10,754 OmniMatch: Joinability Discovery in Data Products 2025 VLDB 4.1945683e-05
10,844 Panel on Neural Relational Data: Tabular Foundation Models, LLMs... or both? 2025 VLDB 4.1945683e-05
10,951 Determining the Largest Overlap between Tables 2024 SIGMOD 4.1945683e-05
10,973 Unstructured Data Fusion for Schema and Data Extraction 2024 SIGMOD 4.1945683e-05
Previous Page 1 / 2 Next

Outgoing Citations (Sorted by Pagerank)

Showing 5 of 5 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers