Annotating and Searching Web Tables Using Entities, Types and Relationships

Summary: Annotates web tables with entities, types, and relations using a joint graphical model to label cells and columns simultaneously. Demonstrates gains in relational Web search over text-only indexing on 25M+ HTML tables using YAGO/DBpedia/Wikipedia. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID: 10068
Venue: VLDB
Year: 2010
Pagerank: 0.00025616694
Overall Rank: 365 | 97.47%
DOI: -

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 39 of 39 citing papers.

Rank	Citing Paper	Year	Venue	Pagerank
420	InfoGather: Entity Augmentation and Attribute Discovery By Holistic Matching with Web Tables	2012	SIGMOD	0.00023700634
514	TURL: Table Understanding through Representation Learning	2021	VLDB	0.00021280726
814	Finding Related Tables	2012	SIGMOD	0.00016298739
1,005	Recovering Semantics of Tables on the Web	2011	VLDB	0.00014694038
1,179	Table Union Search on Open Data	2018	VLDB	0.00013458551
1,544	KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing	2015	SIGMOD	0.00011438274
2,142	LSH Ensemble: Internet-Scale Domain Search	2016	VLDB	9.4461701e-05
2,513	Annotating Columns with Pre-trained Language Models	2022	SIGMOD	8.6155767e-05
2,638	Schema Extraction for Tabular Data on the Web	2013	VLDB	8.3995765e-05
2,842	Semantics-aware Dataset Discovery from Data Lakes with Contextualized Column-based Representation Learning	2023	VLDB	8.0366354e-05
2,895	Sato: Contextual Semantic Type Detection in Tables	2020	VLDB	7.9539265e-05
3,001	SANTOS: Relationship-based Semantic Table Union Search	2023	SIGMOD	7.739698e-05
3,163	Ten Years of WebTables	2018	VLDB	7.4611268e-05
3,232	InfoGather+: Semantic Matching and Annotation of Numeric and Time-Varying Attributes in Web Tables	2013	SIGMOD	7.3324293e-05
3,293	Biperpedia: An Ontology for Search Applications	2014	VLDB	7.2598242e-05
3,360	Organizing Data Lakes for Navigation	2020	SIGMOD	7.1719486e-05
3,693	Navigating the Data Lake with DATAMARAN: Automatically Extracting Structure from Log Datasets	2018	SIGMOD	6.8326441e-05
3,798	Stitching Web Tables for Improving Matching Quality	2017	VLDB	6.753528e-05
4,630	Knowledge Graphs 2021: A Data Odyssey	2021	VLDB	6.030304e-05
4,838	Finding Patterns in a Knowledge Base using Keywords to Compose Table Answers	2014	VLDB	5.883146e-05
4,860	Integrating Data Lake Tables	2023	VLDB	5.867964e-05
5,738	KATARA: Reliable Data Cleaning with Knowledge Bases and Crowdsourcing	2015	VLDB	5.3454984e-05
6,585	Web Data Management	2011	SIGMOD	4.9975374e-05
7,590	Scalable Column Concept Determination for Web Tables Using Large Knowledge Bases	2013	VLDB	4.6987033e-05
8,783	QuTE: Answering Quantity Queries from Web Tables	2021	SIGMOD	4.4478256e-05
8,849	SourceSight: Enabling Effective Source Selection	2016	SIGMOD	4.4326582e-05
8,852	Watchog: A Light-weight Contrastive Learning based Framework for Column Annotation	2023	SIGMOD	4.4313992e-05
9,017	Knowledge Exploration Using Tables on the Web	2017	VLDB	4.4052897e-05
9,258	Joint Open Knowledge Base Canonicalization and Linking	2021	SIGMOD	4.3648789e-05
10,521	Auto-Test: Learning Semantic-Domain Constraints for Unsupervised Error Detection in Tables	2025	SIGMOD	4.1905499e-05
10,759	Cents: A Flexible and Cost-Effective Framework for LLM-Based Table Understanding	2025	VLDB	4.1905499e-05
10,954	Determining the Largest Overlap between Tables	2024	SIGMOD	4.1905499e-05
11,783	Building Structured Databases of Factual Knowledge from Massive Text Corpora	2017	SIGMOD	4.1905499e-05
11,855	Automatic Entity Recognition and Typing in Massive Text Data	2016	SIGMOD	4.1905499e-05
11,903	Finding Quality in Quantity: The Challenge of Discovering Valuable Sources for Integration	2015	CIDR	4.1905499e-05
11,938	ConfSeer: Leveraging Customer Support Knowledge Bases for Automated Misconfiguration Detection	2015	VLDB	4.1905499e-05
11,979	Mining Latent Entity Structures from Massive Unstructured and Interconnected Data	2014	SIGMOD	4.1905499e-05
12,052	Knowledge Harvesting in the Big-Data Era	2013	SIGMOD	4.1905499e-05
12,209	AIDA: An Online Tool for Accurate Disambiguation of Named Entities in Text and Tables	2011	VLDB	4.1905499e-05

Outgoing Citations (Sorted by Pagerank)

Showing 3 of 3 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank	Cited Paper	Year	Venue	Pagerank
108	WebTables: Exploring the Power of Tables on the Web	2008	VLDB	0.00048345996
1,140	EntityRank: Searching Entities Directly and Holistically	2007	VLDB	0.00013709412
1,585	Answering Table Augmentation Queries from Unstructured Lists on the Web	2009	VLDB	0.00011245609

Semantically Similar Papers

Overall Rank	Paper	Year	Venue	Pagerank
7,590	Scalable Column Concept Determination for Web Tables Using Large Knowledge Bases	2013	VLDB	4.6987033e-05
1,585	Answering Table Augmentation Queries from Unstructured Lists on the Web	2009	VLDB	0.00011245609
420	InfoGather: Entity Augmentation and Attribute Discovery By Holistic Matching with Web Tables	2012	SIGMOD	0.00023700634
3,232	InfoGather+: Semantic Matching and Annotation of Numeric and Time-Varying Attributes in Web Tables	2013	SIGMOD	7.3324293e-05
4,097	Structured Annotations of Web Queries	2010	SIGMOD	6.4504937e-05
1,316	Harvesting Relational Tables from Lists on the Web	2009	VLDB	0.00012616422
1,370	Answering Table Queries on the Web using Column Keywords	2012	VLDB	0.00012339543
814	Finding Related Tables	2012	SIGMOD	0.00016298739
108	WebTables: Exploring the Power of Tables on the Web	2008	VLDB	0.00048345996
1,005	Recovering Semantics of Tables on the Web	2011	VLDB	0.00014694038