Database Paper Browser

Back to papers

WebTables: Exploring the Power of Tables on the Web

Summary: WebTables builds a web-scale corpus of 154M relational tables from 14.1B HTML pages, each a tiny database. It introduces AcsDB for corpus-wide attribute co-occurrence, enabling better search and tools like auto-complete, synonyms, and join traversal. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
9694
Venue
VLDB
Year
2008
Pagerank
0.00048377684
Overall Rank
107 | 99.26%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 34 of 84 citing papers.

Rank Citing Paper Year Venue Pagerank
6,133 DIADEM: Thousands of Websites to a Single Database 2014 VLDB 5.1954702e-05
6,412 CERES: Distantly Supervised Relation Extraction from the Semi-Structured Web 2018 VLDB 5.0740036e-05
6,416 Synthesizing Type-Detection Logic for Rich Semantic Data Types using Open-source Code 2018 SIGMOD 5.072267e-05
6,557 Knowledge Verification for Long-Tail Verticals 2017 VLDB 5.0124455e-05
6,586 Web Data Management 2011 SIGMOD 5.0023398e-05
6,792 Automatically Incorporating New Sources in Keyword Search-Based Data Integration 2010 SIGMOD 4.9249098e-05
6,981 Dataset Relationship Management 2019 CIDR 4.8743957e-05
7,026 Mind the Data Gap: Bridging LLMs to Enterprise Data Integration 2025 CIDR 4.8570811e-05
7,424 Table Extraction and Understanding for Scientific and Enterprise Applications 2020 VLDB 4.7339251e-05
7,588 Scalable Column Concept Determination for Web Tables Using Large Knowledge Bases 2013 VLDB 4.7030914e-05
7,648 User Guidance for Efficient Fact Checking 2019 VLDB 4.6889787e-05
7,868 Solo: Data Discovery Using Natural Language Questions Via A Self-Supervised Approach 2023 SIGMOD 4.6319504e-05
7,919 DEXTER: Large-Scale Discovery and Extraction of Product Specifications on the Web 2015 VLDB 4.616746e-05
8,135 Applying WebTables in Practice 2015 CIDR 4.5777549e-05
8,307 Automatic Web-Scale Information Extraction 2012 SIGMOD 4.5435639e-05
8,751 Generations of Knowledge Graphs: The Crazy Ideas and the Business Impact 2023 VLDB 4.456315e-05
8,852 Watchog: A Light-weight Contrastive Learning based Framework for Column Annotation 2023 SIGMOD 4.4356508e-05
9,014 Knowledge Exploration Using Tables on the Web 2017 VLDB 4.4095176e-05
9,248 Web Record Extraction with Invariants 2023 VLDB 4.3690661e-05
10,753 Cents: A Flexible and Cost-Effective Framework for LLM-Based Table Understanding 2025 VLDB 4.1945683e-05
10,951 Determining the Largest Overlap between Tables 2024 SIGMOD 4.1945683e-05
11,063 Searching Data Lakes for Nested and Joined Data 2024 VLDB 4.1945683e-05
11,344 Simplifying Access to Large-scale Structured Datasets by Meta-Profiling with Scalable Training Set Enrichment 2022 SIGMOD 4.1945683e-05
11,391 Blueprint: A Constraint-solving Approach For Document Extraction 2022 VLDB 4.1945683e-05
11,754 Constraint-based Explanation and Repair of Filter-based Transformations 2018 VLDB 4.1945683e-05
11,775 Building Structured Databases of Factual Knowledge from Massive Text Corpora 2017 SIGMOD 4.1945683e-05
11,892 Looking at Everything in Context 2015 CIDR 4.1945683e-05
11,910 Demonstrating "Data Near Here": Scientific Data Search 2015 SIGMOD 4.1945683e-05
11,939 Annotating Database Schemas to Help Enterprise Search 2015 VLDB 4.1945683e-05
12,170 Schema-As-You-Go: On Probabilistic Tagging and Querying of Wide Tables 2011 SIGMOD 4.1945683e-05
12,256 QUICK: Expressive and Flexible Search over Knowledge Bases and Text Collections 2010 VLDB 4.1945683e-05
12,301 Privacy Preservation of Aggregates in Hidden Databases: Why and How? 2009 SIGMOD 4.1945683e-05
12,320 Vispedia: On-demand Data Integration for Interactive Visualization and Exploration 2009 SIGMOD 4.1945683e-05
12,349 Answering Web Questions Using Structured Data – Dream or Reality? Panel Discussion 2009 VLDB 4.1945683e-05
Previous Page 2 / 2 Next

Outgoing Citations (Sorted by Pagerank)

Showing 6 of 6 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers

Overall Rank Paper Year Venue Pagerank
1,851 An Analysis of Structured Data on the Web 2012 VLDB 0.00010327871
5,672 Effective Keyword-based Selection of Relational Databases 2007 SIGMOD 5.3784128e-05
7,326 Answering Web Queries Using Structured Data Sources 2009 SIGMOD 4.7612871e-05
818 Finding Related Tables 2012 SIGMOD 0.00016311524
1,317 Harvesting Relational Tables from Lists on the Web 2009 VLDB 0.00012625853
1,001 Recovering Semantics of Tables on the Web 2011 VLDB 0.00014706505
1,367 Answering Table Queries on the Web using Column Keywords 2012 VLDB 0.00012349783
2,633 Schema Extraction for Tabular Data on the Web 2013 VLDB 8.4063569e-05
364 Annotating and Searching Web Tables Using Entities, Types and Relationships 2010 VLDB 0.00025637562
8,135 Applying WebTables in Practice 2015 CIDR 4.5777549e-05