Harvesting Relational Tables from Lists on the Web
Summary: Unsupervised extraction of relational tables from web lists, handling delimiters and missing fields. Uses an HTML-table corpus to validate splits and alignments, yields an extraction score, and scales to ~100k lists, implying tens of millions of usable tables. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Hazem Elmeleegy
- 2. Jayant Madhavan
- 3. Alon Halevy
Incoming Citations (Sorted by Pagerank)
Showing 21 of 21 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 107 | WebTables: Exploring the Power of Tables on the Web | 2008 | VLDB | 0.00048377684 |
| 533 | RoadRunner: Towards Automatic Data Extraction from Large Web Sites | 2001 | VLDB | 0.00020757722 |
| 587 | Extracting Structured Data from Web Pages | 2003 | SIGMOD | 0.00019648348 |
| 637 | Automatic segmentation of text into structured records | 2001 | SIGMOD | 0.00018824614 |
| 1,537 | Google's Deep-Web Crawl | 2008 | VLDB | 0.00011465704 |
| 2,005 | Record-Boundary Discovery in Web Documents | 1999 | SIGMOD | 9.8112591e-05 |
| 3,285 | Using the Structure of Web Sites for Automatic Segmentation of Tables | 2004 | SIGMOD | 7.2759001e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,135 | Applying WebTables in Practice | 2015 | CIDR | 4.5777549e-05 |
| 818 | Finding Related Tables | 2012 | SIGMOD | 0.00016311524 |
| 1,001 | Recovering Semantics of Tables on the Web | 2011 | VLDB | 0.00014706505 |
| 1,367 | Answering Table Queries on the Web using Column Keywords | 2012 | VLDB | 0.00012349783 |
| 3,742 | TEGRA: Table Extraction by Global Record Alignment | 2015 | SIGMOD | 6.7966898e-05 |
| 2,633 | Schema Extraction for Tabular Data on the Web | 2013 | VLDB | 8.4063569e-05 |
| 364 | Annotating and Searching Web Tables Using Entities, Types and Relationships | 2010 | VLDB | 0.00025637562 |
| 107 | WebTables: Exploring the Power of Tables on the Web | 2008 | VLDB | 0.00048377684 |
| 3,285 | Using the Structure of Web Sites for Automatic Segmentation of Tables | 2004 | SIGMOD | 7.2759001e-05 |
| 1,585 | Answering Table Augmentation Queries from Unstructured Lists on the Web | 2009 | VLDB | 0.00011255098 |