Database Paper Browser

Back to papers

Using the Structure of Web Sites for Automatic Segmentation of Tables

Summary: Automatic extraction and segmentation of records from web tables without user input; leverages common table/list layouts and detail-page links. Two algorithms: constraint-based CSP using detail-page constraints, and probabilistic inference, domain-independent and tested on twelve sites. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
3512
Venue
SIGMOD
Year
2004
Pagerank
7.2759001e-05
Overall Rank
3,285 | 77.15%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 5 of 5 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 5 of 5 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
234 Crawling the Hidden Web 2001 VLDB 0.00032018108
533 RoadRunner: Towards Automatic Data Extraction from Large Web Sites 2001 VLDB 0.00020757722
587 Extracting Structured Data from Web Pages 2003 SIGMOD 0.00019648348
637 Automatic segmentation of text into structured records 2001 SIGMOD 0.00018824614
2,005 Record-Boundary Discovery in Web Documents 1999 SIGMOD 9.8112591e-05
Previous Page 1 / 1 Next

Semantically Similar Papers