Database Paper Browser

Back to papers

Robust and Noise Resistant Wrapper Induction

Summary: Robust wrapper induction for template-based web data extraction, tolerant to page changes and noisy samples. A restricted subset of XPATH yields discriminative anchor nodes; despite infeasibility of optimal-query induction, the framework learns long-lived, noise-resistant wrappers validated on Internet Archive snapshots. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
5241
Venue
SIGMOD
Year
2016
Pagerank
4.4051668e-05
Overall Rank
9,026 | 37.21%
DOI
10.1145/2882903.2915214

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 1 of 1 citing papers.

Rank Citing Paper Year Venue Pagerank
7,826 The Smallest Extraction Problem 2021 VLDB 4.6416742e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 8 of 8 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers