Database Paper Browser

Back to papers

DIADEM: Thousands of Websites to a Single Database

Summary: Automatic full-site extraction at scale using a self-adaptive network of relational transducers. Exhaustive wrappers for thousands of sites across domains with 97% precision on >90% of sites, via combining phenomenological and ontological knowledge. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
10882
Venue
VLDB
Year
2014
Pagerank
5.1954702e-05
Overall Rank
6,133 | 57.34%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 7 of 7 citing papers.

Rank Citing Paper Year Venue Pagerank
5,705 Datalog Unchained 2021 PODS 5.3621239e-05
6,195 WADaR: Joint Wrapper and Data Repair 2015 VLDB 5.1618114e-05
6,412 CERES: Distantly Supervised Relation Extraction from the Semi-Structured Web 2018 VLDB 5.0740036e-05
7,826 The Smallest Extraction Problem 2021 VLDB 4.6416742e-05
7,919 DEXTER: Large-Scale Discovery and Extraction of Product Specifications on the Web 2015 VLDB 4.616746e-05
9,026 Robust and Noise Resistant Wrapper Induction 2016 SIGMOD 4.4051668e-05
11,543 Migrating a Privacy-Safe Information Extraction System to a Software 2.0 Design 2020 CIDR 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 13 of 13 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers