Database Paper Browser

Back to papers

Extraction and Integration of Partially Overlapping Web Sources

Summary: Unsupervised extraction and integration of overlapping web sources via WEIR, deriving rules. WEIR uses overlaps to prune rules and align source traits, with correctness guarantees and redundancy analysis; empirical gains over baselines. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
10742
Venue
VLDB
Year
2013
Pagerank
8.4462621e-05
Overall Rank
2,617 | 81.80%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 14 of 14 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 13 of 13 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
107 WebTables: Exploring the Power of Tables on the Web 2008 VLDB 0.00048377684
518 Data Integration for the Relational Web 2009 VLDB 0.00021158934
587 Extracting Structured Data from Web Pages 2003 SIGMOD 0.00019648348
916 On Schema Matching with Opaque Column Names and Data Values 2003 SIGMOD 0.00015379422
1,211 Truth Finding on the Deep Web: Is the Problem Solved? 2013 VLDB 0.00013257101
1,317 Harvesting Relational Tables from Lists on the Web 2009 VLDB 0.00012625853
1,527 Generic Schema Matching, Ten Years Later 2011 VLDB 0.00011499442
1,851 An Analysis of Structured Data on the Web 2012 VLDB 0.00010327871
3,477 Toward Best-Effort Information Extraction 2008 SIGMOD 7.0583481e-05
3,678 Automatic Wrappers for Large Scale Web Extraction 2011 VLDB 6.8517545e-05
3,747 Context-Aware Wrapping: Synchronized Data Extraction 2007 VLDB 6.7917216e-05
4,137 Exploiting Content Redundancy for Web Information Extraction 2010 VLDB 6.4181549e-05
4,229 Harnessing the Deep Web: Present and Future 2009 CIDR 6.3399547e-05
Previous Page 1 / 1 Next

Semantically Similar Papers