Context-Aware Wrapping: Synchronized Data Extraction
Summary: Context-aware wrappers enable synchronized extraction across Web sources, leveraging peer wrappers and domain knowledge to improve downstream matching. The turbo syncer, inspired by turbo codes, interconnects extraction and matching to boost F-measure to ~78–94% and reduce errors to ~1–11%. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 7 of 7 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,617 | Extraction and Integration of Partially Overlapping Web Sources | 2013 | VLDB | 8.4462621e-05 |
| 5,937 | DataXFormer: Leveraging the Web for Semantic Transformations | 2015 | CIDR | 5.2650964e-05 |
| 6,195 | WADaR: Joint Wrapper and Data Repair | 2015 | VLDB | 5.1618114e-05 |
| 7,919 | DEXTER: Large-Scale Discovery and Extraction of Product Specifications on the Web | 2015 | VLDB | 4.616746e-05 |
| 9,026 | Robust and Noise Resistant Wrapper Induction | 2016 | SIGMOD | 4.4051668e-05 |
| 12,230 | ONDUX: On-Demand Unsupervised Learning for Information Extraction | 2010 | SIGMOD | 4.1945683e-05 |
| 12,258 | ObjectRunner: Lightweight, Targeted Extraction and Querying of Structured Web Data | 2010 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 4 of 4 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 533 | RoadRunner: Towards Automatic Data Extraction from Large Web Sites | 2001 | VLDB | 0.00020757722 |
| 587 | Extracting Structured Data from Web Pages | 2003 | SIGMOD | 0.00019648348 |
| 672 | An Interactive Clustering-based Approach to Integrating Source Query Interfaces on the Deep Web | 2004 | SIGMOD | 0.00018355746 |
| 3,285 | Using the Structure of Web Sites for Automatic Segmentation of Tables | 2004 | SIGMOD | 7.2759001e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,440 | Robust Web Extraction: An Approach Based on a Probabilistic Tree-Edit Model | 2009 | SIGMOD | 6.187819e-05 |
| 5,774 | A Hierarchical Approach to Model Web Query Interfaces for Web Source Integration | 2009 | VLDB | 5.3313642e-05 |
| 6,751 | Optimal Schemes for Robust Web Extraction | 2011 | VLDB | 4.939042e-05 |
| 6,403 | RoadRunner: Automatic Data Extraction from Data-Intensive Web Sites | 2002 | SIGMOD | 5.0797045e-05 |
| 908 | Fusing Data with Correlations | 2014 | SIGMOD | 0.00015431241 |
| 9,026 | Robust and Noise Resistant Wrapper Induction | 2016 | SIGMOD | 4.4051668e-05 |
| 8,322 | An XML-based Wrapper Generator for Web Information Extraction | 1999 | SIGMOD | 4.5435639e-05 |
| 3,678 | Automatic Wrappers for Large Scale Web Extraction | 2011 | VLDB | 6.8517545e-05 |
| 672 | An Interactive Clustering-based Approach to Integrating Source Query Interfaces on the Deep Web | 2004 | SIGMOD | 0.00018355746 |
| 2,617 | Extraction and Integration of Partially Overlapping Web Sources | 2013 | VLDB | 8.4462621e-05 |