Back to papers
Building light-weight wrappers for legacy Web data-sources using W4F
Summary: W4F is a toolkit for rapid design, generation, and integration of lightweight HTML wrappers for legacy web data. Declarative specs, WYSIWYG extraction, XML mappings, and Java deployment enable CGI gateways and XML access to HTML content.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 8621
- Venue
- VLDB
- Year
- 1999
- Pagerank
- 0.00013777657
- Overall Rank
- 1,132 | 92.13%
- DOI
-
-
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 14 of 14 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 637 |
Automatic segmentation of text into structured records |
2001 |
SIGMOD |
0.00018824614 |
| 1,221 |
A Web of Concepts |
2009 |
PODS |
0.00013219242 |
| 2,224 |
The SphereSearch Engine for Unified Ranked Retrieval of Heterogeneous XML and Web Documents |
2005 |
VLDB |
9.251962e-05 |
| 2,698 |
Visual Web Information Extraction with Lixto* |
2001 |
VLDB |
8.2753317e-05 |
| 3,678 |
Automatic Wrappers for Large Scale Web Extraction |
2011 |
VLDB |
6.8517545e-05 |
| 4,440 |
Robust Web Extraction: An Approach Based on a Probabilistic Tree-Edit Model |
2009 |
SIGMOD |
6.187819e-05 |
| 5,609 |
Documentum ECI Self-Repairing Wrappers: Performance Analysis |
2006 |
SIGMOD |
5.4129892e-05 |
| 6,751 |
Optimal Schemes for Robust Web Extraction |
2011 |
VLDB |
4.939042e-05 |
| 6,996 |
Web Data Extraction using Hybrid Program Synthesis: A Combination of Top-down and Bottom-up Inference |
2020 |
SIGMOD |
4.8681362e-05 |
| 7,919 |
DEXTER: Large-Scale Discovery and Extraction of Product Specifications on the Web |
2015 |
VLDB |
4.616746e-05 |
| 8,307 |
Automatic Web-Scale Information Extraction |
2012 |
SIGMOD |
4.5435639e-05 |
| 8,603 |
OXPath: A Language for Scalable, Memory-efficient Data Extraction from Web Applications |
2011 |
VLDB |
4.4866461e-05 |
| 12,634 |
From Focused Crawling to Expert Information: an Application Framework for Web Exploration and Portal Generation |
2003 |
VLDB |
4.1945683e-05 |
| 12,691 |
Toward Learning Based Web Query Processing |
2000 |
VLDB |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 1 of 1 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 3,678 |
Automatic Wrappers for Large Scale Web Extraction |
2011 |
VLDB |
6.8517545e-05 |
| 6,638 |
The SQL-based All-Declarative FORWARD Web Application Development Framework |
2011 |
CIDR |
4.9808044e-05 |
| 4,440 |
Robust Web Extraction: An Approach Based on a Probabilistic Tree-Edit Model |
2009 |
SIGMOD |
6.187819e-05 |
| 13,194 |
Web Connector: A Unified API Wrapper to Simplify Web Data Collection |
2023 |
VLDB |
- |
| 4,766 |
Building and Customizing Data-Intensive Web Sites using Weave |
2000 |
VLDB |
5.9377055e-05 |
| 9,200 |
Databases on the Web: Technologies for Federation Architectures and Case Studies |
1997 |
SIGMOD |
4.3744643e-05 |
| 13,861 |
Enabling End-users to Construct Data-intensive Web-sites from XML Repositories: An Example-based Approach |
2001 |
VLDB |
- |
| 346 |
Don't Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources |
1997 |
VLDB |
0.00026656272 |
| 2,698 |
Visual Web Information Extraction with Lixto* |
2001 |
VLDB |
8.2753317e-05 |
| 8,322 |
An XML-based Wrapper Generator for Web Information Extraction |
1999 |
SIGMOD |
4.5435639e-05 |