NEAR-Miner: Mining Evolution Associations of Web Site Directories for Efficient Maintenance of Web Archives
Summary: NEAR-Miner mines evolution patterns in web directory hierarchies to guide selective re-downloads for web archives. Discovers negatively correlated ancestor–descendant rules offline to skip subdirectories at crawl time, boosting maintenance efficiency with minor freshness loss. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Ling Chen
- 2. Sourav S Bhowmick
- 3. Wolfgang Nejdl
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 3 of 3 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 728 | Meaningful Change Detection in Structured Data | 1997 | SIGMOD | 0.00017494982 |
| 744 | Beyond Market Baskets: Generalizing Association Rules to Correlations | 1997 | SIGMOD | 0.00017333019 |
| 6,928 | The Evolution of the Web and Implications for an Incremental Crawler | 2000 | VLDB | 4.8925595e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,617 | Extraction and Integration of Partially Overlapping Web Sources | 2013 | VLDB | 8.4462621e-05 |
| 9,248 | Web Record Extraction with Invariants | 2023 | VLDB | 4.3690661e-05 |
| 3,683 | Finding replicated web collections | 2000 | SIGMOD | 6.8477289e-05 |
| 559 | Maintaining Data Privacy in Association Rule Mining | 2002 | VLDB | 0.00020147576 |
| 13 | Mining Association Rules between Sets of Items in Large Databases | 1993 | SIGMOD | 0.0010864752 |
| 7,768 | Accurate and Efficient Crawling for Relevant Websites | 2004 | VLDB | 4.6563056e-05 |
| 6,928 | The Evolution of the Web and Implications for an Incremental Crawler | 2000 | VLDB | 4.8925595e-05 |
| 12,231 | Optimizing Content Freshness of Relations Extracted From the Web Using Keyword Search | 2010 | SIGMOD | 4.1945683e-05 |
| 547 | An Efficient Algorithm for Mining Association Rules in Large Databases | 1995 | VLDB | 0.00020420717 |
| 12,334 | SHARC: Framework for Quality-Conscious Web Archiving | 2009 | VLDB | 4.1945683e-05 |