SHARC: Framework for Quality-Conscious Web Archiving
Summary: SHARC enables quality-aware web archiving with data-quality metrics and scheduling to maximize captures under resource limits. Offline optimal crawling with known change rates and online learning, plus revisit policies; tested in a lab and on daily crawls. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Dimitar Denev
- 2. Arturas Mazeika
- 3. Marc Spaniol
- 4. Gerhard Weikum
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 614 | Logical Modeling Of Temporal Data | 1987 | SIGMOD | 0.00019177247 |
| 1,282 | Best-Effort Cache Synchronization with Source Cooperation | 2002 | SIGMOD | 0.00012852655 |
| 1,304 | Synchronizing a database to Improve Freshness | 2000 | SIGMOD | 0.00012691283 |
| 3,002 | Supporting Multiple View Maintenance Policies | 1997 | SIGMOD | 7.7399579e-05 |
| 5,442 | RankMass Crawler: A Crawler with High Personalized PageRank Coverage Guarantee | 2007 | VLDB | 5.5026403e-05 |
| 8,320 | Effective Change Detection Using Sampling | 2002 | VLDB | 4.5435639e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 234 | Crawling the Hidden Web | 2001 | VLDB | 0.00032018108 |
| 3,683 | Finding replicated web collections | 2000 | SIGMOD | 6.8477289e-05 |
| 5,442 | RankMass Crawler: A Crawler with High Personalized PageRank Coverage Guarantee | 2007 | VLDB | 5.5026403e-05 |
| 6,928 | The Evolution of the Web and Implications for an Incremental Crawler | 2000 | VLDB | 4.8925595e-05 |
| 9,548 | Optimal Algorithms for Crawling a Hidden Database in the Web | 2012 | VLDB | 4.3258142e-05 |
| 7,768 | Accurate and Efficient Crawling for Relevant Websites | 2004 | VLDB | 4.6563056e-05 |
| 3,487 | Longitudinal Analytics on Web Archive Data: It's About Time! | 2011 | CIDR | 7.0480733e-05 |
| 12,231 | Optimizing Content Freshness of Relations Extracted From the Web Using Keyword Search | 2010 | SIGMOD | 4.1945683e-05 |
| 8,678 | Progressive Deep Web Crawling Through Keyword Queries For Data Enrichment | 2019 | SIGMOD | 4.4702119e-05 |
| 12,333 | NEAR-Miner: Mining Evolution Associations of Web Site Directories for Efficient Maintenance of Web Archives | 2009 | VLDB | 4.1945683e-05 |