Dealing with Web Data: History and Look ahead
Summary: Historical Web data and incremental crawling underpin a survey of Web data management since 2000. Revisits the finding that fetch resources for changing data items does not guarantee timeliness, and outlines challenges for scalable data management. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 1 of 1 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 6,928 | The Evolution of the Web and Implications for an Incremental Crawler | 2000 | VLDB | 4.8925595e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 13,650 | Database Issues for the 21st century | 2005 | SIGMOD | - |
| 3,683 | Finding replicated web collections | 2000 | SIGMOD | 6.8477289e-05 |
| 7,768 | Accurate and Efficient Crawling for Relevant Websites | 2004 | VLDB | 4.6563056e-05 |
| 1,851 | An Analysis of Structured Data on the Web | 2012 | VLDB | 0.00010327871 |
| 12,231 | Optimizing Content Freshness of Relations Extracted From the Web Using Keyword Search | 2010 | SIGMOD | 4.1945683e-05 |
| 13,588 | Databases on the Web | 2007 | SIGMOD | - |
| 3,487 | Longitudinal Analytics on Web Archive Data: It's About Time! | 2011 | CIDR | 7.0480733e-05 |
| 14,014 | Future Directions and Research Problems in the World Wide Web | 1996 | PODS | - |
| 9,548 | Optimal Algorithms for Crawling a Hidden Database in the Web | 2012 | VLDB | 4.3258142e-05 |
| 6,928 | The Evolution of the Web and Implications for an Incremental Crawler | 2000 | VLDB | 4.8925595e-05 |