Optimizing Content Freshness of Relations Extracted From the Web Using Keyword Search
Summary: Defines content freshness for Web-derived data and keeps local copies up-to-date via a keyword-search interface. Selectivity-based keyword queries chosen to maximize freshness with minimum web traffic, outperforming naive baselines. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Mohan Yang
- 2. Haixun Wang
- 3. Lipyeow Lim
- 4. Min Wang
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 4 of 4 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 234 | Crawling the Hidden Web | 2001 | VLDB | 0.00032018108 |
| 1,282 | Best-Effort Cache Synchronization with Source Cooperation | 2002 | SIGMOD | 0.00012852655 |
| 1,304 | Synchronizing a database to Improve Freshness | 2000 | SIGMOD | 0.00012691283 |
| 1,537 | Google's Deep-Web Crawl | 2008 | VLDB | 0.00011465704 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,548 | Optimal Algorithms for Crawling a Hidden Database in the Web | 2012 | VLDB | 4.3258142e-05 |
| 8,766 | Toward Scalable Keyword Search over Relational Data | 2010 | VLDB | 4.456315e-05 |
| 6,729 | Keyword Query Cleaning | 2008 | VLDB | 4.9483065e-05 |
| 6,607 | Balancing Performance and Data Freshness in Web Database Servers | 2003 | VLDB | 4.9962123e-05 |
| 7,768 | Accurate and Efficient Crawling for Relevant Websites | 2004 | VLDB | 4.6563056e-05 |
| 2,319 | Expressive and Flexible Access to Web-Extracted Data: A Keyword-based Structured Query Language | 2010 | SIGMOD | 9.0387108e-05 |
| 4,592 | Keyword Search on Relational Data Streams | 2007 | SIGMOD | 6.0613645e-05 |
| 5,672 | Effective Keyword-based Selection of Relational Databases | 2007 | SIGMOD | 5.3784128e-05 |
| 8,678 | Progressive Deep Web Crawling Through Keyword Queries For Data Enrichment | 2019 | SIGMOD | 4.4702119e-05 |
| 1,304 | Synchronizing a database to Improve Freshness | 2000 | SIGMOD | 0.00012691283 |