Determining Text Databases to Search in the Internet
Summary: Predict usefulness of text databases for a query as the count of sufficiently similar documents. Proposes threshold-determination techniques for local databases in a collection-fusion setting with heterogeneous similarity functions. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Weiyi Meng
- 2. King-Lup Liu
- 3. Clement Yu
- 4. Xiaodong Wang
- 5. Yuhsi Chang
- 6. Naphtali Rishe
Incoming Citations (Sorted by Pagerank)
Showing 8 of 8 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 805 | Evaluating Top-k Selection Queries | 1999 | VLDB | 0.00016437265 |
| 1,431 | Computing Geographical Scopes of Web Resources | 2000 | VLDB | 0.00012021056 |
| 1,492 | Distributed Search over the Hidden Web: Hierarchical Database Sampling and Selection | 2002 | VLDB | 0.00011694396 |
| 2,095 | Knocking the Door to the Deep Web: Integrating Web Query Interfaces | 2004 | SIGMOD | 9.5505068e-05 |
| 3,950 | Probe, Count, and Classify: Categorizing Hidden-Web Databases | 2001 | SIGMOD | 6.5953844e-05 |
| 8,691 | Efficient and Effective Metasearch for Text Databases Incorporating Linkages among Documents | 2001 | SIGMOD | 4.466355e-05 |
| 12,447 | AllInOneNews: Development and Evaluation of a Large-Scale News Metasearch Engine | 2007 | SIGMOD | 4.1945683e-05 |
| 12,575 | When one Sample is not Enough: Improving Text Database Selection Using Shrinkage | 2004 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 2 of 2 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,256 | Generalizing GLOSS to Vector-Space Databases and Broker Hierarchies | 1995 | VLDB | 0.00013022726 |
| 1,899 | Merging Ranks from Heterogeneous Internet Sources | 1997 | VLDB | 0.00010170921 |
Previous
Page 1 / 1
Next