Enhanced hypertext categorization using hyperlinks
Summary: Proposes robust statistical models with relaxation labeling to exploit hyperlinks for hypertext classification, mitigating noisy link signals in a local neighborhood. Demonstrates large gains over text-only classifiers on Yahoo and patent data, adapting to partial supervision with small labeled neighborhoods (error ~21%). (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Soumen Chakrabarti
- 2. Byron Dom
- 3. Piotr Indyk
Incoming Citations (Sorted by Pagerank)
Showing 6 of 6 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 771 | Distributed Hypertext Resource Discovery Through Examples | 1999 | VLDB | 0.00016887664 |
| 1,431 | Computing Geographical Scopes of Web Resources | 2000 | VLDB | 0.00012021056 |
| 2,325 | Building Hierarchical Classifiers Using Class Proximity | 1999 | VLDB | 9.0304462e-05 |
| 7,768 | Accurate and Efficient Crawling for Relevant Websites | 2004 | VLDB | 4.6563056e-05 |
| 12,615 | The BINGO! System for Information Portal Generation and Expert Web Search | 2003 | CIDR | 4.1945683e-05 |
| 12,691 | Toward Learning Based Web Query Processing | 2000 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 3 of 3 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,107 | SPRINT: A Scalable Parallel Classifier for Data Mining | 1996 | VLDB | 0.00013985717 |
| 1,289 | Using Probabilistic Information in Data Integration | 1997 | VLDB | 0.00012804879 |
| 3,485 | Using taxonomy, discriminants, and signatures for navigating in text databases | 1997 | VLDB | 7.0504959e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 13,925 | Hypertext Databases and Data Mining* | 1999 | SIGMOD | - |
| 12,669 | Self-similarity in the web | 2001 | VLDB | 4.1945683e-05 |
| 4,342 | LinkClus: Efficient Clustering via Heterogeneous Semantic Links | 2006 | VLDB | 6.2758722e-05 |
| 2,325 | Building Hierarchical Classifiers Using Class Proximity | 1999 | VLDB | 9.0304462e-05 |
| 3,950 | Probe, Count, and Classify: Categorizing Hidden-Web Databases | 2001 | SIGMOD | 6.5953844e-05 |
| 8,691 | Efficient and Effective Metasearch for Text Databases Incorporating Linkages among Documents | 2001 | SIGMOD | 4.466355e-05 |
| 771 | Distributed Hypertext Resource Discovery Through Examples | 1999 | VLDB | 0.00016887664 |
| 13,808 | A Method of Re-ranking Web Search Results Using their Hidden Hyperlink Structure | 2002 | VLDB | - |
| 12,928 | Indexing in a Hypertext Database | 1990 | VLDB | 4.1945683e-05 |
| 3,485 | Using taxonomy, discriminants, and signatures for navigating in text databases | 1997 | VLDB | 7.0504959e-05 |