Web Scale Taxonomy Cleansing
Summary: Web-scale taxonomy matching using external positive/negative evidence, formulating the task as optimal multi-way graph cuts. A Monte Carlo greedy-cut algorithm yields scalable cleansing; extensive experiments against three baselines demonstrate its advantage. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Taesung Lee
- 2. Zhongyuan Wang
- 3. Haixun Wang
- 4. Seung-won Hwang
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,066 | Probase: A Probabilistic Taxonomy for Text Understanding | 2012 | SIGMOD | 0.0001433416 |
| 7,475 | Optimizing Index for Taxonomy Keyword Search | 2012 | SIGMOD | 4.7191809e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 62 | Freebase: A Collaboratively Created Graph Database For Structuring Human Knowledge | 2008 | SIGMOD | 0.0006429466 |
| 229 | Reference Reconciliation in Complex Information Spaces | 2005 | SIGMOD | 0.00032242633 |
| 280 | Eliminating Fuzzy Duplicates in Data Warehouses | 2002 | VLDB | 0.00029113044 |
| 509 | On Active Learning of Record Matching Packages | 2010 | SIGMOD | 0.00021409518 |
| 936 | Framework for Evaluating Clustering Algorithms in Duplicate Detection | 2009 | VLDB | 0.0001521549 |
| 1,950 | Leveraging Data and Structure in Ontology Integration | 2007 | SIGMOD | 9.9756731e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,345 | Entity Matching: How Similar Is Similar | 2011 | VLDB | 0.00012468408 |
| 2,514 | Comparative Analysis of Approximate Blocking Techniques for Entity Resolution | 2016 | VLDB | 8.6139012e-05 |
| 11,373 | Generalized Supervised Meta-blocking | 2022 | VLDB | 4.1945683e-05 |
| 902 | Statistical Schema Matching across Web Query Interfaces | 2003 | SIGMOD | 0.00015486247 |
| 8,409 | Ontology-based Entity Matching in Attributed Graphs | 2019 | VLDB | 4.5205877e-05 |
| 5,362 | Cost-Effective Crowdsourced Entity Resolution: A Partial-Order Approach | 2016 | SIGMOD | 5.5473503e-05 |
| 5,798 | Exploiting Context Analysis for Combining Multiple Entity Resolution Systems | 2009 | SIGMOD | 5.3231654e-05 |
| 319 | Evaluation of entity resolution approaches on real-world match problems | 2010 | VLDB | 0.00027781866 |
| 8,005 | Online Topic-Aware Entity Resolution Over Incomplete Data Streams | 2021 | SIGMOD | 4.6081461e-05 |
| 7,588 | Scalable Column Concept Determination for Web Tables Using Large Knowledge Bases | 2013 | VLDB | 4.7030914e-05 |