Mining Document Collections to Facilitate Accurate Approximate Entity Matching
Summary: Mine document collections to expand a reference entity table with variations of each entity. Approximate matching over substrings is reduced to exact match against the expanded table; introduces a new architecture and techniques to tackle efficiency, with experiments showing accuracy and scalability. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Surajit Chaudhuri
- 2. Venkatesh Ganti
- 3. Dong Xin
Incoming Citations (Sorted by Pagerank)
Showing 6 of 6 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,451 | Learning String Transformations From Examples | 2009 | VLDB | 7.0822216e-05 |
| 5,073 | Faerie: Efficient Filtering Algorithms for Approximate Dictionary-based Entity Extraction | 2011 | SIGMOD | 5.7177424e-05 |
| 5,536 | On Indexing Error-Tolerant Set Containment | 2010 | SIGMOD | 5.4532734e-05 |
| 5,652 | From Information to Knowledge: Harvesting Entities and Relationships from Web Sources | 2010 | PODS | 5.3903671e-05 |
| 6,580 | Query Portals: Dynamically Generating Portals for Entity-Oriented Web Queries | 2010 | SIGMOD | 5.0034092e-05 |
| 11,982 | Matching Titles with Cross Title Web-Search Enrichment and Community Detection | 2014 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 22 | SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets | 2008 | VLDB | 0.0008456613 |
| 181 | Mining Frequent Patterns without Candidate Generation | 2000 | SIGMOD | 0.00036992674 |
| 229 | Reference Reconciliation in Complex Information Spaces | 2005 | SIGMOD | 0.00032242633 |
| 322 | Record Linkage: Similarity Measures and Algorithms | 2006 | SIGMOD | 0.00027518768 |
| 3,451 | Learning String Transformations From Examples | 2009 | VLDB | 7.0822216e-05 |
| 3,868 | An Efficient Filter for Approximate Membership Checking | 2008 | SIGMOD | 6.6822543e-05 |
| 5,379 | Scalable Ad-hoc Entity Extraction from Text Collections | 2008 | VLDB | 5.5405989e-05 |
Previous
Page 1 / 1
Next