Dealing with Acronyms, Abbreviations, and Typos in Real-World Entity Matching
Summary: Smash: a similarity measure with a dynamic-programming algorithm that jointly handles acronyms, abbreviations, and typos for entity/record matching without needing pre-specified synonym rules. Two optimizations and OpenRefine integration yield large F‑score gains over strong baselines (including GPT‑4). (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Joshua Wu
- 2. Dixin Tang
- 3. Nithin Chalapathi
- 4. Tristan Chambers
- 5. Julie Ciccolini
- 6. Cheryl Phillips
- 7. Lisa Pickoff-White
- 8. Aditya Parameswaran
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 16 of 16 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,026 | Flexible String Matching Against Large Databases in Practice | 2004 | VLDB | 6.5169976e-05 |
| 221 | Deep Entity Matching with Pre-Trained Language Models | 2021 | VLDB | 0.00033121824 |
| 319 | Evaluation of entity resolution approaches on real-world match problems | 2010 | VLDB | 0.00027781866 |
| 7,669 | Incorporating String Transformations in Record Matching | 2008 | SIGMOD | 4.6833751e-05 |
| 4,951 | Mining Document Collections to Facilitate Accurate Approximate Entity Matching | 2009 | VLDB | 5.8100413e-05 |
| 5,151 | String Similarity Measures and Joins with Synonyms | 2013 | SIGMOD | 5.6609851e-05 |
| 3,578 | Efficient Approximate Entity Extraction with Edit Distance Constraints | 2009 | SIGMOD | 6.9503858e-05 |
| 3,451 | Learning String Transformations From Examples | 2009 | VLDB | 7.0822216e-05 |
| 1,345 | Entity Matching: How Similar Is Similar | 2011 | VLDB | 0.00012468408 |
| 4,684 | Approximate String Joins with Abbreviations | 2018 | VLDB | 6.0006406e-05 |