Learning String Transformations From Examples
Summary: Learning string transformations from examples to handle synonyms, abbreviations, and aliases in matching. Formulates an NP-hard optimization to learn a concise transformation set; proposes a greedy approximation with real-data experiments. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Arvind Arasu
- 2. Surajit Chaudhuri
- 3. Raghav Kaushik
Incoming Citations (Sorted by Pagerank)
Showing 8 of 8 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,894 | Baran: Effective Error Correction via a Unified Context Representation and Transfer Learning | 2020 | VLDB | 0.0001018378 |
| 3,230 | Learning Semantic String Transformations from Examples | 2012 | VLDB | 7.339123e-05 |
| 4,684 | Approximate String Joins with Abbreviations | 2018 | VLDB | 6.0006406e-05 |
| 4,951 | Mining Document Collections to Facilitate Accurate Approximate Entity Matching | 2009 | VLDB | 5.8100413e-05 |
| 5,536 | On Indexing Error-Tolerant Set Containment | 2010 | SIGMOD | 5.4532734e-05 |
| 6,818 | NLyze: Interactive Programming by Natural Language for SpreadSheet Data Analysis and Manipulation | 2014 | SIGMOD | 4.916347e-05 |
| 9,563 | Towards a Unified Framework for String Similarity Joins | 2019 | VLDB | 4.3254416e-05 |
| 11,343 | SPINE: Scaling up Programming-by-Negative-Example for String Filtering and Transformation | 2022 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 229 | Reference Reconciliation in Complex Information Spaces | 2005 | SIGMOD | 0.00032242633 |
| 322 | Record Linkage: Similarity Measures and Algorithms | 2006 | SIGMOD | 0.00027518768 |
| 992 | XTRACT: A System for Extracting Document Type Descriptors from XML Documents | 2000 | SIGMOD | 0.00014799689 |
| 1,533 | Example-driven Design of Efficient Record Matching Queries | 2007 | VLDB | 0.00011471971 |
| 4,951 | Mining Document Collections to Facilitate Accurate Approximate Entity Matching | 2009 | VLDB | 5.8100413e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,536 | On Indexing Error-Tolerant Set Containment | 2010 | SIGMOD | 5.4532734e-05 |
| 3,578 | Efficient Approximate Entity Extraction with Edit Distance Constraints | 2009 | SIGMOD | 6.9503858e-05 |
| 4,026 | Flexible String Matching Against Large Databases in Practice | 2004 | VLDB | 6.5169976e-05 |
| 3,230 | Learning Semantic String Transformations from Examples | 2012 | VLDB | 7.339123e-05 |
| 11,087 | Dealing with Acronyms, Abbreviations, and Typos in Real-World Entity Matching | 2024 | VLDB | 4.1945683e-05 |
| 2,740 | String Similarity Joins: An Experimental Evaluation | 2014 | VLDB | 8.1980628e-05 |
| 7,708 | Efficient Top-k Algorithms for Approximate Substring Matching | 2013 | SIGMOD | 4.6721808e-05 |
| 4,684 | Approximate String Joins with Abbreviations | 2018 | VLDB | 6.0006406e-05 |
| 5,151 | String Similarity Measures and Joins with Synonyms | 2013 | SIGMOD | 5.6609851e-05 |
| 7,669 | Incorporating String Transformations in Record Matching | 2008 | SIGMOD | 4.6833751e-05 |