Database Paper Browser

Back to papers

Incorporating String Transformations in Record Matching

Summary: Extends record matching by allowing user-defined string transformations (e.g., Robert/Bob) to define string similarity. Proposes an index-aware fuzzy-lookup framework that combines transformations with a base similarity, achieving better match quality and faster retrieval. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
4070
Venue
SIGMOD
Year
2008
Pagerank
4.6833751e-05
Overall Rank
7,669 | 46.65%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 2 of 2 citing papers.

Rank Citing Paper Year Venue Pagerank
3,578 Efficient Approximate Entity Extraction with Edit Distance Constraints 2009 SIGMOD 6.9503858e-05
5,887 Efficient Approximate Search on String Collections (Tutorial) 2009 VLDB 5.2879769e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 4 of 4 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
34 Similarity Search in High Dimensions via Hashing 1999 VLDB 0.00076637636
266 Efficient Exact Set-Similarity Joins 2006 VLDB 0.00029718727
322 Record Linkage: Similarity Measures and Algorithms 2006 SIGMOD 0.00027518768
7,725 Data Cleaning in Microsoft SQL Server 2005 2005 SIGMOD 4.6670883e-05
Previous Page 1 / 1 Next

Semantically Similar Papers

Overall Rank Paper Year Venue Pagerank
125 Approximate String Joins in a Database (Almost) for Free 2001 VLDB 0.00044847972
9,563 Towards a Unified Framework for String Similarity Joins 2019 VLDB 4.3254416e-05
4,901 Probabilistic String Similarity Joins 2010 SIGMOD 5.8411648e-05
4,684 Approximate String Joins with Abbreviations 2018 VLDB 6.0006406e-05
11,979 Similarity Joins for Uncertain Strings 2014 SIGMOD 4.1945683e-05
1,533 Example-driven Design of Efficient Record Matching Queries 2007 VLDB 0.00011471971
5,536 On Indexing Error-Tolerant Set Containment 2010 SIGMOD 5.4532734e-05
5,151 String Similarity Measures and Joins with Synonyms 2013 SIGMOD 5.6609851e-05
4,026 Flexible String Matching Against Large Databases in Practice 2004 VLDB 6.5169976e-05
3,451 Learning String Transformations From Examples 2009 VLDB 7.0822216e-05