Flexible String Matching Against Large Databases in Practice
Summary: Extends tf-idf-based fuzzy string matching to multi-attribute queries and known semantic equivalences in large databases. Reports practical performance optimizations, including accuracy-speed trade-offs, demonstrated on real AT&T datasets. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Nick Koudas
- 2. Amit Marathe
- 3. Divesh Srivastava
Incoming Citations (Sorted by Pagerank)
Showing 9 of 9 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,527 | Generic Schema Matching, Ten Years Later | 2011 | VLDB | 0.00011499442 |
| 1,533 | Example-driven Design of Efficient Record Matching Queries | 2007 | VLDB | 0.00011471971 |
| 3,267 | Benchmarking Declarative Approximate Selection Predicates | 2007 | SIGMOD | 7.3058429e-05 |
| 3,328 | Multi-column Substring Matching for Database Schema Translation | 2006 | VLDB | 7.2174278e-05 |
| 4,988 | Incremental Maintenance of Length Normalized Indexes for Approximate String Matching | 2009 | SIGMOD | 5.783959e-05 |
| 6,351 | SigMatch: Fast and Scalable Multi-Pattern Matching | 2010 | VLDB | 5.1005697e-05 |
| 7,708 | Efficient Top-k Algorithms for Approximate Substring Matching | 2013 | SIGMOD | 4.6721808e-05 |
| 12,544 | SPIDER: Flexible Matching in Databases | 2005 | SIGMOD | 4.1945683e-05 |
| 13,612 | Using SPIDER: An Experience Report | 2006 | SIGMOD | - |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 1 of 1 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 155 | Robust and Efficient Fuzzy Match for Online Data Cleaning | 2003 | SIGMOD | 0.00040637896 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 13,612 | Using SPIDER: An Experience Report | 2006 | SIGMOD | - |
| 9,563 | Towards a Unified Framework for String Similarity Joins | 2019 | VLDB | 4.3254416e-05 |
| 3,529 | Merging the Results of Approximate Match Operations | 2004 | VLDB | 7.0059524e-05 |
| 125 | Approximate String Joins in a Database (Almost) for Free | 2001 | VLDB | 0.00044847972 |
| 4,435 | Sampling Dirty Data for Matching Attributes | 2010 | SIGMOD | 6.1918164e-05 |
| 11,979 | Similarity Joins for Uncertain Strings | 2014 | SIGMOD | 4.1945683e-05 |
| 4,901 | Probabilistic String Similarity Joins | 2010 | SIGMOD | 5.8411648e-05 |
| 7,669 | Incorporating String Transformations in Record Matching | 2008 | SIGMOD | 4.6833751e-05 |
| 2,740 | String Similarity Joins: An Experimental Evaluation | 2014 | VLDB | 8.1980628e-05 |
| 155 | Robust and Efficient Fuzzy Match for Online Data Cleaning | 2003 | SIGMOD | 0.00040637896 |