Ed-Join: An Efficient Algorithm for Similarity Joins With Edit Distance Constraints
Summary: EdJoin leverages mismatch-based q-grams to derive two novel edit-distance lower bounds, enabling tighter filtering for similarity joins under edit-distance constraints. It dramatically reduces candidate sets and runtime, outperforming prior q-gram–based approaches on large real datasets. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Chuan Xiao
- 2. Wei Wang
- 3. Xuemin Lin
Incoming Citations (Sorted by Pagerank)
Showing 37 of 37 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 12 of 12 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,979 | Similarity Joins for Uncertain Strings | 2014 | SIGMOD | 4.1945683e-05 |
| 3,226 | Extending Q-Grams to Estimate Selectivity of String Matching with Low Edit Distance | 2007 | VLDB | 7.3433307e-05 |
| 3,774 | Efficient Exact Edit Similarity Query Processing with the Asymmetric Signature Scheme | 2011 | SIGMOD | 6.7757301e-05 |
| 250 | Efficient set joins on similarity predicates | 2004 | SIGMOD | 0.00030661988 |
| 125 | Approximate String Joins in a Database (Almost) for Free | 2001 | VLDB | 0.00044847972 |
| 4,901 | Probabilistic String Similarity Joins | 2010 | SIGMOD | 5.8411648e-05 |
| 2,740 | String Similarity Joins: An Experimental Evaluation | 2014 | VLDB | 8.1980628e-05 |
| 7,708 | Efficient Top-k Algorithms for Approximate Substring Matching | 2013 | SIGMOD | 4.6721808e-05 |
| 4,216 | Trie-Join: Efficient Trie-based String Similarity Joins with Edit-Distance Constraints | 2010 | VLDB | 6.3521675e-05 |
| 2,592 | Pass-Join: A Partition-based Method for Similarity Joins | 2012 | VLDB | 8.4795761e-05 |