Database Paper Browser

Back to papers

TokenJoin: Efficient Filtering for Set Similarity Join with Maximum Weighted Bipartite Matching

Summary: TokenJoin: token-based filtering for fuzzy set similarity joins that use maximum-weight bipartite matching, replacing expensive element-pair similarity checks with cheaper token comparisons. Supports top-k and early termination, achieving up to an order-of-magnitude speedup vs. state-of-the-art element-based filters. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
13332
Venue
VLDB
Year
2023
Pagerank
4.1945683e-05
Overall Rank
11,305 | 21.36%
DOI
10.14778/3574245.3574263

Incoming Non-self Citations Over Time

No non-self incoming citations found for this paper in this database.

Authors

Incoming Citations (Sorted by Pagerank)

Showing 0 of 0 citing papers.

Rank Citing Paper Year Venue Pagerank
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 16 of 16 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
125 Approximate String Joins in a Database (Almost) for Free 2001 VLDB 0.00044847972
250 Efficient set joins on similarity predicates 2004 SIGMOD 0.00030661988
266 Efficient Exact Set-Similarity Joins 2006 VLDB 0.00029718727
1,187 JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes 2019 SIGMOD 0.00013443639
1,234 Ed-Join: An Efficient Algorithm for Similarity Joins With Edit Distance Constraints 2008 VLDB 0.00013122499
1,305 Bayesian Locality Sensitive Hashing for Fast Similarity Search 2012 VLDB 0.00012687101
1,396 Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search 2012 SIGMOD 0.00012204748
2,024 ATLAS: A Probabilistic Algorithm for High Dimensional Similarity Search 2011 SIGMOD 9.7519678e-05
2,376 Bed-Tree: An All-Purpose Index Structure for String Similarity Search Based on Edit Distance 2010 SIGMOD 8.9424361e-05
2,740 String Similarity Joins: An Experimental Evaluation 2014 VLDB 8.1980628e-05
3,459 An Empirical Evaluation of Set Similarity Join Techniques 2016 VLDB 7.072508e-05
3,490 Leveraging Set Relations in Exact Set Similarity Join 2017 VLDB 7.0465856e-05
3,514 Spatio-Textual Similarity Joins 2013 VLDB 7.0226998e-05
3,774 Efficient Exact Edit Similarity Query Processing with the Asymmetric Signature Scheme 2011 SIGMOD 6.7757301e-05
4,775 Set Similarity Joins on MapReduce: An Experimental Survey 2018 VLDB 5.9315784e-05
5,179 SilkMoth: An Efficient Method for Finding Related Sets with Maximum Matching Constraints 2017 VLDB 5.6428428e-05
Previous Page 1 / 1 Next

Semantically Similar Papers