Efficient Exact Set-Similarity Joins
Summary: Exact set-similarity join (SSJoin) algorithms for cross-collection sets. First to achieve both exact results and deterministic performance guarantees, surpassing prior probabilistic-guarantee methods; validated on real and synthetic datasets. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Arvind Arasu
- 2. Venkatesh Ganti
- 3. Raghav Kaushik
Incoming Citations (Sorted by Pagerank)
Showing 28 of 78 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 12 of 12 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,899 | Fast Approximate Similarity Join in Vector Databases | 2025 | SIGMOD | 4.427232e-05 |
| 4,775 | Set Similarity Joins on MapReduce: An Experimental Survey | 2018 | VLDB | 5.9315784e-05 |
| 10,930 | Similarity Joins of Sparse Features | 2024 | SIGMOD | 4.1945683e-05 |
| 7,847 | Set Similarity Join on Probabilistic Data | 2010 | VLDB | 4.6365272e-05 |
| 2,740 | String Similarity Joins: An Experimental Evaluation | 2014 | VLDB | 8.1980628e-05 |
| 250 | Efficient set joins on similarity predicates | 2004 | SIGMOD | 0.00030661988 |
| 4,353 | Overlap Set Similarity Joins with Theoretical Guarantees | 2018 | SIGMOD | 6.263585e-05 |
| 3,459 | An Empirical Evaluation of Set Similarity Join Techniques | 2016 | VLDB | 7.072508e-05 |
| 3,490 | Leveraging Set Relations in Exact Set Similarity Join | 2017 | VLDB | 7.0465856e-05 |
| 4,050 | An Efficient Partition Based Method for Exact Set Similarity Joins | 2016 | VLDB | 6.4953612e-05 |