SyncSignature: A Simple, Efficient, Parallelizable Framework for Tree Similarity Joins

Summary: SyncSignature: first fully parallelizable framework for tree similarity joins under edit distance using implicitly-synchronized signature generation to enable hash-join candidate generation. Beats prior work in parallel settings and on large trees (single-thread too); includes theoretical analysis of signature schemes. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID: 13166
Venue: VLDB
Year: 2023
Pagerank: 4.7939964e-05
Overall Rank: 7,215 | 49.86%
DOI: 10.14778/3565816.3565833

Incoming Non-self Citations Over Time

Authors

1. Nikolai Karpov
2. Qin Zhang

Incoming Citations (Sorted by Pagerank)

Showing 2 of 2 citing papers.

Rank	Citing Paper	Year	Venue	Pagerank
10,714	Extensible and Robust Evaluation of Similarity Queries	2025	VLDB	4.1905499e-05
11,016	X-TED: Massive Parallelization of Tree Edit Distance	2024	VLDB	4.1905499e-05

Outgoing Citations (Sorted by Pagerank)

Showing 16 of 16 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank	Cited Paper	Year	Venue	Pagerank
34	Similarity Search in High Dimensions via Hashing	1999	VLDB	0.00076824554
125	Approximate String Joins in a Database (Almost) for Free	2001	VLDB	0.00044946098
264	Efficient Exact Set-Similarity Joins	2006	VLDB	0.00029950264
442	Efficient Parallel Set-Similarity Joins Using MapReduce	2010	SIGMOD	0.00023095823
962	A Comparison of Join Algorithms for Log Processing in MapReduce	2010	SIGMOD	0.00015003834
1,232	Ed-Join: An Efficient Algorithm for Similarity Joins With Edit Distance Constraints	2008	VLDB	0.00013133604
1,396	Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search	2012	SIGMOD	0.00012215253
2,588	Pass-Join: A Partition-based Method for Similarity Joins	2012	VLDB	8.4872437e-05
2,786	Approximate XML Joins	2002	SIGMOD	8.1223413e-05
3,206	Similarity Evaluation on Tree-structured Data	2005	SIGMOD	7.3855924e-05
3,302	RTED: A Robust Algorithm for the Tree Edit Distance	2012	VLDB	7.2445358e-05
3,779	Efficient Exact Edit Similarity Query Processing with the Asymmetric Signature Scheme	2011	SIGMOD	6.7709545e-05
4,215	Trie-Join: Efficient Trie-based String Similarity Joins with Edit-Distance Constraints	2010	VLDB	6.3464157e-05
4,402	Approximate Matching of Hierarchical Data Using pq-Grams	2005	VLDB	6.2086707e-05
6,240	Scaling Similarity Joins over Tree-Structured Data	2015	VLDB	5.1362097e-05
7,593	Exact Single-Source SimRank Computation on Large Graphs	2020	SIGMOD	4.6984572e-05

Semantically Similar Papers

Overall Rank	Paper	Year	Venue	Pagerank
6,805	Indexing for Subtree Similarity-Search using Edit Distance	2013	SIGMOD	4.9170518e-05
6,351	SigMatch: Fast and Scalable Multi-Pattern Matching	2010	VLDB	5.0956764e-05
9,563	Towards a Unified Framework for String Similarity Joins	2019	VLDB	4.3212967e-05
1,232	Ed-Join: An Efficient Algorithm for Similarity Joins With Edit Distance Constraints	2008	VLDB	0.00013133604
4,215	Trie-Join: Efficient Trie-based String Similarity Joins with Edit-Distance Constraints	2010	VLDB	6.3464157e-05
5,152	String Similarity Measures and Joins with Synonyms	2013	SIGMOD	5.6550754e-05
11,249	A Two-Level Signature Scheme for Stable Set Similarity Joins	2023	VLDB	4.1905499e-05
3,779	Efficient Exact Edit Similarity Query Processing with the Asymmetric Signature Scheme	2011	SIGMOD	6.7709545e-05
3,206	Similarity Evaluation on Tree-structured Data	2005	SIGMOD	7.3855924e-05
6,240	Scaling Similarity Joins over Tree-Structured Data	2015	VLDB	5.1362097e-05