Efficient Similarity Join and Search on Multi-Attribute Data
Summary: Introduces a prefix tree index enabling holistic pruning across multiple attributes for similarity join and search. With a cost model, greedy and budget-based multi-tree strategies plus a hybrid verifier yield strong empirical gains. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Guoliang Li
- 2. Jian He
- 3. Dong Deng
- 4. Jian Li
Incoming Citations (Sorted by Pagerank)
Showing 7 of 7 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,175 | Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services | 2017 | SIGMOD | 9.3644117e-05 |
| 4,250 | Local Similarity Search for Unstructured Text | 2016 | SIGMOD | 6.3241139e-05 |
| 4,402 | Smurf: Self-Service String Matching Using Random Forests | 2019 | VLDB | 6.2195162e-05 |
| 5,469 | Learned Cardinality Estimation for Similarity Queries | 2021 | SIGMOD | 5.4898192e-05 |
| 6,270 | MATE: Multi-Attribute Table Extraction | 2022 | VLDB | 5.1337451e-05 |
| 7,668 | Human-in-the-loop Data Integration | 2017 | VLDB | 4.6834075e-05 |
| 9,832 | Balance-Aware Distributed String Similarity-Based Query Processing System | 2019 | VLDB | 4.2751057e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 14 of 14 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,050 | An Efficient Partition Based Method for Exact Set Similarity Joins | 2016 | VLDB | 6.4953612e-05 |
| 7,522 | Efficient and Tunable Similar Set Retrieval | 2001 | SIGMOD | 4.7180617e-05 |
| 4,216 | Trie-Join: Efficient Trie-based String Similarity Joins with Edit-Distance Constraints | 2010 | VLDB | 6.3521675e-05 |
| 5,615 | A Scalable Index for Top-k Subtree Similarity Queries | 2019 | SIGMOD | 5.4101086e-05 |
| 8,899 | Fast Approximate Similarity Join in Vector Databases | 2025 | SIGMOD | 4.427232e-05 |
| 6,241 | Scaling Similarity Joins over Tree-Structured Data | 2015 | VLDB | 5.1411469e-05 |
| 3,199 | Similarity Evaluation on Tree-structured Data | 2005 | SIGMOD | 7.3927291e-05 |
| 3,490 | Leveraging Set Relations in Exact Set Similarity Join | 2017 | VLDB | 7.0465856e-05 |
| 250 | Efficient set joins on similarity predicates | 2004 | SIGMOD | 0.00030661988 |
| 1,396 | Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search | 2012 | SIGMOD | 0.00012204748 |