| 1,187 |
JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes |
2019 |
SIGMOD |
0.00013443639 |
| 2,641 |
Locality-Sensitive Hashing for Earthquake Detection: A Case Study of Scaling Data-Driven Science |
2018 |
VLDB |
8.3905374e-05 |
| 2,730 |
Open Data Integration |
2018 |
VLDB |
8.2126735e-05 |
| 2,740 |
String Similarity Joins: An Experimental Evaluation |
2014 |
VLDB |
8.1980628e-05 |
| 3,263 |
QASCA: A Quality-Aware Task Assignment System for Crowdsourcing Applications |
2015 |
SIGMOD |
7.3097573e-05 |
| 3,459 |
An Empirical Evaluation of Set Similarity Join Techniques |
2016 |
VLDB |
7.072508e-05 |
| 3,490 |
Leveraging Set Relations in Exact Set Similarity Join |
2017 |
VLDB |
7.0465856e-05 |
| 4,050 |
An Efficient Partition Based Method for Exact Set Similarity Joins |
2016 |
VLDB |
6.4953612e-05 |
| 4,250 |
Local Similarity Search for Unstructured Text |
2016 |
SIGMOD |
6.3241139e-05 |
| 4,353 |
Overlap Set Similarity Joins with Theoretical Guarantees |
2018 |
SIGMOD |
6.263585e-05 |
| 4,684 |
Approximate String Joins with Abbreviations |
2018 |
VLDB |
6.0006406e-05 |
| 4,808 |
On the Complexity of Inner Product Similarity Join |
2016 |
PODS |
5.908896e-05 |
| 5,151 |
String Similarity Measures and Joins with Synonyms |
2013 |
SIGMOD |
5.6609851e-05 |
| 5,365 |
Question Answering Over Knowledge Graphs: Question Understanding Via Template Decomposition |
2018 |
VLDB |
5.5461187e-05 |
| 5,434 |
Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples |
2021 |
SIGMOD |
5.5045402e-05 |
| 5,469 |
Learned Cardinality Estimation for Similarity Queries |
2021 |
SIGMOD |
5.4898192e-05 |
| 6,074 |
Pigeonring: A Principle for Faster Thresholded Similarity Search |
2019 |
VLDB |
5.2242306e-05 |
| 6,605 |
Dima: A Distributed In-Memory Similarity-Based Query Processing System |
2017 |
VLDB |
4.9965703e-05 |
| 6,726 |
A Pivotal Prefix Based Filtering Algorithm for String Similarity Search |
2014 |
SIGMOD |
4.9484027e-05 |
| 7,109 |
Efficient Similarity Join and Search on Multi-Attribute Data |
2015 |
SIGMOD |
4.8292998e-05 |
| 7,215 |
SyncSignature: A Simple, Efficient, Parallelizable Framework for Tree Similarity Joins |
2023 |
VLDB |
4.7985991e-05 |
| 7,588 |
Scalable Column Concept Determination for Web Tables Using Large Knowledge Bases |
2013 |
VLDB |
4.7030914e-05 |
| 7,635 |
Allign: Aligning All-Pair Near-Duplicate Passages in Long Texts |
2021 |
SIGMOD |
4.6908858e-05 |
| 7,668 |
Human-in-the-loop Data Integration |
2017 |
VLDB |
4.6834075e-05 |
| 8,291 |
TxtAlign: Efficient Near-Duplicate Text Alignment Search via Bottom-k Sketches for Plagiarism Detection |
2022 |
SIGMOD |
4.5435639e-05 |
| 8,618 |
Nexus: Correlation Discovery over Collections of Spatio-Temporal Tabular Data |
2024 |
SIGMOD |
4.4838259e-05 |
| 9,439 |
On-the-Fly Token Similarity Joins in Relational Databases |
2014 |
SIGMOD |
4.3423824e-05 |
| 9,563 |
Towards a Unified Framework for String Similarity Joins |
2019 |
VLDB |
4.3254416e-05 |
| 9,832 |
Balance-Aware Distributed String Similarity-Based Query Processing System |
2019 |
VLDB |
4.2751057e-05 |
| 9,876 |
Near-Duplicate Sequence Search at Scale for Large Language Model Memorization Evaluation |
2023 |
SIGMOD |
4.2667743e-05 |
| 9,932 |
Local Filtering: Improving the Performance of Approximate Queries on String Collections |
2015 |
SIGMOD |
4.2500258e-05 |
| 9,933 |
Efficient and Effective KNN Sequence Search with Approximate n-grams |
2014 |
VLDB |
4.2500258e-05 |
| 10,706 |
Extensible and Robust Evaluation of Similarity Queries |
2025 |
VLDB |
4.1945683e-05 |
| 11,087 |
Dealing with Acronyms, Abbreviations, and Typos in Real-World Entity Matching |
2024 |
VLDB |
4.1945683e-05 |
| 11,175 |
Grouping Time Series for Efficient Columnar Storage |
2023 |
SIGMOD |
4.1945683e-05 |
| 11,247 |
A Two-Level Signature Scheme for Stable Set Similarity Joins |
2023 |
VLDB |
4.1945683e-05 |
| 11,305 |
TokenJoin: Efficient Filtering for Set Similarity Join with Maximum Weighted Bipartite Matching |
2023 |
VLDB |
4.1945683e-05 |
| 11,347 |
OpenTFV: An Open Domain Table-Based Fact Verification System |
2022 |
SIGMOD |
4.1945683e-05 |
| 11,504 |
LES3: Learning-based Exact Set Similarity Search |
2021 |
VLDB |
4.1945683e-05 |
| 11,724 |
ZigZag: Supporting Similarity Queries on Vector Space Models |
2018 |
SIGMOD |
4.1945683e-05 |
| 12,086 |
RCSI: Scalable similarity search in thousand(s) of genomes |
2013 |
VLDB |
4.1945683e-05 |