| 712 |
Magellan: Toward Building Entity Matching Management Systems |
2016 |
VLDB |
0.00017732426 |
| 818 |
Finding Related Tables |
2012 |
SIGMOD |
0.00016311524 |
| 1,074 |
Processing Theta-Joins using MapReduce* |
2011 |
SIGMOD |
0.00014260096 |
| 1,187 |
JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes |
2019 |
SIGMOD |
0.00013443639 |
| 1,308 |
Upper and Lower Bounds on the Cost of a Map-Reduce Computation |
2013 |
VLDB |
0.00012661651 |
| 1,396 |
Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search |
2012 |
SIGMOD |
0.00012204748 |
| 1,715 |
V-SMART-Join: A Scalable MapReduce Framework for All-Pair Similarity Joins of Multisets and Vectors |
2012 |
VLDB |
0.00010803271 |
| 1,931 |
Efficient Processing of k Nearest Neighbor Joins using MapReduce |
2012 |
VLDB |
0.00010040427 |
| 2,024 |
ATLAS: A Probabilistic Algorithm for High Dimensional Similarity Search |
2011 |
SIGMOD |
9.7519678e-05 |
| 2,175 |
Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services |
2017 |
SIGMOD |
9.3644117e-05 |
| 2,337 |
Efficient Processing of Data Warehousing Queries in a Split Execution Environment |
2011 |
SIGMOD |
9.0098186e-05 |
| 2,592 |
Pass-Join: A Partition-based Method for Similarity Joins |
2012 |
VLDB |
8.4795761e-05 |
| 2,674 |
Minimal MapReduce Algorithms |
2013 |
SIGMOD |
8.3328645e-05 |
| 2,740 |
String Similarity Joins: An Experimental Evaluation |
2014 |
VLDB |
8.1980628e-05 |
| 3,062 |
Efficient Multi-way Theta-Join Processing Using MapReduce |
2012 |
VLDB |
7.6343994e-05 |
| 3,129 |
Scalable Big Graph Processing in MapReduce |
2014 |
SIGMOD |
7.5008242e-05 |
| 3,141 |
ClusterJoin: A Similarity Joins Framework using Map-Reduce |
2014 |
VLDB |
7.4829448e-05 |
| 3,263 |
QASCA: A Quality-Aware Task Assignment System for Crowdsourcing Applications |
2015 |
SIGMOD |
7.3097573e-05 |
| 3,459 |
An Empirical Evaluation of Set Similarity Join Techniques |
2016 |
VLDB |
7.072508e-05 |
| 3,490 |
Leveraging Set Relations in Exact Set Similarity Join |
2017 |
VLDB |
7.0465856e-05 |
| 3,528 |
Distributed Data Deduplication |
2016 |
VLDB |
7.0066139e-05 |
| 4,050 |
An Efficient Partition Based Method for Exact Set Similarity Joins |
2016 |
VLDB |
6.4953612e-05 |
| 4,147 |
Exploiting MapReduce-based Similarity Joins |
2012 |
SIGMOD |
6.4096022e-05 |
| 4,402 |
Smurf: Self-Service String Matching Using Random Forests |
2019 |
VLDB |
6.2195162e-05 |
| 4,493 |
ASTERIX: An Open Source System for "Big Data" Management and Analysis (Demo) |
2012 |
VLDB |
6.141595e-05 |
| 4,775 |
Set Similarity Joins on MapReduce: An Experimental Survey |
2018 |
VLDB |
5.9315784e-05 |
| 5,434 |
Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples |
2021 |
SIGMOD |
5.5045402e-05 |
| 5,902 |
The Communication Complexity of Distributed Set-Joins with Applications to Matrix Multiplication |
2015 |
PODS |
5.2796864e-05 |
| 5,903 |
Building Wavelet Histograms on Large Data in MapReduce |
2012 |
VLDB |
5.2791351e-05 |
| 6,099 |
WOO: A Scalable and Multi-tenant Platform for Continuous Knowledge Base Synthesis |
2013 |
VLDB |
5.2104516e-05 |
| 6,507 |
Similarity Join over Array Data |
2016 |
SIGMOD |
5.0337166e-05 |
| 6,605 |
Dima: A Distributed In-Memory Similarity-Based Query Processing System |
2017 |
VLDB |
4.9965703e-05 |
| 7,153 |
Submodularity of Distributed Join Computation |
2018 |
SIGMOD |
4.8153963e-05 |
| 7,215 |
SyncSignature: A Simple, Efficient, Parallelizable Framework for Tree Similarity Joins |
2023 |
VLDB |
4.7985991e-05 |
| 7,588 |
Scalable Column Concept Determination for Web Tables Using Large Knowledge Bases |
2013 |
VLDB |
4.7030914e-05 |
| 7,668 |
Human-in-the-loop Data Integration |
2017 |
VLDB |
4.6834075e-05 |
| 8,137 |
Customizable and Scalable Fuzzy Join for Big Data |
2019 |
VLDB |
4.5774794e-05 |
| 8,291 |
TxtAlign: Efficient Near-Duplicate Text Alignment Search via Bottom-k Sketches for Plagiarism Detection |
2022 |
SIGMOD |
4.5435639e-05 |
| 9,115 |
MapReduce Algorithms for Big Data Analysis |
2012 |
VLDB |
4.3932167e-05 |
| 9,502 |
Streaming Similarity Self-Join |
2016 |
VLDB |
4.3341665e-05 |
| 9,832 |
Balance-Aware Distributed String Similarity-Based Query Processing System |
2019 |
VLDB |
4.2751057e-05 |
| 10,930 |
Similarity Joins of Sparse Features |
2024 |
SIGMOD |
4.1945683e-05 |
| 11,724 |
ZigZag: Supporting Similarity Queries on Vector Space Models |
2018 |
SIGMOD |
4.1945683e-05 |
| 11,976 |
Anti-Combining for MapReduce |
2014 |
SIGMOD |
4.1945683e-05 |