| 149 |
Trio: A System for Integrated Management of Data, Accuracy, and Lineage |
2005 |
CIDR |
0.00041101118 |
| 229 |
Reference Reconciliation in Complex Information Spaces |
2005 |
SIGMOD |
0.00032242633 |
| 250 |
Efficient set joins on similarity predicates |
2004 |
SIGMOD |
0.00030661988 |
| 266 |
Efficient Exact Set-Similarity Joins |
2006 |
VLDB |
0.00029718727 |
| 420 |
InfoGather: Entity Augmentation and Attribute Discovery By Holistic Matching with Web Tables |
2012 |
SIGMOD |
0.00023719065 |
| 627 |
Management of Probabilistic Data: Foundations and Challenges |
2007 |
PODS |
0.00018959005 |
| 759 |
To Search or to Crawl? Towards a Query Optimizer for Text-Centric Tasks |
2006 |
SIGMOD |
0.00017064615 |
| 814 |
Entity Resolution: Theory, Practice & Open Challenges |
2012 |
VLDB |
0.00016370594 |
| 1,159 |
Towards Certain Fixes with Editing Rules and Master Data |
2010 |
VLDB |
0.00013592813 |
| 1,202 |
VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams |
2007 |
VLDB |
0.00013326298 |
| 1,285 |
Neighborhood Based Fast Graph Search in Large Networks |
2011 |
SIGMOD |
0.00012833377 |
| 1,396 |
Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search |
2012 |
SIGMOD |
0.00012204748 |
| 1,533 |
Example-driven Design of Efficient Record Matching Queries |
2007 |
VLDB |
0.00011471971 |
| 1,585 |
Answering Table Augmentation Queries from Unstructured Lists on the Web |
2009 |
VLDB |
0.00011255098 |
| 2,193 |
Cost-Based Variable-Length-Gram Selection for String Collections to Support Approximate Queries Efficiently |
2008 |
SIGMOD |
9.3178557e-05 |
| 2,333 |
A Platform for Personal Information Management and Integration |
2005 |
CIDR |
9.0169986e-05 |
| 2,376 |
Bed-Tree: An All-Purpose Index Structure for String Similarity Search Based on Edit Distance |
2010 |
SIGMOD |
8.9424361e-05 |
| 2,386 |
Leveraging Aggregate Constraints For Deduplication |
2007 |
SIGMOD |
8.9231895e-05 |
| 2,592 |
Pass-Join: A Partition-based Method for Similarity Joins |
2012 |
VLDB |
8.4795761e-05 |
| 2,823 |
Interaction between Record Matching and Data Repairing |
2011 |
SIGMOD |
8.0593894e-05 |
| 3,226 |
Extending Q-Grams to Estimate Selectivity of String Matching with Low Edit Distance |
2007 |
VLDB |
7.3433307e-05 |
| 3,267 |
Benchmarking Declarative Approximate Selection Predicates |
2007 |
SIGMOD |
7.3058429e-05 |
| 3,328 |
Multi-column Substring Matching for Database Schema Translation |
2006 |
VLDB |
7.2174278e-05 |
| 3,529 |
Merging the Results of Approximate Match Operations |
2004 |
VLDB |
7.0059524e-05 |
| 3,712 |
MOMA - A Mapping-based Object Matching System |
2007 |
CIDR |
6.823134e-05 |
| 4,026 |
Flexible String Matching Against Large Databases in Practice |
2004 |
VLDB |
6.5169976e-05 |
| 4,137 |
Exploiting Content Redundancy for Web Information Extraction |
2010 |
VLDB |
6.4181549e-05 |
| 4,216 |
Trie-Join: Efficient Trie-based String Similarity Joins with Edit-Distance Constraints |
2010 |
VLDB |
6.3521675e-05 |
| 4,438 |
Selectivity Estimation for Fuzzy String Predicates in Large Data Sets |
2005 |
VLDB |
6.1898903e-05 |
| 4,901 |
Probabilistic String Similarity Joins |
2010 |
SIGMOD |
5.8411648e-05 |
| 5,073 |
Faerie: Efficient Filtering Algorithms for Approximate Dictionary-based Entity Extraction |
2011 |
SIGMOD |
5.7177424e-05 |
| 5,179 |
SilkMoth: An Efficient Method for Finding Related Sets with Maximum Matching Constraints |
2017 |
VLDB |
5.6428428e-05 |
| 5,434 |
Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples |
2021 |
SIGMOD |
5.5045402e-05 |
| 5,486 |
Fast Foreign-Key Detection in Microsoft SQL Server PowerPivot for Excel |
2014 |
VLDB |
5.4811603e-05 |
| 5,794 |
Discovering Related Data At Scale |
2021 |
VLDB |
5.3245122e-05 |
| 5,796 |
Finding Frequent Items in Probabilistic Data |
2008 |
SIGMOD |
5.3240234e-05 |
| 5,869 |
Demonstration of Panda: A Weakly Supervised Entity Matching System |
2021 |
VLDB |
5.2959029e-05 |
| 5,987 |
Sampling Cube: A Framework for Statistical OLAP Over Sampling Data |
2008 |
SIGMOD |
5.2432535e-05 |
| 6,074 |
Pigeonring: A Principle for Faster Thresholded Similarity Search |
2019 |
VLDB |
5.2242306e-05 |
| 6,419 |
A Deferred Cleansing Method for RFID Data Analytics |
2006 |
VLDB |
5.0690363e-05 |
| 6,726 |
A Pivotal Prefix Based Filtering Algorithm for String Similarity Search |
2014 |
SIGMOD |
4.9484027e-05 |
| 7,588 |
Scalable Column Concept Determination for Web Tables Using Large Knowledge Bases |
2013 |
VLDB |
4.7030914e-05 |
| 7,725 |
Data Cleaning in Microsoft SQL Server 2005 |
2005 |
SIGMOD |
4.6670883e-05 |
| 7,777 |
Indexing Mixed Types for Approximate Retrieval |
2005 |
VLDB |
4.653704e-05 |
| 8,007 |
A Grammar-based Entity Representation Framework for Data Cleaning |
2009 |
SIGMOD |
4.6068018e-05 |
| 8,099 |
Sparkly: A Simple yet Surprisingly Strong TF/IDF Blocker for Entity Matching |
2023 |
VLDB |
4.5859317e-05 |
| 8,137 |
Customizable and Scalable Fuzzy Join for Big Data |
2019 |
VLDB |
4.5774794e-05 |
| 9,274 |
Ranking Distributed Probabilistic Data |
2009 |
SIGMOD |
4.3646295e-05 |
| 9,567 |
META: An Efficient Matching-Based Method for Error-Tolerant Autocompletion |
2016 |
VLDB |
4.3254416e-05 |
| 9,832 |
Balance-Aware Distributed String Similarity-Based Query Processing System |
2019 |
VLDB |
4.2751057e-05 |