| 221 |
Deep Entity Matching with Pre-Trained Language Models |
2021 |
VLDB |
0.00033121824 |
| 300 |
Deep Learning for Entity Matching: A Design Space Exploration |
2018 |
SIGMOD |
0.00028441466 |
| 517 |
Can Foundation Models Wrangle Your Data? |
2023 |
VLDB |
0.00021169035 |
| 712 |
Magellan: Toward Building Entity Matching Management Systems |
2016 |
VLDB |
0.00017732426 |
| 754 |
Distributed Representations of Tuples for Entity Resolution |
2018 |
VLDB |
0.00017117211 |
| 791 |
ActiveClean: Interactive Data Cleaning For Statistical Modeling |
2016 |
VLDB |
0.00016629664 |
| 1,627 |
Data Cleaning: Overview and Emerging Challenges |
2016 |
SIGMOD |
0.00011086905 |
| 1,716 |
Chimera: Large-Scale Classification using Machine Learning, Rules, and Crowdsourcing |
2014 |
VLDB |
0.00010795718 |
| 1,831 |
Synthesizing Entity Matching Rules by Examples |
2018 |
VLDB |
0.00010384082 |
| 1,914 |
Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks |
2020 |
SIGMOD |
0.00010109102 |
| 2,175 |
Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services |
2017 |
SIGMOD |
9.3644117e-05 |
| 2,767 |
A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching |
2020 |
SIGMOD |
8.1513883e-05 |
| 3,640 |
Deep Learning for Blocking in Entity Matching: A Design Space Exploration |
2021 |
VLDB |
6.8891671e-05 |
| 4,104 |
Online Entity Resolution Using an Oracle |
2016 |
VLDB |
6.4493809e-05 |
| 4,126 |
Waldo: An Adaptive Human Interface for Crowd Entity Resolution |
2017 |
SIGMOD |
6.4314729e-05 |
| 4,278 |
Similarity Query Processing for High-Dimensional Data |
2020 |
VLDB |
6.2953764e-05 |
| 4,402 |
Smurf: Self-Service String Matching Using Random Forests |
2019 |
VLDB |
6.2195162e-05 |
| 4,451 |
CLAMShell: Speeding up Crowds for Low-latency Data Labeling |
2016 |
VLDB |
6.1738675e-05 |
| 4,607 |
Data Integration and Machine Learning: A Natural Synergy |
2018 |
SIGMOD |
6.0538827e-05 |
| 4,619 |
Crowd-Based Deduplication: An Adaptive Approach |
2015 |
SIGMOD |
6.0444854e-05 |
| 4,665 |
Argonaut: Macrotask Crowdsourcing for Complex Data Processing |
2015 |
VLDB |
6.0125329e-05 |
| 4,837 |
Entity Resolution with Hierarchical Graph Attention Networks |
2022 |
SIGMOD |
5.8892326e-05 |
| 4,989 |
BEER: Blocking for Effective Entity Resolution |
2021 |
SIGMOD |
5.7827362e-05 |
| 5,362 |
Cost-Effective Crowdsourced Entity Resolution: A Partial-Order Approach |
2016 |
SIGMOD |
5.5473503e-05 |
| 5,445 |
QFix: Diagnosing Errors through Query Histories |
2017 |
SIGMOD |
5.5020909e-05 |
| 5,622 |
Monotonic Cardinality Estimation of Similarity Selection: A Deep Learning Approach |
2020 |
SIGMOD |
5.4060403e-05 |
| 5,929 |
ActiveClean: An Interactive Data Cleaning Framework For Modern Machine Learning |
2016 |
SIGMOD |
5.2682177e-05 |
| 6,111 |
Why Big Data Industrial Systems Need Rules and What We Can Do About It |
2015 |
SIGMOD |
5.2049579e-05 |
| 6,868 |
Cost-Effective Data Annotation using Game-Based Crowdsourcing |
2019 |
VLDB |
4.9010083e-05 |
| 6,955 |
Inspector Gadget: A Data Programming-based Labeling System for Industrial Images |
2021 |
VLDB |
4.8864297e-05 |
| 7,013 |
Qualitative Data Cleaning |
2016 |
VLDB |
4.8619024e-05 |
| 7,117 |
Crowdsourced Data Management: Overview and Challenges |
2017 |
SIGMOD |
4.826509e-05 |
| 7,185 |
Certus: An Effective Entity Resolution Approach with Graph Differential Dependencies (GDDs) |
2019 |
VLDB |
4.8066159e-05 |
| 7,243 |
Data Integration and Machine Learning: A Natural Synergy |
2018 |
VLDB |
4.7913666e-05 |
| 7,575 |
Human-in-the-loop Outlier Detection |
2020 |
SIGMOD |
4.7068909e-05 |
| 7,668 |
Human-in-the-loop Data Integration |
2017 |
VLDB |
4.6834075e-05 |
| 7,780 |
A Natural Language Interface for Querying General and Individual Knowledge |
2015 |
VLDB |
4.6533677e-05 |
| 8,099 |
Sparkly: A Simple yet Surprisingly Strong TF/IDF Blocker for Entity Matching |
2023 |
VLDB |
4.5859317e-05 |
| 8,362 |
Minimizing Efforts in Validating Crowd Answers |
2015 |
SIGMOD |
4.5366717e-05 |
| 8,585 |
Robust Entity Resolution using Random Graphs |
2018 |
SIGMOD |
4.4905755e-05 |
| 8,593 |
Wisteria: Nurturing Scalable Data Cleaning Infrastructure |
2015 |
VLDB |
4.4891474e-05 |
| 8,694 |
Managing General and Individual Knowledge in Crowd Mining Applications |
2015 |
CIDR |
4.4661379e-05 |
| 8,908 |
Deep Active Alignment of Knowledge Graph Entities and Schemata |
2023 |
SIGMOD |
4.427232e-05 |
| 8,911 |
PromptEM: Prompt-tuning for Low-resource Generalized Entity Matching |
2023 |
VLDB |
4.427232e-05 |
| 9,056 |
A Data Quality Metric (DQM): How to Estimate the Number of Undetected Errors in Data Sets |
2017 |
VLDB |
4.4039656e-05 |
| 9,683 |
Hierarchical Entity Resolution using an Oracle |
2022 |
SIGMOD |
4.3047774e-05 |
| 9,684 |
How to Design Robust Algorithms using Noisy Comparison Oracle |
2021 |
VLDB |
4.3047774e-05 |
| 9,771 |
EasyDR: A Human-in-the-loop Error Detection and Repair Platform for Holistic Table Cleaning |
2022 |
VLDB |
4.2856106e-05 |
| 9,896 |
Towards Interpretable and Learnable Risk Analysis for Entity Resolution |
2020 |
SIGMOD |
4.2600049e-05 |
| 10,022 |
In-context Clustering-based Entity Resolution with Large Language Models: A Design Space Exploration |
2026 |
SIGMOD |
4.1945683e-05 |