Database Paper Browser

Back to papers

Evaluation of entity resolution approaches on real-world match problems

Summary: Comparative evaluation of entity resolution approaches on real-world match problems, measuring ML-based and traditional parameterizations, plus a commercial system. Reveals substantial quality and efficiency gaps among methods, and shows product-entity matching for online shops often defeats conventional attribute-similarity approaches. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
10021
Venue
VLDB
Year
2010
Pagerank
0.00027781866
Overall Rank
319 | 97.79%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 46 of 46 citing papers.

Rank Citing Paper Year Venue Pagerank
221 Deep Entity Matching with Pre-Trained Language Models 2021 VLDB 0.00033121824
263 CrowdER: Crowdsourcing Entity Resolution 2012 VLDB 0.00029862413
300 Deep Learning for Entity Matching: A Design Space Exploration 2018 SIGMOD 0.00028441466
643 Corleone: Hands-Off Crowdsourcing for Entity Matching 2014 SIGMOD 0.00018754451
754 Distributed Representations of Tuples for Entity Resolution 2018 VLDB 0.00017117211
814 Entity Resolution: Theory, Practice & Open Challenges 2012 VLDB 0.00016370594
1,831 Synthesizing Entity Matching Rules by Examples 2018 VLDB 0.00010384082
2,184 A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data 2014 SIGMOD 9.3429789e-05
2,231 Dedoop: Efficient Deduplication with Hadoop 2012 VLDB 9.2304499e-05
2,767 A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching 2020 SIGMOD 8.1513883e-05
3,140 ZeroER: Entity Resolution using Zero Labeled Examples 2020 SIGMOD 7.4841763e-05
3,528 Distributed Data Deduplication 2016 VLDB 7.0066139e-05
3,840 Revisiting Prompt Engineering via Declarative Crowdsourcing 2024 CIDR 6.7106924e-05
4,185 Arnold: Declarative Crowd-Machine Data Integration 2013 CIDR 6.3776356e-05
4,607 Data Integration and Machine Learning: A Natural Synergy 2018 SIGMOD 6.0538827e-05
4,837 Entity Resolution with Hierarchical Graph Attention Networks 2022 SIGMOD 5.8892326e-05
4,859 Integrating Data Lake Tables 2023 VLDB 5.8732433e-05
5,282 Deep Indexed Active Learning for Matching Heterogeneous Entity Representations 2022 VLDB 5.5864206e-05
5,434 Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples 2021 SIGMOD 5.5045402e-05
5,869 Demonstration of Panda: A Weakly Supervised Entity Matching System 2021 VLDB 5.2959029e-05
5,896 In Search of an Entity Resolution OASIS: Optimal Asymptotic Sequential Importance Sampling 2017 VLDB 5.2847867e-05
6,099 WOO: A Scalable and Multi-tenant Platform for Continuous Knowledge Base Synthesis 2013 VLDB 5.2104516e-05
6,690 Parallel Discrepancy Detection and Incremental Detection 2021 VLDB 4.9621556e-05
7,052 Pre-trained Embeddings for Entity Resolution: An Experimental Analysis 2023 VLDB 4.8497453e-05
7,185 Certus: An Effective Entity Resolution Approach with Graph Differential Dependencies (GDDs) 2019 VLDB 4.8066159e-05
7,243 Data Integration and Machine Learning: A Natural Synergy 2018 VLDB 4.7913666e-05
7,345 Linking Temporal Records for Profiling Entities 2015 SIGMOD 4.756212e-05
7,952 Multi-Source Uncertain Entity Resolution at Yad Vashem: Transforming Holocaust Victim Reports into People 2016 SIGMOD 4.613363e-05
8,005 Online Topic-Aware Entity Resolution Over Incomplete Data Streams 2021 SIGMOD 4.6081461e-05
8,528 Cryptographically Secure Private Record Linkage Using Locality-Sensitive Hashing 2024 VLDB 4.4937074e-05
9,409 Ground Truth Inference for Weakly Supervised Entity Matching 2023 SIGMOD 4.3441378e-05
9,434 Rock: Cleaning Data by Embedding ML in Logic Rules 2024 SIGMOD 4.3430376e-05
9,487 Making It Tractable to Catch Duplicates and Conflicts in Graphs 2023 SIGMOD 4.3341665e-05
9,855 Progressive Entity Matching: A Design Space Exploration 2025 SIGMOD 4.269353e-05
9,963 Parallel Rule Discovery from Large Datasets by Sampling 2022 SIGMOD 4.2294678e-05
10,489 Incremental Rule Discovery in Response to Parameter Updates 2025 SIGMOD 4.1945683e-05
10,624 Evaluating Methods for Efficient Entity Count Estimation 2025 VLDB 4.1945683e-05
11,223 Splitting Tuples of Mismatched Entities 2023 SIGMOD 4.1945683e-05
11,333 LACE: A Logical Approach to Collective Entity Resolution 2022 PODS 4.1945683e-05
11,342 FILA: Online Auditing of Machine Learning Model Accuracy under Finite Labelling Budget 2022 SIGMOD 4.1945683e-05
11,373 Generalized Supervised Meta-blocking 2022 VLDB 4.1945683e-05
11,388 Frost: A Platform for Benchmarking and Exploring Data Matching Results 2022 VLDB 4.1945683e-05
11,438 New Algorithms for Monotone Classification 2021 PODS 4.1945683e-05
11,930 ConfSeer: Leveraging Customer Support Knowledge Bases for Automated Misconfiguration Detection 2015 VLDB 4.1945683e-05
11,982 Matching Titles with Cross Title Web-Search Enrichment and Community Detection 2014 VLDB 4.1945683e-05
12,044 Knowledge Harvesting in the Big-Data Era 2013 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 9 of 9 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers