Back to papers
Ground Truth Inference for Weakly Supervised Entity Matching
Summary: Weak supervision for entity matching via labeling functions; simple labeling model outperforms prior methods. EM transitivity; exact solutions when possible, ML approximations otherwise; DeepMatcher trained on weak labels rivals fully supervised models.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 6535
- Venue
- SIGMOD
- Year
- 2023
- Pagerank
- 4.3441378e-05
- Overall Rank
- 9,409 | 34.55%
- DOI
-
10.1145/3588712
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 19 of 19 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 221 |
Deep Entity Matching with Pre-Trained Language Models |
2021 |
VLDB |
0.00033121824 |
| 254 |
Snorkel: Rapid Training Data Creation with Weak Supervision |
2018 |
VLDB |
0.00030540555 |
| 300 |
Deep Learning for Entity Matching: A Design Space Exploration |
2018 |
SIGMOD |
0.00028441466 |
| 319 |
Evaluation of entity resolution approaches on real-world match problems |
2010 |
VLDB |
0.00027781866 |
| 509 |
On Active Learning of Record Matching Packages |
2010 |
SIGMOD |
0.00021409518 |
| 667 |
Incremental Knowledge Base Construction Using DeepDive |
2015 |
VLDB |
0.00018440557 |
| 712 |
Magellan: Toward Building Entity Matching Management Systems |
2016 |
VLDB |
0.00017732426 |
| 754 |
Distributed Representations of Tuples for Entity Resolution |
2018 |
VLDB |
0.00017117211 |
| 814 |
Entity Resolution: Theory, Practice & Open Challenges |
2012 |
VLDB |
0.00016370594 |
| 1,215 |
Snuba: Automating Weak Supervision to Label Training Data |
2019 |
VLDB |
0.0001323375 |
| 2,567 |
Resolving Conflicts in Heterogeneous Data by Truth Discovery and Source Reliability Estimation |
2014 |
SIGMOD |
8.5239306e-05 |
| 2,937 |
Truth Inference in Crowdsourcing: Is the Problem Solved? |
2017 |
VLDB |
7.853108e-05 |
| 3,140 |
ZeroER: Entity Resolution using Zero Labeled Examples |
2020 |
SIGMOD |
7.4841763e-05 |
| 3,303 |
Fonduer: Knowledge Base Construction from Richly Formatted Data |
2018 |
SIGMOD |
7.2487486e-05 |
| 4,471 |
GOGGLES: Automatic Image Labeling with Affinity Coding |
2020 |
SIGMOD |
6.1555681e-05 |
| 4,607 |
Data Integration and Machine Learning: A Natural Synergy |
2018 |
SIGMOD |
6.0538827e-05 |
| 5,434 |
Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples |
2021 |
SIGMOD |
5.5045402e-05 |
| 5,869 |
Demonstration of Panda: A Weakly Supervised Entity Matching System |
2021 |
VLDB |
5.2959029e-05 |
| 7,610 |
Learning to be a Statistician: Learned Estimator for Number of Distinct Values |
2022 |
VLDB |
4.6965039e-05 |
Semantically Similar Papers