Back to papers
A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching
Summary: Unifies active learning for Entity Matching into a benchmark framework to compose learning and selection strategies. On public EM data, active learning with fewer labels can match or beat supervised results; optimizations boost F1 ~9% and cut latency up to 10x.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 5810
- Venue
- SIGMOD
- Year
- 2020
- Pagerank
- 8.1513883e-05
- Overall Rank
- 2,767 | 80.76%
- DOI
-
10.1145/3318464.3380597
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 15 of 15 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 221 |
Deep Entity Matching with Pre-Trained Language Models |
2021 |
VLDB |
0.00033121824 |
| 4,837 |
Entity Resolution with Hierarchical Graph Attention Networks |
2022 |
SIGMOD |
5.8892326e-05 |
| 5,282 |
Deep Indexed Active Learning for Matching Heterogeneous Entity Representations |
2022 |
VLDB |
5.5864206e-05 |
| 5,978 |
Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond |
2021 |
SIGMOD |
5.2453012e-05 |
| 6,553 |
How do Categorical Duplicates Affect ML? A New Benchmark and Empirical Analyses |
2024 |
VLDB |
5.0157344e-05 |
| 8,958 |
FlexER: Flexible Entity Resolution for Multiple Intents |
2023 |
SIGMOD |
4.4210635e-05 |
| 9,460 |
The Battleship Approach to the Low Resource Entity Matching Problem |
2023 |
SIGMOD |
4.3366491e-05 |
| 9,777 |
Data Augmentation for ML-driven Data Preparation and Integration |
2021 |
VLDB |
4.2856106e-05 |
| 9,855 |
Progressive Entity Matching: A Design Space Exploration |
2025 |
SIGMOD |
4.269353e-05 |
| 10,022 |
In-context Clustering-based Entity Resolution with Large Language Models: A Design Space Exploration |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,617 |
Deduplicated Sampling On-Demand |
2025 |
VLDB |
4.1945683e-05 |
| 11,223 |
Splitting Tuples of Mismatched Entities |
2023 |
SIGMOD |
4.1945683e-05 |
| 11,230 |
VersaMatch: Ontology Matching with Weak Supervision |
2023 |
VLDB |
4.1945683e-05 |
| 11,342 |
FILA: Online Auditing of Machine Learning Model Accuracy under Finite Labelling Budget |
2022 |
SIGMOD |
4.1945683e-05 |
| 11,438 |
New Algorithms for Monotone Classification |
2021 |
PODS |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 15 of 15 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers