Back to papers
Deep Learning for Blocking in Entity Matching: A Design Space Exploration
Summary: DeepBlocker defines a DL-based blocking design space for entity matching, with eight label-free solutions using transformers and self-supervision. It shows top DL blocking beats prior DL and non-DL on dirty/text data, matches structured data, and benefits from DL+non-DL hybrids.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 12421
- Venue
- VLDB
- Year
- 2021
- Pagerank
- 6.8891671e-05
- Overall Rank
- 3,640 | 74.68%
- DOI
-
10.14778/3476249.3476294
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 21 of 21 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 3,942 |
Ember: No-Code Context Enrichment via Similarity-Based Keyless Joins |
2022 |
VLDB |
6.6114622e-05 |
| 4,212 |
Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration |
2023 |
SIGMOD |
6.3555142e-05 |
| 6,569 |
Domain Adaptation for Deep Entity Resolution |
2022 |
SIGMOD |
5.0065379e-05 |
| 6,711 |
Analyzing How BERT Performs Entity Matching |
2022 |
VLDB |
4.9517546e-05 |
| 7,052 |
Pre-trained Embeddings for Entity Resolution: An Experimental Analysis |
2023 |
VLDB |
4.8497453e-05 |
| 8,008 |
Entity Resolution On-Demand |
2022 |
VLDB |
4.6067684e-05 |
| 8,099 |
Sparkly: A Simple yet Surprisingly Strong TF/IDF Blocker for Entity Matching |
2023 |
VLDB |
4.5859317e-05 |
| 8,911 |
PromptEM: Prompt-tuning for Low-resource Generalized Entity Matching |
2023 |
VLDB |
4.427232e-05 |
| 8,958 |
FlexER: Flexible Entity Resolution for Multiple Intents |
2023 |
SIGMOD |
4.4210635e-05 |
| 9,077 |
VerifAI: Verified Generative AI |
2024 |
CIDR |
4.4010762e-05 |
| 9,355 |
Discovering Top-k Rules using Subjective and Objective Criteria |
2023 |
SIGMOD |
4.3514328e-05 |
| 9,846 |
HyperBlocker: Accelerating Rule-based Blocking in Entity Resolution using GPUs |
2025 |
VLDB |
4.2721228e-05 |
| 9,855 |
Progressive Entity Matching: A Design Space Exploration |
2025 |
SIGMOD |
4.269353e-05 |
| 10,022 |
In-context Clustering-based Entity Resolution with Large Language Models: A Design Space Exploration |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,040 |
3dSAGER: Geospatial Entity Resolution over 3D Objects |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,499 |
Privacy and Accuracy-Aware AI/ML Model Deduplication |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,617 |
Deduplicated Sampling On-Demand |
2025 |
VLDB |
4.1945683e-05 |
| 10,624 |
Evaluating Methods for Efficient Entity Count Estimation |
2025 |
VLDB |
4.1945683e-05 |
| 11,006 |
FusionQuery: On-demand Fusion Queries over Multi-source Heterogeneous Data |
2024 |
VLDB |
4.1945683e-05 |
| 11,047 |
Blocker and Matcher Can Mutually Benefit: A Co-Learning Framework for Low-Resource Entity Resolution |
2024 |
VLDB |
4.1945683e-05 |
| 11,223 |
Splitting Tuples of Mismatched Entities |
2023 |
SIGMOD |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 14 of 14 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers