Database Paper Browser

Back to papers

Deep Learning for Blocking in Entity Matching: A Design Space Exploration

Summary: DeepBlocker defines a DL-based blocking design space for entity matching, with eight label-free solutions using transformers and self-supervision. It shows top DL blocking beats prior DL and non-DL on dirty/text data, matches structured data, and benefits from DL+non-DL hybrids. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
12421
Venue
VLDB
Year
2021
Pagerank
6.8891671e-05
Overall Rank
3,640 | 74.68%
DOI
10.14778/3476249.3476294

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 21 of 21 citing papers.

Rank Citing Paper Year Venue Pagerank
3,942 Ember: No-Code Context Enrichment via Similarity-Based Keyless Joins 2022 VLDB 6.6114622e-05
4,212 Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration 2023 SIGMOD 6.3555142e-05
6,569 Domain Adaptation for Deep Entity Resolution 2022 SIGMOD 5.0065379e-05
6,711 Analyzing How BERT Performs Entity Matching 2022 VLDB 4.9517546e-05
7,052 Pre-trained Embeddings for Entity Resolution: An Experimental Analysis 2023 VLDB 4.8497453e-05
8,008 Entity Resolution On-Demand 2022 VLDB 4.6067684e-05
8,099 Sparkly: A Simple yet Surprisingly Strong TF/IDF Blocker for Entity Matching 2023 VLDB 4.5859317e-05
8,911 PromptEM: Prompt-tuning for Low-resource Generalized Entity Matching 2023 VLDB 4.427232e-05
8,958 FlexER: Flexible Entity Resolution for Multiple Intents 2023 SIGMOD 4.4210635e-05
9,077 VerifAI: Verified Generative AI 2024 CIDR 4.4010762e-05
9,355 Discovering Top-k Rules using Subjective and Objective Criteria 2023 SIGMOD 4.3514328e-05
9,846 HyperBlocker: Accelerating Rule-based Blocking in Entity Resolution using GPUs 2025 VLDB 4.2721228e-05
9,855 Progressive Entity Matching: A Design Space Exploration 2025 SIGMOD 4.269353e-05
10,022 In-context Clustering-based Entity Resolution with Large Language Models: A Design Space Exploration 2026 SIGMOD 4.1945683e-05
10,040 3dSAGER: Geospatial Entity Resolution over 3D Objects 2026 SIGMOD 4.1945683e-05
10,499 Privacy and Accuracy-Aware AI/ML Model Deduplication 2025 SIGMOD 4.1945683e-05
10,617 Deduplicated Sampling On-Demand 2025 VLDB 4.1945683e-05
10,624 Evaluating Methods for Efficient Entity Count Estimation 2025 VLDB 4.1945683e-05
11,006 FusionQuery: On-demand Fusion Queries over Multi-source Heterogeneous Data 2024 VLDB 4.1945683e-05
11,047 Blocker and Matcher Can Mutually Benefit: A Co-Learning Framework for Low-Resource Entity Resolution 2024 VLDB 4.1945683e-05
11,223 Splitting Tuples of Mismatched Entities 2023 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 14 of 14 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers