Database Paper Browser

Back to papers

Magellan: Toward Building Entity Matching Management Systems

Summary: Magellan reframes entity matching as an end-to-end EM management system, not just algorithms. It offers step-by-step guides, a Python-based toolchain across the EM pipeline, and an interactive scripting environment, validated by 44 users. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
11229
Venue
VLDB
Year
2016
Pagerank
0.00017732426
Overall Rank
712 | 95.05%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 49 of 49 citing papers.

Rank Citing Paper Year Venue Pagerank
221 Deep Entity Matching with Pre-Trained Language Models 2021 VLDB 0.00033121824
300 Deep Learning for Entity Matching: A Design Space Exploration 2018 SIGMOD 0.00028441466
754 Distributed Representations of Tuples for Entity Resolution 2018 VLDB 0.00017117211
2,038 The return of JedAI: End-to-End Entity Resolution for Structured and Semi-Structured Data 2018 VLDB 9.7098952e-05
2,209 Data Integration: After the Teenage Years 2017 PODS 9.2868035e-05
2,349 RPT: Relational Pre-trained Transformer Is Almost All You Need towards Democratizing Data Preparation 2021 VLDB 8.9876423e-05
2,753 Complaint-driven Training Data Debugging for Query 2.0 2020 SIGMOD 8.1724339e-05
2,767 A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching 2020 SIGMOD 8.1513883e-05
3,140 ZeroER: Entity Resolution using Zero Labeled Examples 2020 SIGMOD 7.4841763e-05
3,640 Deep Learning for Blocking in Entity Matching: A Design Space Exploration 2021 VLDB 6.8891671e-05
3,942 Ember: No-Code Context Enrichment via Similarity-Based Keyless Joins 2022 VLDB 6.6114622e-05
4,018 Through the Fairness Lens: Experimental Analysis and Evaluation of Entity Matching 2023 VLDB 6.5244015e-05
4,402 Smurf: Self-Service String Matching Using Random Forests 2019 VLDB 6.2195162e-05
4,464 Magellan: Toward Building Entity Matching Management Systems over Data Science Stacks 2016 VLDB 6.1606042e-05
4,837 Entity Resolution with Hierarchical Graph Attention Networks 2022 SIGMOD 5.8892326e-05
4,989 BEER: Blocking for Effective Entity Resolution 2021 SIGMOD 5.7827362e-05
5,088 TCUDB: Accelerating Database with Tensor Processors 2022 SIGMOD 5.7072189e-05
5,192 Pattern Functional Dependencies for Data Cleaning 2020 VLDB 5.6375087e-05
5,282 Deep Indexed Active Learning for Matching Heterogeneous Entity Representations 2022 VLDB 5.5864206e-05
5,434 Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples 2021 SIGMOD 5.5045402e-05
5,533 Dual-Objective Fine-Tuning of BERT for Entity Matching 2021 VLDB 5.4544359e-05
5,869 Demonstration of Panda: A Weakly Supervised Entity Matching System 2021 VLDB 5.2959029e-05
5,978 Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond 2021 SIGMOD 5.2453012e-05
6,065 APEx: Accuracy-Aware Differentially Private Data Exploration 2019 SIGMOD 5.2291685e-05
6,553 How do Categorical Duplicates Affect ML? A New Benchmark and Empirical Analyses 2024 VLDB 5.0157344e-05
6,747 Entity Matching Meets Data Science: A Progress Report from the Magellan Project 2019 SIGMOD 4.9408824e-05
7,450 SystemER: A Human-in-the-loop System for Explainable Entity Resolution 2019 VLDB 4.7265276e-05
7,575 Human-in-the-loop Outlier Detection 2020 SIGMOD 4.7068909e-05
7,668 Human-in-the-loop Data Integration 2017 VLDB 4.6834075e-05
8,000 Data Civilizer 2.0: A Holistic Framework for Data Preparation and Analytics 2019 VLDB 4.6092803e-05
8,099 Sparkly: A Simple yet Surprisingly Strong TF/IDF Blocker for Entity Matching 2023 VLDB 4.5859317e-05
8,436 A Critical Re-evaluation of Neural Methods for Entity Alignment 2022 VLDB 4.5138915e-05
8,958 FlexER: Flexible Entity Resolution for Multiple Intents 2023 SIGMOD 4.4210635e-05
9,221 VisClean: Interactive Cleaning for Progressive Visualization 2020 VLDB 4.3699444e-05
9,235 ThriftLLM: On Cost-Effective Selection of Large Language Models for Classification Queries 2025 VLDB 4.3690661e-05
9,409 Ground Truth Inference for Weakly Supervised Entity Matching 2023 SIGMOD 4.3441378e-05
9,434 Rock: Cleaning Data by Embedding ML in Logic Rules 2024 SIGMOD 4.3430376e-05
9,460 The Battleship Approach to the Low Resource Entity Matching Problem 2023 SIGMOD 4.3366491e-05
9,461 BrewER: Entity Resolution On-Demand 2023 VLDB 4.3366491e-05
9,832 Balance-Aware Distributed String Similarity-Based Query Processing System 2019 VLDB 4.2751057e-05
10,022 In-context Clustering-based Entity Resolution with Large Language Models: A Design Space Exploration 2026 SIGMOD 4.1945683e-05
10,040 3dSAGER: Geospatial Entity Resolution over 3D Objects 2026 SIGMOD 4.1945683e-05
10,617 Deduplicated Sampling On-Demand 2025 VLDB 4.1945683e-05
11,117 FairEM360: A Suite for Responsible Entity Matching 2024 VLDB 4.1945683e-05
11,183 Matching Roles from Temporal Data 2023 SIGMOD 4.1945683e-05
11,333 LACE: A Logical Approach to Collective Entity Resolution 2022 PODS 4.1945683e-05
11,388 Frost: A Platform for Benchmarking and Exploring Data Matching Results 2022 VLDB 4.1945683e-05
11,438 New Algorithms for Monotone Classification 2021 PODS 4.1945683e-05
11,739 CloudMatcher: A Hands-Off Cloud/Crowd Service for Entity Matching 2018 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 10 of 10 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers