Database Paper Browser

Back to papers

CrowdER: Crowdsourcing Entity Resolution

Summary: CrowdER uses a machine-driven coarse pass to prune candidate pairs, saving human effort. Minimizing verification batches is NP-hard; a two-tiered batched-heuristic yields efficient, accurate results, demonstrated on real datasets via crowdsourcing. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
10384
Venue
VLDB
Year
2012
Pagerank
0.00029862413
Overall Rank
263 | 98.18%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 50 of 71 citing papers.

Rank Citing Paper Year Venue Pagerank
221 Deep Entity Matching with Pre-Trained Language Models 2021 VLDB 0.00033121824
300 Deep Learning for Entity Matching: A Design Space Exploration 2018 SIGMOD 0.00028441466
517 Can Foundation Models Wrangle Your Data? 2023 VLDB 0.00021169035
643 Corleone: Hands-Off Crowdsourcing for Entity Matching 2014 SIGMOD 0.00018754451
754 Distributed Representations of Tuples for Entity Resolution 2018 VLDB 0.00017117211
866 Leveraging Transitive Relations for Crowdsourced Joins 2013 SIGMOD 0.00015801196
1,242 Question Selection for Crowd Entity Resolution 2013 VLDB 0.00013096655
1,546 KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing 2015 SIGMOD 0.00011446851
1,627 Data Cleaning: Overview and Emerging Challenges 2016 SIGMOD 0.00011086905
1,831 Synthesizing Entity Matching Rules by Examples 2018 VLDB 0.00010384082
1,841 Crowdsourcing Algorithms for Entity Resolution 2014 VLDB 0.00010348858
2,175 Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services 2017 SIGMOD 9.3644117e-05
2,184 A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data 2014 SIGMOD 9.3429789e-05
2,722 Progressive Approach to Relational Entity Resolution 2014 VLDB 8.2338356e-05
2,767 A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching 2020 SIGMOD 8.1513883e-05
2,792 Finish Them!: Pricing Algorithms for Human Computation 2014 VLDB 8.1197186e-05
2,797 Query-Oriented Data Cleaning with Oracles 2015 SIGMOD 8.1108589e-05
2,937 Truth Inference in Crowdsourcing: Is the Problem Solved? 2017 VLDB 7.853108e-05
3,118 Scaling Up Crowd-Sourcing to Very Large Datasets: A Case for Active Learning 2015 VLDB 7.5379338e-05
3,263 QASCA: A Quality-Aware Task Assignment System for Crowdsourcing Applications 2015 SIGMOD 7.3097573e-05
3,322 iCrowd: An Adaptive Crowdsourcing Framework 2015 SIGMOD 7.2230626e-05
3,490 Leveraging Set Relations in Exact Set Similarity Join 2017 VLDB 7.0465856e-05
3,582 NADEEF/ER: Generic and Interactive Entity Resolution 2014 SIGMOD 6.9479263e-05
3,773 Cleaning Crowdsourced Labels Using Oracles for Statistical Classification 2019 VLDB 6.7758649e-05
3,840 Revisiting Prompt Engineering via Declarative Crowdsourcing 2024 CIDR 6.7106924e-05
4,050 An Efficient Partition Based Method for Exact Set Similarity Joins 2016 VLDB 6.4953612e-05
4,104 Online Entity Resolution Using an Oracle 2016 VLDB 6.4493809e-05
4,126 Waldo: An Adaptive Human Interface for Crowd Entity Resolution 2017 SIGMOD 6.4314729e-05
4,185 Arnold: Declarative Crowd-Machine Data Integration 2013 CIDR 6.3776356e-05
4,451 CLAMShell: Speeding up Crowds for Low-latency Data Labeling 2016 VLDB 6.1738675e-05
4,479 Optimal Crowd-Powered Rating and Filtering Algorithms 2014 VLDB 6.149053e-05
4,619 Crowd-Based Deduplication: An Adaptive Approach 2015 SIGMOD 6.0444854e-05
4,665 Argonaut: Macrotask Crowdsourcing for Complex Data Processing 2015 VLDB 6.0125329e-05
4,827 An Online Cost Sensitive Decision-Making Method in Crowdsourcing Systems 2013 SIGMOD 5.8938399e-05
4,837 Entity Resolution with Hierarchical Graph Attention Networks 2022 SIGMOD 5.8892326e-05
5,081 Reducing Uncertainty of Schema Matching via Crowdsourcing 2013 VLDB 5.7132042e-05
5,282 Deep Indexed Active Learning for Matching Heterogeneous Entity Representations 2022 VLDB 5.5864206e-05
5,362 Cost-Effective Crowdsourced Entity Resolution: A Partial-Order Approach 2016 SIGMOD 5.5473503e-05
5,618 Explaining Repaired Data with CFDs 2018 VLDB 5.4079415e-05
5,734 Efficient Algorithms for Crowd-Aided Categorization 2020 VLDB 5.3482904e-05
6,393 On Uncertain Graphs Modeling and Queries 2015 VLDB 5.0837624e-05
6,584 Budget Constrained Interactive Search for Multiple Targets 2021 VLDB 5.0027686e-05
6,868 Cost-Effective Data Annotation using Game-Based Crowdsourcing 2019 VLDB 4.9010083e-05
7,013 Qualitative Data Cleaning 2016 VLDB 4.8619024e-05
7,023 Hear the Whole Story: Towards the Diversity of Opinion in Crowdsourcing Markets 2015 VLDB 4.8576599e-05
7,113 Answering Planning Queries with the Crowd 2013 VLDB 4.8274062e-05
7,117 Crowdsourced Data Management: Overview and Challenges 2017 SIGMOD 4.826509e-05
7,178 Towards Globally Optimal Crowdsourcing Quality Management: The Uniform Worker Setting 2016 SIGMOD 4.8085946e-05
7,185 Certus: An Effective Entity Resolution Approach with Graph Differential Dependencies (GDDs) 2019 VLDB 4.8066159e-05
7,224 OASSIS: Query Driven Crowd Mining 2014 SIGMOD 4.7959024e-05
Previous Page 1 / 2 Next

Outgoing Citations (Sorted by Pagerank)

Showing 7 of 7 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
94 CrowdDB: Answering Queries with Crowdsourcing 2011 SIGMOD 0.00051013264
249 Crowdsourced Databases: Query Processing with People 2011 CIDR 0.00030740523
267 Human-powered Sorts and Joins 2012 VLDB 0.00029690405
319 Evaluation of entity resolution approaches on real-world match problems 2010 VLDB 0.00027781866
509 On Active Learning of Record Matching Packages 2010 SIGMOD 0.00021409518
692 Pay-as-you-go User Feedback for Dataspace Systems 2008 SIGMOD 0.00018083948
1,885 CrowdDB: Query Processing with the VLDB Crowd 2011 VLDB 0.0001021098
Previous Page 1 / 1 Next

Semantically Similar Papers