Database Paper Browser

Back to papers

Question Selection for Crowd Entity Resolution

Summary: Proposes a probabilistic crowdsourced ER framework that quantifies the impact of each human question on clustering accuracy. Since exact expected accuracy is #P-hard, it offers efficient approximations and shows substantial accuracy gains with far fewer crowd queries on real and synthetic data. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
10700
Venue
VLDB
Year
2013
Pagerank
0.00013096655
Overall Rank
1,242 | 91.37%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 34 of 34 citing papers.

Rank Citing Paper Year Venue Pagerank
643 Corleone: Hands-Off Crowdsourcing for Entity Matching 2014 SIGMOD 0.00018754451
866 Leveraging Transitive Relations for Crowdsourced Joins 2013 SIGMOD 0.00015801196
1,627 Data Cleaning: Overview and Emerging Challenges 2016 SIGMOD 0.00011086905
1,841 Crowdsourcing Algorithms for Entity Resolution 2014 VLDB 0.00010348858
2,175 Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services 2017 SIGMOD 9.3644117e-05
2,767 A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching 2020 SIGMOD 8.1513883e-05
2,797 Query-Oriented Data Cleaning with Oracles 2015 SIGMOD 8.1108589e-05
2,937 Truth Inference in Crowdsourcing: Is the Problem Solved? 2017 VLDB 7.853108e-05
3,263 QASCA: A Quality-Aware Task Assignment System for Crowdsourcing Applications 2015 SIGMOD 7.3097573e-05
3,773 Cleaning Crowdsourced Labels Using Oracles for Statistical Classification 2019 VLDB 6.7758649e-05
4,104 Online Entity Resolution Using an Oracle 2016 VLDB 6.4493809e-05
4,126 Waldo: An Adaptive Human Interface for Crowd Entity Resolution 2017 SIGMOD 6.4314729e-05
4,619 Crowd-Based Deduplication: An Adaptive Approach 2015 SIGMOD 6.0444854e-05
4,665 Argonaut: Macrotask Crowdsourcing for Complex Data Processing 2015 VLDB 6.0125329e-05
5,362 Cost-Effective Crowdsourced Entity Resolution: A Partial-Order Approach 2016 SIGMOD 5.5473503e-05
5,734 Efficient Algorithms for Crowd-Aided Categorization 2020 VLDB 5.3482904e-05
6,475 Explain3D: Explaining Disagreements in Disjoint Datasets 2019 VLDB 5.0497183e-05
6,584 Budget Constrained Interactive Search for Multiple Targets 2021 VLDB 5.0027686e-05
6,868 Cost-Effective Data Annotation using Game-Based Crowdsourcing 2019 VLDB 4.9010083e-05
7,023 Hear the Whole Story: Towards the Diversity of Opinion in Crowdsourcing Markets 2015 VLDB 4.8576599e-05
7,117 Crowdsourced Data Management: Overview and Challenges 2017 SIGMOD 4.826509e-05
7,668 Human-in-the-loop Data Integration 2017 VLDB 4.6834075e-05
8,006 ALEX: Automatic Link Exploration in Linked Data 2015 SIGMOD 4.6080343e-05
8,056 Where To: Crowd-Aided Path Selection 2014 VLDB 4.5946189e-05
8,585 Robust Entity Resolution using Random Graphs 2018 SIGMOD 4.4905755e-05
9,678 Interactive Graph Search for Multiple Targets on DAGs 2025 VLDB 4.3047774e-05
9,683 Hierarchical Entity Resolution using an Oracle 2022 SIGMOD 4.3047774e-05
9,896 Towards Interpretable and Learnable Risk Analysis for Entity Resolution 2020 SIGMOD 4.2600049e-05
10,022 In-context Clustering-based Entity Resolution with Large Language Models: A Design Space Exploration 2026 SIGMOD 4.1945683e-05
10,091 LLM-Powered Interactive Graph Search: A Scalable and Practical Approach 2026 SIGMOD 4.1945683e-05
11,731 A Demonstration of PERC: Probabilistic Entity Resolution With Crowd Errors 2018 VLDB 4.1945683e-05
11,770 Staging User Feedback toward Rapid Conflict Resolution in Data Fusion 2017 SIGMOD 4.1945683e-05
11,788 CDB: Optimizing Queries with Crowd-Based Selections and Joins 2017 SIGMOD 4.1945683e-05
11,816 DOCS: Domain-Aware Crowdsourcing System 2017 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 8 of 8 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
67 The Merge/Purge Problem for Large Databases 1995 SIGMOD 0.00061348205
94 CrowdDB: Answering Queries with Crowdsourcing 2011 SIGMOD 0.00051013264
263 CrowdER: Crowdsourcing Entity Resolution 2012 VLDB 0.00029862413
267 Human-powered Sorts and Joins 2012 VLDB 0.00029690405
509 On Active Learning of Record Matching Packages 2010 SIGMOD 0.00021409518
936 Framework for Evaluating Clustering Algorithms in Duplicate Detection 2009 VLDB 0.0001521549
2,123 Demonstration of Qurk: A Query Processor for Human Operators 2011 SIGMOD 9.4945521e-05
2,809 Deco: A System for Declarative Crowdsourcing 2012 VLDB 8.0869896e-05
Previous Page 1 / 1 Next

Semantically Similar Papers