Database Paper Browser

Back to papers

Cleaning Crowdsourced Labels Using Oracles for Statistical Classification

Summary: Oracle-based label cleaning for crowdsourced data in classification. TARS estimates test performance from noisy labels with confidence bounds and selects which labels to clean to boost training accuracy under budget, beating existing strategies. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
11972
Venue
VLDB
Year
2019
Pagerank
6.7758649e-05
Overall Rank
3,773 | 73.76%
DOI
10.14778/3297753.3297758

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 11 of 11 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 22 of 22 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
263 CrowdER: Crowdsourcing Entity Resolution 2012 VLDB 0.00029862413
791 ActiveClean: Interactive Data Cleaning For Statistical Modeling 2016 VLDB 0.00016629664
833 Guided Data Repair 2011 VLDB 0.00016138432
1,242 Question Selection for Crowd Entity Resolution 2013 VLDB 0.00013096655
1,491 CDAS: A Crowdsourcing Data Analytics System 2012 VLDB 0.00011694982
1,546 KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing 2015 SIGMOD 0.00011446851
1,627 Data Cleaning: Overview and Emerging Challenges 2016 SIGMOD 0.00011086905
2,175 Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services 2017 SIGMOD 9.3644117e-05
2,184 A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data 2014 SIGMOD 9.3429789e-05
2,452 Data Fusion – Resolving Data Conflicts for Integration 2009 VLDB 8.7839322e-05
2,797 Query-Oriented Data Cleaning with Oracles 2015 SIGMOD 8.1108589e-05
2,937 Truth Inference in Crowdsourcing: Is the Problem Solved? 2017 VLDB 7.853108e-05
3,067 CrowdFill: Collecting Structured Data from the Crowd 2014 SIGMOD 7.6180371e-05
3,118 Scaling Up Crowd-Sourcing to Very Large Datasets: A Case for Active Learning 2015 VLDB 7.5379338e-05
3,263 QASCA: A Quality-Aware Task Assignment System for Crowdsourcing Applications 2015 SIGMOD 7.3097573e-05
3,322 iCrowd: An Adaptive Crowdsourcing Framework 2015 SIGMOD 7.2230626e-05
3,897 SLiMFast: Guaranteed Results for Data Fusion and Source Reliability 2017 SIGMOD 6.6554845e-05
4,104 Online Entity Resolution Using an Oracle 2016 VLDB 6.4493809e-05
4,451 CLAMShell: Speeding up Crowds for Low-latency Data Labeling 2016 VLDB 6.1738675e-05
4,827 An Online Cost Sensitive Decision-Making Method in Crowdsourcing Systems 2013 SIGMOD 5.8938399e-05
5,405 Truth Discovery and Crowdsourcing Aggregation: A Unified Perspective 2015 VLDB 5.5257718e-05
8,362 Minimizing Efforts in Validating Crowd Answers 2015 SIGMOD 4.5366717e-05
Previous Page 1 / 1 Next

Semantically Similar Papers