Database Paper Browser

Back to papers

Query-Oriented Data Cleaning with Oracles

Summary: QOCO: query-oriented data cleaning via edits to the DB driven by domain-expert oracle crowds, to improve query results. NP-hardness of minimizing interactions, heuristics, and a prototype with experiments on correcting incorrect and missing tuples. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
5000
Venue
SIGMOD
Year
2015
Pagerank
8.1108589e-05
Overall Rank
2,797 | 80.55%
DOI
10.1145/2723372.2737786

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 13 of 13 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 28 of 28 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
31 Provenance Semirings 2007 PODS 0.0007857786
94 CrowdDB: Answering Queries with Crowdsourcing 2011 SIGMOD 0.00051013264
112 Potter's Wheel: An Interactive Data Cleaning System 2001 VLDB 0.00047045036
173 Schema Mapping as Query Discovery 2000 VLDB 0.00038627829
263 CrowdER: Crowdsourcing Entity Resolution 2012 VLDB 0.00029862413
294 Using Schema Matching to Simplify Heterogeneous Data Translation 1998 VLDB 0.00028669519
378 Towards Estimation Error Guarantees for Distinct Values 2000 PODS 0.0002497492
487 Why Not? 2009 SIGMOD 0.00022050218
652 On the Provenance of Non-Answers to Queries over Extracted Data 2008 VLDB 0.00018634477
655 On Propagation of Deletions and Annotations Through Views 2002 PODS 0.00018608845
767 Explaining differences in multidimensional aggregates 1999 VLDB 0.00016981309
809 Curated Databases 2008 PODS 0.00016430384
1,119 The Complexity of Causality and Responsibility for Query Answers and non-Answers 2011 VLDB 0.0001386199
1,125 How to ConQueR Why-Not Questions 2010 SIGMOD 0.00013845652
1,164 CrowdScreen: Algorithms for Filtering Data with Humans 2012 SIGMOD 0.00013564823
1,242 Question Selection for Crowd Entity Resolution 2013 VLDB 0.00013096655
1,699 Sensitivity Analysis and Explanations for Robust Query Evaluation in Probabilistic Databases 2011 SIGMOD 0.00010858983
2,184 A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data 2014 SIGMOD 9.3429789e-05
2,334 Counting with the Crowd 2013 VLDB 9.0161817e-05
2,562 Explaining Missing Answers to SPJUA Queries 2010 VLDB 8.5386194e-05
2,722 Progressive Approach to Relational Entity Resolution 2014 VLDB 8.2338356e-05
2,790 Artemis: A System for Analyzing Missing Answers 2009 VLDB 8.1239026e-05
3,067 CrowdFill: Collecting Structured Data from the Crowd 2014 SIGMOD 7.6180371e-05
3,100 Crowd Mining 2013 SIGMOD 7.5634778e-05
4,416 CrowdMatcher: Crowd-Assisted Schema Matching 2014 SIGMOD 6.2039225e-05
4,479 Optimal Crowd-Powered Rating and Filtering Algorithms 2014 VLDB 6.149053e-05
4,971 Maximizing Conjunctive Views in Deletion Propagation 2011 PODS 5.7938195e-05
8,875 CerFix: A System for Cleaning Data with Certain Fixes 2011 VLDB 4.430475e-05
Previous Page 1 / 1 Next

Semantically Similar Papers

Overall Rank Paper Year Venue Pagerank
11,837 QFix: Demonstrating Error Diagnosis in Query Histories 2016 SIGMOD 4.1945683e-05
143 Optimization of Nonrecursive Queries 1986 VLDB 0.00041510555
9,278 Interactive and Deterministic Data Cleaning: A Tossed Stone Raises a Thousand Ripples 2016 SIGMOD 4.3639892e-05
5,445 QFix: Diagnosing Errors through Query Histories 2017 SIGMOD 5.5020909e-05
623 Improving Data Quality: Consistency and Accuracy 2007 VLDB 0.00018996374
6,729 Keyword Query Cleaning 2008 VLDB 4.9483065e-05
6,806 Query Optimization over Crowdsourced Data 2013 VLDB 4.9218336e-05
9,351 On Efficient Approximate Queries over Machine Learning Models 2023 VLDB 4.3524472e-05
9,043 Query-Guided Resolution in Uncertain Databases 2023 SIGMOD 4.4039656e-05
9,196 QOCO: A Query Oriented Data Cleaning System with Oracles 2015 VLDB 4.3749064e-05