Database Paper Browser

Back to papers

Qualitative Data Cleaning

Summary: Proposes a taxonomy of qualitative error detection and repair methods for data cleaning in database research. Discusses scaling and distribution challenges for data and contrasts human-in-the-loop with automated/script-based repair. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
11294
Venue
VLDB
Year
2016
Pagerank
4.8619024e-05
Overall Rank
7,013 | 51.22%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 3 of 3 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 25 of 25 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
112 Potter's Wheel: An Interactive Data Cleaning System 2001 VLDB 0.00047045036
214 Scorpion: Explaining Away Outliers in Aggregate Queries 2013 VLDB 0.0003363692
263 CrowdER: Crowdsourcing Entity Resolution 2012 VLDB 0.00029862413
265 A Cost-Based Model and Effective Heuristic for Repairing Constraints by Value Modification 2005 SIGMOD 0.00029763412
280 Eliminating Fuzzy Duplicates in Data Warehouses 2002 VLDB 0.00029113044
489 Data Curation at Scale: The Data Tamer System 2013 CIDR 0.00022030728
555 Discovering Denial Constraints 2013 VLDB 0.00020254908
623 Improving Data Quality: Consistency and Accuracy 2007 VLDB 0.00018996374
643 Corleone: Hands-Off Crowdsourcing for Entity Matching 2014 SIGMOD 0.00018754451
833 Guided Data Repair 2011 VLDB 0.00016138432
1,012 NADEEF: A Commodity Data Cleaning System 2013 SIGMOD 0.0001464733
1,159 Towards Certain Fixes with Editing Rules and Master Data 2010 VLDB 0.00013592813
1,188 On Generating Near-Optimal Tableaux for Conditional Functional Dependencies 2008 VLDB 0.00013441729
1,197 The LLUNATIC Data-Cleaning Framework 2013 VLDB 0.00013390321
1,546 KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing 2015 SIGMOD 0.00011446851
1,624 Sampling the Repairs of Functional Dependency Violations under Hard Constraints 2010 VLDB 0.00011099222
2,184 A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data 2014 SIGMOD 9.3429789e-05
2,231 Dedoop: Efficient Deduplication with Hadoop 2012 VLDB 9.2304499e-05
2,602 Tracing Data Errors with View-Conditioned Causality 2011 SIGMOD 8.4667197e-05
2,823 Interaction between Record Matching and Data Repairing 2011 SIGMOD 8.0593894e-05
2,946 BigDansing: A System for Big Data Cleansing 2015 SIGMOD 7.8372441e-05
3,192 Towards Dependable Data Repairing with Fixing Rules 2014 SIGMOD 7.4095761e-05
3,360 Modeling and Querying Possible Repairs in Duplicate Detection 2009 VLDB 7.1742067e-05
3,528 Distributed Data Deduplication 2016 VLDB 7.0066139e-05
5,660 Descriptive and Prescriptive Data Cleaning 2014 SIGMOD 5.3847321e-05
Previous Page 1 / 1 Next

Semantically Similar Papers