Descriptive and Prescriptive Data Cleaning
Summary: Introduces a system for descriptive and prescriptive data cleaning that ties rules on transformation outputs to source data, yielding explanations and fixes. Scalable detection, propagation, and explanation of errors; evaluation on TPC-H for diverse quality rules. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Anup Chalamalla
- 2. Ihab F. Ilyas
- 3. Mourad Ouzzani
- 4. Paolo Papotti
Incoming Citations (Sorted by Pagerank)
Showing 11 of 11 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,277 | The Data Civilizer System | 2017 | CIDR | 0.00012879695 |
| 1,627 | Data Cleaning: Overview and Emerging Challenges | 2016 | SIGMOD | 0.00011086905 |
| 2,280 | SMOKE: Fine-grained Lineage at Interactive Speed | 2018 | VLDB | 9.1111033e-05 |
| 4,801 | CLAMS: Bringing Quality to Data Lakes | 2016 | SIGMOD | 5.9115269e-05 |
| 5,445 | QFix: Diagnosing Errors through Query Histories | 2017 | SIGMOD | 5.5020909e-05 |
| 5,618 | Explaining Repaired Data with CFDs | 2018 | VLDB | 5.4079415e-05 |
| 7,013 | Qualitative Data Cleaning | 2016 | VLDB | 4.8619024e-05 |
| 7,780 | A Natural Language Interface for Querying General and Individual Knowledge | 2015 | VLDB | 4.6533677e-05 |
| 8,694 | Managing General and Individual Knowledge in Crowd Mining Applications | 2015 | CIDR | 4.4661379e-05 |
| 8,728 | Stale View Cleaning: Getting Fresh Answers from Stale Materialized Views | 2015 | VLDB | 4.4589711e-05 |
| 9,849 | Reptile: Aggregation-level Explanations for Hierarchical Data | 2022 | SIGMOD | 4.2721228e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 10 of 10 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 214 | Scorpion: Explaining Away Outliers in Aggregate Queries | 2013 | VLDB | 0.0003363692 |
| 1,012 | NADEEF: A Commodity Data Cleaning System | 2013 | SIGMOD | 0.0001464733 |
| 1,125 | How to ConQueR Why-Not Questions | 2010 | SIGMOD | 0.00013845652 |
| 1,197 | The LLUNATIC Data-Cleaning Framework | 2013 | VLDB | 0.00013390321 |
| 1,624 | Sampling the Repairs of Functional Dependency Violations under Hard Constraints | 2010 | VLDB | 0.00011099222 |
| 1,699 | Sensitivity Analysis and Explanations for Robust Query Evaluation in Probabilistic Databases | 2011 | SIGMOD | 0.00010858983 |
| 2,562 | Explaining Missing Answers to SPJUA Queries | 2010 | VLDB | 8.5386194e-05 |
| 2,602 | Tracing Data Errors with View-Conditioned Causality | 2011 | SIGMOD | 8.4667197e-05 |
| 6,384 | A Demonstration of DBWipes: Clean as You Query | 2012 | VLDB | 5.0880333e-05 |
| 6,385 | Propagating Functional Dependencies with Conditions | 2008 | VLDB | 5.0875028e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,482 | Automating Large-Scale Data Quality Verification | 2018 | VLDB | 0.00011725533 |
| 3,396 | Automatic Data Repair: Are We Ready to Deploy? | 2024 | VLDB | 7.1455126e-05 |
| 1,159 | Towards Certain Fixes with Editing Rules and Master Data | 2010 | VLDB | 0.00013592813 |
| 623 | Improving Data Quality: Consistency and Accuracy | 2007 | VLDB | 0.00018996374 |
| 507 | Data Quality and Data Cleaning: An Overview | 2003 | SIGMOD | 0.00021473263 |
| 1,612 | Detecting Data Errors: Where are we and what needs to be done? | 2016 | VLDB | 0.00011142794 |
| 3,192 | Towards Dependable Data Repairing with Fixing Rules | 2014 | SIGMOD | 7.4095761e-05 |
| 7,013 | Qualitative Data Cleaning | 2016 | VLDB | 4.8619024e-05 |
| 732 | Discovering Data Quality Rules | 2008 | VLDB | 0.00017465093 |
| 1,627 | Data Cleaning: Overview and Emerging Challenges | 2016 | SIGMOD | 0.00011086905 |