Database Paper Browser

Back to papers

Discovering Data Quality Rules

Summary: Data-driven discovery of context-dependent CFDs for data quality in dirty databases. Finds minimal CFDs and near-misses, reports the rules with their contexts and the non-conforming records, and uses interest metrics with scalable pruning. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
9744
Venue
VLDB
Year
2008
Pagerank
0.00017465093
Overall Rank
732 | 94.91%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 25 of 25 citing papers.

Rank Citing Paper Year Venue Pagerank
555 Discovering Denial Constraints 2013 VLDB 0.00020254908
1,159 Towards Certain Fixes with Editing Rules and Master Data 2010 VLDB 0.00013592813
1,546 KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing 2015 SIGMOD 0.00011446851
2,158 Uni-Detect: A Unified Approach to Automated Error Detection in Tables 2019 SIGMOD 9.4141354e-05
2,159 Sequential Dependencies 2009 VLDB 9.4130956e-05
2,266 Estimating the Confidence of Conditional Functional Dependencies 2009 SIGMOD 9.1540815e-05
2,483 Discovery of Approximate (and Exact) Denial Constraints 2020 VLDB 8.6864916e-05
2,823 Interaction between Record Matching and Data Repairing 2011 SIGMOD 8.0593894e-05
3,440 Approximate Denial Constraints 2020 VLDB 7.0918817e-05
3,976 UGuide – User-Guided Discovery of FD-Detectable Errors 2017 SIGMOD 6.5736462e-05
4,904 Temporal Rules Discovery for Web Data Cleaning 2016 VLDB 5.8399195e-05
5,096 Auto-Transform: Learning-to-Transform by Patterns 2020 VLDB 5.7011825e-05
5,618 Explaining Repaired Data with CFDs 2018 VLDB 5.4079415e-05
6,703 Discovering Graph Functional Dependencies 2018 SIGMOD 4.9555163e-05
7,564 PIClean: A Probabilistic and Interactive Data Cleaning System 2019 SIGMOD 4.7093702e-05
7,766 ICARUS: Minimizing Human Effort in Iterative Data Completion 2018 VLDB 4.6564959e-05
7,838 Auto-Validate: Unsupervised Data Validation Using Data-Domain Patterns Inferred from Data Lakes 2021 SIGMOD 4.6377995e-05
8,086 Determining the Relative Accuracy of Attributes 2013 SIGMOD 4.5899469e-05
9,177 Cost-efficient Data Acquisition on Online Data Marketplaces for Correlation Analysis 2019 VLDB 4.3834281e-05
9,278 Interactive and Deterministic Data Cleaning: A Tossed Stone Raises a Thousand Ripples 2016 SIGMOD 4.3639892e-05
9,301 Repairing Data through Regular Expressions 2016 VLDB 4.3587281e-05
9,649 DAFDiscover: Robust Mining Algorithm for Dynamic Approximate Functional Dependencies on Dirty Data 2024 VLDB 4.3109001e-05
10,679 How and Why False Denial Constraints are Discovered 2025 VLDB 4.1945683e-05
10,791 FDepHunter: Harnessing Negative Examples to Expose Fakes and Reveal Ghosts 2025 VLDB 4.1945683e-05
11,682 IHCS: An Integrated Hybrid Cleaning System 2019 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 9 of 9 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers