Database Paper Browser

Back to papers

A Cost-Based Model and Effective Heuristic for Repairing Constraints by Value Modification

Summary: Cost-based model for repairing constraints by value modification, enabling record-linkage–style search for low-cost fixes. NP-complete in database size; two equivalence-class–based greedy heuristics with cubic-time baselines and duplicate-record detection optimizations, yielding scalable repairs with little quality loss. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
3631
Venue
SIGMOD
Year
2005
Pagerank
0.00029763412
Overall Rank
265 | 98.16%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 50 of 73 citing papers.

Rank Citing Paper Year Venue Pagerank
192 HoloClean: Holistic Data Repairs with Probabilistic Inference 2017 VLDB 0.00035728858
560 Dependencies Revisited for Improving Data Quality 2008 PODS 0.00020141923
623 Improving Data Quality: Consistency and Accuracy 2007 VLDB 0.00018996374
656 ERACER: A Database Approach for Statistical Inference and Data Cleaning 2010 SIGMOD 0.00018588729
833 Guided Data Repair 2011 VLDB 0.00016138432
1,012 NADEEF: A Commodity Data Cleaning System 2013 SIGMOD 0.0001464733
1,159 Towards Certain Fixes with Editing Rules and Master Data 2010 VLDB 0.00013592813
1,197 The LLUNATIC Data-Cleaning Framework 2013 VLDB 0.00013390321
1,350 Northstar: An Interactive Data Science System 2018 VLDB 0.00012431059
1,401 Extending Dependencies with Conditions 2007 VLDB 0.00012187775
1,546 KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing 2015 SIGMOD 0.00011446851
1,624 Sampling the Repairs of Functional Dependency Violations under Hard Constraints 2010 VLDB 0.00011099222
1,627 Data Cleaning: Overview and Emerging Challenges 2016 SIGMOD 0.00011086905
2,158 Uni-Detect: A Unified Approach to Automated Error Detection in Tables 2019 SIGMOD 9.4141354e-05
2,349 RPT: Relational Pre-trained Transformer Is Almost All You Need towards Democratizing Data Preparation 2021 VLDB 8.9876423e-05
2,386 Leveraging Aggregate Constraints For Deduplication 2007 SIGMOD 8.9231895e-05
2,460 Combining Quantitative and Logical Data Cleaning 2016 VLDB 8.7617484e-05
2,566 Database Repairs and Consistent Query Answering: Origins and Further Developments 2019 PODS 8.5243847e-05
2,638 Messing Up with BART: Error Generation for Evaluating Data-Cleaning Algorithms 2016 VLDB 8.399764e-05
2,823 Interaction between Record Matching and Data Repairing 2011 SIGMOD 8.0593894e-05
2,946 BigDansing: A System for Big Data Cleansing 2015 SIGMOD 7.8372441e-05
3,133 Time Series Data Cleaning: From Anomaly Detection to Anomaly Repairing 2017 VLDB 7.4978041e-05
3,192 Towards Dependable Data Repairing with Fixing Rules 2014 SIGMOD 7.4095761e-05
3,299 SCODED: Statistical Constraint Oriented Data Error Detection 2020 SIGMOD 7.2546659e-05
3,713 GDR: A System for Guided Data Repair 2010 SIGMOD 6.8224341e-05
3,845 On Repairing Structural Problems In Semi-structured Data 2013 VLDB 6.7073366e-05
4,668 PrivateClean: Data Cleaning and Differential Privacy 2016 SIGMOD 6.0115918e-05
4,904 Temporal Rules Discovery for Web Data Cleaning 2016 VLDB 5.8399195e-05
5,002 Sequential Data Cleaning: A Statistical Approach 2016 SIGMOD 5.7671075e-05
5,096 Auto-Transform: Learning-to-Transform by Patterns 2020 VLDB 5.7011825e-05
5,153 Horizon: Scalable Dependency-driven Data Cleaning 2021 VLDB 5.6607963e-05
5,253 Enriching Data Imputation with Extensive Similarity Neighbors 2015 VLDB 5.6014916e-05
5,506 Exploring Change – A New Dimension of Data Analytics 2019 VLDB 5.473324e-05
5,618 Explaining Repaired Data with CFDs 2018 VLDB 5.4079415e-05
5,803 Semandaq: A Data Quality System Based on Conditional Functional Dependencies 2008 VLDB 5.3205861e-05
5,852 Repairing Vertex Labels under Neighborhood Constraints 2014 VLDB 5.3007132e-05
5,978 Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond 2021 SIGMOD 5.2453012e-05
6,134 Finding Label and Model Errors in Perception Data With Learned Observation Assertions 2022 SIGMOD 5.1943414e-05
6,280 Self-supervised and Interpretable Data Cleaning with Sequence Generative Adversarial Networks 2023 VLDB 5.1290457e-05
6,350 NADEEF: A Generalized Data Cleaning System 2013 VLDB 5.101815e-05
6,451 Multivariate Time Series Cleaning under Speed Constraints 2024 SIGMOD 5.0583324e-05
6,475 Explain3D: Explaining Disagreements in Disjoint Datasets 2019 VLDB 5.0497183e-05
6,583 SCREEN: Stream Data Cleaning under Speed Constraints 2015 SIGMOD 5.0027988e-05
6,705 Consistent Query Answers in Inconsistent Probabilistic Databases 2010 SIGMOD 4.9549359e-05
6,739 Benchmarking Approximate Consistent Query Answering 2021 PODS 4.9449088e-05
6,810 Record Linkage with Uniqueness Constraints and Erroneous Values 2010 VLDB 4.9203397e-05
7,013 Qualitative Data Cleaning 2016 VLDB 4.8619024e-05
7,223 Akane: Perplexity-Guided Time Series Data Cleaning 2024 SIGMOD 4.7965857e-05
7,391 Time Series Data Validity 2023 SIGMOD 4.7429293e-05
7,561 Efficient Recovery of Missing Events 2013 VLDB 4.7102455e-05
Previous Page 1 / 2 Next

Outgoing Citations (Sorted by Pagerank)

Showing 6 of 6 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers

Overall Rank Paper Year Venue Pagerank
1,197 The LLUNATIC Data-Cleaning Framework 2013 VLDB 0.00013390321
9,369 Constraint-Variance Tolerant Data Repairing 2016 SIGMOD 4.3481081e-05
3,360 Modeling and Querying Possible Repairs in Duplicate Detection 2009 VLDB 7.1742067e-05
10,235 Repairing Property Graphs under PG-Constraints 2026 VLDB 4.1945683e-05
3,042 Dichotomies in the Complexity of Preferred Repairs 2015 PODS 7.669374e-05
7,605 The Computation of Optimal Subset Repairs 2020 VLDB 4.697534e-05
623 Improving Data Quality: Consistency and Accuracy 2007 VLDB 0.00018996374
7,702 Counting and Enumerating (Preferred) Database Repairs 2017 PODS 4.6736471e-05
2,823 Interaction between Record Matching and Data Repairing 2011 SIGMOD 8.0593894e-05
8,840 The Cost of Representation by Subset Repairs 2025 VLDB 4.4388652e-05