Database Paper Browser

Back to papers

SCODED: Statistical Constraint Oriented Data Error Detection

Summary: SCODED uses Statistical Constraints (SCs) for data cleaning, aligning with integrity constraints for insight and downstream use. Two parts SC Violation Detection and Error Drill Down (topk); experiments on synthetic data show SCs beat latest methods. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
5781
Venue
SIGMOD
Year
2020
Pagerank
7.2546659e-05
Overall Rank
3,299 | 77.06%
DOI
10.1145/3318464.3380568

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 16 of 16 citing papers.

Rank Citing Paper Year Venue Pagerank
3,252 Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks 2020 SIGMOD 7.3178277e-05
5,222 Enabling SQL-based Training Data Debugging for Federated Learning 2022 VLDB 5.6210545e-05
6,690 Parallel Discrepancy Detection and Incremental Detection 2021 VLDB 4.9621556e-05
6,944 DataPrism: Exposing Disconnect between Data and Systems 2022 SIGMOD 4.8912787e-05
7,202 Conformance Constraint Discovery: Measuring Trust in Data-Driven Systems 2021 SIGMOD 4.8023314e-05
7,449 OTClean: Data Cleaning for Conditional Independence Violations using Optimal Transport 2024 SIGMOD 4.7269357e-05
7,667 Fast Detection of Denial Constraint Violations 2022 VLDB 4.683767e-05
7,838 Auto-Validate: Unsupervised Data Validation Using Data-Domain Patterns Inferred from Data Lakes 2021 SIGMOD 4.6377995e-05
7,926 CoCo: Interactive Exploration of Conformance Constraints for Data Understanding and Data Cleaning 2021 SIGMOD 4.6144554e-05
9,410 Leveraging Application Data Constraints to Optimize Database-Backed Web Applications 2023 VLDB 4.3441378e-05
9,434 Rock: Cleaning Data by Embedding ML in Logic Rules 2024 SIGMOD 4.3430376e-05
9,560 MTSClean: Efficient Constraint-based Cleaning for Multi-Dimensional Time Series Data 2024 VLDB 4.3254416e-05
10,019 Guardrail: Automated Integrity Constraint Synthesis From Noisy Data 2026 SIGMOD 4.1945683e-05
10,026 Minimum Change ≠ Best Cleaning: Parallel and Incremental Error Detection under Integrity Constraints 2026 SIGMOD 4.1945683e-05
10,213 Stress-Testing Causal Claims via Cardinality Repairs 2026 SIGMOD 4.1945683e-05
10,598 Auto-Prep: Holistic Prediction of Data Preparation Steps for Self-Service Business Intelligence 2025 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 22 of 22 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
192 HoloClean: Holistic Data Repairs with Probabilistic Inference 2017 VLDB 0.00035728858
214 Scorpion: Explaining Away Outliers in Aggregate Queries 2013 VLDB 0.0003363692
224 CORDS: Automatic Discovery of Correlations and Soft Functional Dependencies 2004 SIGMOD 0.00032746205
265 A Cost-Based Model and Effective Heuristic for Repairing Constraints by Value Modification 2005 SIGMOD 0.00029763412
555 Discovering Denial Constraints 2013 VLDB 0.00020254908
656 ERACER: A Database Approach for Statistical Inference and Data Cleaning 2010 SIGMOD 0.00018588729
881 Don’t be SCAREd: Use SCalable Automatic REpairing with Maximal Likelihood and Bounded Changes 2013 SIGMOD 0.00015661103
942 A Formal Approach to Finding Explanations for Database Queries 2014 SIGMOD 0.00015155714
1,041 Interventional Fairness : Causal Database Repair for Algorithmic Fairness 2019 SIGMOD 0.00014482047
1,337 HoloDetect: Few-Shot Learning for Error Detection 2019 SIGMOD 0.00012497164
1,546 KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing 2015 SIGMOD 0.00011446851
1,612 Detecting Data Errors: Where are we and what needs to be done? 2016 VLDB 0.00011142794
1,627 Data Cleaning: Overview and Emerging Challenges 2016 SIGMOD 0.00011086905
2,158 Uni-Detect: A Unified Approach to Automated Error Detection in Tables 2019 SIGMOD 9.4141354e-05
2,506 Auto-Detect: Data-Driven Error Detection in Tables 2018 SIGMOD 8.6335464e-05
2,797 Query-Oriented Data Cleaning with Oracles 2015 SIGMOD 8.1108589e-05
2,810 Bias in OLAP Queries: Detection, Explanation, and Removal (Or Think Twice About Your AVG-Query) 2018 SIGMOD 8.0810163e-05
2,968 Raha: A Configuration-Free Error Detection System 2019 SIGMOD 7.7985097e-05
3,105 Data X-Ray: A Diagnostic Tool for Data Errors 2015 SIGMOD 7.5568954e-05
5,445 QFix: Diagnosing Errors through Query Histories 2017 SIGMOD 5.5020909e-05
5,929 ActiveClean: An Interactive Data Cleaning Framework For Modern Machine Learning 2016 SIGMOD 5.2682177e-05
7,262 HypDB: A Demonstration of Detecting, Explaining and Resolving Bias in OLAP queries 2018 VLDB 4.78584e-05
Previous Page 1 / 1 Next

Semantically Similar Papers