Database Paper Browser

Back to papers

Outliers: The Good, the Bad and the Ugly

Summary: Distinguishes good (novel), bad (errors), and ugly (feature‑influential errors) outliers and shows only ugly ones harm ML classifier accuracy. Proposes OMRs—predicate rules combining outlier detectors and statistics—to learn and repair ugly outliers (not delete), boosting accuracy 7.2% on average, up to 34.8%. (summarized by gpt-5-mini on Feb 11 2026)

Paper ID
7334
Venue
SIGMOD
Year
2026
Pagerank
4.1945683e-05
Overall Rank
10,029 | 30.24%
DOI
10.1145/3749177

Incoming Non-self Citations Over Time

No non-self incoming citations found for this paper in this database.

Authors

Incoming Citations (Sorted by Pagerank)

Showing 0 of 0 citing papers.

Rank Citing Paper Year Venue Pagerank
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 23 of 23 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
49 Consistent Query Answers in Inconsistent Databases 1999 PODS 0.00067660624
161 LOF: Identifying Density-Based Local Outliers 2000 SIGMOD 0.00039846974
181 Mining Frequent Patterns without Candidate Generation 2000 SIGMOD 0.00036992674
555 Discovering Denial Constraints 2013 VLDB 0.00020254908
774 Algorithms for Mining Distance-Based Outliers in Large Datasets 1998 VLDB 0.00016865771
791 ActiveClean: Interactive Data Cleaning For Statistical Modeling 2016 VLDB 0.00016629664
894 A Hybrid Approach to Functional Dependency Discovery 2016 SIGMOD 0.00015556428
1,099 Interpretable and Informative Explanations of Outcomes 2015 VLDB 0.00014096312
1,188 On Generating Near-Optimal Tableaux for Conditional Functional Dependencies 2008 VLDB 0.00013441729
2,483 Discovery of Approximate (and Exact) Denial Constraints 2020 VLDB 8.6864916e-05
4,448 The Interaction between Functional Dependencies and Template Dependencies 1980 SIGMOD 6.1785017e-05
4,456 AutoOD: Automatic Outlier Detection 2023 SIGMOD 6.1704203e-05
5,128 CAPE: Explaining Outliers by Counterbalancing 2019 VLDB 5.6758584e-05
6,600 Missing Data Imputation with Uncertainty-Driven Network 2024 SIGMOD 4.9972581e-05
7,575 Human-in-the-loop Outlier Detection 2020 SIGMOD 4.7068909e-05
7,796 CHEF: A Cheap and Fast Pipeline for Iteratively Cleaning Label Uncertainties 2021 VLDB 4.6482625e-05
7,926 CoCo: Interactive Exploration of Conformance Constraints for Data Understanding and Data Cleaning 2021 SIGMOD 4.6144554e-05
8,503 A Demonstration of KGLac: A Data Discovery and Enrichment Platform for Data Science 2021 VLDB 4.496339e-05
9,273 ActiveDeeper: A Model-based Active Data Enrichment System 2020 VLDB 4.3649603e-05
9,434 Rock: Cleaning Data by Embedding ML in Logic Rules 2024 SIGMOD 4.3430376e-05
9,709 Outlier Summarization via Human Interpretable Rules 2024 VLDB 4.299267e-05
9,963 Parallel Rule Discovery from Large Datasets by Sampling 2022 SIGMOD 4.2294678e-05
11,209 Enriching Recommendation Models with Logic Conditions 2023 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Semantically Similar Papers