Outliers: The Good, the Bad and the Ugly
Summary: Distinguishes good (novel), bad (errors), and ugly (feature‑influential errors) outliers and shows only ugly ones harm ML classifier accuracy. Proposes OMRs—predicate rules combining outlier detectors and statistics—to learn and repair ugly outliers (not delete), boosting accuracy 7.2% on average, up to 34.8%. (summarized by gpt-5-mini on Feb 11 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Shenglin Chen
- 2. Wenfei Fan
- 3. Ruochun Jin
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 23 of 23 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 701 | Efficient Algorithms for Mining Outliers from Large Data Sets | 2000 | SIGMOD | 0.00017938417 |
| 9,709 | Outlier Summarization via Human Interpretable Rules | 2024 | VLDB | 4.299267e-05 |
| 3,920 | Continuous Outlier Detection in Data Streams: An Extensible Framework and State-Of-The-Art Algorithms | 2013 | SIGMOD | 6.6309693e-05 |
| 4,456 | AutoOD: Automatic Outlier Detection | 2023 | SIGMOD | 6.1704203e-05 |
| 7,575 | Human-in-the-loop Outlier Detection | 2020 | SIGMOD | 4.7068909e-05 |
| 11,467 | Fast and Exact Outlier Detection in Metric Spaces: A Proximity Graph-based Approach | 2021 | SIGMOD | 4.1945683e-05 |
| 10,003 | Clustering with Set Outliers and Applications in Relational Clustering | 2026 | PODS | 4.1945683e-05 |
| 774 | Algorithms for Mining Distance-Based Outliers in Large Datasets | 1998 | VLDB | 0.00016865771 |
| 9,924 | On Saving Outliers for Better Clustering over Noisy Data | 2021 | SIGMOD | 4.2544238e-05 |
| 9,787 | Distance-Based Outlier Detection: Consolidation and Renewed Bearing | 2010 | VLDB | 4.2823546e-05 |