Efficiently Mitigating the Impact of Data Drift on Machine Learning Pipelines
Summary: Not all data drift degrades ML accuracy; paper defines Data Distributions with Low Accuracy (DDLA) — subregions of serving data where drift harms predictions. Uses decision-tree proxies to locate DDLAs for black‑box models, retraining only on harmful drift to cut costs while preserving accuracy. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Sijie Dong
- 2. Qitong Wang
- 3. Soror Sahri
- 4. Themis Palpanas
- 5. Divesh Srivastava
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 791 | ActiveClean: Interactive Data Cleaning For Statistical Modeling | 2016 | VLDB | 0.00016629664 |
| 1,099 | Interpretable and Informative Explanations of Outcomes | 2015 | VLDB | 0.00014096312 |
| 1,482 | Automating Large-Scale Data Quality Verification | 2018 | VLDB | 0.00011725533 |
| 2,302 | Nearest Neighbor Classifiers over Incomplete Information: From Certain Answers to Certain Predictions | 2021 | VLDB | 9.0668832e-05 |
| 2,456 | Production Machine Learning Pipelines: Empirical Analysis and Optimization Opportunities | 2021 | SIGMOD | 8.7733773e-05 |
| 4,110 | Learning to Validate the Predictions of Black Box Classifiers on Unseen Data | 2020 | SIGMOD | 6.4428544e-05 |
Previous
Page 1 / 1
Next