Two Birds with One Stone: Efficient Deep Learning over Mislabeled Data through Subset Selection
Summary: Deem selects a subset under label uncertainty by using losses and gradients to approximate the full gradient on soft labels. Framed as submodular NP-hard subset selection with a scalable approximation, it yields up to 10× speedups with no accuracy loss. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Yuhao Deng
- 2. Chengliang Chai
- 3. Kaisen Jin
- 4. Linan Zheng
- 5. Lei Cao
- 6. Ye Yuan
- 7. Guoren Wang
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 71 | How Good Are Query Optimizers, Really? | 2016 | VLDB | 0.00059038975 |
| 791 | ActiveClean: Interactive Data Cleaning For Statistical Modeling | 2016 | VLDB | 0.00016629664 |
| 2,302 | Nearest Neighbor Classifiers over Incomplete Information: From Certain Answers to Certain Predictions | 2021 | VLDB | 9.0668832e-05 |
| 4,102 | GoodCore: Data-effective and Data-efficient Machine Learning through Coreset Selection over Incomplete Data | 2023 | SIGMOD | 6.4522929e-05 |
| 7,179 | Coresets over Multiple Tables for Feature-rich and Data-efficient Machine Learning | 2023 | VLDB | 4.8078895e-05 |
| 11,000 | MisDetect: Iterative Mislabel Detection using Early Loss | 2024 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next