Efficient and Effective Data Imputation with Influence Functions
Summary: EDIT uses influence functions to train data-imputation models with accuracy guarantees. IIE estimates sample influence and RSS selects a small high-impact subset with a weighted loss, delivering ~4x speed and >11% accuracy gains across ten methods. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Xiaoye Miao
- 2. Yangyang Wu
- 3. Lu Chen
- 4. Yunjun Gao
- 5. Jun Wang
- 6. Jianwei Yin
Incoming Citations (Sorted by Pagerank)
Showing 10 of 10 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,396 | Automatic Data Repair: Are We Ready to Deploy? | 2024 | VLDB | 7.1455126e-05 |
| 4,102 | GoodCore: Data-effective and Data-efficient Machine Learning through Coreset Selection over Incomplete Data | 2023 | SIGMOD | 6.4522929e-05 |
| 6,600 | Missing Data Imputation with Uncertainty-Driven Network | 2024 | SIGMOD | 4.9972581e-05 |
| 9,240 | ZIP: Lazy Imputation during Query Processing | 2024 | VLDB | 4.3690661e-05 |
| 9,709 | Outlier Summarization via Human Interpretable Rules | 2024 | VLDB | 4.299267e-05 |
| 9,856 | In-Database Data Imputation | 2024 | SIGMOD | 4.269353e-05 |
| 10,644 | Still More Shades of Null: An Evaluation Suite for Responsible Missing Value Imputation | 2025 | VLDB | 4.1945683e-05 |
| 10,744 | DIM-SUM: Dynamic IMputation for Smart Utility Management | 2025 | VLDB | 4.1945683e-05 |
| 11,190 | Efficient and Effective Cardinality Estimation for Skyline Family | 2023 | SIGMOD | 4.1945683e-05 |
| 11,223 | Splitting Tuples of Mismatched Entities | 2023 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 192 | HoloClean: Holistic Data Repairs with Probabilistic Inference | 2017 | VLDB | 0.00035728858 |
| 1,894 | Baran: Effective Error Correction via a Unified Context Representation and Transfer Learning | 2020 | VLDB | 0.0001018378 |
| 2,574 | Discovery of Genuine Functional Dependencies from Relational Data with Missing Values | 2018 | VLDB | 8.5173637e-05 |
| 3,818 | Embedded Functional Dependencies and Data-completeness Tailored Database Design | 2019 | VLDB | 6.7300958e-05 |
| 5,028 | Adaptive Data Augmentation for Supervised Learning over Missing Data | 2021 | VLDB | 5.7506746e-05 |
| 7,560 | Complete Approximations of Incomplete Queries | 2013 | VLDB | 4.7102455e-05 |
| 7,561 | Efficient Recovery of Missing Events | 2013 | VLDB | 4.7102455e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,675 | On LLM-Enhanced Mixed-Type Data Imputation with High-Order Message Passing | 2025 | VLDB | 4.1945683e-05 |
| 11,050 | Win-Win: On Simultaneous Clustering and Imputing over Incomplete Data | 2024 | VLDB | 4.1945683e-05 |
| 4,332 | Missing Value Imputation on Multidimensional Time Series | 2021 | VLDB | 6.2805243e-05 |
| 10,644 | Still More Shades of Null: An Evaluation Suite for Responsible Missing Value Imputation | 2025 | VLDB | 4.1945683e-05 |
| 9,479 | Data Imputation with Limited Data Redundancy Using Data Lakes | 2025 | VLDB | 4.3341665e-05 |
| 7,400 | Missing Value Imputation for Multi-attribute Sensor Data Streams via Message Propagation | 2024 | VLDB | 4.7397846e-05 |
| 6,600 | Missing Data Imputation with Uncertainty-Driven Network | 2024 | SIGMOD | 4.9972581e-05 |
| 9,856 | In-Database Data Imputation | 2024 | SIGMOD | 4.269353e-05 |
| 2,573 | Query Optimization for Dynamic Imputation | 2017 | VLDB | 8.518235e-05 |
| 10,953 | Certain and Approximately Certain Models for Statistical Learning | 2024 | SIGMOD | 4.1945683e-05 |