ZIP: Lazy Imputation during Query Processing
Summary: ZIP makes relational operators imputation-aware at query time, using a cost-based decide-to-impute-or-defer policy while guaranteeing results equivalent to eager (pre-query) imputation. Key techniques: outer-join execution to preserve NULLs and a Bloom-filter index; yields 10–25× speedups over ImputeDB and up to 19,607× vs. offline imputation. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Yiming Lin
- 2. Sharad Mehrotra
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,617 | Deduplicated Sampling On-Demand | 2025 | VLDB | 4.1945683e-05 |
| 10,744 | DIM-SUM: Dynamic IMputation for Smart Utility Management | 2025 | VLDB | 4.1945683e-05 |
| 11,069 | Hardware-Efficient Data Imputation through DBMS Extensibility | 2024 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 21 of 21 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,549 | Query Processing over Incomplete Autonomous Databases | 2007 | VLDB | 5.4428494e-05 |
| 11,050 | Win-Win: On Simultaneous Clustering and Imputing over Incomplete Data | 2024 | VLDB | 4.1945683e-05 |
| 11,254 | Asymptotically Better Query Optimization Using Indexed Algebra | 2023 | VLDB | 4.1945683e-05 |
| 3,818 | Embedded Functional Dependencies and Data-completeness Tailored Database Design | 2019 | VLDB | 6.7300958e-05 |
| 8,138 | Fast and Reliable Missing Data Contingency Analysis with Predicate-Constraints | 2020 | SIGMOD | 4.5771031e-05 |
| 8,347 | QPPT: Query Processing on Prefix Trees | 2013 | CIDR | 4.5410746e-05 |
| 5,253 | Enriching Data Imputation with Extensive Similarity Neighbors | 2015 | VLDB | 5.6014916e-05 |
| 9,479 | Data Imputation with Limited Data Redundancy Using Data Lakes | 2025 | VLDB | 4.3341665e-05 |
| 9,856 | In-Database Data Imputation | 2024 | SIGMOD | 4.269353e-05 |
| 2,573 | Query Optimization for Dynamic Imputation | 2017 | VLDB | 8.518235e-05 |