Database Paper Browser

Back to papers

ZIP: Lazy Imputation during Query Processing

Summary: ZIP makes relational operators imputation-aware at query time, using a cost-based decide-to-impute-or-defer policy while guaranteeing results equivalent to eager (pre-query) imputation. Key techniques: outer-join execution to preserve NULLs and a Bloom-filter index; yields 10–25× speedups over ImputeDB and up to 19,607× vs. offline imputation. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
13501
Venue
VLDB
Year
2024
Pagerank
4.3690661e-05
Overall Rank
9,240 | 35.72%
DOI
10.14778/3617838.3617841

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 3 of 3 citing papers.

Rank Citing Paper Year Venue Pagerank
10,617 Deduplicated Sampling On-Demand 2025 VLDB 4.1945683e-05
10,744 DIM-SUM: Dynamic IMputation for Smart Utility Management 2025 VLDB 4.1945683e-05
11,069 Hardware-Efficient Data Imputation through DBMS Extensibility 2024 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 21 of 21 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
192 HoloClean: Holistic Data Repairs with Probabilistic Inference 2017 VLDB 0.00035728858
326 Optimal Histograms with Quality Guarantees 1998 VLDB 0.00027358981
555 Discovering Denial Constraints 2013 VLDB 0.00020254908
852 Dynamic Multidimensional Histograms 2002 SIGMOD 0.00015941524
1,159 Towards Certain Fixes with Editing Rules and Master Data 2010 VLDB 0.00013592813
1,546 KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing 2015 SIGMOD 0.00011446851
2,276 Mind the Gap: An Experimental Evaluation of Imputation of Missing Values Techniques in Time Series 2020 VLDB 9.1261944e-05
2,573 Query Optimization for Dynamic Imputation 2017 VLDB 8.518235e-05
2,790 Artemis: A System for Analyzing Missing Answers 2009 VLDB 8.1239026e-05
2,946 BigDansing: A System for Big Data Cleansing 2015 SIGMOD 7.8372441e-05
3,311 Efficient and Effective Data Imputation with Influence Functions 2022 VLDB 7.2406486e-05
3,488 Optimal Column Layout for Hybrid Workloads 2019 VLDB 7.0479329e-05
4,273 Cleaning Denial Constraint Violations through Relaxation 2020 SIGMOD 6.3003864e-05
4,332 Missing Value Imputation on Multidimensional Time Series 2021 VLDB 6.2805243e-05
5,028 Adaptive Data Augmentation for Supervised Learning over Missing Data 2021 VLDB 5.7506746e-05
5,253 Enriching Data Imputation with Extensive Similarity Neighbors 2015 VLDB 5.6014916e-05
5,586 QuERy: A Framework for Integrating Entity Resolution with Query Processing 2016 VLDB 5.4219548e-05
6,175 Query-Driven Approach to Entity Resolution 2013 VLDB 5.169496e-05
6,727 ORBITS: Online Recovery of Missing Values in Multiple Time Series Streams 2021 VLDB 4.9483604e-05
9,049 JENNER: Just-in-time Enrichment in Query Processing 2022 VLDB 4.4039656e-05
11,536 LOCATER: Cleaning WiFi Connectivity Datasets for Semantic Localization 2021 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Semantically Similar Papers