Estimating the Impact of Unknown Unknowns on Aggregate Query Results
Summary: Leverages source overlap to estimate unobserved data's count and values, quantifying unknown-unknown impact on simple aggregates. Parameter-free, distribution-agnostic; enables uncertainty-aware assessment of integrated-data results without priors. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Yeounoh Chung
- 2. Michael Lind Mortensen
- 3. Carsten Binnig
- 4. Tim Kraska
Incoming Citations (Sorted by Pagerank)
Showing 8 of 8 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,350 | Northstar: An Interactive Data Science System | 2018 | VLDB | 0.00012431059 |
| 1,627 | Data Cleaning: Overview and Emerging Challenges | 2016 | SIGMOD | 0.00011086905 |
| 2,132 | Towards Sustainable Insights or why polygamy is bad for you | 2017 | CIDR | 9.4770432e-05 |
| 6,912 | CYADB: A Database that Covers Your Ask | 2018 | VLDB | 4.8925595e-05 |
| 7,634 | ReStore - Neural Data Completion for Relational Databases | 2021 | SIGMOD | 4.6911382e-05 |
| 8,047 | Thrifty Query Execution via Incrementability | 2020 | SIGMOD | 4.5983505e-05 |
| 9,056 | A Data Quality Metric (DQM): How to Estimate the Number of Undetected Errors in Data Sets | 2017 | VLDB | 4.4039656e-05 |
| 9,849 | Reptile: Aggregation-level Explanations for Hierarchical Data | 2022 | SIGMOD | 4.2721228e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 12 of 12 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 12,312 | Anonymized Data: Generation, Models, Usage | 2009 | SIGMOD | 4.1945683e-05 |
| 721 | Data Integration with Uncertainty | 2007 | VLDB | 0.00017570539 |
| 2,808 | A Robust, Optimization-Based Approach for Approximate Answering of Aggregate Queries | 2001 | SIGMOD | 8.0870741e-05 |
| 8,684 | Unbiased Estimation of Size and Other Aggregates Over Hidden Web Databases | 2010 | SIGMOD | 4.4677591e-05 |
| 12,272 | Conditioning and Aggregating Uncertain Data Streams: Going Beyond Expectations | 2010 | VLDB | 4.1945683e-05 |
| 2,573 | Query Optimization for Dynamic Imputation | 2017 | VLDB | 8.518235e-05 |
| 7,941 | Efficient Uncertainty Tracking for Complex Queries with Attribute-level Bounds | 2021 | SIGMOD | 4.613363e-05 |
| 11,161 | Querying Incomplete Numerical Data: Between Certain and Possible Answers | 2023 | PODS | 4.1945683e-05 |
| 6,079 | Querying Uncertain Data with Aggregate Constraints | 2011 | SIGMOD | 5.2223439e-05 |
| 8,138 | Fast and Reliable Missing Data Contingency Analysis with Predicate-Constraints | 2020 | SIGMOD | 4.5771031e-05 |