Provenance-based Data Skipping
Summary: Proposes provenance-based data skipping (PBDS) that builds compact provenance sketches encoding data relevance for a query, e.g., HAVING and top-k. These sketches speed up subsequent queries and can leverage physical design artifacts like indexes and zone maps. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Xing Niu
- 2. Boris Glavic
- 3. Ziyu Liu
- 4. Pengyuan Li
- 5. Dieter Gawlick
- 6. Vasudha Krishnaswamy
- 7. Zhen Hua Liu
- 8. Danica Porobic
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,415 | Pruning in Snowflake: Working Smarter, Not Harder | 2025 | SIGMOD | 4.5197687e-05 |
| 10,886 | FaDE: More Than a Million What-ifs Per Second | 2025 | VLDB | 4.1945683e-05 |
| 10,895 | Towards an Objective Metric for Data Value Through Relevance | 2024 | CIDR | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 37 of 37 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,029 | Provenance for SQL through Abstract Interpretation: Value-less, but Worthwhile | 2015 | VLDB | 4.4040532e-05 |
| 10,546 | Evaluating Continuous Queries with Inconsistency Annotations | 2025 | VLDB | 4.1945683e-05 |
| 11,993 | A Partitioning Framework for Aggressive Data Skipping | 2014 | VLDB | 4.1945683e-05 |
| 6,809 | Adaptive Data Skipping in Main-Memory Systems | 2016 | SIGMOD | 4.9206606e-05 |
| 8,729 | OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance From Database Query Event Logs | 2023 | VLDB | 4.4582221e-05 |
| 8,394 | Hypothetical Reasoning via Provenance Abstraction | 2019 | SIGMOD | 4.527807e-05 |
| 6,186 | On Provenance Minimization | 2011 | PODS | 5.166082e-05 |
| 6,466 | Pando: Enhanced Data Skipping with Logical Data Partitioning | 2023 | VLDB | 5.0528281e-05 |
| 2,173 | Querying Data Provenance | 2010 | SIGMOD | 9.3676609e-05 |
| 3,922 | Pushing Data-Induced Predicates Through Joins in Big-Data Clusters | 2020 | VLDB | 6.6291079e-05 |