OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance From Database Query Event Logs
Summary: OneProvenance: a log-based system that efficiently extracts coarse-grained provenance from DB query event logs by recovering query execution dependencies via lightweight log analysis and novel event transformations. Adds filtering optimizations to cut noise and capture execution dependencies, yielding up to ~18× faster extraction than prior work and deployed at scale in Microsoft Purview. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,429 | Real-time Workload Pattern Analysis for Large-scale Cloud Databases | 2023 | VLDB | 7.1010535e-05 |
| 10,419 | Unified Lineage System: Tracking Data Provenance at Scale | 2025 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 23 of 23 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,851 | Provenance for Natural Language Queries | 2017 | VLDB | 5.8768322e-05 |
| 8,163 | Capturing and Querying Fine-grained Provenance of Preprocessing Pipelines in Data Science | 2021 | VLDB | 4.5723431e-05 |
| 1,106 | Provenance for Aggregate Queries | 2011 | PODS | 0.0001398766 |
| 8,960 | Computing How-Provenance for SPARQL Queries via Query Rewriting | 2021 | VLDB | 4.4206222e-05 |
| 2,892 | Data Provenance at Internet Scale: Architecture, Experiences, and the Road Ahead | 2017 | CIDR | 7.9480559e-05 |
| 8,230 | You Say 'What', I Hear 'Where' and 'Why' - (Mis-)Interpreting SQL to Derive Fine-Grained Provenance | 2018 | VLDB | 4.5541444e-05 |
| 7,556 | Interactive Query Explanations Using Fine Grained Provenance | 2022 | SIGMOD | 4.7117814e-05 |
| 2,173 | Querying Data Provenance | 2010 | SIGMOD | 9.3676609e-05 |
| 6,186 | On Provenance Minimization | 2011 | PODS | 5.166082e-05 |
| 11,665 | Ursprung: Provenance for Large-Scale Analytics Environments | 2019 | SIGMOD | 4.1945683e-05 |