Database Paper Browser

Back to papers

OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance From Database Query Event Logs

Summary: OneProvenance: a log-based system that efficiently extracts coarse-grained provenance from DB query event logs by recovering query execution dependencies via lightweight log analysis and novel event transformations. Adds filtering optimizations to cut noise and capture execution dependencies, yielding up to ~18× faster extraction than prior work and deployed at scale in Microsoft Purview. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
13195
Venue
VLDB
Year
2023
Pagerank
4.4582221e-05
Overall Rank
8,729 | 39.28%
DOI
10.14778/3611540.3611555

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 2 of 2 citing papers.

Rank Citing Paper Year Venue Pagerank
3,429 Real-time Workload Pattern Analysis for Large-scale Cloud Databases 2023 VLDB 7.1010535e-05
10,419 Unified Lineage System: Tracking Data Provenance at Scale 2025 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 23 of 23 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
149 Trio: A System for Integrated Management of Data, Accuracy, and Lineage 2005 CIDR 0.00041101118
518 Data Integration for the Relational Web 2009 VLDB 0.00021158934
561 An Annotation Management System for Relational Databases 2004 VLDB 0.00020115419
610 Goods: Organizing Google's Datasets 2016 SIGMOD 0.00019232674
833 Guided Data Repair 2011 VLDB 0.00016138432
939 Data Lake Management: Challenges and Opportunities 2019 VLDB 0.00015187344
1,178 Table Union Search on Open Data 2018 VLDB 0.00013468118
1,277 The Data Civilizer System 2017 CIDR 0.00012879695
1,281 DataHub: Collaborative Data Science & Dataset Version Management at Scale 2015 CIDR 0.00012854744
1,413 VisTrails: Visualization meets Data Management 2006 SIGMOD 0.00012121257
1,565 Principles of Dataset Versioning: Exploring the Recreation/Storage Tradeoff 2015 VLDB 0.00011345567
1,677 Graceful Database Schema Evolution: the PRISM Workbench 2008 VLDB 0.00010939366
1,858 Bootstrapping Pay-As-You-Go Data Integration Systems 2008 SIGMOD 0.00010301124
1,866 Update Exchange with Mappings and Provenance 2007 VLDB 0.00010272139
2,028 Putting Lipstick on Pig: Enabling Database-style Workflow Provenance 2012 VLDB 9.7433981e-05
2,141 LSH Ensemble: Internet-Scale Domain Search 2016 VLDB 9.4542625e-05
2,237 Procedural Extensions of SQL: Understanding their usage in the wild 2021 VLDB 9.2212748e-05
2,269 Ground: A Data Context Service 2017 CIDR 9.147379e-05
2,280 SMOKE: Fine-grained Lineage at Interactive Speed 2018 VLDB 9.1111033e-05
4,783 Ibis: A Provenance Manager for Multi-Layer Systems 2011 CIDR 5.9253575e-05
4,801 CLAMS: Bringing Quality to Data Lakes 2016 SIGMOD 5.9115269e-05
5,086 Improving Reproducibility of Data Science Pipelines through Transparent Provenance Capture 2020 VLDB 5.7078462e-05
7,833 Dependency-Driven Analytics: a Compass for Uncharted Data Oceans 2017 CIDR 4.6382648e-05
Previous Page 1 / 1 Next

Semantically Similar Papers