Database Paper Browser

Back to papers

Efficient Provenance Storage

Summary: Proposes efficient provenance storage via three techniques—factorization and two inheritance-based methods—to reduce provenance data while preserving queryability. Applied to MiMI, Karma, and PReServ, it yields up to 20x storage reduction with efficient queries and cheap incremental updates. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
4048
Venue
SIGMOD
Year
2008
Pagerank
0.00010287053
Overall Rank
1,861 | 87.06%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 18 of 18 citing papers.

Rank Citing Paper Year Venue Pagerank
1,440 Provenance for Generalized Map and Reduce Workflows 2011 CIDR 0.00011961469
2,173 Querying Data Provenance 2010 SIGMOD 9.3676609e-05
4,851 Provenance for Natural Language Queries 2017 VLDB 5.8768322e-05
5,691 Putting Things into Context: Rich Explanations for Query Answers using Join Graphs 2021 SIGMOD 5.3684557e-05
5,802 An Optimal Labeling Scheme for Workflow Provenance Using Skeleton Labels 2010 SIGMOD 5.3209459e-05
6,084 Distributed Provenance Compression 2017 SIGMOD 5.2196728e-05
6,186 On Provenance Minimization 2011 PODS 5.166082e-05
6,662 Selective Provenance for Datalog Programs Using Top-K Queries 2015 VLDB 4.9704872e-05
6,696 Approximate Summaries for Why and Why-not Provenance 2020 VLDB 4.9581958e-05
7,370 Detecting and Resolving Unsound Workflow Views for Correct Provenance Analysis 2009 SIGMOD 4.7500735e-05
7,482 Provenance-Enabled Explainable AI 2024 SIGMOD 4.7180617e-05
8,394 Hypothetical Reasoning via Provenance Abstraction 2019 SIGMOD 4.527807e-05
8,886 Provenance-based Data Skipping 2022 VLDB 4.4279829e-05
9,179 Equivalence-Invariant Algebraic Provenance for Hyperplane Update Queries 2020 SIGMOD 4.3820222e-05
9,202 Compact, Tamper-Resistant Archival of Fine-Grained Provenance 2021 VLDB 4.3742967e-05
9,622 NLProv: Natural Language Provenance 2016 VLDB 4.3163112e-05
10,419 Unified Lineage System: Tracking Data Provenance at Scale 2025 SIGMOD 4.1945683e-05
11,452 Flow Provenance in Temporal Interaction Networks 2021 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 7 of 7 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
101 ULDBs: Databases with Uncertainty and Lineage 2006 VLDB 0.0004955674
561 An Annotation Management System for Relational Databases 2004 VLDB 0.00020115419
611 Lineage Tracing for General Data Warehouse Transformations 2001 VLDB 0.00019231115
676 Archiving Scientific Data 2002 SIGMOD 0.00018281665
926 XMill: an Efficient Compressor for XML Data 2000 SIGMOD 0.00015251799
2,524 Provenance Management in Curated Databases 2006 SIGMOD 8.6017899e-05
4,768 The Virtual Data Grid: A New Model and Architecture for Data-Intensive Collaboration 2003 CIDR 5.9356093e-05
Previous Page 1 / 1 Next

Semantically Similar Papers

Overall Rank Paper Year Venue Pagerank
6,084 Distributed Provenance Compression 2017 SIGMOD 5.2196728e-05
6,696 Approximate Summaries for Why and Why-not Provenance 2020 VLDB 4.9581958e-05
2,892 Data Provenance at Internet Scale: Architecture, Experiences, and the Road Ahead 2017 CIDR 7.9480559e-05
2,173 Querying Data Provenance 2010 SIGMOD 9.3676609e-05
8,394 Hypothetical Reasoning via Provenance Abstraction 2019 SIGMOD 4.527807e-05
9,202 Compact, Tamper-Resistant Archival of Fine-Grained Provenance 2021 VLDB 4.3742967e-05
11,471 On Optimizing the Trade-off between Privacy and Utility in Data Provenance 2021 SIGMOD 4.1945683e-05
1,765 Efficient Lineage Tracking For Scientific Workflows 2008 SIGMOD 0.00010630348
6,186 On Provenance Minimization 2011 PODS 5.166082e-05
2,524 Provenance Management in Curated Databases 2006 SIGMOD 8.6017899e-05