Back to papers
Provenance and Scientific Workflows: Challenges and Opportunities
Summary: Provenance in scientific workflows spans data products and workflow specifications to enable reproducibility, sharing, and reuse. Surveys current systems' provenance support, highlights emerging applications, and sketches open problems and directions for database research.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 4097
- Venue
- SIGMOD
- Year
- 2008
- Pagerank
- 0.0001527609
- Overall Rank
- 923 | 93.59%
- DOI
-
-
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 29 of 29 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 1,440 |
Provenance for Generalized Map and Reduce Workflows |
2011 |
CIDR |
0.00011961469 |
| 2,028 |
Putting Lipstick on Pig: Enabling Database-style Workflow Provenance |
2012 |
VLDB |
9.7433981e-05 |
| 2,359 |
Data Market Platforms: Trading Data Assets to Solve Data Problems |
2020 |
VLDB |
8.9607667e-05 |
| 3,105 |
Data X-Ray: A Diagnostic Tool for Data Errors |
2015 |
SIGMOD |
7.5568954e-05 |
| 4,301 |
Packing Experiments for Sharing and Publication |
2013 |
SIGMOD |
6.2885419e-05 |
| 4,851 |
Provenance for Natural Language Queries |
2017 |
VLDB |
5.8768322e-05 |
| 5,288 |
Bolt-on, Compact, and Rapid Program Slicing for Notebooks |
2022 |
VLDB |
5.5836876e-05 |
| 5,372 |
ReproZip: Computational Reproducibility With Ease |
2016 |
SIGMOD |
5.5428429e-05 |
| 6,295 |
Your notebook is not crumby enough, REPLace it |
2020 |
CIDR |
5.1249204e-05 |
| 6,409 |
Fine-Grained Lineage for Safer Notebook Interactions |
2021 |
VLDB |
5.0756653e-05 |
| 6,662 |
Selective Provenance for Datalog Programs Using Top-K Queries |
2015 |
VLDB |
4.9704872e-05 |
| 7,229 |
Sailing the Information Ocean with Awareness of Currents: Discovery and Application of Source Dependence |
2009 |
CIDR |
4.7950172e-05 |
| 7,370 |
Detecting and Resolving Unsound Workflow Views for Correct Provenance Analysis |
2009 |
SIGMOD |
4.7500735e-05 |
| 7,549 |
SOLOMON: Seeking the Truth Via Copying Detection |
2010 |
VLDB |
4.7137426e-05 |
| 7,720 |
Provenance: On and Behind the Screens |
2016 |
SIGMOD |
4.6684701e-05 |
| 7,833 |
Dependency-Driven Analytics: a Compass for Uncharted Data Oceans |
2017 |
CIDR |
4.6382648e-05 |
| 8,078 |
Meta-Dataflows: Efficient Exploratory Dataflow Jobs |
2018 |
SIGMOD |
4.5914967e-05 |
| 8,504 |
Distributed Time-aware Provenance |
2013 |
VLDB |
4.496125e-05 |
| 9,059 |
Tracking Personal Data Use: Provenance And Trust |
2015 |
CIDR |
4.4039656e-05 |
| 9,179 |
Equivalence-Invariant Algebraic Provenance for Hyperplane Update Queries |
2020 |
SIGMOD |
4.3820222e-05 |
| 9,622 |
NLProv: Natural Language Provenance |
2016 |
VLDB |
4.3163112e-05 |
| 9,907 |
PROPOLIS: Provisioned Analysis of Data-Centric Processes |
2013 |
VLDB |
4.2577164e-05 |
| 10,024 |
LPStream: Fine-grained Lazy Provenance for Stream Processing |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,168 |
FlowPilot: A Suggestion System for Designing Scientific Workflows |
2026 |
SIGMOD |
4.1945683e-05 |
| 11,908 |
Even Metadata is Getting Big: Annotation Summarization using InsightNotes |
2015 |
SIGMOD |
4.1945683e-05 |
| 11,934 |
SAASFEE: Scalable Scientific Workflow Execution Engine |
2015 |
VLDB |
4.1945683e-05 |
| 12,014 |
A Provenance Framework for Data-Dependent Process Analysis |
2014 |
VLDB |
4.1945683e-05 |
| 12,345 |
PDiffView: Viewing the Difference in Provenance of Workflow Results |
2009 |
VLDB |
4.1945683e-05 |
| 13,548 |
WOLVES: Achieving Correct Provenance Analysis by Detecting and Resolving Unsound Workflow Views |
2009 |
VLDB |
- |
Outgoing Citations (Sorted by Pagerank)
Showing 2 of 2 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 9,225 |
Computational Reproducibility: State-of-the-Art, Challenges, and Database Research Opportunities |
2012 |
SIGMOD |
4.369407e-05 |
| 2,892 |
Data Provenance at Internet Scale: Architecture, Experiences, and the Road Ahead |
2017 |
CIDR |
7.9480559e-05 |
| 13,544 |
Storing Scientific Workflows in a Database |
2009 |
VLDB |
- |
| 13,209 |
Theory and Practice of Provenance |
2022 |
SIGMOD |
- |
| 2,524 |
Provenance Management in Curated Databases |
2006 |
SIGMOD |
8.6017899e-05 |
| 1,765 |
Efficient Lineage Tracking For Scientific Workflows |
2008 |
SIGMOD |
0.00010630348 |
| 7,132 |
Enabling Privacy in Provenance-Aware Workflow Systems |
2011 |
CIDR |
4.8227603e-05 |
| 8,933 |
Querying and Re-Using Workflows with VisTrails |
2008 |
SIGMOD |
4.427232e-05 |
| 7,720 |
Provenance: On and Behind the Screens |
2016 |
SIGMOD |
4.6684701e-05 |
| 12,457 |
Provenance in Databases (Tutorial Outline) |
2007 |
SIGMOD |
4.1945683e-05 |