Database Paper Browser

Back to papers

Putting Lipstick on Pig: Enabling Database-style Workflow Provenance

Summary: Merges database- and workflow-style provenance by exposing module functionality via Pig Latin to reveal state, fine-grained dependencies, and ZoomIn/ZoomOut granularity. Lipstick implements the approach and benchmarks its performance. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
10483
Venue
VLDB
Year
2012
Pagerank
9.7433981e-05
Overall Rank
2,028 | 85.90%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 24 of 24 citing papers.

Rank Citing Paper Year Venue Pagerank
2,027 Titian: Data Provenance Support in Spark 2016 VLDB 9.7437067e-05
2,456 Production Machine Learning Pipelines: Empirical Analysis and Optimization Opportunities 2021 SIGMOD 8.7733773e-05
2,764 The Semiring Framework for Database Provenance 2017 PODS 8.1574444e-05
3,105 Data X-Ray: A Diagnostic Tool for Data Errors 2015 SIGMOD 7.5568954e-05
4,424 PrIU: A Provenance-Based Approach for Incrementally Updating Regression Models 2020 SIGMOD 6.198474e-05
4,774 LIMA: Fine-grained Lineage Tracing and Reuse in Machine Learning Systems 2021 SIGMOD 5.9316087e-05
5,086 Improving Reproducibility of Data Science Pipelines through Transparent Provenance Capture 2020 VLDB 5.7078462e-05
5,209 Explaining Outputs in Modern Data Analytics 2016 VLDB 5.629362e-05
5,445 QFix: Diagnosing Errors through Query Histories 2017 SIGMOD 5.5020909e-05
6,291 Lightweight Inspection of Data Preprocessing in Native Machine Learning Pipelines 2021 CIDR 5.1269764e-05
7,678 To Not Miss the Forest for the Trees - A Holistic Approach for Explaining Missing Answers over Nested Data 2021 SIGMOD 4.6813062e-05
7,720 Provenance: On and Behind the Screens 2016 SIGMOD 4.6684701e-05
7,833 Dependency-Driven Analytics: a Compass for Uncharted Data Oceans 2017 CIDR 4.6382648e-05
8,394 Hypothetical Reasoning via Provenance Abstraction 2019 SIGMOD 4.527807e-05
8,729 OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance From Database Query Event Logs 2023 VLDB 4.4582221e-05
9,043 Query-Guided Resolution in Uncertain Databases 2023 SIGMOD 4.4039656e-05
9,059 Tracking Personal Data Use: Provenance And Trust 2015 CIDR 4.4039656e-05
9,202 Compact, Tamper-Resistant Archival of Fine-Grained Provenance 2021 VLDB 4.3742967e-05
10,419 Unified Lineage System: Tracking Data Provenance at Scale 2025 SIGMOD 4.1945683e-05
10,910 Postulates for Provenance: Instance-based provenance for first-order logic 2024 PODS 4.1945683e-05
11,647 Ariadne: Online Provenance for Big Graph Analytics 2019 SIGMOD 4.1945683e-05
11,662 Capturing and Querying Structural Provenance in Spark with Pebble 2019 SIGMOD 4.1945683e-05
11,892 Looking at Everything in Context 2015 CIDR 4.1945683e-05
12,014 A Provenance Framework for Data-Dependent Process Analysis 2014 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 10 of 10 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
3 Pig Latin: A Not-So-Foreign Language for Data Processing 2008 SIGMOD 0.0024183614
31 Provenance Semirings 2007 PODS 0.0007857786
923 Provenance and Scientific Workflows: Challenges and Opportunities 2008 SIGMOD 0.0001527609
1,106 Provenance for Aggregate Queries 2011 PODS 0.0001398766
1,440 Provenance for Generalized Map and Reduce Workflows 2011 CIDR 0.00011961469
1,866 Update Exchange with Mappings and Provenance 2007 VLDB 0.00010272139
2,173 Querying Data Provenance 2010 SIGMOD 9.3676609e-05
4,783 Ibis: A Provenance Manager for Multi-Layer Systems 2011 CIDR 5.9253575e-05
5,270 Annotated XML: Queries and Provenance 2008 PODS 5.5963545e-05
8,935 Zoom*UserViews: Querying Relevant Provenance in Workflow Systems 2007 VLDB 4.427232e-05
Previous Page 1 / 1 Next

Semantically Similar Papers