C2Metadata: Automating the Capture of Data Transformations from Statistical Scripts in Data Documentation
Summary: Automates capture of transformations from statistical scripts into provenance metadata. Introduces SDTA, a compact algebra for transformations, implemented in SDTL; enables provenance tracing (SPSS, Stata; SAS/R coming) across languages via a web UI. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Jie Song
- 2. George Alter
- 3. H. V. Jagadish
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,216 | Demystifying the QoS and QoE of Edge-hosted Video Streaming Applications in the Wild with SNESet | 2023 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 0 of 0 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,086 | Improving Reproducibility of Data Science Pipelines through Transparent Provenance Capture | 2020 | VLDB | 5.7078462e-05 |
| 4,691 | Managing Derived Data in the Gaea Scientific DBMS | 1993 | VLDB | 5.9970685e-05 |
| 14,220 | Concept Description Language for Statistical Data Modeling | 1990 | VLDB | - |
| 11,396 | DPDS: Assisting Data Science with Data Provenance | 2022 | VLDB | 4.1945683e-05 |
| 5,275 | Auto-Tables: Synthesizing Multi-Step Transformations to Relationalize Tables without Using Examples | 2023 | VLDB | 5.5905507e-05 |
| 2,832 | Intensional Associations Between Data and Metadata | 2007 | SIGMOD | 8.050082e-05 |
| 1,625 | Data Profiling with Metanome | 2015 | VLDB | 0.00011094926 |
| 7,548 | Data Exchange with Data-Metadata Translations | 2008 | VLDB | 4.7143189e-05 |
| 8,163 | Capturing and Querying Fine-grained Provenance of Preprocessing Pipelines in Data Science | 2021 | VLDB | 4.5723431e-05 |
| 748 | Metadata Management for Large Statistical Databases | 1982 | VLDB | 0.00017268814 |