Data Provenance at Internet Scale: Architecture, Experiences, and the Road Ahead
Summary: Recasts database provenance for Internet-scale distributed systems, redesigning models/architecture to handle scale, heterogeneity, and real-time forensic/diagnostic needs. Describes a unified system, deployments for intrusion analysis and SDN debugging/auto-fix, and open challenges. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Ang Chen
- 2. Yang Wu
- 3. Andreas Haeberlen
- 4. Boon Thau Loo
- 5. Wenchao Zhou
Incoming Citations (Sorted by Pagerank)
Showing 8 of 8 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,280 | SMOKE: Fine-grained Lineage at Interactive Speed | 2018 | VLDB | 9.1111033e-05 |
| 2,764 | The Semiring Framework for Database Provenance | 2017 | PODS | 8.1574444e-05 |
| 6,160 | A Demonstration of Interactive Analysis of Performance Measurements with Viska | 2017 | SIGMOD | 5.1758344e-05 |
| 7,857 | Fixed It For You: Protocol Repair Using Lineage Graphs | 2019 | CIDR | 4.6345517e-05 |
| 8,341 | BugDoc: Algorithms to Debug Computational Processes | 2020 | SIGMOD | 4.5433282e-05 |
| 8,663 | Transactions Make Debugging Easy | 2023 | CIDR | 4.4722808e-05 |
| 9,220 | BugDoc: A System for Debugging Computational Pipelines | 2020 | SIGMOD | 4.3702188e-05 |
| 10,886 | FaDE: More Than a Million What-ifs Per Second | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 14 of 14 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,665 | Ursprung: Provenance for Large-Scale Analytics Environments | 2019 | SIGMOD | 4.1945683e-05 |
| 8,729 | OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance From Database Query Event Logs | 2023 | VLDB | 4.4582221e-05 |
| 7,132 | Enabling Privacy in Provenance-Aware Workflow Systems | 2011 | CIDR | 4.8227603e-05 |
| 12,014 | A Provenance Framework for Data-Dependent Process Analysis | 2014 | VLDB | 4.1945683e-05 |
| 6,084 | Distributed Provenance Compression | 2017 | SIGMOD | 5.2196728e-05 |
| 8,504 | Distributed Time-aware Provenance | 2013 | VLDB | 4.496125e-05 |
| 11,798 | Privacy-Preserving Network Provenance | 2017 | VLDB | 4.1945683e-05 |
| 2,173 | Querying Data Provenance | 2010 | SIGMOD | 9.3676609e-05 |
| 13,492 | NetTrails: A Declarative Platform for Maintaining and Querying Provenance in Distributed Systems | 2011 | SIGMOD | - |
| 3,584 | Efficient Querying and Maintenance of Network Provenance at Internet-Scale | 2010 | SIGMOD | 6.9460423e-05 |