Explaining Outputs in Modern Data Analytics
Summary: Framework for interactive explanations of outputs from modern data-parallel analytics, including iterations. Uses first-occurrence pruning to shrink explanations and a sufficiency-based method to reproduce the target output, implemented as differential dataflow operators for fast, incremental updates. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Zaheer Chothia
- 2. John Liagouris
- 3. Frank McSherry
- 4. Timothy Roscoe
Incoming Citations (Sorted by Pagerank)
Showing 7 of 7 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,280 | SMOKE: Fine-grained Lineage at Interactive Speed | 2018 | VLDB | 9.1111033e-05 |
| 5,944 | DBSP: Automatic Incremental View Maintenance for Rich Query Languages | 2023 | VLDB | 5.2628186e-05 |
| 8,230 | You Say 'What', I Hear 'Where' and 'Why' - (Mis-)Interpreting SQL to Derive Fine-Grained Provenance | 2018 | VLDB | 4.5541444e-05 |
| 10,816 | mlidea: Interactively Improving ML Data Preparation Code via "Shadow Pipelines" | 2025 | VLDB | 4.1945683e-05 |
| 11,452 | Flow Provenance in Temporal Interaction Networks | 2021 | SIGMOD | 4.1945683e-05 |
| 11,647 | Ariadne: Online Provenance for Big Graph Analytics | 2019 | SIGMOD | 4.1945683e-05 |
| 11,710 | Demonstration of Smoke: A Deep Breath of Data-Intensive Lineage Applications | 2018 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 19 of 19 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 7,556 | Interactive Query Explanations Using Fine Grained Provenance | 2022 | SIGMOD | 4.7117814e-05 |
| 2,611 | Opening the Black Boxes in Data Flow Optimization | 2012 | VLDB | 8.4536967e-05 |
| 538 | The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing | 2015 | VLDB | 0.00020678804 |
| 3,710 | Optimizing Analytic Data Flows for Multiple Execution Engines | 2012 | SIGMOD | 6.8238962e-05 |
| 942 | A Formal Approach to Finding Explanations for Database Queries | 2014 | SIGMOD | 0.00015155714 |
| 12,039 | Iterative Parallel Data Processing with Stratosphere: An Inside Look | 2013 | SIGMOD | 4.1945683e-05 |
| 10,883 | IcedTea: Efficient and Responsive Time-Travel Debugging in Dataflow Systems | 2025 | VLDB | 4.1945683e-05 |
| 2,035 | Generating Example Data for Dataflow Programs | 2009 | SIGMOD | 9.7149269e-05 |
| 522 | Differential dataflow | 2013 | CIDR | 0.00021099241 |
| 2,172 | Spinning Fast Iterative Data Flows | 2012 | VLDB | 9.3706587e-05 |