Database Paper Browser

Back to papers

DIFF: A Relational Interface for Large-Scale Data Explanation

Summary: DIFF introduces a relational aggregation operator that unifies explanation engines with declarative SQL for large-scale analytics. Implemented in MB SQL (MacroBase), with single-node and distributed deployments, it preserves semantics, enables optimizations, and yields up to 10x speedups in production. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
11975
Venue
VLDB
Year
2019
Pagerank
9.4208667e-05
Overall Rank
2,154 | 85.02%
DOI
10.14778/3297753.3297761

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 19 of 19 citing papers.

Rank Citing Paper Year Venue Pagerank
2,753 Complaint-driven Training Data Debugging for Query 2.0 2020 SIGMOD 8.1724339e-05
5,222 Enabling SQL-based Training Data Debugging for Federated Learning 2022 VLDB 5.6210545e-05
5,313 XInsight: eXplainable Data Analysis Through The Lens of Causality 2023 SIGMOD 5.573009e-05
5,691 Putting Things into Context: Rich Explanations for Query Answers using Join Graphs 2021 SIGMOD 5.3684557e-05
6,779 Explaining Inference Queries with Bayesian Optimization 2021 VLDB 4.9280116e-05
7,556 Interactive Query Explanations Using Fine Grained Provenance 2022 SIGMOD 4.7117814e-05
7,922 Video-zilla: An Indexing Layer for Large-Scale Video Analytics 2022 SIGMOD 4.615892e-05
8,531 Sommelier: Curating DNN Models for the Masses 2022 SIGMOD 4.4937074e-05
8,853 Complaint-Driven Training Data Debugging at Interactive Speeds 2022 SIGMOD 4.4350727e-05
8,862 TabEE: Tabular Embeddings Explanations 2024 SIGMOD 4.4331977e-05
9,533 TSExplain: Surfacing Evolving Explanations for Time Series 2021 SIGMOD 4.3269636e-05
9,849 Reptile: Aggregation-level Explanations for Hierarchical Data 2022 SIGMOD 4.2721228e-05
9,850 COMPARE: Accelerating Groupwise Comparison in Relational Databases for Data Analytics 2021 VLDB 4.2721228e-05
10,269 Database Views as Explanations for Relational Deep Learning 2026 VLDB 4.1945683e-05
10,875 SDEcho: Efficient Explanation of Aggregated Sequence Difference 2025 VLDB 4.1945683e-05
10,886 FaDE: More Than a Million What-ifs Per Second 2025 VLDB 4.1945683e-05
11,384 BABOONS: Black-Box Optimization of Data Summaries in Natural Language 2022 VLDB 4.1945683e-05
11,392 Automated Relational Data Explanation using External Semantic Knowledge 2022 VLDB 4.1945683e-05
11,629 Leveraging Organizational Resources to Adapt Models to New Data Modalities 2020 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 30 of 30 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
21 C-Store: A Column-oriented DBMS 2005 VLDB 0.00086087497
36 Fast Algorithms for Mining Association Rules 1994 VLDB 0.00076161096
66 Spark SQL: Relational Data Processing in Spark 2015 SIGMOD 0.00061639801
99 On the Propagation of Errors in the Size of Join Results 1991 SIGMOD 0.00050022914
109 Dremel: Interactive Analysis of Web-Scale Datasets 2010 VLDB 0.00048186983
115 Eddies: Continuously Adaptive Query Processing 2000 SIGMOD 0.00046221215
121 Improved Query Performance with Variant Indexes 1997 SIGMOD 0.00045447517
214 Scorpion: Explaining Away Outliers in Aggregate Queries 2013 VLDB 0.0003363692
224 CORDS: Automatic Discovery of Correlations and Soft Functional Dependencies 2004 SIGMOD 0.00032746205
310 The Vertica Analytic Database: C-Store 7 Years Later 2012 VLDB 0.00028132402
454 An Overview of Query Optimization in Relational Systems 1998 PODS 0.00022734812
476 Impala: A Modern, Open-Source SQL Engine for Hadoop 2015 CIDR 0.00022226941
903 To Join or Not to Join? Thinking Twice about Joins before Feature Selection 2016 SIGMOD 0.0001547016
942 A Formal Approach to Finding Explanations for Database Queries 2014 SIGMOD 0.00015155714
1,022 DBSherlock: A Performance Diagnostic Tool for Transactional Databases 2016 SIGMOD 0.00014614917
1,167 Learning Generalized Linear Models Over Normalized Data 2015 SIGMOD 0.00013547713
1,272 Proactive Re-Optimization 2005 SIGMOD 0.00012920076
1,279 Towards Linear Algebra over Normalized Data 2017 VLDB 0.00012868394
1,534 PerfXplain: Debugging MapReduce Job Performance 2012 VLDB 0.00011468393
1,588 Druid: A Real-time Analytical Data Store 2014 SIGMOD 0.00011239313
1,619 Adaptive Optimization of Very Large Join Queries 2018 SIGMOD 0.00011111678
1,804 An Experimental Comparison of Thirteen Relational Equi-Joins in Main Memory 2016 SIGMOD 0.00010501185
2,126 MacroBase: Prioritizing Attention in Fast Data 2017 SIGMOD 9.4887794e-05
2,402 Causality and Explanations in Databases 2014 VLDB 8.8928361e-05
3,105 Data X-Ray: A Diagnostic Tool for Data Errors 2015 SIGMOD 7.5568954e-05
3,515 Scalable Computation of Acyclic Joins (Extended Abstract) 2006 PODS 7.0220813e-05
4,693 Multi-Structural Databases 2005 PODS 5.9955924e-05
6,370 Efficient Implementation of Large-Scale Multi-Structural Databases 2005 VLDB 5.0935585e-05
7,273 Feature Selection in Enterprise Analytics: A Demonstration using an R-based Data Analytics System 2013 VLDB 4.7810804e-05
11,756 Prioritizing Attention in Fast Data: Principles and Promise 2017 CIDR 4.1945683e-05
Previous Page 1 / 1 Next

Semantically Similar Papers