Debugging Missing Answers for Spark Queries over Nested Data with Breadcrumb
Summary: Breadcrumb provides query-based explanations for missing Spark results on nested data, pinpointing the operators responsible for the absence. It scales to big data and handles schema-semantic errors in nested/de-normalized queries, guiding fixes. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Ralf Diestelkämper
- 2. Seokki Lee
- 3. Boris Glavic
- 4. Melanie Herschel
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,826 | Why Not Yet: Fixing a Top-k Ranking that Is Not Fair to Individuals | 2023 | VLDB | 5.3124507e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 487 | Why Not? | 2009 | SIGMOD | 0.00022050218 |
| 611 | Lineage Tracing for General Data Warehouse Transformations | 2001 | VLDB | 0.00019231115 |
| 968 | Schema and Ontology Matching with COMA++ | 2005 | SIGMOD | 0.0001495703 |
| 6,975 | NLProveNAns: Natural Language Provenance for Non-Answers | 2018 | VLDB | 4.8772572e-05 |
| 7,678 | To Not Miss the Forest for the Trees - A Holistic Approach for Explaining Missing Answers over Nested Data | 2021 | SIGMOD | 4.6813062e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,868 | LEAP: A Low-cost Spark SQL Query Optimizer using Pairwise Comparison | 2025 | VLDB | 4.1945683e-05 |
| 11,405 | SparkCAD: Caching Anomalies Detector for Spark Applications | 2022 | VLDB | 4.1945683e-05 |
| 6,658 | Scalable Querying of Nested Data | 2021 | VLDB | 4.9711629e-05 |
| 8,586 | A Demonstration of DLBD: Database Logic Bug Detection System | 2023 | VLDB | 4.4902778e-05 |
| 1,482 | Automating Large-Scale Data Quality Verification | 2018 | VLDB | 0.00011725533 |
| 11,197 | QaaD (Query-as-a-Data): Scalable Execution of Massive Number of Small Queries in Spark | 2023 | SIGMOD | 4.1945683e-05 |
| 3,200 | Big Data Analytics with Datalog Queries on Spark | 2016 | SIGMOD | 7.3912411e-05 |
| 5,106 | Debugging Big Data Analytics in Spark with BigDebug | 2017 | SIGMOD | 5.6927181e-05 |
| 11,662 | Capturing and Querying Structural Provenance in Spark with Pebble | 2019 | SIGMOD | 4.1945683e-05 |
| 7,678 | To Not Miss the Forest for the Trees - A Holistic Approach for Explaining Missing Answers over Nested Data | 2021 | SIGMOD | 4.6813062e-05 |