Data Canopy: Accelerating Exploratory Statistical Analysis
Summary: Data Canopy provides an in-memory library of basic aggregates to reuse statistics across overlapping data parts, cutting recomputation in exploratory analysis. It decomposes stats into reusable aggregates, with storage/maintenance and hardware-aware tuning, yielding ~10x speedup after 100 queries vs. state-of-the-art. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Abdul Wasay
- 2. Xinding Wei
- 3. Niv Dayan
- 4. Stratos Idreos
Incoming Citations (Sorted by Pagerank)
Showing 9 of 9 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,501 | DBEst: Revisiting Approximate Query Processing Engines with Machine Learning Models | 2019 | SIGMOD | 8.6453446e-05 |
| 2,953 | Moment-Based Quantile Sketches for Efficient High Cardinality Aggregation Queries | 2018 | VLDB | 7.8267643e-05 |
| 3,277 | A Layered Aggregate Engine for Analytics Workloads | 2019 | SIGMOD | 7.2871625e-05 |
| 3,606 | EVA: A Symbolic Approach to Accelerating Exploratory Video Analytics with Materialized Views | 2022 | SIGMOD | 6.9260354e-05 |
| 6,008 | Apollo: A Dataset Profiling and Operator Modeling System | 2019 | SIGMOD | 5.2415551e-05 |
| 7,338 | Aero: Adaptive Query Processing of ML Queries | 2025 | SIGMOD | 4.7584583e-05 |
| 8,346 | Deep Learning: Systems and Responsibility | 2021 | SIGMOD | 4.5420668e-05 |
| 9,052 | RawVis: A System for Efficient In-situ Visual Analytics | 2021 | SIGMOD | 4.4039656e-05 |
| 11,474 | Exploring Ratings in Subjective Databases | 2021 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 41 of 41 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,909 | SciBORQ: Scientific data management with Bounds On Runtime and Quality | 2011 | CIDR | 0.00010121304 |
| 4,030 | Revisiting Reuse for Approximate Query Processing | 2017 | VLDB | 6.5129665e-05 |
| 5,203 | Efficient Exploration of Large Scientific Databases | 2002 | VLDB | 5.6316997e-05 |
| 4,675 | Scalable Multi-Query Optimization for Exploratory Queries over Federated Scientific Databases | 2008 | VLDB | 6.0056894e-05 |
| 1,191 | Fast Computation of Sparse Datacubes | 1997 | VLDB | 0.00013434201 |
| 4,758 | Optimization for Active Learning-based Interactive Database Exploration | 2019 | VLDB | 5.9422515e-05 |
| 11,427 | Accelerating Complex Analytics using Speculation | 2021 | CIDR | 4.1945683e-05 |
| 13,359 | Robust Data Transformations | 2015 | CIDR | - |
| 11,411 | High-dimensional Data Cubes | 2022 | VLDB | 4.1945683e-05 |
| 5,981 | DataPrep.EDA: Task-Centric Exploratory Data Analysis for Statistical Modeling in Python | 2021 | SIGMOD | 5.2448986e-05 |