Back to papers
Combining Aggregation and Sampling (Nearly) Optimally for Approximate Query Processing
Summary: Introduces PASS, Precomputation-Assisted Stratified Sampling: a partitioned tree of partial aggregates to speed up AQP. Exact answers for partition-aligned predicates via DFS; partial overlaps are approximated by stratified samples with an algorithm for near-optimal partitioning.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 6168
- Venue
- SIGMOD
- Year
- 2021
- Pagerank
- 4.944395e-05
- Overall Rank
- 6,740 | 53.12%
- DOI
-
10.1145/3448016.3457277
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 11 of 11 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 5,214 |
ThalamusDB: Approximate Query Processing on Multi-Modal Data |
2024 |
SIGMOD |
5.624434e-05 |
| 5,401 |
ALECE: An Attention-based Learned Cardinality Estimator for SPJ Queries on Dynamic Workloads |
2024 |
VLDB |
5.5285035e-05 |
| 8,373 |
Hierarchical Residual Encoding for Multiresolution Time Series Compression |
2023 |
SIGMOD |
4.5329467e-05 |
| 9,118 |
Towards Observability for Production Machine Learning Pipelines |
2022 |
VLDB |
4.3928288e-05 |
| 9,431 |
PairwiseHist: Fast, Accurate and Space-Efficient Approximate Query Processing with Data Compression |
2024 |
VLDB |
4.3434046e-05 |
| 9,848 |
Saving Money for Analytical Workloads in the Cloud |
2024 |
VLDB |
4.2721228e-05 |
| 10,223 |
On Fair Epsilon Net and Geometric Hitting Set |
2026 |
VLDB |
4.1945683e-05 |
| 10,359 |
Smallest Synthetic Witnesses for Conjunctive Queries |
2025 |
PODS |
4.1945683e-05 |
| 10,481 |
FAAQP: Fast and Accurate Approximate Query Processing based on Bitmap-augmented Sum-Product Network |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,608 |
Approximation-First Timeseries Query At Scale |
2025 |
VLDB |
4.1945683e-05 |
| 10,927 |
Computing A Well-Representative Summary of Conjunctive Query Results |
2024 |
PODS |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 29 of 29 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 14 |
Online Aggregation |
1997 |
SIGMOD |
0.0010801504 |
| 46 |
Simple Random Sampling from Relational Databases |
1986 |
VLDB |
0.00070894702 |
| 326 |
Optimal Histograms with Quality Guarantees |
1998 |
VLDB |
0.00027358981 |
| 402 |
Mergeable Summaries |
2012 |
PODS |
0.00024196343 |
| 608 |
DeepDB: Learn from Data, not from Queries! |
2020 |
VLDB |
0.00019235898 |
| 647 |
Progressive Approximate Aggregate Queries with a Multi-Resolution Tree Structure |
2001 |
SIGMOD |
0.00018668224 |
| 739 |
Congressional Samples for Approximate Answering of Group-By Queries |
2000 |
SIGMOD |
0.00017401518 |
| 758 |
Deep Unsupervised Cardinality Estimation |
2020 |
VLDB |
0.0001706608 |
| 1,120 |
Global Optimization of Histograms |
2001 |
SIGMOD |
0.00013856211 |
| 1,204 |
VerdictDB: Universalizing Approximate Query Processing |
2018 |
SIGMOD |
0.00013319541 |
| 1,260 |
Dynamic Sample Selection for Approximate Query Processing |
2003 |
SIGMOD |
0.00012993347 |
| 1,323 |
Quickr: Lazily Approximating Complex AdHoc Queries in BigData Clusters |
2016 |
SIGMOD |
0.00012601997 |
| 1,335 |
ICICLES: Self-tuning Samples for Approximate Query Answering |
2000 |
VLDB |
0.00012502131 |
| 1,369 |
Random Sampling over Joins Revisited |
2018 |
SIGMOD |
0.00012339777 |
| 1,477 |
Fine-grained Partitioning for Aggressive Data Skipping |
2014 |
SIGMOD |
0.00011770865 |
| 1,574 |
Approximate Query Processing: No Silver Bullet |
2017 |
SIGMOD |
0.00011287495 |
| 2,184 |
A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data |
2014 |
SIGMOD |
9.3429789e-05 |
| 2,580 |
Sample + Seek: Approximating Aggregates with Distribution Precision Guarantee |
2016 |
SIGMOD |
8.5058814e-05 |
| 2,588 |
Database Learning: Toward a Database that Becomes Smarter Every Time |
2017 |
SIGMOD |
8.4909562e-05 |
| 2,808 |
A Robust, Optimization-Based Approach for Approximate Answering of Aggregate Queries |
2001 |
SIGMOD |
8.0870741e-05 |
| 3,944 |
AQP++: Connecting Approximate Query Processing With Aggregate Precomputation for Interactive Analytics |
2018 |
SIGMOD |
6.6078243e-05 |
| 4,017 |
Optimal Histograms for Hierarchical Range Queries (Extended Abstract) |
2000 |
PODS |
6.524501e-05 |
| 4,030 |
Revisiting Reuse for Approximate Query Processing |
2017 |
VLDB |
6.5129665e-05 |
| 6,491 |
Robust Estimation With Sampling and Approximate Pre-Aggregation |
2003 |
VLDB |
5.0429323e-05 |
| 7,251 |
Learning to Sample: Counting with Complex Queries |
2020 |
VLDB |
4.7890519e-05 |
| 8,138 |
Fast and Reliable Missing Data Contingency Analysis with Predicate-Constraints |
2020 |
SIGMOD |
4.5771031e-05 |
| 8,240 |
Experiences with Approximating Queries in Microsoft’s Production Big-Data Clusters |
2019 |
VLDB |
4.5522563e-05 |
| 8,673 |
CoopStore: Optimizing Precomputed Summaries for Aggregation |
2020 |
VLDB |
4.4709116e-05 |
| 8,728 |
Stale View Cleaning: Getting Fresh Answers from Stale Materialized Views |
2015 |
VLDB |
4.4589711e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 647 |
Progressive Approximate Aggregate Queries with a Multi-Resolution Tree Structure |
2001 |
SIGMOD |
0.00018668224 |
| 2,808 |
A Robust, Optimization-Based Approach for Approximate Answering of Aggregate Queries |
2001 |
SIGMOD |
8.0870741e-05 |
| 1,260 |
Dynamic Sample Selection for Approximate Query Processing |
2003 |
SIGMOD |
0.00012993347 |
| 1,874 |
Knowing When You’re Wrong: Building Fast and Reliable Approximate Query Processing Systems |
2014 |
SIGMOD |
0.00010244443 |
| 11,285 |
Approximate Queries over Concurrent Updates |
2023 |
VLDB |
4.1945683e-05 |
| 10,049 |
Approximate Query Processing under Updates |
2026 |
SIGMOD |
4.1945683e-05 |
| 3,944 |
AQP++: Connecting Approximate Query Processing With Aggregate Precomputation for Interactive Analytics |
2018 |
SIGMOD |
6.6078243e-05 |
| 6,493 |
Joins on Samples: A Theoretical Guide for Practitioners |
2020 |
VLDB |
5.0424713e-05 |
| 10,337 |
Efficient Approximate Query Processing with Block Sampling |
2025 |
CIDR |
4.1945683e-05 |
| 2,580 |
Sample + Seek: Approximating Aggregates with Distribution Precision Guarantee |
2016 |
SIGMOD |
8.5058814e-05 |