Back to papers
Accelerating Approximate Aggregation Queries with Expensive Predicates
Summary: Proposes ABae, a proxy-based framework to accelerate approximate aggregations with expensive DNN predicates. It uses proxy-driven stratification, pilot sampling, and plug-in estimates for optimal sample allocation, even when some draws fail the predicate; achieves optimal convergence and up to 2.3× labeling cost reductions.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 12411
- Venue
- VLDB
- Year
- 2021
- Pagerank
- 5.9787986e-05
- Overall Rank
- 4,712 | 67.23%
- DOI
-
10.14778/3476249.3476285
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 20 of 20 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 3,606 |
EVA: A Symbolic Approach to Accelerating Exploratory Video Analytics with Materialized Views |
2022 |
SIGMOD |
6.9260354e-05 |
| 4,501 |
TASTI: Semantic Indexes for Machine Learning-based Queries over Unstructured Data |
2022 |
SIGMOD |
6.137686e-05 |
| 4,567 |
Optimizing Video Analytics with Declarative Model Relationships |
2023 |
VLDB |
6.080526e-05 |
| 4,641 |
VIVA: An End-to-End System for Interactive Video Analytics |
2022 |
CIDR |
6.027004e-05 |
| 5,171 |
Abacus: A Cost-Based Optimizer for Semantic Operator Systems |
2026 |
VLDB |
5.6464993e-05 |
| 5,214 |
ThalamusDB: Approximate Query Processing on Multi-Modal Data |
2024 |
SIGMOD |
5.624434e-05 |
| 6,877 |
Extract-Transform-Load for Video Streams |
2023 |
VLDB |
4.8974054e-05 |
| 7,338 |
Aero: Adaptive Query Processing of ML Queries |
2025 |
SIGMOD |
4.7584583e-05 |
| 7,928 |
Accelerating Aggregation Queries on Unstructured Streams of Data |
2023 |
VLDB |
4.613455e-05 |
| 8,469 |
Semantic Operators and Their Optimization: Enabling LLM-Based Data Processing with Accuracy Guarantees in LOTUS |
2025 |
VLDB |
4.5041113e-05 |
| 9,990 |
Deep Research is the New Analytics System: Towards Building the Runtime for AI-Driven Analytics |
2026 |
CIDR |
4.1945683e-05 |
| 10,064 |
Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,215 |
Task Cascades for Efficient Unstructured Data Processing |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,337 |
Efficient Approximate Query Processing with Block Sampling |
2025 |
CIDR |
4.1945683e-05 |
| 10,382 |
MAST: Towards Efficient Analytical Query Processing on Point Cloud Data |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,497 |
PilotDB: Database-Agnostic Online Approximate Query Processing with A Priori Error Guarantees |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,518 |
High-Throughput Ingestion for Video Warehouse: Comprehensive Configuration and Effective Exploration |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,523 |
Scalable Complex Event Processing on Video Streams |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,944 |
Predictive and Near-Optimal Sampling for View Materialization in Video Databases |
2024 |
SIGMOD |
4.1945683e-05 |
| 11,061 |
Optimizing Video Queries with Declarative Clues |
2024 |
VLDB |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 16 of 16 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 5,072 |
Optimizing Machine Learning Inference Queries with Correlative Proxy Models |
2022 |
VLDB |
5.7185674e-05 |
| 6,230 |
Learned Approximate Query Processing: Make it Light, Accurate and Fast |
2021 |
CIDR |
5.145989e-05 |
| 3,558 |
Approximate Selection with Guarantees using Proxies |
2020 |
VLDB |
6.9765724e-05 |
| 1,260 |
Dynamic Sample Selection for Approximate Query Processing |
2003 |
SIGMOD |
0.00012993347 |
| 2,580 |
Sample + Seek: Approximating Aggregates with Distribution Precision Guarantee |
2016 |
SIGMOD |
8.5058814e-05 |
| 9,807 |
Demonstration of Accelerating Machine Learning Inference Queries with Correlative Proxy Models |
2022 |
VLDB |
4.2805224e-05 |
| 329 |
Accelerating Machine Learning Inference with Probabilistic Predicates |
2018 |
SIGMOD |
0.00027249545 |
| 6,740 |
Combining Aggregation and Sampling (Nearly) Optimally for Approximate Query Processing |
2021 |
SIGMOD |
4.944395e-05 |
| 11,426 |
Accelerating Queries over Unstructured Data with ML |
2021 |
CIDR |
4.1945683e-05 |
| 9,351 |
On Efficient Approximate Queries over Machine Learning Models |
2023 |
VLDB |
4.3524472e-05 |