Back to papers
Approximate Selection with Guarantees using Proxies
Summary: Introduces algorithms for approximate selection with statistical guarantees using cheap proxies and limited exact identifications from an oracle. Guarantees target precision or recall with high probability, outperforming prior proxy-based methods—up to 30x improvement on real and synthetic datasets.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 12094
- Venue
- VLDB
- Year
- 2020
- Pagerank
- 6.9765724e-05
- Overall Rank
- 3,558 | 75.25%
- DOI
-
10.14778/3407790.3407804
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 21 of 21 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 3,293 |
Jointly Optimizing Preprocessing and Inference for DNN-based Visual Analytics |
2021 |
VLDB |
7.2629834e-05 |
| 4,501 |
TASTI: Semantic Indexes for Machine Learning-based Queries over Unstructured Data |
2022 |
SIGMOD |
6.137686e-05 |
| 4,567 |
Optimizing Video Analytics with Declarative Model Relationships |
2023 |
VLDB |
6.080526e-05 |
| 4,641 |
VIVA: An End-to-End System for Interactive Video Analytics |
2022 |
CIDR |
6.027004e-05 |
| 4,712 |
Accelerating Approximate Aggregation Queries with Expensive Predicates |
2021 |
VLDB |
5.9787986e-05 |
| 5,072 |
Optimizing Machine Learning Inference Queries with Correlative Proxy Models |
2022 |
VLDB |
5.7185674e-05 |
| 6,315 |
Seiden: Revisiting Query Processing in Video Database Systems |
2023 |
VLDB |
5.1142298e-05 |
| 7,338 |
Aero: Adaptive Query Processing of ML Queries |
2025 |
SIGMOD |
4.7584583e-05 |
| 7,928 |
Accelerating Aggregation Queries on Unstructured Streams of Data |
2023 |
VLDB |
4.613455e-05 |
| 8,469 |
Semantic Operators and Their Optimization: Enabling LLM-Based Data Processing with Accuracy Guarantees in LOTUS |
2025 |
VLDB |
4.5041113e-05 |
| 9,351 |
On Efficient Approximate Queries over Machine Learning Models |
2023 |
VLDB |
4.3524472e-05 |
| 9,765 |
TVM: A Tile-based Video Management Framework |
2024 |
VLDB |
4.2856106e-05 |
| 9,770 |
Everest: A Top-K Deep Video Analytics System |
2022 |
SIGMOD |
4.2856106e-05 |
| 9,786 |
RALF: Accuracy-Aware Scheduling for Feature Store Maintenance |
2024 |
VLDB |
4.2827012e-05 |
| 9,807 |
Demonstration of Accelerating Machine Learning Inference Queries with Correlative Proxy Models |
2022 |
VLDB |
4.2805224e-05 |
| 10,064 |
Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,187 |
On Efficient Approximate Aggregate Nearest Neighbor Queries over Learned Representations |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,215 |
Task Cascades for Efficient Unstructured Data Processing |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,382 |
MAST: Towards Efficient Analytical Query Processing on Point Cloud Data |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,523 |
Scalable Complex Event Processing on Video Streams |
2025 |
SIGMOD |
4.1945683e-05 |
| 11,426 |
Accelerating Queries over Unstructured Data with ML |
2021 |
CIDR |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 14 of 14 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 5,072 |
Optimizing Machine Learning Inference Queries with Correlative Proxy Models |
2022 |
VLDB |
5.7185674e-05 |
| 8,384 |
Consistent and Flexible Selectivity Estimation for High-Dimensional Data |
2021 |
SIGMOD |
4.5304673e-05 |
| 8,672 |
Optimizing Video Selection LIMIT Queries With Commonsense Knowledge |
2024 |
VLDB |
4.4710897e-05 |
| 5,734 |
Efficient Algorithms for Crowd-Aided Categorization |
2020 |
VLDB |
5.3482904e-05 |
| 4,442 |
Approximating Predicates and Expressive Queries on Probabilistic Databases |
2008 |
PODS |
6.186154e-05 |
| 11,595 |
Minimization of Classifier Construction Cost for Search Queries |
2020 |
SIGMOD |
4.1945683e-05 |
| 3,954 |
Efficiently Approximating Selectivity Functions using Low Overhead Regression Models |
2020 |
VLDB |
6.5926838e-05 |
| 11,426 |
Accelerating Queries over Unstructured Data with ML |
2021 |
CIDR |
4.1945683e-05 |
| 4,712 |
Accelerating Approximate Aggregation Queries with Expensive Predicates |
2021 |
VLDB |
5.9787986e-05 |
| 9,351 |
On Efficient Approximate Queries over Machine Learning Models |
2023 |
VLDB |
4.3524472e-05 |