Accelerating Aggregation Queries on Unstructured Streams of Data
Summary: InQuest: streaming, multimodal aggregation over unstructured data using cheap proxy models plus sampling to limit expensive oracle invocations, producing real-time approximate query answers with statistical guarantees. Theory: expected error on stationary streams decays ∝1/(oracle budget); evaluation: matches streaming baselines with up to 5× fewer oracle calls and improves RMSE vs a state-of-the-art batch method. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Matthew Russo
- 2. Tatsunori Hashimoto
- 3. Daniel Kang
- 4. Yi Sun
- 5. Matei Zaharia
Incoming Citations (Sorted by Pagerank)
Showing 9 of 9 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,171 | Abacus: A Cost-Based Optimizer for Semantic Operator Systems | 2026 | VLDB | 5.6464993e-05 |
| 9,990 | Deep Research is the New Analytics System: Towards Building the Runtime for AI-Driven Analytics | 2026 | CIDR | 4.1945683e-05 |
| 10,064 | Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees | 2026 | SIGMOD | 4.1945683e-05 |
| 10,215 | Task Cascades for Efficient Unstructured Data Processing | 2026 | SIGMOD | 4.1945683e-05 |
| 10,337 | Efficient Approximate Query Processing with Block Sampling | 2025 | CIDR | 4.1945683e-05 |
| 10,382 | MAST: Towards Efficient Analytical Query Processing on Point Cloud Data | 2025 | SIGMOD | 4.1945683e-05 |
| 10,497 | PilotDB: Database-Agnostic Online Approximate Query Processing with A Priori Error Guarantees | 2025 | SIGMOD | 4.1945683e-05 |
| 10,523 | Scalable Complex Event Processing on Video Streams | 2025 | SIGMOD | 4.1945683e-05 |
| 10,667 | Déjà Vu: Efficient Video-Language Query Engine with Learning-based Inter-Frame Computation Reuse | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 13 of 13 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 7,841 | Query Mesh: Multi-Route Query Processing Technology | 2009 | VLDB | 4.6370195e-05 |
| 5,072 | Optimizing Machine Learning Inference Queries with Correlative Proxy Models | 2022 | VLDB | 5.7185674e-05 |
| 6,230 | Learned Approximate Query Processing: Make it Light, Accurate and Fast | 2021 | CIDR | 5.145989e-05 |
| 1,064 | Processing Complex Aggregate Queries over Data Streams | 2002 | SIGMOD | 0.00014356481 |
| 10,752 | QUEST: Query Optimization in Unstructured Document Analysis | 2025 | VLDB | 4.1945683e-05 |
| 9,162 | Estimating Quantiles from the Union of Historical and Streaming Data | 2017 | VLDB | 4.3849295e-05 |
| 11,650 | Query-Driven Learning for Next Generation Predictive Modeling & Analytics | 2019 | SIGMOD | 4.1945683e-05 |
| 4,712 | Accelerating Approximate Aggregation Queries with Expensive Predicates | 2021 | VLDB | 5.9787986e-05 |
| 9,351 | On Efficient Approximate Queries over Machine Learning Models | 2023 | VLDB | 4.3524472e-05 |
| 11,426 | Accelerating Queries over Unstructured Data with ML | 2021 | CIDR | 4.1945683e-05 |