Database Paper Browser

Back to papers

Task Cascades for Efficient Unstructured Data Processing

Summary: Task cascades generalize model cascades for LLM-based document processing by varying not only the model, but also the queried span and even the operation, exploiting simpler correlated sub-tasks and partial evidence. An iterative optimizer plus statistical accuracy guarantees yields 36% lower cost than standard cascades at 90% target accuracy. (summarized by gpt-5.4-mini on Apr 11 2026)

Paper ID
7527
Venue
SIGMOD
Year
2026
Pagerank
4.1945683e-05
Overall Rank
10,215 | 28.94%
DOI
10.1145/3786702

Incoming Non-self Citations Over Time

No non-self incoming citations found for this paper in this database.

Authors

Incoming Citations (Sorted by Pagerank)

Showing 0 of 0 citing papers.

Rank Citing Paper Year Venue Pagerank
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 29 of 29 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
139 Predicate Migration: Optimizing Queries with Expensive Predicates 1993 SIGMOD 0.00042299329
316 NoScope: Optimizing Neural Network Queries over Video at Scale 2017 VLDB 0.00027988668
329 Accelerating Machine Learning Inference with Probabilistic Predicates 2018 SIGMOD 0.00027249545
449 Approximate Query Processing: Taming the TeraBytes! A Tutorial 2001 VLDB 0.00022846068
530 Random Sampling for Histogram Construction: How much is enough? 1998 SIGMOD 0.00020803682
696 BlazeIt: Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics 2020 VLDB 0.00018048935
1,082 CAESURA: Language Models as Multi-Modal Query Planners 2024 CIDR 0.00014214232
1,116 Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes 2024 VLDB 0.00013890154
1,574 Approximate Query Processing: No Silver Bullet 2017 SIGMOD 0.00011287495
1,963 DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing 2025 VLDB 9.929429e-05
2,106 Palimpzest: Optimizing AI-Powered Analytics with Declarative Query Processing 2025 CIDR 9.5342543e-05
3,472 LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency 2025 VLDB 7.0639229e-05
3,558 Approximate Selection with Guarantees using Proxies 2020 VLDB 6.9765724e-05
3,876 The Design of an LLM-powered Unstructured Analytics System 2025 CIDR 6.6741456e-05
4,407 Filtering with Approximate Predicates 1998 VLDB 6.2133426e-05
4,501 TASTI: Semantic Indexes for Machine Learning-based Queries over Unstructured Data 2022 SIGMOD 6.137686e-05
4,712 Accelerating Approximate Aggregation Queries with Expensive Predicates 2021 VLDB 5.9787986e-05
5,171 Abacus: A Cost-Based Optimizer for Semantic Operator Systems 2026 VLDB 5.6464993e-05
5,173 FiGO: Fine-Grained Query Optimization in Video Analytics 2022 SIGMOD 5.6447253e-05
5,214 ThalamusDB: Approximate Query Processing on Multi-Modal Data 2024 SIGMOD 5.624434e-05
6,217 Pneuma: Leveraging LLMs for Tabular Data Representation and Retrieval in an End-to-End System 2025 SIGMOD 5.1534752e-05
7,119 VectraFlow: Integrating Vectors into Stream Processing 2025 CIDR 4.8262611e-05
7,339 SpareLLM: Automatically Selecting Task-Specific Minimum-Cost Large Language Models under Equivalence Constraint 2025 SIGMOD 4.7579469e-05
7,705 AOP: Automated and Interactive LLM Pipeline Orchestration for Answering Complex Queries 2025 CIDR 4.6730494e-05
7,928 Accelerating Aggregation Queries on Unstructured Streams of Data 2023 VLDB 4.613455e-05
8,204 ELEET: Efficient Learned Query Execution over Text and Tables 2024 VLDB 4.5594273e-05
8,469 Semantic Operators and Their Optimization: Enabling LLM-Based Data Processing with Accuracy Guarantees in LOTUS 2025 VLDB 4.5041113e-05
8,969 A Learned Query Rewrite System 2023 VLDB 4.4189226e-05
9,235 ThriftLLM: On Cost-Effective Selection of Large Language Models for Classification Queries 2025 VLDB 4.3690661e-05
Previous Page 1 / 1 Next

Semantically Similar Papers