Predicate Pushdown for Data Science Pipelines
Summary: MagicPush uses a search-verification approach to predicate pushdown in data science pipelines, discovering input-space predicates and proving pushdown preserves outputs, even with non-relational operators and UDFs. Evaluations on TPC-H and 200 real-world GitHub Notebook pipelines show it beats a strong rule-based baseline, discovers new pushdown opportunities, and yields up to 99% running-time reduction in 42 pipelines while matching baseline opportunities elsewhere. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 4 of 4 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,763 | The UDFBench Benchmark for General-purpose UDF Queries | 2025 | VLDB | 4.2856106e-05 |
| 10,152 | Data-Semantics-Aware Recommendation of Diverse Pivot Tables | 2026 | SIGMOD | 4.1945683e-05 |
| 10,404 | Dynamic Pruning for Recursive Joins | 2025 | SIGMOD | 4.1945683e-05 |
| 10,854 | LiquidCache: Efficient Pushdown Caching for Cloud-Native Data Analytics | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 27 of 27 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 6,590 | Interactive Demonstration of Probabilistic Predicates | 2018 | SIGMOD | 5.0010949e-05 |
| 3,407 | End-to-end Optimization of Machine Learning Prediction Queries | 2022 | SIGMOD | 7.1295646e-05 |
| 7,427 | Selection Pushdown in Column Stores using Bit Manipulation Instructions | 2023 | SIGMOD | 4.7327406e-05 |
| 5,072 | Optimizing Machine Learning Inference Queries with Correlative Proxy Models | 2022 | VLDB | 5.7185674e-05 |
| 5,718 | Conjunctive Queries with Comparisons | 2022 | SIGMOD | 5.3552123e-05 |
| 3,922 | Pushing Data-Induced Predicates Through Joins in Big-Data Clusters | 2020 | VLDB | 6.6291079e-05 |
| 1,302 | Query Optimization by Predicate Move-Around | 1994 | VLDB | 0.00012705525 |
| 139 | Predicate Migration: Optimizing Queries with Expensive Predicates | 1993 | SIGMOD | 0.00042299329 |
| 10,950 | PLAQUE: Automated Predicate Learning at Query Time | 2024 | SIGMOD | 4.1945683e-05 |
| 329 | Accelerating Machine Learning Inference with Probabilistic Predicates | 2018 | SIGMOD | 0.00027249545 |