Back to papers
Evaluating End-to-End Optimization for Data Analytics Applications in Weld
Summary: Proposes Weld, a common runtime for data analytics libraries that enables cross-library optimizations and pipelining under imperative APIs. An automatic adaptive optimizer uses lightweight measurements to make data-dependent runtime decisions with low overhead, delivering up to 23x single-thread and 80x on eight-thread speedups, plus 3.75x gains over rule-based optimization with incremental porting of 4–5 operators.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 11595
- Venue
- VLDB
- Year
- 2018
- Pagerank
- 7.9452051e-05
- Overall Rank
- 2,896 | 79.86%
- DOI
-
10.14778/3213880.3213890
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 25 of 25 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 1,284 |
Amazon Redshift Re-invented |
2022 |
SIGMOD |
0.00012837822 |
| 1,350 |
Northstar: An Interactive Data Science System |
2018 |
VLDB |
0.00012431059 |
| 1,882 |
Tuplex: Data Science in Python at Native Code Speed |
2021 |
SIGMOD |
0.0001021625 |
| 2,350 |
An Intermediate Representation for Optimizing Machine Learning Pipelines |
2019 |
VLDB |
8.9788641e-05 |
| 2,473 |
Photon: A Fast Query Engine for Lakehouse Systems |
2022 |
SIGMOD |
8.7237281e-05 |
| 3,331 |
A Demonstration of Willump: A Statistically-Aware End-to-end Optimizer for Machine Learning Inference |
2020 |
VLDB |
7.2131599e-05 |
| 3,407 |
End-to-end Optimization of Machine Learning Prediction Queries |
2022 |
SIGMOD |
7.1295646e-05 |
| 3,606 |
EVA: A Symbolic Approach to Accelerating Exploratory Video Analytics with Materialized Views |
2022 |
SIGMOD |
6.9260354e-05 |
| 4,774 |
LIMA: Fine-grained Lineage Tracing and Reuse in Machine Learning Systems |
2021 |
SIGMOD |
5.9316087e-05 |
| 4,924 |
User-Defined Operators: Efficiently Integrating Custom Algorithms into Modern Databases |
2022 |
VLDB |
5.822682e-05 |
| 5,723 |
Evolution of a Compiling Query Engine |
2021 |
VLDB |
5.3522361e-05 |
| 5,731 |
Babelfish: Efficient Execution of Polyglot Queries |
2022 |
VLDB |
5.3502065e-05 |
| 6,863 |
Declarative Sub-Operators for Universal Data Processing |
2023 |
VLDB |
4.905092e-05 |
| 7,311 |
The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development |
2020 |
SIGMOD |
4.7656884e-05 |
| 8,080 |
Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines |
2024 |
VLDB |
4.5911668e-05 |
| 8,094 |
Modularis: Modular Relational Analytics over Heterogeneous Distributed Platforms |
2021 |
VLDB |
4.5867812e-05 |
| 8,257 |
Automating and Optimizing Data-Centric What-If Analyses on Native Machine Learning Pipelines |
2023 |
SIGMOD |
4.5487511e-05 |
| 8,583 |
Efficient Execution of User-Defined Functions in SQL Queries |
2023 |
VLDB |
4.4919445e-05 |
| 8,595 |
Towards A Polyglot Framework for Factorized ML |
2021 |
VLDB |
4.4889397e-05 |
| 9,326 |
BladeDISC: Optimizing Dynamic Shape Machine Learning Workloads via Compiler Approach |
2023 |
SIGMOD |
4.3556432e-05 |
| 9,763 |
The UDFBench Benchmark for General-purpose UDF Queries |
2025 |
VLDB |
4.2856106e-05 |
| 10,177 |
InferF: Declarative Factorization of AI/ML Inferences over Joins |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,471 |
Approximating Opaque Top-k Queries |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,714 |
Towards Designing Future-Proof Data Processing Systems |
2025 |
VLDB |
4.1945683e-05 |
| 10,969 |
Query Compilation Without Regrets |
2024 |
SIGMOD |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 13 of 13 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 4,259 |
Optimizing I/O for Big Array Analytics |
2012 |
VLDB |
6.3147285e-05 |
| 8,617 |
A Spark Optimizer for Adaptive, Fine-Grained Parameter Tuning |
2024 |
VLDB |
4.4846425e-05 |
| 658 |
Towards a Unified Architecture for in-RDBMS Analytics |
2012 |
SIGMOD |
0.00018506577 |
| 3,918 |
On Optimizing Operator Fusion Plans for Large-Scale Machine Learning in SystemML |
2018 |
VLDB |
6.6315176e-05 |
| 2,172 |
Spinning Fast Iterative Data Flows |
2012 |
VLDB |
9.3706587e-05 |
| 1,882 |
Tuplex: Data Science in Python at Native Code Speed |
2021 |
SIGMOD |
0.0001021625 |
| 6,189 |
Accelerating Python UDFs in Vectorized Query Execution |
2022 |
CIDR |
5.1647573e-05 |
| 10,897 |
Welding Natural Language Queries to Analytics IRs with LLMs |
2024 |
CIDR |
4.1945683e-05 |
| 1,873 |
An Architecture for Compiling UDF-centric Workflows |
2015 |
VLDB |
0.00010253002 |
| 1,750 |
Weld: A Common Runtime for High Performance Data Analytics |
2017 |
CIDR |
0.00010683647 |