Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures
Summary: JellyBean: jointly selects AutoML-generated model variants and places them across tiered heterogeneous infrastructure (edge/hubs/edge-DC/cloud) to meet SLOs (throughput, accuracy) while minimizing serving cost. Yields up to 58% cost reduction on VQA, 36% on vehicle tracking, and up to 5x cost savings versus cloud-only serving. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Yongji Wu
- 2. Matthew Lentz
- 3. Danyang Zhuo
- 4. Yao Lu
Incoming Citations (Sorted by Pagerank)
Showing 7 of 7 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,567 | Optimizing Video Analytics with Declarative Model Relationships | 2023 | VLDB | 6.080526e-05 |
| 7,289 | DAHA: Accelerating GNN Training with Data and Hardware Aware Execution Planning | 2024 | VLDB | 4.7747168e-05 |
| 7,338 | Aero: Adaptive Query Processing of ML Queries | 2025 | SIGMOD | 4.7584583e-05 |
| 8,080 | Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines | 2024 | VLDB | 4.5911668e-05 |
| 10,325 | KEN: An Execution Engine for Unstructured Database Systems | 2026 | VLDB | 4.1945683e-05 |
| 10,405 | Flux: Unifying Heterogeneous Infrastructure for Alibaba AnalyticDB | 2025 | SIGMOD | 4.1945683e-05 |
| 10,853 | Algorithmic Data Minimization for Machine Learning over Internet-of-Things Data Streams | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 329 | Accelerating Machine Learning Inference with Probabilistic Predicates | 2018 | SIGMOD | 0.00027249545 |
| 454 | An Overview of Query Optimization in Relational Systems | 1998 | PODS | 0.00022734812 |
| 696 | BlazeIt: Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics | 2020 | VLDB | 0.00018048935 |
| 984 | Natural language to SQL: Where are we today? | 2020 | VLDB | 0.00014857465 |
| 5,072 | Optimizing Machine Learning Inference Queries with Correlative Proxy Models | 2022 | VLDB | 5.7185674e-05 |
| 5,185 | Yugong: Geo-Distributed Data and Job Placement at Scale | 2019 | VLDB | 5.6405374e-05 |
Previous
Page 1 / 1
Next