Pipemizer: An Optimizer for Analytics Data Pipelines
Summary: Pipemizer: an optimizer and recommender for analytics data pipelines. Introduces pipeline-aware statistics, inter-job operator push-up, and split/merge optimizations to boost cross-job performance; demonstrated on large-scale SCOPE workloads with 650k daily jobs and 70% inter-job dependencies, enabling automated recommendations. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Sunny Gakhar
- 2. Joyce Cahoon
- 3. Wangchao Le
- 4. Xiangnan Li
- 5. Kaushik Ravichandran
- 6. Hiren Patel
- 7. Marc Friedman
- 8. Brandon Haynes
- 9. Shi Qiao
- 10. Alekh Jindal
- 11. Jyoti Leeka
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,416 | Towards Building Autonomous Data Services on Azure | 2023 | SIGMOD | 4.5196199e-05 |
| 8,582 | Towards Query Optimizer as a Service (QOaaS) in a Unified LakeHouse Ecosystem: Can One QO Rule Them All? | 2025 | CIDR | 4.492033e-05 |
| 10,931 | Proactive Resume and Pause of Resources for Microsoft Azure SQL Database Serverless | 2024 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 9 of 9 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 179 | Efficient and Extensible Algorithms for Multi Query Optimization | 2000 | SIGMOD | 0.00037672155 |
| 516 | AutoAdmin "What-if" Index Analysis Utility | 1998 | SIGMOD | 0.00021196031 |
| 977 | Pipelining in Multi-Query Optimization | 2001 | PODS | 0.0001488881 |
| 2,456 | Production Machine Learning Pipelines: Empirical Analysis and Optimization Opportunities | 2021 | SIGMOD | 8.7733773e-05 |
| 2,545 | POLARIS: The Distributed SQL Engine in Azure Synapse | 2020 | VLDB | 8.5725413e-05 |
| 2,828 | Automatic Physical Design Tuning: Workload as a Sequence | 2006 | SIGMOD | 8.0548516e-05 |
| 3,703 | Multi-Query Optimization in MapReduce Framework | 2014 | VLDB | 6.8289978e-05 |
| 6,261 | The Cosmos Big Data Platform at Microsoft: Over a Decade of Progress and a Decade to Look Forward | 2021 | VLDB | 5.1350714e-05 |
| 9,194 | Phoebe: A Learning-based Checkpoint Optimizer | 2021 | VLDB | 4.3761777e-05 |
Previous
Page 1 / 1
Next