Towards Resource Efficiency: Practical Insights into Large-Scale Spark Workloads at ByteDance
Summary: Governance for large-scale Spark: push-based Cloud Shuffle + ESS to cut I/O stalls, and extended configs (milliCores, memoryBurst, spill modes) for fine-grained resource control. Two-stage auto-tuning on millions of jobs yields +22% CPU, +5% memory, 10% less shuffle time. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Yixin Wu
- 2. Xiuqi Huang
- 3. Zhongjia Wei
- 4. Hang Cheng
- 5. Chaohui Xin
- 6. Zuzhi Chen
- 7. Binbin Chen
- 8. Yufei Wu
- 9. Hao Wang
- 10. Tieying Zhang
- 11. Rui Shi
- 12. Xiaofeng Gao
- 13. Yuming Liang
- 14. Pengwei Zhao
- 15. Guihai Chen
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,414 | Rockhopper: A Robust Optimizer for Spark Configuration Tuning in Production Environment | 2025 | SIGMOD | 4.1945683e-05 |
| 10,777 | Magnus: A Holistic Approach to Data Management for Large-Scale Machine Learning Workloads | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,269 | iBTune: Individualized Buffer Tuning for Large-scale Cloud Databases | 2019 | VLDB | 7.2998062e-05 |
| 3,522 | ResTune: Resource Oriented Tuning Boosted by Meta-Learning for Cloud Databases | 2021 | SIGMOD | 7.0096727e-05 |
| 3,951 | Why You Should Run TPC-DS: A Workload Analysis | 2007 | VLDB | 6.5953162e-05 |
| 5,833 | LOCAT: Low-Overhead Online Configuration Auto-Tuning of Spark SQL Applications | 2022 | SIGMOD | 5.3106182e-05 |
| 5,888 | Magnet: Push-based Shuffle Service for Large-scale Data Processing | 2020 | VLDB | 5.2873617e-05 |
| 6,209 | AutoExecutor: Predictive Parallelism for Spark SQL Queries | 2021 | VLDB | 5.1565972e-05 |
| 6,871 | Towards General and Efficient Online Tuning for Spark | 2023 | VLDB | 4.8997004e-05 |
Previous
Page 1 / 1
Next