Back to papers
Analyzing and Mitigating Data Stalls in DNN Training
Summary: Data-pipeline stalls often dominate DNN training time for CV and audio models; large-scale study covers nine models, four datasets, three tasks. DS-Analyzer quantifies stalls; CoorDL's three loading techniques yield up to 5x speedups vs DALI.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 12593
- Venue
- VLDB
- Year
- 2021
- Pagerank
- 0.00011642333
- Overall Rank
- 1,504 | 89.54%
- DOI
-
10.14778/3446095.3446100
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 16 of 16 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 2,170 |
tf.data: A Machine Learning Data Processing Framework |
2021 |
VLDB |
9.3821603e-05 |
| 2,688 |
Accelerating Recommendation System Training by Leveraging Popular Choices |
2022 |
VLDB |
8.2991144e-05 |
| 3,698 |
Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning Preprocessing Pipelines |
2022 |
SIGMOD |
6.8340435e-05 |
| 4,180 |
FastFlow: Accelerating Deep Learning Model Training with Smart Offloading of Input Data Pipeline |
2023 |
VLDB |
6.3793352e-05 |
| 5,552 |
GoldMiner: Elastic Scaling of Training Data Pre-Processing Pipelines for Deep Learning |
2023 |
SIGMOD |
5.4402488e-05 |
| 6,057 |
Progressive Compressed Records: Taking a Byte out of Deep Learning Data |
2021 |
VLDB |
5.2317752e-05 |
| 7,656 |
Nautilus: An Optimized System for Deep Transfer Learning over Evolving Training Datasets |
2022 |
SIGMOD |
4.6871575e-05 |
| 8,348 |
FusionFlow: Accelerating Data Preprocessing for Machine Learning with CPU-GPU Cooperation |
2024 |
VLDB |
4.5410024e-05 |
| 8,735 |
TensorSocket: Shared Data Loading for Deep Learning Training |
2026 |
SIGMOD |
4.456315e-05 |
| 8,737 |
Scheduling Data Processing Pipelines for Incremental Training on MLP-based Recommendation Models |
2025 |
SIGMOD |
4.456315e-05 |
| 9,677 |
Apt-Serve: Adaptive Request Scheduling on Hybrid Cache for Scalable LLM Inference Serving |
2025 |
SIGMOD |
4.3047774e-05 |
| 9,805 |
MEMO: Fine-grained Tensor Management For Ultra-long Context LLM Training |
2025 |
SIGMOD |
4.2805224e-05 |
| 10,183 |
Mixtera: A Data Plane for Foundation Model Training |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,580 |
GPEmu: A GPU Emulator for Faster and Cheaper Prototyping and Evaluation of Deep Learning System Research |
2025 |
VLDB |
4.1945683e-05 |
| 10,770 |
cedar: Optimized and Unified Machine Learning Input Data Pipelines |
2025 |
VLDB |
4.1945683e-05 |
| 10,856 |
Analyzing Near-Network Hardware Acceleration with Co-Processing on DPUs |
2025 |
VLDB |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 1 of 1 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 7,289 |
DAHA: Accelerating GNN Training with Data and Hardware Aware Execution Planning |
2024 |
VLDB |
4.7747168e-05 |
| 13,171 |
Reimagining Deep Learning Systems Through the Lens of Data Systems |
2024 |
VLDB |
- |
| 5,552 |
GoldMiner: Elastic Scaling of Training Data Pre-Processing Pipelines for Deep Learning |
2023 |
SIGMOD |
5.4402488e-05 |
| 7,061 |
Serving Deep Learning Models with Deduplication from Relational Databases |
2022 |
VLDB |
4.8463881e-05 |
| 5,561 |
Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage Accesses |
2024 |
VLDB |
5.4332062e-05 |
| 5,737 |
Comprehensive Evaluation of GNN Training Systems: A Data Management Perspective |
2024 |
VLDB |
5.3480667e-05 |
| 4,180 |
FastFlow: Accelerating Deep Learning Model Training with Smart Offloading of Input Data Pipeline |
2023 |
VLDB |
6.3793352e-05 |
| 8,735 |
TensorSocket: Shared Data Loading for Deep Learning Training |
2026 |
SIGMOD |
4.456315e-05 |
| 3,293 |
Jointly Optimizing Preprocessing and Inference for DNN-based Visual Analytics |
2021 |
VLDB |
7.2629834e-05 |
| 3,698 |
Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning Preprocessing Pipelines |
2022 |
SIGMOD |
6.8340435e-05 |