Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning Preprocessing Pipelines
Summary: Analyzes data preprocessing pipelines across four domains, exposing bottlenecks and throughput–storage trade-offs. Presents an open-source profiler that auto-tunes preprocessing, delivering 3x–13x throughput gains with equivalent pipelines. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Alexander Isenko
- 2. Ruben Mayer
- 3. Jeffrey Jedele
- 4. Hans-Arno Jacobsen
Incoming Citations (Sorted by Pagerank)
Showing 9 of 9 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 3 of 3 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,504 | Analyzing and Mitigating Data Stalls in DNN Training | 2021 | VLDB | 0.00011642333 |
| 2,170 | tf.data: A Machine Learning Data Processing Framework | 2021 | VLDB | 9.3821603e-05 |
| 3,293 | Jointly Optimizing Preprocessing and Inference for DNN-based Visual Analytics | 2021 | VLDB | 7.2629834e-05 |
Previous
Page 1 / 1
Next