Back to papers
TensorSocket: Shared Data Loading for Deep Learning Training
Summary: TensorSocket enables collocated DL training processes to share a single data loader, removing redundant CPU preprocessing and data copies while leveraging GPU–GPU interconnects to serve batches directly. Pipeline- and hardware-agnostic, supports heterogeneous models/batch sizes, yields up to 2× throughput and ~50% cloud CPU cost savings, and outperforms CoorDL and Joader.
(summarized by gpt-5-mini on Feb 11 2026)
- Paper ID
- 7342
- Venue
- SIGMOD
- Year
- 2026
- Pagerank
- 4.456315e-05
- Overall Rank
- 8,735 | 39.24%
- DOI
-
10.1145/3749185
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 13 of 13 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 515 |
QPipe: A Simultaneously Pipelined Relational Query Engine |
2005 |
SIGMOD |
0.00021214633 |
| 683 |
Cerebro: A Data System for Optimized Deep Learning Model Selection |
2020 |
VLDB |
0.00018195476 |
| 940 |
SharedDB: Killing One Thousand Queries With One Stone |
2012 |
VLDB |
0.00015173166 |
| 1,504 |
Analyzing and Mitigating Data Stalls in DNN Training |
2021 |
VLDB |
0.00011642333 |
| 2,170 |
tf.data: A Machine Learning Data Processing Framework |
2021 |
VLDB |
9.3821603e-05 |
| 3,363 |
CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU Servers |
2019 |
VLDB |
7.1731921e-05 |
| 3,698 |
Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning Preprocessing Pipelines |
2022 |
SIGMOD |
6.8340435e-05 |
| 4,180 |
FastFlow: Accelerating Deep Learning Model Training with Smart Offloading of Input Data Pipeline |
2023 |
VLDB |
6.3793352e-05 |
| 4,680 |
To Share or Not to Share? |
2007 |
VLDB |
6.0039406e-05 |
| 4,959 |
Sharing Data and Work Across Concurrent Analytical Queries |
2013 |
VLDB |
5.8029448e-05 |
| 6,519 |
Expand your Training Limits! Generating Training Data for ML-based Data Management |
2021 |
SIGMOD |
5.0316686e-05 |
| 8,265 |
ADDICT: Advanced Instruction Chasing for Transactions |
2014 |
VLDB |
4.5461133e-05 |
| 8,348 |
FusionFlow: Accelerating Data Preprocessing for Machine Learning with CPU-GPU Cooperation |
2024 |
VLDB |
4.5410024e-05 |
Semantically Similar Papers