Back to papers
NeutronTask: Scalable and Efficient Multi-GPU GNN Training with Task Parallelism
Summary: Introduce GNN task parallelism that partitions per-layer training tasks across GPUs (instead of graph partitioning), reducing neighbor replication, enabling intra-GPU shared neighbor-embedding reuse and overlapping subgraph computation. Combine with a task-decoupled training framework that releases intermediate data early to cut memory, enabling billion-scale full-graph multi-GPU training and 1.27×–5.47× speedups over NeutronStar/Sancus on 4×A5000.
(summarized by gpt-5-mini on Feb 09 2026)
- Paper ID
- 13830
- Venue
- VLDB
- Year
- 2025
- Pagerank
- 4.1945683e-05
- Overall Rank
- 10,570 | 26.47%
- DOI
-
10.14778/3725688.3725700
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
Outgoing Citations (Sorted by Pagerank)
Showing 16 of 16 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 278 |
AliGraph: A Comprehensive Graph Neural Network Platform |
2019 |
VLDB |
0.00029230623 |
| 1,160 |
Sancus: Staleness-Aware Communication-Avoiding Full-Graph Decentralized Training in Large-Scale Graph Neural Networks |
2022 |
VLDB |
0.00013586221 |
| 2,400 |
ByteGNN: Efficient Graph Neural Network Training at Large Scale |
2022 |
VLDB |
8.8955105e-05 |
| 2,422 |
DUCATI: A Dual-Cache Training System for Graph Neural Networks on Giant Graphs with the GPU |
2023 |
SIGMOD |
8.8499665e-05 |
| 3,025 |
NeutronStar: Distributed GNN Training with Hybrid Dependency Management |
2022 |
SIGMOD |
7.6906935e-05 |
| 3,087 |
Scalable and Efficient Full-Graph GNN Training for Large Graphs |
2023 |
SIGMOD |
7.5939896e-05 |
| 4,355 |
LargeEA: Aligning Entities for Large-scale Knowledge Graphs |
2022 |
VLDB |
6.259483e-05 |
| 5,136 |
NeutronOrch: Rethinking Sample-based GNN Training under CPU-GPU Heterogeneous Environments |
2024 |
VLDB |
5.6723526e-05 |
| 5,443 |
Decoupled Graph Neural Networks for Large Dynamic Graphs |
2023 |
VLDB |
5.5025808e-05 |
| 6,884 |
Lotan: Bridging the Gap between GNNs and Scalable Graph Analytics Engines |
2023 |
VLDB |
4.8955332e-05 |
| 6,942 |
Efficient Training of Graph Neural Networks on Large Graphs |
2024 |
VLDB |
4.8922884e-05 |
| 6,980 |
OUTRE: An OUT-of-core De-REdundancy GNN Training Framework for Massive Graphs within A Single Machine |
2024 |
VLDB |
4.8744298e-05 |
| 7,091 |
HongTu: Scalable Full-Graph GNN Training on Multiple GPUs |
2023 |
SIGMOD |
4.8370645e-05 |
| 7,289 |
DAHA: Accelerating GNN Training with Data and Hardware Aware Execution Planning |
2024 |
VLDB |
4.7747168e-05 |
| 7,545 |
XGNN: Boosting Multi-GPU GNN Training via Global GNN Memory Store |
2024 |
VLDB |
4.714889e-05 |
| 9,395 |
NeutronTP: Load-Balanced Distributed Full-Graph GNN Training with Tensor Parallelism |
2025 |
VLDB |
4.3441378e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 7,091 |
HongTu: Scalable Full-Graph GNN Training on Multiple GPUs |
2023 |
SIGMOD |
4.8370645e-05 |
| 5,737 |
Comprehensive Evaluation of GNN Training Systems: A Data Management Perspective |
2024 |
VLDB |
5.3480667e-05 |
| 2,400 |
ByteGNN: Efficient Graph Neural Network Training at Large Scale |
2022 |
VLDB |
8.8955105e-05 |
| 5,345 |
NeutronStream: A Dynamic GNN Training Framework with Sliding Window for Graph Streams |
2024 |
VLDB |
5.5567697e-05 |
| 3,087 |
Scalable and Efficient Full-Graph GNN Training for Large Graphs |
2023 |
SIGMOD |
7.5939896e-05 |
| 10,027 |
NeutronHeter: Optimizing Distributed Graph Neural Network Training for Heterogeneous Clusters |
2026 |
SIGMOD |
4.1945683e-05 |
| 3,025 |
NeutronStar: Distributed GNN Training with Hybrid Dependency Management |
2022 |
SIGMOD |
7.6906935e-05 |
| 10,298 |
NeutronCloud: Resource-Aware Distributed GNN Training in Fluctuating Cloud Environments |
2026 |
VLDB |
4.1945683e-05 |
| 5,136 |
NeutronOrch: Rethinking Sample-based GNN Training under CPU-GPU Heterogeneous Environments |
2024 |
VLDB |
5.6723526e-05 |
| 9,395 |
NeutronTP: Load-Balanced Distributed Full-Graph GNN Training with Tensor Parallelism |
2025 |
VLDB |
4.3441378e-05 |