Database Paper Browser

Back to papers

PyTorch Distributed: Experiences on Accelerating Data Parallel Training

Summary: Design, implementation, and evaluation of PyTorch Distributed Data Parallel for scalable data-parallel training on GPUs. Unique practical optimizations—gradient bucketing, compute/communication overlap, and skipped gradient sync—achieving near-linear scaling to 256 GPUs. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
12186
Venue
VLDB
Year
2020
Pagerank
0.00023906921
Overall Rank
411 | 97.15%
DOI
10.14778/3415478.3415530

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 28 of 28 citing papers.

Rank Citing Paper Year Venue Pagerank
2,677 HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework 2022 VLDB 8.3268401e-05
2,791 Towards Demystifying Serverless Machine Learning Training 2021 SIGMOD 8.1206618e-05
2,902 PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel 2023 VLDB 7.93939e-05
3,025 NeutronStar: Distributed GNN Training with Hybrid Dependency Management 2022 SIGMOD 7.6906935e-05
3,254 Query Processing on Tensor Computation Runtimes 2022 VLDB 7.3161051e-05
5,052 HET-GMP: A Graph-based System Approach to Scaling Large Embedding Model Training 2022 SIGMOD 5.7337977e-05
5,333 Heterogeneity-Aware Distributed Machine Learning Training via Partial Reduce 2021 SIGMOD 5.5656575e-05
5,720 BAGUA: Scaling up Distributed Learning with System Relaxations 2022 VLDB 5.3527734e-05
5,821 Tensor Relational Algebra for Distributed Machine Learning System Design 2021 VLDB 5.3134851e-05
6,377 Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism 2023 VLDB 5.0911095e-05
7,152 Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity 2024 VLDB 4.8154191e-05
8,126 SDPipe: A Semi-Decentralized Framework for Heterogeneity-aware Pipeline-parallel Training 2023 VLDB 4.5796615e-05
8,520 mLoRA: Fine-Tuning LoRA Adapters via Highly-Efficient Pipeline Parallelism in Multiple GPUs 2025 VLDB 4.4937074e-05
8,607 Harmony: Overcoming the Hurdles of GPU Memory Capacity to Train Massive DNN Models on Commodity Servers 2022 VLDB 4.4855009e-05
8,712 ANN Softmax: Acceleration of Extreme Classification Training 2022 VLDB 4.4626362e-05
8,808 FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement 2023 SIGMOD 4.4454035e-05
8,864 Cerebro: A Layered Data Platform for Scalable Deep Learning 2021 CIDR 4.4326439e-05
9,222 Towards an Optimized GROUP BY Abstraction for Large-Scale Machine Learning 2021 VLDB 4.3698672e-05
9,319 How Can We Train Deep Learning Models Across Clouds and Continents? An Experimental Study 2024 VLDB 4.3556432e-05
9,326 BladeDISC: Optimizing Dynamic Shape Machine Learning Workloads via Compiler Approach 2023 SIGMOD 4.3556432e-05
9,603 Saturn: An Optimized Data System for Multi-Large-Model Deep Learning Workloads 2024 VLDB 4.3177432e-05
9,694 EinDecomp: Decomposition of Declaratively-Specified Machine Learning and Numerical Computations for Parallel Execution 2025 VLDB 4.3025567e-05
10,089 Hydraulis: Balancing Large Transformer Model Training via Co-designing Parallel Strategies and Data Assignment 2026 SIGMOD 4.1945683e-05
10,580 GPEmu: A GPU Emulator for Faster and Cheaper Prototyping and Evaluation of Deep Learning System Research 2025 VLDB 4.1945683e-05
10,626 LobRA: Multi-tenant Fine-tuning over Heterogeneous Data 2025 VLDB 4.1945683e-05
10,638 Heta: Distributed Training of Heterogeneous Graph Neural Networks 2025 VLDB 4.1945683e-05
10,656 Effective and Efficient Distributed Temporal Graph Learning through Hotspot Memory Sharing 2025 VLDB 4.1945683e-05
13,122 DECK: Experiences on Delta Checkpointing for Industrial Recommendation Systems 2025 VLDB -
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 0 of 0 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
Previous Page 1 / 1 Next

Semantically Similar Papers