Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent
Summary: Angel-PTM: production pre‑training system combining hierarchical memory (Page abstraction) with a unified scheduler to coordinate compute, data movement and communication for efficient large‑Transformer training. SSD‑backed, lock‑free I/O enables much larger models and higher throughput; validated on GPT‑3‑175B and T5‑MoE. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Xiaonan Nie
- 2. Yi Liu
- 3. Fangcheng Fu
- 4. Jinbao Xue
- 5. Dian Jiao
- 6. Xupeng Miao
- 7. Yangyu Tao
- 8. Bin Cui
Incoming Citations (Sorted by Pagerank)
Showing 5 of 5 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,993 | DLRover-RM: Resource Optimization for Deep Recommendation Models Training in the Cloud | 2024 | VLDB | 5.2415551e-05 |
| 8,126 | SDPipe: A Semi-Decentralized Framework for Heterogeneity-aware Pipeline-parallel Training | 2023 | VLDB | 4.5796615e-05 |
| 8,808 | FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement | 2023 | SIGMOD | 4.4454035e-05 |
| 10,492 | Malleus: Straggler-Resilient Hybrid Parallel Training of Large-scale Models via Malleable Data and Model Parallelization | 2025 | SIGMOD | 4.1945683e-05 |
| 10,626 | LobRA: Multi-tenant Fine-tuning over Heterogeneous Data | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 8 of 8 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,895 | VF2Boost: Very Fast Vertical Federated Gradient Boosting for Cross-Enterprise Learning | 2021 | SIGMOD | 0.00010180896 |
| 2,330 | Concurrent Analytical Query Processing with GPUs | 2014 | VLDB | 9.0192228e-05 |
| 2,677 | HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework | 2022 | VLDB | 8.3268401e-05 |
| 3,808 | SketchML: Accelerating Distributed Machine Learning with Data Sketches | 2018 | SIGMOD | 6.7455428e-05 |
| 5,333 | Heterogeneity-Aware Distributed Machine Learning Training via Partial Reduce | 2021 | SIGMOD | 5.5656575e-05 |
| 6,377 | Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism | 2023 | VLDB | 5.0911095e-05 |
| 8,808 | FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement | 2023 | SIGMOD | 4.4454035e-05 |
| 9,966 | Towards Communication-efficient Vertical Federated Learning Training via Cache-enabled Local Updates | 2022 | VLDB | 4.2269436e-05 |
Previous
Page 1 / 1
Next