DimBoost: Boosting Gradient Boosting Decision Tree to Higher Dimensions
Summary: DimBoost is a scalable GBDT trainer for ultra-high dimensional data (330K features), with a performance model revealing collective-communication bottlenecks. Key innovations: scheduler, two-phase split finding, sparsity-aware histograms with parallel indexing, and low-precision gradients; 2–9x speedups over existing systems. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Jiawei Jiang
- 2. Bin Cui
- 3. Ce Zhang
- 4. Fangcheng Fu
Incoming Citations (Sorted by Pagerank)
Showing 8 of 8 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,895 | VF2Boost: Very Fast Vertical Federated Gradient Boosting for Cross-Enterprise Learning | 2021 | SIGMOD | 0.00010180896 |
| 2,791 | Towards Demystifying Serverless Machine Learning Training | 2021 | SIGMOD | 8.1206618e-05 |
| 3,506 | BlindFL: Vertical Federated Machine Learning without Peeking into Your Data | 2022 | SIGMOD | 7.0291192e-05 |
| 4,964 | PS2: Parameter Server on Spark | 2019 | SIGMOD | 5.7965988e-05 |
| 4,975 | An Experimental Evaluation of Large Scale GBDT Systems | 2019 | VLDB | 5.79026e-05 |
| 5,720 | BAGUA: Scaling up Distributed Learning with System Relaxations | 2022 | VLDB | 5.3527734e-05 |
| 5,806 | BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees | 2019 | SIGMOD | 5.3200643e-05 |
| 6,566 | Reliable Data Distillation on Graph Convolutional Network | 2020 | SIGMOD | 5.0074274e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 10 of 10 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 126 | Space-Efficient Online Computation of Quantile Summaries | 2001 | SIGMOD | 0.00044744986 |
| 209 | Schism: a Workload-Driven Approach to Database Replication and Partitioning | 2010 | VLDB | 0.00034468292 |
| 285 | Automating Physical Database Design in a Parallel Database | 2002 | SIGMOD | 0.0002899128 |
| 286 | Integrating Vertical and Horizontal Partitioning into Automated Physical Database Design | 2004 | SIGMOD | 0.00028990057 |
| 834 | Learning Linear Regression Models over Factorized Joins | 2016 | SIGMOD | 0.00016135159 |
| 850 | Scaling Factorization Machines to Relational Data | 2013 | VLDB | 0.00015955971 |
| 1,266 | Hybrid-Range Partitioning Strategy: A New Declustering Strategy for Multiprocessor Database Machines | 1990 | VLDB | 0.00012946573 |
| 1,942 | Heterogeneity-aware Distributed Parameter Servers | 2017 | SIGMOD | 0.00010012691 |
| 3,958 | MLog: Towards Declarative In-Database Machine Learning | 2017 | VLDB | 6.5897636e-05 |
| 11,795 | LDA*: A Robust and Large-scale Topic Modeling System | 2017 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next