Back to papers
PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce
Summary: PLANET leverages MapReduce to train tree ensembles on massive datasets using commodity hardware. It frames tree learning as distributed MapReduce steps, enabling scalable construction of classification/regression trees and ensembles on commodity clusters, demonstrated on computational advertising.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 9905
- Venue
- VLDB
- Year
- 2009
- Pagerank
- 8.4128091e-05
- Overall Rank
- 2,630 | 81.71%
- DOI
-
-
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 10 of 10 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 1,158 |
Simulation of Database-Valued Markov Chains Using SimSQL |
2013 |
SIGMOD |
0.0001361064 |
| 1,895 |
VF2Boost: Very Fast Vertical Federated Gradient Boosting for Cross-Enterprise Learning |
2021 |
SIGMOD |
0.00010180896 |
| 1,940 |
SliceLine: Fast, Linear-Algebra-based Slice Finding for ML Model Debugging |
2021 |
SIGMOD |
0.00010020173 |
| 2,337 |
Efficient Processing of Data Warehousing Queries in a Split Execution Environment |
2011 |
SIGMOD |
9.0098186e-05 |
| 2,642 |
Vertica-ML: Distributed Machine Learning in Vertica Database |
2020 |
SIGMOD |
8.3851878e-05 |
| 2,674 |
Minimal MapReduce Algorithms |
2013 |
SIGMOD |
8.3328645e-05 |
| 4,402 |
Smurf: Self-Service String Matching Using Random Forests |
2019 |
VLDB |
6.2195162e-05 |
| 7,294 |
Optimization for iterative queries on MapReduce |
2014 |
VLDB |
4.773119e-05 |
| 8,048 |
Lowering the Latency of Data Processing Pipelines Through FPGA based Hardware Acceleration |
2020 |
VLDB |
4.5977431e-05 |
| 11,251 |
Fast Search-By-Classification for Large-Scale Databases Using Index-Aware Decision Trees and Random Forests |
2023 |
VLDB |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 4 of 4 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 11,447 |
Grouped Learning: Group-By Model Selection Workloads |
2021 |
SIGMOD |
4.1945683e-05 |
| 4,557 |
Distributed Deep Learning on Data Systems: A Comparative Analysis of Approaches |
2021 |
VLDB |
6.087611e-05 |
| 3,087 |
Scalable and Efficient Full-Graph GNN Training for Large Graphs |
2023 |
SIGMOD |
7.5939896e-05 |
| 8,967 |
Planting Trees for scalable and efficient Canonical Hub Labeling |
2020 |
VLDB |
4.4190656e-05 |
| 13,343 |
M3: Scaling Up Machine Learning via Memory Mapping |
2016 |
SIGMOD |
- |
| 12,729 |
Parallel Mining Algorithms for Generalized Association Rules with Classification Hierarchy |
1998 |
SIGMOD |
4.1945683e-05 |
| 1,402 |
Hybrid Parallelization Strategies for Large-Scale Machine Learning in SystemML |
2014 |
VLDB |
0.00012180605 |
| 9,222 |
Towards an Optimized GROUP BY Abstraction for Large-Scale Machine Learning |
2021 |
VLDB |
4.3698672e-05 |
| 7,790 |
Mining Tree-Structured Data on Multicore Systems |
2009 |
VLDB |
4.650649e-05 |
| 1,455 |
RainForest - A Framework for Fast Decision Tree Construction of Large Datasets |
1998 |
VLDB |
0.00011899821 |