Back to papers
RainForest - A Framework for Fast Decision Tree Construction of Large Datasets
Summary: RainForest provides a unifying framework for fast decision-tree construction on large data, decoupling scalability from tree quality and enabling deployment of C4.5, CART, ID3, SLIQ, Sprint, QUEST. It yields scalable classifiers with >5× speedup over Sprint, at memory cost proportional to column value distinct counts.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 8507
- Venue
- VLDB
- Year
- 1998
- Pagerank
- 0.00011899821
- Overall Rank
- 1,455 | 89.88%
- DOI
-
-
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 10 of 10 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 2,630 |
PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce |
2009 |
VLDB |
8.4128091e-05 |
| 2,687 |
BOAT—Optimistic Decision Tree Construction |
1999 |
SIGMOD |
8.3050259e-05 |
| 2,908 |
SPARTAN: A Model-Based Semantic Compression System for Massive Data Tables |
2001 |
SIGMOD |
7.9306333e-05 |
| 3,699 |
Adaptive Fastest Path Computation on a Road Network: A Traffic Mining Approach |
2007 |
VLDB |
6.8337468e-05 |
| 4,685 |
PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning |
1998 |
VLDB |
5.9994771e-05 |
| 5,079 |
Combi-Operator – Database Support for Data Mining Applications |
2003 |
VLDB |
5.7140516e-05 |
| 6,544 |
A Framework for Measuring Changes in Data Characteristics |
1999 |
PODS |
5.0202405e-05 |
| 8,048 |
Lowering the Latency of Data Processing Pipelines Through FPGA based Hardware Acceleration |
2020 |
VLDB |
4.5977431e-05 |
| 8,179 |
Bellwether Analysis: Predicting Global Aggregates from Local Regions |
2006 |
VLDB |
4.5669241e-05 |
| 12,572 |
FARMER: Finding Interesting Rule Groups in Microarray Datasets |
2004 |
SIGMOD |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 10,881 |
Datamap-Driven Tabular Coreset Selection for Classifier Training |
2025 |
VLDB |
4.1945683e-05 |
| 8,956 |
T3: Accurate and Fast Performance Prediction for Relational Database Systems With Compiled Decision Trees |
2025 |
SIGMOD |
4.4214154e-05 |
| 2,630 |
PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce |
2009 |
VLDB |
8.4128091e-05 |
| 3,770 |
Constructing Efficient Decision Trees by Using Optimized Numeric Association Rules |
1996 |
VLDB |
6.7779074e-05 |
| 5,020 |
Efficient Construction of Regression Trees with Range and Region Splitting |
1997 |
VLDB |
5.7552641e-05 |
| 4,685 |
PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning |
1998 |
VLDB |
5.9994771e-05 |
| 12,692 |
Decision Tables: Scalable Classification Exploring RDBMS Capabilities |
2000 |
VLDB |
4.1945683e-05 |
| 2,687 |
BOAT—Optimistic Decision Tree Construction |
1999 |
SIGMOD |
8.3050259e-05 |
| 11,251 |
Fast Search-By-Classification for Large-Scale Databases Using Index-Aware Decision Trees and Random Forests |
2023 |
VLDB |
4.1945683e-05 |
| 1,107 |
SPRINT: A Scalable Parallel Classifier for Data Mining |
1996 |
VLDB |
0.00013985717 |