Random Forests

By Leo Breiman et al
Published on Jan. 10, 2001
Read the original document by opening this link in a new tab.

Table of Contents

1. Random Forests
1.1 Introduction
1.2 Outline of Paper
2. Characterizing the Accuracy of Random Forests
2.1 Random Forests Converge
2.2 Strength and Correlation
3. Using Random Features
3.1 Using Out-Of-Bag Estimates to Monitor Error, Strength, and Correlation
4. Random Forests Using Random Input Selection

Summary

Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Random forests have shown significant improvements in classification accuracy by growing an ensemble of trees and letting them vote for the most popular class. The use of random vectors in growing these ensembles has led to advancements in classification accuracy. Various methods of random feature selection have been explored to improve the accuracy of random forests. Out-of-bag estimates are used to monitor error, strength, and correlation, providing internal estimates that aid in understanding classification accuracy. Random forests using random input selection have been shown to outperform other algorithms in terms of accuracy and robustness to noise.
×
This is where the content will go.