Current location - Trademark Inquiry Complete Network - Trademark inquiry - What problem does the random forest algorithm in decision tree solve?
What problem does the random forest algorithm in decision tree solve?
The emergence of random forest is mainly to understand the big errors and over-fitting problems that may occur in a single decision tree. The core idea of this algorithm is to combine several different decision trees, and use this combination to reduce the one-sidedness and inaccurate judgment that a single decision tree may bring.

Random forest refers to a classifier that uses multiple trees to train and predict samples. The classifier was first proposed by Leo Breiman and Adele Cutler and registered as a trademark.

In machine learning, random forest is a classifier that contains multiple decision trees, and the output category is determined by the category mode of each tree. Leo Breiman and Adele Cutler developed an algorithm to infer random forests. "Random Forest" is their trademark. ?

This term comes from the stochastic decision forest proposed by Tin Kam Ho of Bell Laboratories in 1995.

This method combines the idea of "guided aggregation" of Breimans and "random subspace method" of Ho to establish a set of decision trees.

Learning algorithm:

Each tree is built according to the following algorithm:

1.n represents the number of training cases (samples), and m represents the number of features.

2. Input the feature number m, which is used to determine the decision result of a node on the decision tree; Where m should be much smaller than m.

3. Extract n times from n training cases (samples) by putting back samples, form a bootstrap sampling, predict with the non-extracted use cases (samples), and evaluate its error.

4. For each node, randomly select M features, and the decision of each node in the decision tree is based on these features. According to these m features, the best splitting mode is calculated.

5. Each tree will grow completely without pruning, which can be adopted after establishing a normal tree classifier.