Skip to content

Commit 0189c28

Browse files
authored
Update random-forest.md
1 parent 86e7c0d commit 0189c28

File tree

1 file changed

+6
-31
lines changed

1 file changed

+6
-31
lines changed

contrib/machine-learning/random-forest.md

Lines changed: 6 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -2,31 +2,6 @@
22

33
Random Forest is a versatile machine learning algorithm capable of performing both regression and classification tasks. It is an ensemble method that operates by constructing a multitude of decision trees during training and outputting the average prediction of the individual trees (for regression) or the mode of the classes (for classification).
44

5-
6-
- [Random Forest](#random-forest)
7-
- [Introduction](#introduction)
8-
- [How Random Forest Works](#how-random-forest-works)
9-
- [1. Bootstrap Sampling:](#1-bootstrap-sampling)
10-
- [2. Decision Trees:](#2-decision-trees)
11-
- [3. Feature Selection:](#3-feature-selection)
12-
- [4. Voting/Averaging:](#4-votingaveraging)
13-
- [Detailed Working Mechanism](#detailed-working-mechanism)
14-
- [Step 3: Aggregation:](#step-3-aggregation)
15-
- [Advantages and Disadvantages](#advantages-and-disadvantages)
16-
- [Advantages](#advantages)
17-
- [Disadvantages](#disadvantages)
18-
- [Hyperparameters](#hyperparameters)
19-
- [Key Hyperparameters](#key-hyperparameters)
20-
- [Tuning Hyperparameters](#tuning-hyperparameters)
21-
- [Code Examples](#code-examples)
22-
- [Classification Example](#classification-example)
23-
- [Feature Importance](#feature-importance)
24-
- [Hyperparameter Tuning](#hyperparameter-tuning)
25-
- [Regression Example](#regression-example)
26-
- [Conclusion](#conclusion)
27-
- [References](#references)
28-
29-
305
## Introduction
316
Random Forest is an ensemble learning method used for classification and regression tasks. It is built from multiple decision trees and combines their outputs to improve the model's accuracy and control over-fitting.
327

@@ -41,9 +16,9 @@ Random Forest is an ensemble learning method used for classification and regress
4116
For classification, the mode of the classes predicted by individual trees is taken (majority vote).
4217
For regression, the average of the outputs of the individual trees is taken.
4318
### Detailed Working Mechanism
44-
* #### Step 1: Bootstrap Sampling:
19+
#### Step 1: Bootstrap Sampling:
4520
Each tree is trained on a random sample of the original data, drawn with replacement (bootstrap sample). This means some data points may appear multiple times in a sample while others may not appear at all.
46-
* #### Step 2: Tree Construction:
21+
#### Step 2: Tree Construction:
4722
Each node in the tree is split using the best split among a random subset of the features. This process adds an additional layer of randomness, contributing to the robustness of the model.
4823
#### Step 3: Aggregation:
4924
For classification tasks, the final prediction is based on the majority vote from all the trees. For regression tasks, the final prediction is the average of all the tree predictions.
@@ -73,7 +48,7 @@ Hyperparameter tuning can significantly improve the performance of a Random Fore
7348
#### Classification Example
7449
Below is a simple example of using Random Forest for a classification task with the Iris dataset.
7550

76-
```
51+
```python
7752
import numpy as np
7853
import pandas as pd
7954
from sklearn.datasets import load_iris
@@ -109,7 +84,7 @@ print("Classification Report:\n", classification_report(y_test, y_pred))
10984
Random Forest provides a way to measure the importance of each feature in making predictions.
11085

11186

112-
```
87+
```python
11388
import matplotlib.pyplot as plt
11489

11590
# Get feature importances
@@ -132,7 +107,7 @@ plt.show()
132107
#### Hyperparameter Tuning
133108
Using Grid Search for hyperparameter tuning.
134109

135-
```
110+
```python
136111
from sklearn.model_selection import GridSearchCV
137112

138113
# Define the parameter grid
@@ -155,7 +130,7 @@ print("Best parameters found: ", grid_search.best_params_)
155130
#### Regression Example
156131
Below is a simple example of using Random Forest for a regression task with the Boston housing dataset.
157132

158-
```
133+
```python
159134
import numpy as np
160135
import pandas as pd
161136
from sklearn.datasets import load_boston

0 commit comments

Comments
 (0)