0% found this document useful (0 votes)

36 views14 pages

EXP-3-To Implement CART Decision Tree Algorithm

Uploaded by

badgujarvarad800

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views14 pages

EXP-3-To Implement CART Decision Tree Algorithm

Uploaded by

badgujarvarad800

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Machine Learning Laboratory (PECO6051L)

Laboratory Report
Experiment No - 3
Batch -

Date of Experiment: Date of Submission:

Title: To implement CART decision tree algorithm

___________________________________________________________
Evaluation
1) Attendance [2] ----------------
2) Lab Performance [2] ----------------
3) Oral [1] ----------------

Overall Marks [5] ----------------

Subject Incharge

R. C. Patel Institute of Technology, Shirpur

Machine Learning Laboratory (PECO6051L)

Experiment No. 3
TITLE: To implement CART decision tree algorithm

THEORY:
Decision Tree
Decision tree is a non-parametric supervised learning technique, it is a tree of multiple decision
rules, all these rules will be derived from the data features. It is one of most easy to understand &
explainable machine learning algorithm. This ML algorithm is the most fundamental components
of Random Forest, which are most popular & powerful ML algorithm.
Structure of Decision Tree
 The figure shows how a decision tree would look like. Each internal node represents a
segment or region. With respect to tree analogy, segments or regions are nodes or leaves
of the tree.

 Root Node: This is the first node which is our training data set.
 Internal Node: This is the point where subgroup is split to a new sub-group or leaf node.

R. C. Patel Institute of Technology, Shirpur

Machine Learning Laboratory (PECO6051L)

We can call this as a decision node as well because this is where node splits further based
on the best attribute of your sub-group.
 Leaf Node: Final node from any internal node, this holds the decision.

About nodes in Decision Tree

 As mentioned before Decision tree is a tree like structure which will have nested nodes,
the splitting of one node to another happens based on a threshold value of an attribute.
 Decision tree algorithm splits the training set (root node) to sub-groups - internal nodes
& any internal node with final sub-group will be the leaf node which we can term it as
a Recursive partitioning.
To split a node Decision Tree algorithm needs best attribute & threshold value. Selection of best
attribute & threshold value pair (f, t) happens based on below algorithms which will give you the
purest nodes. The below algorithms helps to find the measurements of the best attributes:
 CART algorithm : Gini Index
 ID3 algorithm : Information Gain
 C4.5 algorithm : Gain Ratio
CART Algorithm
This algorithm can be used for both classification & regression. CART algorithm uses Gini Index
criterion to split a node to a sub-node. It start with the training set as a root node, after successfully
splitting the root node in two, it splits the subsets using the same logic & again split the sub-
subsets, recursively until it finds further splitting will not give any pure sub-nodes or maximum
number of leaves in a growing tree or termed it as a Tree pruning.
How to calculate Gini Index?

In Gini Index, P is the probability of class i & there is total c classes.

R. C. Patel Institute of Technology, Shirpur

Machine Learning Laboratory (PECO6051L)

Data set
There are 14 instances of golf playing decisions based on outlook, temperature, humidity and wind
factors.

Day Outlook Temp. Humidity Wind Decision

1 Sunny Hot High Weak No

2 Sunny Hot High Strong No

3 Overcast Hot High Weak Yes

4 Rain Mild High Weak Yes

5 Rain Cool Normal Weak Yes

6 Rain Cool Normal Strong No

7 Overcast Cool Normal Strong Yes

8 Sunny Mild High Weak No

9 Sunny Cool Normal Weak Yes

10 Rain Mild Normal Weak Yes

11 Sunny Mild Normal Strong Yes

12 Overcast Mild High Strong Yes

R. C. Patel Institute of Technology, Shirpur

Machine Learning Laboratory (PECO6051L)

13 Overcast Hot Normal Weak Yes

14 Rain Mild High Strong No

Gini index
Gini index is a metric for classification tasks in CART. It stores sum of squared probabilities of
each class. We can formulate it as illustrated below.
Gini = 1 – Σ (Pi) 2 for i=1 to number of classes
Outlook
Outlook is a nominal feature. It can be sunny, overcast or rain. Final decisions for outlook feature
are as below.

Outlook Yes No Number of instances

Sunny 2 3 5

Overcast 4 0 4

Rain 3 2 5

Gini (Outlook=Sunny) = 1 – (2/5)2 – (3/5)2 = 1 – 0.16 – 0.36 = 0.48

Gini (Outlook=Overcast) = 1 – (4/4)2 – (0/4)2 = 0
Gini (Outlook=Rain) = 1 – (3/5)2 – (2/5)2 = 1 – 0.36 – 0.16 = 0.48
Then, we will calculate weighted sum of Gini indexes for outlook feature.
Gini (Outlook) = (5/14) x 0.48 + (4/14) x 0 + (5/14) x 0.48 = 0.171 + 0 + 0.171 = 0.342
Temperature
Similarly, temperature is a nominal feature and it could have 3 different values: Cool, Hot and
Mild. Let’s summarize decisions for temperature feature.

Temperature Yes No Number of instances

Hot 2 2 4

R. C. Patel Institute of Technology, Shirpur

Machine Learning Laboratory (PECO6051L)

Cool 3 1 4

Mild 4 2 6

Gini (Temp=Hot) = 1 – (2/4)2 – (2/4)2 = 0.5

Gini (Temp=Cool) = 1 – (3/4)2 – (1/4)2 = 1 – 0.5625 – 0.0625 = 0.375
Gini (Temp=Mild) = 1 – (4/6)2 – (2/6)2 = 1 – 0.444 – 0.111 = 0.445
We’ll calculate weighted sum of Gini index for temperature feature
Gini (Temp) = (4/14) x 0.5 + (4/14) x 0.375 + (6/14) x 0.445 = 0.142 + 0.107 + 0.190 = 0.439
Humidity
Humidity is a binary class feature. It can be high or normal.

Humidity Yes No Number of instances

High 3 4 7

Normal 6 1 7

Gini (Humidity=High) = 1 – (3/7)2 – (4/7)2 = 1 – 0.183 – 0.326 = 0.489

Gini (Humidity=Normal) = 1 – (6/7)2 – (1/7)2 = 1 – 0.734 – 0.02 = 0.244
Weighted sum for humidity feature will be calculated next
Gini (Humidity) = (7/14) x 0.489 + (7/14) x 0.244 = 0.367
Wind
Wind is a binary class similar to humidity. It can be weak and strong.

Wind Yes No Number of instances

Weak 6 2 8

Strong 3 3 6

R. C. Patel Institute of Technology, Shirpur

Machine Learning Laboratory (PECO6051L)

Gini (Wind=Weak) = 1 – (6/8)2 – (2/8)2 = 1 – 0.5625 – 0.062 = 0.375

Gini (Wind=Strong) = 1 – (3/6)2 – (3/6)2 = 1 – 0.25 – 0.25 = 0.5
Gini (Wind) = (8/14) x 0.375 + (6/14) x 0.5 = 0.428
We’ve calculated Gini index values for each feature. The winner will be outlook feature because
its cost is the lowest.

Feature Gini index

Outlook 0.342

Temperature 0.439

Humidity 0.367

Wind 0.428

We’ll put outlook decision at the top of the tree.

R. C. Patel Institute of Technology, Shirpur

Machine Learning Laboratory (PECO6051L)

You might realize that sub dataset in the overcast leaf has only yes decisions. This means that
overcast leaf is over.

Focus on the sub dataset for sunny outlook. We need to find the Gini index scores for temperature,
humidity and wind features respectively.

Day Outlook Temp. Humidity Wind Decision

1 Sunny Hot High Weak No

2 Sunny Hot High Strong No

8 Sunny Mild High Weak No

9 Sunny Cool Normal Weak Yes

11 Sunny Mild Normal Strong Yes

Gini of temperature for sunny outlook

Temperature Yes No Number of instances

Hot 0 2 2

R. C. Patel Institute of Technology, Shirpur

Machine Learning Laboratory (PECO6051L)

Cool 1 0 1

Mild 1 1 2

Gini (Outlook=Sunny and Temp. =Hot) = 1 – (0/2)2 – (2/2)2 = 0

Gini (Outlook=Sunny and Temp. =Cool) = 1 – (1/1)2 – (0/1)2 = 0
Gini (Outlook=Sunny and Temp. =Mild) = 1 – (1/2)2 – (1/2)2 = 1 – 0.25 – 0.25 = 0.5
Gini (Outlook=Sunny and Temp.) = (2/5) x0 + (1/5) x0 + (2/5) x0.5 = 0.2
Gini of humidity for sunny outlook

Humidity Yes No Number of instances

High 0 3 3

Normal 2 0 2

Gini (Outlook=Sunny and Humidity=High) = 1 – (0/3)2 – (3/3)2 = 0

Gini (Outlook=Sunny and Humidity=Normal) = 1 – (2/2)2 – (0/2)2 = 0
Gini (Outlook=Sunny and Humidity) = (3/5) x0 + (2/5) x0 = 0
Gini of wind for sunny outlook

Wind Yes No Number of instances

Weak 1 2 3

Strong 1 1 2

Gini (Outlook=Sunny and Wind=Weak) = 1 – (1/3)2 – (2/3)2 = 0.266

Gini (Outlook=Sunny and Wind=Strong) = 1- (1/2)2 – (1/2)2 = 0.2
Gini (Outlook=Sunny and Wind) = (3/5) x0.266 + (2/5) x0.2 = 0.466
Decision for sunny outlook
We’ve calculated Gini index scores for feature when outlook is sunny. The winner is humidity
because it has the lowest value.

R. C. Patel Institute of Technology, Shirpur

Machine Learning Laboratory (PECO6051L)

Feature Gini index

Temperature 0.2

Humidity 0

Wind 0.466

We’ll put humidity check at the extension of sunny outlook.

As seen, decision is always no for high humidity and sunny outlook. On the other hand, decision
will always be yes for normal humidity and sunny outlook. This branch is over.

R. C. Patel Institute of Technology, Shirpur

Machine Learning Laboratory (PECO6051L)

Now, we need to focus on rain outlook.

Rain outlook

Day Outlook Temp. Humidity Wind Decision

4 Rain Mild High Weak Yes

5 Rain Cool Normal Weak Yes

6 Rain Cool Normal Strong No

10 Rain Mild Normal Weak Yes

14 Rain Mild High Strong No

We’ll calculate Gini index scores for temperature, humidity and wind features when outlook is
rain.
Gini of temperature for rain outlook

Temperature Yes No Number of instances

Cool 1 1 2

Mild 2 1 3

R. C. Patel Institute of Technology, Shirpur

Machine Learning Laboratory (PECO6051L)

Gini (Outlook=Rain and Temp. =Cool) = 1 – (1/2)2 – (1/2)2 = 0.5

Gini (Outlook=Rain and Temp. =Mild) = 1 – (2/3)2 – (1/3)2 = 0.444
Gini (Outlook=Rain and Temp.) = (2/5) x0.5 + (3/5) x0.444 = 0.466
Gini of humidity for rain outlook

Humidity Yes No Number of instances

High 1 1 2

Normal 2 1 3

Gini (Outlook=Rain and Humidity=High) = 1 – (1/2)2 – (1/2)2 = 0.5

Gini (Outlook=Rain and Humidity=Normal) = 1 – (2/3)2 – (1/3)2 = 0.444
Gini (Outlook=Rain and Humidity) = (2/5) x0.5 + (3/5) x0.444 = 0.466
Gini of wind for rain outlook

Wind Yes No Number of instances

Weak 3 0 3

Strong 0 2 2

Gini (Outlook=Rain and Wind=Weak) = 1 – (3/3)2 – (0/3)2 = 0

Gini (Outlook=Rain and Wind=Strong) = 1 – (0/2)2 – (2/2)2 = 0
Gini (Outlook=Rain and Wind) = (3/5) x0 + (2/5) x0 = 0
Decision for rain outlook
The winner is wind feature for rain outlook because it has the minimum Gini index score in
features.

R. C. Patel Institute of Technology, Shirpur

Machine Learning Laboratory (PECO6051L)

Feature Gini index

Temperature 0.466

Humidity 0.466

Wind 0

Put the wind feature for rain outlook branch and monitor the new sub data sets.

As seen, decision is always yes when wind is weak. On the other hand, decision is always no if
wind is strong. This means that this branch is over.

R. C. Patel Institute of Technology, Shirpur

Machine Learning Laboratory (PECO6051L)

CONCLUSION / RESULT:
In this Experiment, we have implemented CART decision tree algorithm on given dataset.

R. C. Patel Institute of Technology, Shirpur

ML Unit Iii
No ratings yet
ML Unit Iii
18 pages
Classification and Regression Trees (CART)
No ratings yet
Classification and Regression Trees (CART)
6 pages
Decision Tree Classifier-CART
No ratings yet
Decision Tree Classifier-CART
29 pages
Session 6 - Decision Tree
No ratings yet
Session 6 - Decision Tree
37 pages
00 Decision Tree Example
No ratings yet
00 Decision Tree Example
12 pages
A3.1 Decision Tree Gini Index
No ratings yet
A3.1 Decision Tree Gini Index
25 pages
Decision Tree
No ratings yet
Decision Tree
34 pages
Week - 2 Day - 2 Machine Learning 2 - 3
No ratings yet
Week - 2 Day - 2 Machine Learning 2 - 3
33 pages
Decision Tree
No ratings yet
Decision Tree
36 pages
ML Unit 2 Final - III Yr
No ratings yet
ML Unit 2 Final - III Yr
72 pages
Decision Tree
No ratings yet
Decision Tree
49 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
Decisiontrees
No ratings yet
Decisiontrees
46 pages
Examples
No ratings yet
Examples
8 pages
Decision Tree
No ratings yet
Decision Tree
35 pages
A Step by Step CART Decision Tree Example - Sefik Ilkin Serengil PDF
0% (1)
A Step by Step CART Decision Tree Example - Sefik Ilkin Serengil PDF
26 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
unit 3.2 Decision Tree Algorithm wit examples
No ratings yet
unit 3.2 Decision Tree Algorithm wit examples
85 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
12 pages
07 Decision Tree
No ratings yet
07 Decision Tree
45 pages
Decision Trees: Decision Tree Is One of The Most Widely Used and
No ratings yet
Decision Trees: Decision Tree Is One of The Most Widely Used and
53 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
MODULE 4-Dr - GM
No ratings yet
MODULE 4-Dr - GM
23 pages
Decision Tree
No ratings yet
Decision Tree
2 pages
CSC454 10
No ratings yet
CSC454 10
36 pages
Data Minning Unit 5 PDF
No ratings yet
Data Minning Unit 5 PDF
19 pages
Decision Tree
No ratings yet
Decision Tree
100 pages
CSE 422 Machine Learning Tree Based Methods
No ratings yet
CSE 422 Machine Learning Tree Based Methods
35 pages
07 - Decision Tree
No ratings yet
07 - Decision Tree
45 pages
ML Unit-3
No ratings yet
ML Unit-3
29 pages
6 DecisionTrees ID3 CART
No ratings yet
6 DecisionTrees ID3 CART
24 pages
Decision Trees
No ratings yet
Decision Trees
18 pages
2 - 4 Cart
No ratings yet
2 - 4 Cart
16 pages
IS4834 Week 8
No ratings yet
IS4834 Week 8
42 pages
Data Mining: Classification-1
No ratings yet
Data Mining: Classification-1
53 pages
Decision Tree
No ratings yet
Decision Tree
7 pages
Play Tennis Example: Outlook Temperature Humidity Windy
No ratings yet
Play Tennis Example: Outlook Temperature Humidity Windy
29 pages
Decision Trees
No ratings yet
Decision Trees
49 pages
Decision Tree
No ratings yet
Decision Tree
27 pages
Unit 6 Finalized
No ratings yet
Unit 6 Finalized
30 pages
Decision Tree
No ratings yet
Decision Tree
2 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
28 pages
Housing Price Prediction Modeling Using Machine Learning
No ratings yet
Housing Price Prediction Modeling Using Machine Learning
6 pages
Trinh Khanh Ly 20213676
No ratings yet
Trinh Khanh Ly 20213676
13 pages
Decision Tree Id3 Problem
No ratings yet
Decision Tree Id3 Problem
5 pages
2.3 Decision-Tree-Algorithm
No ratings yet
2.3 Decision-Tree-Algorithm
61 pages
Ch05-DT1-Dr Amin ML
No ratings yet
Ch05-DT1-Dr Amin ML
26 pages
2 Decision Tree Algo
No ratings yet
2 Decision Tree Algo
46 pages
Data Mining Unit-IV
No ratings yet
Data Mining Unit-IV
7 pages
Tree Models
No ratings yet
Tree Models
42 pages
Data Mining Algorithms Classification L4
No ratings yet
Data Mining Algorithms Classification L4
7 pages
DT Classifier
No ratings yet
DT Classifier
45 pages
Decision Tree
100% (1)
Decision Tree
10 pages
Lec 3&4
No ratings yet
Lec 3&4
20 pages
Unit-4 (1) .Docx ML
No ratings yet
Unit-4 (1) .Docx ML
42 pages
4. ML Probabilistic Classifiers
No ratings yet
4. ML Probabilistic Classifiers
37 pages
Unit IV - Decision Tree With ID3
No ratings yet
Unit IV - Decision Tree With ID3
8 pages
ID3 Decision Tree Explanation
No ratings yet
ID3 Decision Tree Explanation
8 pages
3 Decision Trees - LMS
No ratings yet
3 Decision Trees - LMS
47 pages
COMP 4200: Expert Systems: Department of Computer Science University of Manitoba
No ratings yet
COMP 4200: Expert Systems: Department of Computer Science University of Manitoba
40 pages
(Ce351 Ai4e) A-01 (A, B)
No ratings yet
(Ce351 Ai4e) A-01 (A, B)
2 pages
Analysis and Design Accounting Information System PDF
No ratings yet
Analysis and Design Accounting Information System PDF
5 pages
Essay - Artificial Intelligence More Likely To Benefit or Harm Humanity
No ratings yet
Essay - Artificial Intelligence More Likely To Benefit or Harm Humanity
2 pages
Assignment
No ratings yet
Assignment
6 pages
Detection of Tuberculosis Based On Deep Learning Methods PPT
No ratings yet
Detection of Tuberculosis Based On Deep Learning Methods PPT
11 pages
Perceptiviti Corporate Deck - August 15 - 2024
No ratings yet
Perceptiviti Corporate Deck - August 15 - 2024
14 pages
Rahman Document
No ratings yet
Rahman Document
46 pages
Situationalawareness 1 30
No ratings yet
Situationalawareness 1 30
30 pages
Evolution of Robot in AI
No ratings yet
Evolution of Robot in AI
16 pages
1 s2.0 S2352710221001443 Main
No ratings yet
1 s2.0 S2352710221001443 Main
13 pages
LSTM
No ratings yet
LSTM
8 pages
Sequential Sentence Classification in Research Papers Using Cross-Domain Multi-Task Learning
No ratings yet
Sequential Sentence Classification in Research Papers Using Cross-Domain Multi-Task Learning
24 pages
Agri Sense
No ratings yet
Agri Sense
10 pages
DLL For Demo
No ratings yet
DLL For Demo
3 pages
Should We Let Students Use ChatGPT
No ratings yet
Should We Let Students Use ChatGPT
8 pages
Sample Report
No ratings yet
Sample Report
34 pages
Agentic Ai Threats
No ratings yet
Agentic Ai Threats
38 pages
Studentplacement
No ratings yet
Studentplacement
10 pages
The Turing Test
No ratings yet
The Turing Test
21 pages
Bridging The Gap
No ratings yet
Bridging The Gap
31 pages
Artificial Intelligence (AI) For IT Professionals - Quiz
No ratings yet
Artificial Intelligence (AI) For IT Professionals - Quiz
5 pages
Artificial Intelligence in Diplomacy Whe
No ratings yet
Artificial Intelligence in Diplomacy Whe
5 pages
Custom GPTs - A Comprehensive Guide
No ratings yet
Custom GPTs - A Comprehensive Guide
9 pages
SSRN 5218753
No ratings yet
SSRN 5218753
21 pages
Deep Learning Based Ischemic Stroke Classification
No ratings yet
Deep Learning Based Ischemic Stroke Classification
47 pages
21 Lessons For The 21st Century
No ratings yet
21 Lessons For The 21st Century
21 pages
IF4071 - Deep Learning Laboratory
No ratings yet
IF4071 - Deep Learning Laboratory
1 page
Glossary of Deep Learning Word Embedding by Jaron Collis Deeper Learning Medium
No ratings yet
Glossary of Deep Learning Word Embedding by Jaron Collis Deeper Learning Medium
15 pages
Assign
No ratings yet
Assign
12 pages