Decision Trees

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 19

Decision Trees

Important Terminology

• Root Node: It represents entire • Branch / Sub-Tree: A sub section of


population or sample and this further decision tree is called branch or
gets divided into two or more sub-tree.
homogeneous sets
• Splitting: It is a process of dividing a • Parent and Child Node: A node,
node into two or more sub-nodes which is divided into sub-nodes is
• Decision Node: When a sub-node splits called parent node of sub-nodes
into further sub-nodes, then it is called where as sub-nodes are the child of
decision node parent node.
• Leaf/ Terminal Node: Nodes with no
children (no further split) is called Leaf or
Terminal node
• Pruning: When we reduce the size of
decision trees by removing nodes
(opposite of Splitting), the process is
called pruning
ID3 - Entropy

• A decision tree is built top-down


from a root node and involves
partitioning the data into
subsets that contain instances
with similar values
(homogeneous).
• ID3 algorithm uses entropy to
calculate the homogeneity of a
sample.
• If the sample is completely
homogeneous the entropy is
zero and if the sample is equally
divided then it has entropy of
one.
ID3 - Entropy

• Entropy using the frequency


table of one attribute:
Entropy using the frequency table
of two attributes:
• Step 2:
• The dataset is then split on the
ID3 Steps different attributes.
• The entropy for each branch is
calculated. Then it is added
proportionally, to get total entropy for
the split.
• The information gain is based on • The resulting entropy is subtracted
the decrease in entropy after a from the entropy before the split.
data-set is split on an attribute. • The result is the Information Gain, or
decrease in entropy.
• Constructing a decision tree is all
about finding attribute that
returns the highest information
gain (i.e., the most homogeneous
branches).
• Step 1: Calculate entropy of the
target.
ID3 Steps

• Step 3: Choose attribute with


the largest information gain as
the decision node, divide the
dataset by its branches and
repeat the same process on
every branch.
ID3 Steps
• Step 4a: A branch with entropy of • Step 4b: A branch with entropy
0 is a leaf node. more than 0 needs further
splitting.

Step 5: The ID3 algorithm is run recursively on the non-leaf branches,


until all data is classified.
Example Revisited

• Consider weather dataset based on which we will determine whether


to play football or not.
First step - find the parent node for
our decision tree
• Find the entropy of class variable.
E(S) = -[(9/14)log(9/14) + (5/14)log(5/14)] = 0.94

• calculate average weighted entropy


E(S, outlook) = (5/14)*E(3,2) + (4/14)*E(4,0) + (5/14)*E(2,3) = (5/14)(-
(3/5)log(3/5)-(2/5)log(2/5))+ (4/14)(0) + (5/14)((2/5)log(2/5)-
(3/5)log(3/5)) = 0.693
• find the information gain [Difference between parent entropy and
average weighted entropy ] = IG(S, outlook) = 0.94 - 0.693 = 0.247
First step - find the parent node for
our decision tree
• Find Information gain for Temperature, Humidity and Windy

• IG(S, Temperature) = 0.940 - 0.911 = 0.029

• IG(S, Humidity) = 0.940 - 0.788 = 0.152

• IG(S, Windy) = 0.940 - 0.8932 = 0.048


Step 2: select the feature
having largest entropy gain
• Here it is Outlook. So it forms first node(root node) of our decision
tree.

overcast contains
only examples of
class ‘Yes’ we can
set it as yes
Next step : find the next node in
our decision tree
• find the node under sunny -- to determine which of the following
Temperature ,Humidity or Wind has higher information gain

Calculate parent entropy E(sunny)

E(sunny) = (-(3/5)log(3/5)-(2/5)log(2/5)) =
0.971.
Next Step : find the next node in our
decision tree
• Calculate information gain of Temperature. IG(sunny, Temperature)

E(sunny, Temperature) = (2/5)*E(0,2) + (2/5)*E(1,1) + (1/5)*E(1,0)=2/5=0.4

IG(sunny, Temperature) = 0.971–0.4 =0.571

Compute IG ,
IG(sunny, Humidity) =
IG(sunny, Wind) =
Next Step : find the next node in our
decision tree
• Calculate information gain of Temperature. IG(sunny, Temperature)

E(sunny, Temperature) = (2/5)*E(0,2) + (2/5)*E(1,1) + (1/5)*E(1,0)=2/5=0.4

IG(sunny, Temperature) = 0.971–0.4 =0.571

Compute IG ,
IG(sunny, Humidity) = 0.971
IG(sunny, Windy) = 0.020
Next Step : find the next node in our
decision tree
• IG(sunny, Humidity) is the largest value, So Humidity is the node
which comes under sunny.

Play will occur if humidity is


normal
and will not occur if it is high
Final Decision Tree
Example 2

• https://medium.com/@rakendd/building-decision-trees-and-its-mat
h-711862eea1c0
CART Vs ID3 Vs C4.5
Types of decision trees

You might also like