Decision Tree Basics

Uploaded by

gq998trc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

Decision Tree Basics

Uploaded by

gq998trc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Decision Tree Basics

Dan Lo
Department of Computer Science
Kennesaw State University
Overview
• Widely used in practice
• Strengths include
– Fast and simple to implement
– Can convert to rules
– Handles noisy data

• Weaknesses include
– Univariate splits/partitioning using only one attribute at a time --- limits types of
possible trees
– Large decision trees may be hard to understand
– Requires fixed-length feature vectors
– Non-incremental (i.e., batch method)
Tennis Played?
• Columns denote features Xi
• Rows denote labeled instances 𝑥𝑖 , 𝑦𝑖
• Class label denotes whether a tennis game was played
Decision Tree
• A possible decision tree for the data:

• Each internal node: test one attribute Xi

• Each branch from a node: selects one value for Xi
• Each leaf node: predict Y
Decision Tree – Decision Boundary
• Decision trees divide the feature space into axis parallel (hyper-
)rectangles
• Each rectangular region is labeled with one label
• or a probability distribution over labels
Decision Tree – Is milk spoiled?
Another Example
• A robot wants to decide which animals in the shop would make a
good pet for a child?
First Decision Tree

• This decision tree predicts 5 out of 9 correct cases.

• The threshold could be lowered to 100 kg to get 6 out of 9.
• We need to build a second decision tree for lighter cases.
The Second Decision Tree
• One direction is to pick a feature that changes some of the incorrect
Yes to No, e.g., snake by color green.
What Functions Can be Represented?
• Decision trees can represent any function of input attributes.
• For Boolean functions, path to leaf gives truth table row.
• However, could have exponentially many nodes.
Information Gain
• Which test is more informative?
Impurity/Entropy
• Measures the level of impurity in a group of examples
Impurity
Entropy – a Common Way to Measure
Impurity
• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 = Σ𝑖 − 𝑝𝑖 lg 𝑝𝑖 where
𝑝𝑖 is the probability of class i in a node.
• Entropy comes from information theory. The higher the entropy, the
more the information content.
• Another measurement: Gini impurity 𝐺𝑖𝑛𝑖 = 1 − Σ𝑖 𝑝𝑖2
2-Class Case
2
• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑥 = −Σ𝑖=1 𝑝 𝑥 = 𝑖 lg 𝑝(𝑥 = 𝑖)
• What is the entropy of a group in which all examples
belong to the same class?
• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 = −1 lg 1 = 0
• Not a good training set for learning
• What is the entropy of a group with 50% of either class?
• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 = −0.5 lg 0.5 − 0.5 lg 0.5 = 1
• Good training set for learning
Sample Entropy

• S is a training sample
• 𝑝⊕ is the proportion of positive examples in S.
• 𝑝⊖ is the proportion of negative examples in S.
• Entropy measures the impurity of S
• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = −𝑝⊕ lg 𝑝⊕ − 𝑝⊖ lg𝑝⊖
Information Gain
• We want to determine which attribute in a given set of training
feature vectors is most useful for discriminating between the classes
to be learned.
• Information gain tells us how important a given attribute of the
feature vectors is.
• We will use it to decide the ordering of attributes in the nodes of a
decision tree.
• IG = Entropy(parent) – Weighted Sum of Entropy(children)
Basic Algorithm for Top-Down Learning of
Decision Trees
ID3 (Iterative Dichotomiser 3, Ross Quinlan, 1986)
node = root of decision tree
Main loop:
1. A <- the “best” decision attribute for the next node.
2. Assign A as decision attribute for node.
3. For each value of A, create a new descendant of node.
4. Sort training examples to leaf nodes.
5. If training examples are perfectly classified, stop. Else,
recurse over new leaf nodes.

Question: How do we choose which attribute is best?

Choosing the Best Attribute
Key problem: choosing which attribute to split a given set of examples
• Some possibilities are:
• Random: Select any attribute at random
• Least-Values: Choose the attribute with the smallest number of possible
values
• Most-Values: Choose the attribute with the largest number of possible values
• Max-Gain: Choose the attribute that has the largest expected information
gain
• i.e., attribute that results in smallest expected size of subtrees
rooted at its children
• The ID3 algorithm uses the Max-Gain method of selecting the best
attribute
COVID-19 Example
Wearing Fever Running COVID-19
Masks Nose Wearing Masks
N Y Y Y (3/5, 2/5)
H=0.9710
N N Y Y
Y N N N IG=0.971-3/
Y Y Y Y
5*0.9183=0.4200
Y N Y N

(1/3, 2/3) (0, 2/2)

H=0.9183 H=0
COVID-19 Example (Cont.)

Wearing Fever Running COVID-19

Masks Nose Fever
N Y Y Y (3/5, 2/5)
N N Y Y
H=0.9710

Y N N N IG=0.971-3/
Y Y Y Y 5*0.9183=0.4200
Y N Y N

(2/2, 0) (1/3, 2/3)

H=0 H=0.9183
COVID-19 Example (Cont.)
Wearing Fever Running COVID-19 Running nose
Masks Nose
(3/5, 2/5)
N Y Y Y
H=0.9710
N N Y Y
Y N N N
IG=0.971-4/
Y Y Y Y
5*0.8113=0.3220
Y N Y N

(3/4, 1/4) (0, 1/1)

H=0.8113 H=0
COVID-19 Example (Pick Highest IG)
Wearing Fever Running COVID-19
Masks Nose Wearing Masks
N Y Y Y (3/5, 2/5)
H=0.9710
N N Y Y
Y N N N IG=0.971-3/
Y Y Y Y
5*0.9183=0.4200
Y N Y N

(1/3, 2/3) (0, 2/2)

H=0.9183 H=0
COVID-19 Example (Expand Left Tree)
Wearing Fever Running COVID-19
Masks Nose
N Y Y Y
Fever Running Nose
N N Y Y (1/3, 2/3) (1/3, 2/3)
H = 0.9183 H = 0.9183
Y N N N
IG=0.9183 IG=0.2516
Y Y Y Y
(1/1, 0) (0,2/2) (1/2, 1/2) (0,1/1)
Y N Y N H=0 H=0 H=1 H=0
COVID-19 Example (Expand Right Tree?)
Wearing Fever Running COVID-19
Masks Nose Wearing Masks
N Y Y Y (3/5, 2/5)
H=0.9710
N N Y Y
Y N N N IG=0.971-3/
Y Y Y Y
5*0.9183=0.4200
Y N Y N

(1/3, 2/3) (0, 2/2)

H=0.9183 H=0
COVID-19 Example (Decision Tree)
Wearing Masks
(3/5, 2/5)
H=0.9710
IG=0.971-3/
5*0.9183=0.4200

(2/2, 0)
Fever
(1/3, 2/3)
H=0
H = 0.9183

IG=0.9183

(1/1, 0) (0,2/2)
H=0 H=0
How to Use Decision Tree
Wearing Masks

• Fill in answers in leaf nodes

• Run test sample from root
• <wearing masks, fever, running nose>
<N, Y, Y>  Yes
<Y, Y, N>  Yes Yes (100%)
<Y, N, Y>  No Fever

• Not all attributes are used!

Yes No
(100%) (100%)
What if IG is negative?
• If IG is negative, that means children’s entropy is larger than their
parent.
• I.e., adding children nodes do not get better classification.
• So stop growing nodes at that branch.
• This is one way of true pruning.
Pruning Tree
• Decision may grow fast, which we don’t like!
• It may cause overfitting by noise including incorrect attributes or class
membership.
• Large decision trees requires lots of memory and may not be deployed in
resource limited devices.
• Decision tree may not capture features in the training set.
• It is hard to tell if a single extra node will increase accuracy, so called the
horizon effect.
• One way to prune trees is set an IG threshold to keep subtrees.
• i.e., IG has to be greater than the threshold to grow the tree;
• Another way is simply set the tree depth or set the max bin count.
How About Numeric Attributes
• IN the COVID-19 example, we only have Yes/No attributes, what if we have
a person’s weight?
• We could sort the weight. Find the average of two adjacent values.
Calculate entropy of each 𝑊 < 𝑤𝑖 . Pick the one with lowest entropy.
• For ranked data, like rank 1-4 for a question. Or categorical data, like low,
medium, and high. We may simply encode them as ordinals. Calculate
entropies for each R < 𝑟𝑖 . Pick the one with lowest entropy.
• For non-sequential numeric data, like red, green, and blue. We may
enumerate all possible combinations and calculate their entropies such as
{C=red},{C=green}, {C=blue},{C=red, green}, {C=red, blue}, {C=green, blue}.
• Remember our goal is to split data. So we don’t consider any split criteria
that do not separate data like {C=red, green, blue}

Decision Trees
No ratings yet
Decision Trees
25 pages
ML Lec-12
No ratings yet
ML Lec-12
17 pages
15 1 Random Forest and Decision Tree
No ratings yet
15 1 Random Forest and Decision Tree
66 pages
Lec4 - Decision Trees
No ratings yet
Lec4 - Decision Trees
43 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
61 pages
Randomforest TNP
No ratings yet
Randomforest TNP
71 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
ML-Lec5
No ratings yet
ML-Lec5
7 pages
2025-Lecture07-P1-ID3
No ratings yet
2025-Lecture07-P1-ID3
41 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
Module 3
No ratings yet
Module 3
101 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Chapter 5 2018 2019
No ratings yet
Chapter 5 2018 2019
5 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Decision Tree
No ratings yet
Decision Tree
19 pages
Random Forest Regression
No ratings yet
Random Forest Regression
57 pages
3. Classification Trees,
No ratings yet
3. Classification Trees,
48 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
Unit 2 1
No ratings yet
Unit 2 1
15 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Module 3
No ratings yet
Module 3
102 pages
Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
3. Tree Models
No ratings yet
3. Tree Models
42 pages
ID3
No ratings yet
ID3
7 pages
Bhabesh - Chapter 3 Complete Editing Including Summary
No ratings yet
Bhabesh - Chapter 3 Complete Editing Including Summary
18 pages
2.decision Tree
No ratings yet
2.decision Tree
56 pages
DMDW-CO3-SESSION-14
No ratings yet
DMDW-CO3-SESSION-14
55 pages
ML Lecture 13-14
No ratings yet
ML Lecture 13-14
33 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
Decision Trees
No ratings yet
Decision Trees
7 pages
L3 - Decision Trees
No ratings yet
L3 - Decision Trees
28 pages
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
No ratings yet
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
54 pages
New Module 3 Part1
No ratings yet
New Module 3 Part1
69 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
2 ML Ch3 Decision Trees Final
No ratings yet
2 ML Ch3 Decision Trees Final
70 pages
Unit 3
No ratings yet
Unit 3
46 pages
7_DecisionTree
No ratings yet
7_DecisionTree
58 pages
UNIT3
No ratings yet
UNIT3
71 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
Decision Trees and How To Build and Optimize Decision Tree Classifier
No ratings yet
Decision Trees and How To Build and Optimize Decision Tree Classifier
16 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
MLT UNIT-3 notes
No ratings yet
MLT UNIT-3 notes
35 pages
Module 3 DecisionTree Notes
100% (1)
Module 3 DecisionTree Notes
14 pages
ML Unit-2.1
No ratings yet
ML Unit-2.1
17 pages
7. Decision Tree & Random Forest
No ratings yet
7. Decision Tree & Random Forest
41 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
18 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
Decisiontree 2
No ratings yet
Decisiontree 2
16 pages
2024-Lecture11-MLAlgorithms
No ratings yet
2024-Lecture11-MLAlgorithms
84 pages
2.3 Decision-Tree-Algorithm
No ratings yet
2.3 Decision-Tree-Algorithm
61 pages
Chapter18 2
No ratings yet
Chapter18 2
40 pages
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
No ratings yet
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
129 pages
Machine Learning: MVJ21CS62
No ratings yet
Machine Learning: MVJ21CS62
12 pages
Computer Solved: Nonlinear Differential Equations
From Everand
Computer Solved: Nonlinear Differential Equations
Joe J. Ettl
No ratings yet
Through The Looking Glass John Cage and Avant-Garde Film (Richard H. Brown JR) (Z-Library)
No ratings yet
Through The Looking Glass John Cage and Avant-Garde Film (Richard H. Brown JR) (Z-Library)
257 pages
Project
No ratings yet
Project
38 pages
AUREEN ICBAN Pedagogically Designed LP, - EARTH AND LIFE SCIENCE
No ratings yet
AUREEN ICBAN Pedagogically Designed LP, - EARTH AND LIFE SCIENCE
4 pages
ĐỀ ĐỀ XUẤT -BG
No ratings yet
ĐỀ ĐỀ XUẤT -BG
22 pages
The Frantic Side of India by Dilip
No ratings yet
The Frantic Side of India by Dilip
363 pages
Abbotsleigh 2016 2U Trials & Solutions
No ratings yet
Abbotsleigh 2016 2U Trials & Solutions
33 pages
"Gas Density & Specific Gravity": A Report Submitted by The Students: Sajjad Kareem & Karar Shaker Youssef
No ratings yet
"Gas Density & Specific Gravity": A Report Submitted by The Students: Sajjad Kareem & Karar Shaker Youssef
10 pages
Colonies in Space: Survive, Spread Out Independent
No ratings yet
Colonies in Space: Survive, Spread Out Independent
1 page
9.10.202 Revised GELATINIZATION
No ratings yet
9.10.202 Revised GELATINIZATION
6 pages
Trabajo de Ingles
No ratings yet
Trabajo de Ingles
5 pages
Istar-Pro 2u RM Um-272 b0
No ratings yet
Istar-Pro 2u RM Um-272 b0
32 pages
Advt No-15-2025 20-04-2025 X7 Version
No ratings yet
Advt No-15-2025 20-04-2025 X7 Version
1 page
Dr. Oscar Victor M. Antonio, JR.: Prepared by
No ratings yet
Dr. Oscar Victor M. Antonio, JR.: Prepared by
37 pages
2. CIDER_Diversity_and_Inclusivity_Policy
No ratings yet
2. CIDER_Diversity_and_Inclusivity_Policy
3 pages
11 CAE Gold
No ratings yet
11 CAE Gold
6 pages
Unity: The Most Versatile and Powerful Thermal Desorption Unit
No ratings yet
Unity: The Most Versatile and Powerful Thermal Desorption Unit
8 pages
TP TQM Borja 1
No ratings yet
TP TQM Borja 1
3 pages
64 02 22 2977 CNPHI Activities EN 05
No ratings yet
64 02 22 2977 CNPHI Activities EN 05
1 page
Patterns of Development
50% (2)
Patterns of Development
31 pages
7 SCL-Nanney PDF
No ratings yet
7 SCL-Nanney PDF
3 pages
Reviewer Eng
No ratings yet
Reviewer Eng
5 pages
Assignment 3
No ratings yet
Assignment 3
4 pages
Catapult
No ratings yet
Catapult
8 pages
Identity Crisis of Philippine Architecture
No ratings yet
Identity Crisis of Philippine Architecture
1 page
Akhil4 1
No ratings yet
Akhil4 1
3 pages
Useful Memory Tips and Techniques
No ratings yet
Useful Memory Tips and Techniques
2 pages
List of All Unis
No ratings yet
List of All Unis
217 pages
Acfrogcsag9xi7sppmnb2lgoksxwl k3 S Ltfi5ehomlx596ht3f1knvq Flyvixggp07dchdjrbzrnaqnmfzgtw4of8fkw4synvevowqwjgv3nbolaowmkxy2lrdoeowjhe6hq8skkdrju l0m
No ratings yet
Acfrogcsag9xi7sppmnb2lgoksxwl k3 S Ltfi5ehomlx596ht3f1knvq Flyvixggp07dchdjrbzrnaqnmfzgtw4of8fkw4synvevowqwjgv3nbolaowmkxy2lrdoeowjhe6hq8skkdrju l0m
73 pages
Jan - 2017 - NEO - PGM - AND - OR Functionality
No ratings yet
Jan - 2017 - NEO - PGM - AND - OR Functionality
5 pages
Eeb 551 Lab 2 Ded
No ratings yet
Eeb 551 Lab 2 Ded
19 pages