0% found this document useful (0 votes)

23 views

2.3 Decision-Tree-Algorithm

Decision trees are classifiers represented as tree structures with decision nodes and leaf nodes. Decision nodes specify a test of an attribute, with branches for each outcome. Leaf nodes assign a class. Decision trees are built top-down by choosing the attribute that best splits the data to produce the most homogeneous child nodes. The information gain metric measures the reduction in entropy, or uncertainty, from splitting on an attribute, and is used to select the attribute at each node that best separates the classes. The tree grows recursively in this manner until reaching a stopping criteria such as pure nodes or no remaining attributes.

Uploaded by

Nandini rathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views

2.3 Decision-Tree-Algorithm

Uploaded by

Nandini rathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 61

Decision Tree

Definition
• A decision tree is a classifier in the form of a
tree structure with two types of nodes:
– Decision node: Specifies a choice or test of
some attribute, with one branch for each
outcome
– Leaf node: Indicates classification of an example
– It is a recursive partitioning algorithm.
Decision Tree Example 1
Whether to approve a loan

Employed?
No Yes

Credit
Income?
Score?
High Low High Low

Approve Reject Approve Reject

Decision Tree Example 3
Issues
• Given some training examples, what decision tree
should be generated?
• One proposal: prefer the smallest tree that is
consistent with the data (Bias)
– the tree with the least depth?
– the tree with the fewest nodes?
• Possible method:
– search the space of decision trees for the smallest decision
tree that fits the data
Example Data
Training Examples:
Action Author Thread Length Where
e1 skips known new long Home
e2 reads unknown new short Work
e3 skips unknown old long Work
e4 skips known old long home
e5 reads known new short home
e6 skips known old long work
New Examples:
e7 ??? known new short work
e8 ??? unknown new short work
Possible splits
skips 9
length reads 9

long short skips 9

thread reads 9
Skips 2
Skips 7
Reads 9
Reads 0 new old
Skips 6
Skips 3
Reads 2
Reads 7
Two Example DTs
Decision Tree for PlayTennis
• Attributes and their values:
– Outlook: Sunny, Overcast, Rain
– Humidity: High, Normal
– Wind: Strong, Weak
– Temperature: Hot, Mild, Cool

• Target concept - Play Tennis: Yes, No

Decision Tree for PlayTennis
Outlook

Sunny Overcast Rain

Humidity Yes Wind

High Normal Strong Weak

No Yes No Yes
Decision Tree for PlayTennis
Outlook

Sunny Overcast Rain

Humidity Each internal node tests an attribute

High Normal Each branch corresponds to an

attribute value node

No Yes Each leaf node assigns a classification

Decision Tree for PlayTennis
Outlook Temperature Humidity Wind PlayTennis
Sunny Hot High Weak ? No
Outlook

Sunny Overcast Rain

Humidity Yes Wind

High Normal Strong Weak

No Yes No Yes
Decision Tree
decision trees represent disjunctions of conjunctions
Outlook

Sunny Overcast Rain

Humidity Yes Wind

High Normal Strong Weak

No Yes No Yes

(Outlook=Sunny  Humidity=Normal)
 (Outlook=Overcast)
 (Outlook=Rain  Wind=Weak)
Searching for a good tree
• How should you go about building a decision tree?
• The space of decision trees is too big for systematic
search.

• Stop and
– return the a value for the target feature or
– a distribution over target feature values

• Choose a test (e.g. an input feature) to split on.

– For each value of the test, build a subtree for those
examples with this value for the test.
Top-Down Induction of Decision Trees ID3

1. Which node to proceed with?

1. A  the “best” decision attribute for next node
2. Assign A as decision attribute for node
3. For each value of A create new descendant
4. Sort training examples to leaf node according to the
attribute value of the branch
5. If all training examples are perfectly classified (same
value of target attribute) stop, else iterate over new
leaf nodes. 2. When to stop?
Choices
• When to stop
– no more input features
– all examples are classified the same
– too few examples to make an informative split

• Which test to split on

– split gives smallest error.
– With multi-valued features
• split on all values or
• split values into half.
Which Attribute is ”best”?

[29+,35-] A1=? A2=? [29+,35-]

True False True False

[21+, 5-] [8+, 30-] [18+, 33-] [11+, 2-]

17
Principled Criterion
• Selection of an attribute to test at each node -
choosing the most useful attribute for classifying
examples.
• information gain
– measures how well a given attribute separates the training
examples according to their target classification
– This measure is used to select among the candidate
attributes at each step while growing the tree
– Gain is measure of how much we can reduce
uncertainty (Value lies between 0,1)
Entropy
• A measure for
– uncertainty
– purity
– information content
• Information theory: optimal length code assigns (- log2p) bits
to message having probability p
• S is a sample of training examples
– p+ is the proportion of positive examples in S
– p- is the proportion of negative examples in S
• Entropy of S: average optimal number of bits to encode
information about certainty/uncertainty about S
Entropy(S) = p+(-log2p+) + p-(-log2p-) = -p+log2p+- p-log2p-
Entropy

• The entropy is 0 if the outcome

is ``certain”.
• The entropy is maximum if we
have no knowledge of the
system (or any outcome is
equally possible).

• S is a sample of training examples

• p+ is the proportion of positive examples
• p- is the proportion of negative examples
• Entropy measures the impurity of S
Entropy(S) = -p+log2p+- p-log2 p-
Information Gain
Gain(S,A): expected reduction in entropy due to partitioning S
on attribute A

Gain(S,A)=Entropy(S) − vvalues(A) |Sv|/|S| Entropy(Sv)

Entropy([29+,35-]) = -29/64 log2 29/64 – 35/64 log2 35/64
= 0.99

[29+,35-] A1=? A2=? [29+,35-]

True False True False

[21+, 5-] [8+, 30-] [18+, 33-] [11+, 2-]

Information Gain
Entropy([21+,5-]) = 0.71 Entropy([18+,33-]) = 0.94
Entropy([8+,30-]) = 0.74 Entropy([11+,2-]) = 0.62
Gain(S,A1)=Entropy(S) Gain(S,A2)=Entropy(S)
-26/64*Entropy([21+,5-]) -51/64*Entropy([18+,33-])
-38/64*Entropy([8+,30-])
-13/64*Entropy([11+,2-])
=0.27
=0.12

[29+,35-] A1=? A2=? [29+,35-]

True False True False

[21+, 5-] [8+, 30-] [18+, 33-] [11+, 2-]

22
Training Example:1
Day Outlook Temp Humidity Wind Tennis?
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes
D5 Rain Cool Normal Weak Yes
D6 Rain Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rain Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No
Selecting the Next Attribute
S=[9+,5-] S=[9+,5-]
E=0.940 E=0.940

Humidity Wind

High Normal Weak Strong

[3+, 4-] [6+, 1-] [6+, 2-] [3+, 3-]

E=0.985 E=0.592 E=0.811 E=1.0

Gain(S,Humidity) Gain(S,Wind)
=0.940-(7/14)*0.985 =0.940-(8/14)*0.811
– (7/14)*0.592 – (6/14)*1.0
=0.151 =0.048
Humidity provides greater info. gain than Wind, w.r.t target classification.
Selecting the Next Attribute
S=[9+,5-]
E=0.940

Outlook

Sunny Overcast Rain

[2+, 3-] [4+, 0] [3+, 2-]

E=0.971 E=0.0 E=0.971

Gain(S,Outlook)
=0.940-(5/14)*0.971
-(4/14)*0.0 – (5/14)*0.0971
=0.247
Selecting the Next Attribute
The information gain values for the 4 attributes are:
• Gain(S,Outlook) =0.247
• Gain(S,Humidity) =0.151
• Gain(S,Wind) =0.048
• Gain(S,Temperature) =0.029

where S denotes the collection of training examples

Note: 0Log20 =0
ID3 Algorithm
[D1,D2,…,D14] Outlook
[9+,5-]

Sunny Overcast Rain

Ssunny=[D1,D2,D8,D9,D11] [D3,D7,D12,D13] [D4,D5,D6,D10,D14]

[2+,3-] [4+,0-] [3+,2-]

? Yes ?
Test for this node

Gain(Ssunny , Humidity)=0.970-(3/5)0.0 – 2/5(0.0) = 0.970

Gain(Ssunny , Temp.)=0.970-(2/5)0.0 –2/5(1.0)-(1/5)0.0 = 0.570
Gain(Ssunny , Wind)=0.970= -(2/5)1.0 – 3/5(0.918) = 0.019
ID3 Algorithm
Outlook

Sunny Overcast Rain

Humidity Yes Wind

[D3,D7,D12,D13]

High Normal Strong Weak

No Yes No Yes

[D1,D2] [D8,D9,D11] [D6,D14] [D4,D5,D10]

Triangles and Squares: Example2
# Attribute Shape
Color Outline Dot
1 green dashed no triange
2 green dashed yes triange
3 yellow dashed no square
4 red dashed no square
5 red solid no square
6 red solid yes triange
7 green solid no square
8 green dashed no triange
9 yellow solid yes square
10 red solid no square
11 green solid yes square
12 yellow dashed yes square
13 yellow solid no square
14 red dashed yes triange
Triangles and Squares
# Attribute Shape
Color Outline Dot
1 green dashed no triange
2 green dashed yes triange
3 yellow dashed no square
4 red dashed no square
5 red solid no square Data Set:
6 red solid yes triange A set of classified objects
7 green solid no square
8 green dashed no triange
9 yellow solid yes square
10 red solid no square . .
11 green solid yes square
12 yellow dashed yes square . .
13 yellow solid no square . .
14 red dashed yes triange
Entropy
• 5 triangles
• 9 squares
.
.
• class probabilities
. .
. .

• entropy
. .
. .
. .
. .
red

Color? green
Entropy
reduction .
by yellow .

data set
partitioning .

.
. .
. .
. .
. .
red

Color? green

.
yellow .

.
.
. .
. .
.
Information Gain
.
. .
red

Color? green

.
yellow .

.
.
Information Gain of The Attribute
• Attributes
– Gain(Color) = 0.246
– Gain(Outline) = 0.151
– Gain(Dot) = 0.048
• Heuristics: attribute with the highest gain is chosen
• This heuristics is local (local minimization of impurity)
. .
. . .
.
. . red

Color?

green

. yellow
.
.
.

Gain(Outline) = 0.971 – 0 = 0.971 bits

Gain(Dot) = 0.971 – 0.951 = 0.020 bits
. .
. . .
.
. . red

Color?
Gain(Outline) = 0.971 – 0.951 = 0.020 bits
Gain(Dot) = 0.971 – 0 = 0.971 bits
green

. yellow
.
.
.

solid
.
Outline?

dashed

.
. .
. . .
.
. . red
yes .

Dot? .
Color?
no
green

. yellow
.
.
.

solid
.
Outline?

dashed

.
Decision Tree
. .

. .
. .

Color

red green
yellow

Dot square Outline

yes no dashed solid

triangle square triangle square

Splitting Rule: GINI Index
• GINI Index
– Measure of node impurity

GINInode(Node) = 1- [ p(c)] 2

c  classes

Sv
GINIsplit (A) =  S
GINI(N v )
v Value s(A)
Splitting Based on Continuous Attributes

Taxable Taxable
Income Income?
> 80K?
< 10K > 80K
Yes No

[10K,25K) [25K,50K) [50K,80K)

(i) Binary split (ii) Multi-way split

Continuous Attribute – Binary Split
• For continuous attribute
– Partition the continuous value of attribute A into a
discrete set of intervals
– Create a new boolean attribute Ac , looking for a
threshold c,

true if Ac  c
Ac = 
 false otherwise
How to choose c ?
• consider all possible splits and finds the best cut
Strengths of decision tree methods
• Ability to generate understandable rules

• Ease of calculation at classification time

• Ability to handle both continuous and

categorical variables

• Ability to clearly indicate best attributes

The weaknesses of decision tree methods
• Greedy algorithm: no global optimization

• Error-prone with too many classes: numbers of training

examples become smaller quickly in a tree with many
levels/branches.

• Expensive to train: sorting, combination of attributes,

calculating quality measures, etc.

• Trouble with non-rectangular regions: the rectangular

classification boxes that may not correspond well with the
actual distribution of records in the decision space.
Practical Issues of Classification
• Underfitting and Overfitting

• Missing Values

• Costs of Classification

Work at A Pizza Place FIXED SIREL KILLERS AUTO FARM
No ratings yet
Work at A Pizza Place FIXED SIREL KILLERS AUTO FARM
20 pages
FRM - Purple Fix
No ratings yet
FRM - Purple Fix
6 pages
Painless Pre-Algebra
From Everand
Painless Pre-Algebra
Barron's Educational Series
3/5 (2)
7-Decision Trees Learning
No ratings yet
7-Decision Trees Learning
51 pages
7. Decision Tree & Random Forest
No ratings yet
7. Decision Tree & Random Forest
41 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
2.decision Tree
No ratings yet
2.decision Tree
56 pages
AIML Lect5 Decision Tree
No ratings yet
AIML Lect5 Decision Tree
33 pages
Decision Tree Example
No ratings yet
Decision Tree Example
21 pages
Module 3-Decision Tree Learning
100% (1)
Module 3-Decision Tree Learning
33 pages
L5 - Decision Tree - B
No ratings yet
L5 - Decision Tree - B
51 pages
Module 3 Chap 3 Decision Tree Learning
No ratings yet
Module 3 Chap 3 Decision Tree Learning
79 pages
Lecture2 DT
No ratings yet
Lecture2 DT
75 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Decision Tree Learning and Inductive Inference
No ratings yet
Decision Tree Learning and Inductive Inference
37 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
Machine Learning - Part 1
100% (1)
Machine Learning - Part 1
80 pages
Decision Tree: Courtesy: Prof. Pabitra Mitra, CSE, IIT Kharagpur
No ratings yet
Decision Tree: Courtesy: Prof. Pabitra Mitra, CSE, IIT Kharagpur
73 pages
Unit-3 (1)
No ratings yet
Unit-3 (1)
81 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
61 pages
Decisiontrees
No ratings yet
Decisiontrees
46 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
DMDW-CO3-SESSION-14
No ratings yet
DMDW-CO3-SESSION-14
55 pages
Unit 2 1
No ratings yet
Unit 2 1
15 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
Decision Tree
No ratings yet
Decision Tree
58 pages
Classification With Decision Trees I: Instructor: Qiang Yang
No ratings yet
Classification With Decision Trees I: Instructor: Qiang Yang
29 pages
ID3
No ratings yet
ID3
7 pages
Decision Trees: Decision Tree Representation ID3 Learning Algorithm Entropy, Information Gain Overfitting
No ratings yet
Decision Trees: Decision Tree Representation ID3 Learning Algorithm Entropy, Information Gain Overfitting
33 pages
Data Mining: Classification-1
No ratings yet
Data Mining: Classification-1
53 pages
ML-Lec5
No ratings yet
ML-Lec5
7 pages
ID3_Explanation
No ratings yet
ID3_Explanation
23 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
MLT UNIT-3 notes
No ratings yet
MLT UNIT-3 notes
35 pages
Ics320part3 DecisionTreeLearning
No ratings yet
Ics320part3 DecisionTreeLearning
29 pages
Decision - Tree
No ratings yet
Decision - Tree
75 pages
SDG Sdgs DF
No ratings yet
SDG Sdgs DF
23 pages
Decision Trees
No ratings yet
Decision Trees
53 pages
Class 16 Decision Tree
No ratings yet
Class 16 Decision Tree
45 pages
3. Tree Models
No ratings yet
3. Tree Models
42 pages
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
No ratings yet
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
21 pages
L3 - Decision Trees
No ratings yet
L3 - Decision Trees
28 pages
Classification - Issues Regarding Classification and Prediction
No ratings yet
Classification - Issues Regarding Classification and Prediction
42 pages
Decision Tree
100% (4)
Decision Tree
66 pages
Lecture 06 Part A - Macine Learning
No ratings yet
Lecture 06 Part A - Macine Learning
77 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
Lec-3-Decision Trees
No ratings yet
Lec-3-Decision Trees
47 pages
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
No ratings yet
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
54 pages
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
No ratings yet
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
7 pages
DT-0 (3 Files Merged)
No ratings yet
DT-0 (3 Files Merged)
143 pages
New Module 3 Part1
No ratings yet
New Module 3 Part1
69 pages
Random Forest Regression
No ratings yet
Random Forest Regression
57 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Machine Learning Unit-3.2
No ratings yet
Machine Learning Unit-3.2
61 pages
Unit 2
No ratings yet
Unit 2
20 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
W7-8_ Decision Trees
No ratings yet
W7-8_ Decision Trees
81 pages
Classification and Clustering
No ratings yet
Classification and Clustering
59 pages
DS_w12_DT
No ratings yet
DS_w12_DT
61 pages
Unit 6 Finalized
No ratings yet
Unit 6 Finalized
30 pages
Module 2 Notes v1 PDF
No ratings yet
Module 2 Notes v1 PDF
20 pages
1 Proposal Sample
No ratings yet
1 Proposal Sample
11 pages
Zsarina Gurtiza Final
No ratings yet
Zsarina Gurtiza Final
1 page
Rohitweb Program
No ratings yet
Rohitweb Program
12 pages
Op Amp Using Pspice
No ratings yet
Op Amp Using Pspice
18 pages
Exercise 1DM
No ratings yet
Exercise 1DM
8 pages
Chapter 3 - 1 Intel 8086
No ratings yet
Chapter 3 - 1 Intel 8086
23 pages
FUNCTIONS and RELATIONS (Discussion)
No ratings yet
FUNCTIONS and RELATIONS (Discussion)
32 pages
1756 Controllogix Chassis
No ratings yet
1756 Controllogix Chassis
14 pages
Ardhouse Pro V.3.1+
No ratings yet
Ardhouse Pro V.3.1+
6 pages
Introduction ODI
No ratings yet
Introduction ODI
14 pages
View Activation, License, and Expiration Date Information: SLMGR - Vbs /dli
No ratings yet
View Activation, License, and Expiration Date Information: SLMGR - Vbs /dli
3 pages
Famsun SCQY70×122 Cleaner
No ratings yet
Famsun SCQY70×122 Cleaner
9 pages
LB1403N Series: 5-Dot Red/Green LED Level Meter
No ratings yet
LB1403N Series: 5-Dot Red/Green LED Level Meter
4 pages
Phased Array Ultrasonic Technique Speeds Up Examination of Aluminothermic Rail Welds
No ratings yet
Phased Array Ultrasonic Technique Speeds Up Examination of Aluminothermic Rail Welds
1 page
SQL Server Backup and Restore
No ratings yet
SQL Server Backup and Restore
8 pages
Analyze
No ratings yet
Analyze
3 pages
Topic 2 Principle of Software
No ratings yet
Topic 2 Principle of Software
17 pages
MCC Panel
0% (1)
MCC Panel
9 pages
E18 Ball
No ratings yet
E18 Ball
2 pages
OTS Structured Cabling Installation Checklist (1) 2
No ratings yet
OTS Structured Cabling Installation Checklist (1) 2
3 pages
Assignment of Lab Exercise 1,2
No ratings yet
Assignment of Lab Exercise 1,2
18 pages
Lesage 2009 Review
No ratings yet
Lesage 2009 Review
27 pages
Cyberbullying DETECTION
No ratings yet
Cyberbullying DETECTION
10 pages
Business Proposal
No ratings yet
Business Proposal
3 pages
Hesham Fawzi: Lead Application Consultant - Project Manager
No ratings yet
Hesham Fawzi: Lead Application Consultant - Project Manager
2 pages
Report Kedar Internship
No ratings yet
Report Kedar Internship
56 pages
Android - How To Make The Last Footer Fixed (ListView) - Stack Overflow
No ratings yet
Android - How To Make The Last Footer Fixed (ListView) - Stack Overflow
2 pages
CG Practical File
No ratings yet
CG Practical File
48 pages