0% found this document useful (0 votes)

85 views26 pages

Unit en Decision Trees Algorithms

sumber : http://www.uni-weimar.de/medien/webis/teaching/lecturenotes/machine-learning/unit-en-decision-trees-algorithms.pdf

Uploaded by

Hasbi Hilmi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

85 views26 pages

Unit en Decision Trees Algorithms

sumber : http://www.uni-weimar.de/medien/webis/teaching/lecturenotes/machine-learning/unit-en-decision-trees-algorithms.pdf

Uploaded by

Hasbi Hilmi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Chapter ML:III

III. Decision Trees

q Decision Trees Basics
q Impurity Functions
q Decision Tree Algorithms
q Decision Tree Pruning

ML:III-66

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm

[Quinlan 1986] [CART Algorithm]

Characterization of the model (model world)

[ML Introduction] :

X is a set of feature vectors, also called feature space.

C is a set of classes.

c : X C is the ideal classifier for X.

D = {(x1, c(x1)), . . . , (xn, c(xn))} X C is a set of examples.

Task: Based on D, construction of a decision tree T to approximate c.

ML:III-67

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm

[Quinlan 1986] [CART Algorithm]

Characterization of the model (model world)

[ML Introduction] :

X is a set of feature vectors, also called feature space.

C is a set of classes.

c : X C is the ideal classifier for X.

D = {(x1, c(x1)), . . . , (xn, c(xn))} X C is a set of examples.

Task: Based on D, construction of a decision tree T to approximate c.

Characteristics of the ID3 algorithm:
1. Each splitting is based on one nominal feature and considers its complete
domain. Splitting based on feature A with domain {a1, . . . , ak } :
X = {x X : x|A = a1} . . . {x X : x|A = ak }
2. Splitting criterion is the information gain.

ML:III-68

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm

[Mitchell 1997] [algorithm template]

ID3(D, Attributes, Target)

q Create a node t for the tree.
q Label t with the most common value of Target in D.
q If all examples in D are positive, return the single-node tree t, with label +.
q If all examples in D are negative, return the single-node tree t, with label .
q If Attributes is empty, return the single-node tree t.
q Otherwise:
q Let A* be the attribute from Attributes that best classifies examples in D.
q Assign t the decision attribute A*.
q For each possible value a in A* do:
q Add a new tree branch below t, corresponding to the test A* = a.
q Let D_a be the subset of D that has value a for A*.
q If D_a is empty:

Then add a leaf node with label of the most common value of Target in D.
Else add the subtree ID3(D_a, Attributes \ {A*}, Target).
q Return t.

ML:III-69

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm (pseudo code)

[algorithm template]

ID3(D, Attributes, Target)

1. t = createNode()
2. label(t) = mostCommonClass(D, Target)
3. IF hx, c(x)i D : c(x) = c THEN return(t) ENDIF
4. IF Attributes = THEN return(t) ENDIF
5. A = argmaxAAttributes (informationGain(D, A))
6. FOREACH a A DO
Da = {(x, c(x)) D : x|A = a}
IF Da = THEN
t0 = createNode()
label(t0 ) = mostCommonClass(D, Target)
createEdge(t, a, t0 )
ELSE
createEdge(t, a, ID3(Da , Attributes \ {A }, Target))
ENDIF
ENDDO
7. return(t)

ML:III-70

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm (pseudo code)

[algorithm template]

ID3(D, Attributes, Target)

ML:III-71

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm (pseudo code)

[algorithm template]

ID3(D, Attributes, Target)

ML:III-72

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm (pseudo code)

[algorithm template]

ID3(D, Attributes, Target)

ML:III-73

Decision Trees

STEIN/LETTMANN 2005-2015

Remarks:
q Target designates the feature (= attribute) that is comprised of the labels according to

which an example can be classified. Within Mitchells algorithm the respective class labels
are + and , modeling the binary classification situation. In the pseudo code version,
Target may be comprised of multiple (more than two) classes.
q Step 3 of of the ID3 algorithm checks the purity of D and, given this case, assigns the

unique class c, c dom(Target), as label to the respective node.

ML:III-74

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm: Example
Example set D for mushrooms, implicitly defining a feature space X over the three
dimensions color, size, and points:

ML:III-75

Color

Size

Points

Eatability

red

small

yes

toxic

2
3
4
5

brown
brown
green
red

small
large
small
large

no
yes
no
no

eatable
eatable
eatable
eatable

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm: Example

(continued)

Top-level call of ID3. Analyze a splitting with regard to the feature color :

D|color

toxic eatable
red
1
1
=
brown
0
2
green
0
1

|Dred | = 2, |Dbrown | = 2, |Dgreen | = 1

Estimated a-priori probabilities:

pred =

ML:III-76

Decision Trees

2
= 0.4,
5

pbrown =

2
= 0.4,
5

pgreen =

1
= 0.2
5

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm: Example

(continued)

Top-level call of ID3. Analyze a splitting with regard to the feature color :

D|color

toxic eatable
red
1
1
=
brown
0
2
green
0
1

|Dred | = 2, |Dbrown | = 2, |Dgreen | = 1

Estimated a-priori probabilities:

pred =

2
= 0.4,
5

pbrown =

2
= 0.4,
5

pgreen =

1
= 0.2
5

Conditional entropy values for all attributes:

H(C | color)

= (0.4 ( 12 log2 12 + 12 log2 12 ) +

0.4 ( 20 log2 02 + 22 log2 22 ) +
0.2 ( 10 log2 01 + 11 log2 11 )) = 0.4

H(C | size)
0.55
H(C | points) = 0.4
ML:III-77

Decision Trees

STEIN/LETTMANN 2005-2015

Remarks:
q The smaller H(C | feature) is, the larger becomes the information gain. Hence, the

difference H(C) H(C | feature) needs not to be computed since H(C) is constant within
each recursion step.
q In the example, the information gain in the first recursion step is maximum for the two

features color and points.

ML:III-78

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm: Example

(continued)

Decision tree before the first recursion step:

attribute: points
yes

color

size

eatability

color

size

eatability

red
brown

small
large

toxic
eatable

brown
green
red

small
small
large

eatable
eatable
eatable

The feature points was chosen in Step 5 of the ID3 algorithm.

ML:III-79

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm: Example

(continued)

Decision tree before the second recursion step:

attribute: points
yes

attribute: color
red

brown

green

size

eatability

size

small

toxic

-/-

eatability
-/-

size

eatability

large

eatable

color

size

eatability

brown
green
red

small
small
large

eatable
eatable
eatable

The feature color was chosen in Step 5 of the ID3 algorithm.

ML:III-80

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm: Example

(continued)

Final decision tree after second recursion step:

attribute: points
yes

no
label: eatable

attribute: color
red
label: toxic

green
label: toxic

brown
label: eatable

Break of a tie: choosing the class toxic for Dgreen in Step 6 of the ID3 algorithm.

ML:III-81

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm: Hypothesis Space

...
A1
+

+o o

A2
+

+ o

A2
+

+
+

A3 -

Decision Trees

+
o

...
ML:III-82

...

A4 -

...

...
STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm: Inductive Bias
Inductive bias is the rigidity in applying the (little bit of) knowledge learned from a
training set for the classification of unseen feature vectors.
Observations:
q

Decision tree search happens in the space of all hypotheses.

The target concept is a member of the hypothesis space.

To generate a decision tree, the ID3 algorithm needs per branch at most as
many decisions as features are given.
no backtracking takes place
local optimization of decision trees

ML:III-83

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm: Inductive Bias
Inductive bias is the rigidity in applying the (little bit of) knowledge learned from a
training set for the classification of unseen feature vectors.
Observations:
q

Decision tree search happens in the space of all hypotheses.

The target concept is a member of the hypothesis space.

To generate a decision tree, the ID3 algorithm needs per branch at most as
many decisions as features are given.
no backtracking takes place
local optimization of decision trees

ML:III-84

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

ID3 Algorithm: Inductive Bias
Inductive bias is the rigidity in applying the (little bit of) knowledge learned from a
training set for the classification of unseen feature vectors.
Observations:
q

Decision tree search happens in the space of all hypotheses.

The target concept is a member of the hypothesis space.

To generate a decision tree, the ID3 algorithm needs per branch at most as
many decisions as features are given.
no backtracking takes place
local optimization of decision trees

Where the inductive bias of the ID3 algorithm becomes manifest:

Small decision trees are preferred.

Highly discriminative features tend to be closer to the root.

Is this justified?
ML:III-85

Decision Trees

STEIN/LETTMANN 2005-2015

Remarks:
q Let Aj be the finite domain (the possible values) of feature Aj , j = 1, . . . , p, and let C be a

set of classes. Then, a hypothesis space H that is comprised of all decision trees
corresponds to the set of all functions h, h : A1 . . . Ap C. Typically, C = {0, 1}.
q The inductive bias of the ID3 algorithm is of a different kind than the inductive bias of the

candidate elimination algorithm (version space algorithm):

1. The underlying hypothesis space H of the candidate elimination algorithm is
incomplete. H corresponds to a coarsened view onto the space of all hypotheses since
H contains only conjunctions of attribute-value pairs as hypotheses. However, this
restricted hypothesis space is searched completely by the candidate elimination
algorithm. Keyword: restriction bias
2. The underlying hypothesis space H of the ID3 algorithm is complete. H corresponds to
the set of all discrete functions (from the Cartesian product of the feature domains onto
the set of classes) that can be represented in the form of a decision tree. However, this
complete hypothesis space is searched incompletely (following a preference).
Keyword: preference bias or search bias
q The inductive bias of the ID3 algorithm renders the algorithm robust with respect to noise.

ML:III-86

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

CART Algorithm

[Breiman 1984] [ID3 Algorithm]

Characterization of the model (model world)

[ML Introduction] :

X is a set of feature vectors, also called feature space. No restrictions are

presumed for the measurement scales of the features.

C is a set of classes.

c : X C is the ideal classifier for X.

D = {(x1, c(x1)), . . . , (xn, c(xn))} X C is a set of examples.

Task: Based on D, construction of a decision tree T to approximate c.

ML:III-87

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

CART Algorithm

[Breiman 1984] [ID3 Algorithm]

Characterization of the model (model world)

[ML Introduction] :

X is a set of feature vectors, also called feature space. No restrictions are

presumed for the measurement scales of the features.

C is a set of classes.

c : X C is the ideal classifier for X.

D = {(x1, c(x1)), . . . , (xn, c(xn))} X C is a set of examples.

Task: Based on D, construction of a decision tree T to approximate c.

Characteristics of the CART algorithm:
1. Each splitting is binary and considers one feature at a time.
2. Splitting criterion is the information gain or the Gini index.

ML:III-88

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

CART Algorithm

(continued)

1. Let A be a feature with domain A. Ensure a finite number of binary splittings

ML:III-89

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

CART Algorithm

(continued)

1. Let A be a feature with domain A. Ensure a finite number of binary splittings

for X by applying the following domain partitioning rules:
If A is nominal, choose A0 A such that 0 < |A0| |A \ A0|.
If A is ordinal, choose a A such that xmin < a < xmax, where xmin, xmax
are the minimum and maximum values of feature A in D.
If A is numeric, choose a A such that a = (xk + xl )/2, where xk , xl are
consecutive elements in the ordered value list of feature A in D.
2. For node t of a decision tree generate all splittings of the above type.
3. Choose a splitting from the set of splittings that maximizes the impurity
reduction :
|DL|
|DR |
(D(t), {D(tL), D(tR )}) = (t)
(tL)
(tR ),
|D|
|D|
where tL and tR denote the left and right successor of t.

ML:III-90

Decision Trees

STEIN/LETTMANN 2005-2015

Decision Tree Algorithms

CART Algorithm

(continued)

Illustration for two numeric features; i.e., the feature space X corresponds to a
two-dimensional plane:
t1 X(t1)
X(t7)
t2

X(t2)

X(t4)

X(t3)

X(t6)
t5 X(t5)

t4 X(t4)
c3
X(t8)

t6 X(t6)
c1

X(t7)
c2

X(t8)

X(t9)

X = X(t1)

X(t9)

By a sequence of splittings the feature space X is partitioned into rectangles that

are parallel to the two axes.

ML:III-91

Decision Trees

STEIN/LETTMANN 2005-2015

ID3 Algorithm For Decision Trees
No ratings yet
ID3 Algorithm For Decision Trees
16 pages
Serials
100% (1)
Serials
2 pages
Calculus 1 - True and False Questions
No ratings yet
Calculus 1 - True and False Questions
3 pages
3 - Decision Trees
No ratings yet
3 - Decision Trees
16 pages
Research Scholars Evaluation Based On Guides View Using Id3
No ratings yet
Research Scholars Evaluation Based On Guides View Using Id3
4 pages
Decision Trees - Id3 Algorithms
No ratings yet
Decision Trees - Id3 Algorithms
12 pages
Ijret - Research Scholars Evaluation Based On Guides View Using Id3
No ratings yet
Ijret - Research Scholars Evaluation Based On Guides View Using Id3
4 pages
LINFO2262: Decision Trees + Random Forests: Pierre Dupont
No ratings yet
LINFO2262: Decision Trees + Random Forests: Pierre Dupont
43 pages
So sánh thuật toán cây quyết định ID3 và C45
No ratings yet
So sánh thuật toán cây quyết định ID3 và C45
7 pages
Decision Tree Using ID3 Algorithm
No ratings yet
Decision Tree Using ID3 Algorithm
40 pages
Classification
No ratings yet
Classification
148 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Decision Tree 2
No ratings yet
Decision Tree 2
20 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Complete ID3 Decision Tree
No ratings yet
Complete ID3 Decision Tree
15 pages
ML UNIT 2 Decision Tree
No ratings yet
ML UNIT 2 Decision Tree
109 pages
Unit-3 MLT
No ratings yet
Unit-3 MLT
74 pages
Unit 3
No ratings yet
Unit 3
46 pages
Machine Learning Approaches: Decision Trees
No ratings yet
Machine Learning Approaches: Decision Trees
44 pages
Decision Tree Classification Fully Explained by Example
No ratings yet
Decision Tree Classification Fully Explained by Example
4 pages
2167TC1 Lab
No ratings yet
2167TC1 Lab
8 pages
ML Unit 3 Part 1
No ratings yet
ML Unit 3 Part 1
42 pages
Decizsion Tree
No ratings yet
Decizsion Tree
16 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Springer - Linguistic Decision Trees For Classification-2014
No ratings yet
Springer - Linguistic Decision Trees For Classification-2014
43 pages
Decision Tree Learning Notes On 23rd July
No ratings yet
Decision Tree Learning Notes On 23rd July
23 pages
ML Unit 2-2-40
No ratings yet
ML Unit 2-2-40
39 pages
ML Introduction - CLASSIFICATION DECISION TREE
No ratings yet
ML Introduction - CLASSIFICATION DECISION TREE
18 pages
Decision Tree Learning Lecture
No ratings yet
Decision Tree Learning Lecture
13 pages
4.3-DecisionTreesLearningAlgorithms Part 2
No ratings yet
4.3-DecisionTreesLearningAlgorithms Part 2
15 pages
Lesson 7 Supervised Method (Decision Trees) Algorithms
No ratings yet
Lesson 7 Supervised Method (Decision Trees) Algorithms
12 pages
ML - Unit 2 - Part I
No ratings yet
ML - Unit 2 - Part I
15 pages
Decision Tree - Classifica On
No ratings yet
Decision Tree - Classifica On
4 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
12 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
Screenshot 2024-02-06 at 1.43.15 PM
No ratings yet
Screenshot 2024-02-06 at 1.43.15 PM
66 pages
Lecture - 3 Classification (Decision Tree)
No ratings yet
Lecture - 3 Classification (Decision Tree)
44 pages
ML-3-Decision Tree
No ratings yet
ML-3-Decision Tree
17 pages
Unit 4 - Decision Tree ID3
No ratings yet
Unit 4 - Decision Tree ID3
5 pages
Decession Tree
No ratings yet
Decession Tree
72 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
ML Unit-2 Material
No ratings yet
ML Unit-2 Material
20 pages
FALLSEM2024-25 BCSE209L TH VL2024250101735 2024-07-29 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101735 2024-07-29 Reference-Material-I
48 pages
Data Structures: Notes For Lecture 13 Techniques of Data Mining by Samaher Hussein Ali
No ratings yet
Data Structures: Notes For Lecture 13 Techniques of Data Mining by Samaher Hussein Ali
8 pages
Lec-3-Decision Trees
No ratings yet
Lec-3-Decision Trees
47 pages
DM Lect 9 - Classification - Decision Trees
No ratings yet
DM Lect 9 - Classification - Decision Trees
39 pages
ID3 Algorithm
100% (1)
ID3 Algorithm
3 pages
Module 2 Notes
No ratings yet
Module 2 Notes
20 pages
07. Decision Trees
No ratings yet
07. Decision Trees
34 pages
DWM - Module 3
No ratings yet
DWM - Module 3
22 pages
Decision Trees Tute
No ratings yet
Decision Trees Tute
8 pages
Module 3-1 PDF
No ratings yet
Module 3-1 PDF
43 pages
Algorithms: Improvement of ID3 Algorithm Based On Simplified Information Entropy and Coordination Degree
No ratings yet
Algorithms: Improvement of ID3 Algorithm Based On Simplified Information Entropy and Coordination Degree
18 pages
ID3 Algorithm
No ratings yet
ID3 Algorithm
5 pages
Dec Tree
No ratings yet
Dec Tree
17 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Python for Absolute Beginners: Learn to Code Fast!
From Everand
Python for Absolute Beginners: Learn to Code Fast!
Ibnul Jaif Farabi
No ratings yet
Data Science with R: Beginner to Expert
From Everand
Data Science with R: Beginner to Expert
Narayana Nemani
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
The Tech Interview Playbook: From DSA to System Design
From Everand
The Tech Interview Playbook: From DSA to System Design
Chinmoy Mukherjee
No ratings yet
Design And Analysis Of Algorithm
From Everand
Design And Analysis Of Algorithm
Bhupendra Mandloi
No ratings yet
Embedded Communications: Version 2 EE IIT, Kharagpur 1
100% (1)
Embedded Communications: Version 2 EE IIT, Kharagpur 1
15 pages
Auto Export Understanding
No ratings yet
Auto Export Understanding
12 pages
Plataforma Install DVR 3g
No ratings yet
Plataforma Install DVR 3g
22 pages
DSP Viva Questions PDF
No ratings yet
DSP Viva Questions PDF
4 pages
PDF
No ratings yet
PDF
420 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
34 pages
SAS Advance-3-1
No ratings yet
SAS Advance-3-1
14 pages
Datasource List For PPM and RPM Linking
No ratings yet
Datasource List For PPM and RPM Linking
2 pages
VMware Vcenter Converter Standalone 5.0 User's Guide
No ratings yet
VMware Vcenter Converter Standalone 5.0 User's Guide
96 pages
Yellow Pages
100% (1)
Yellow Pages
65 pages
Lecture 15 - Addressing Modes
100% (1)
Lecture 15 - Addressing Modes
4 pages
Bus Impedance Matrix PDF
No ratings yet
Bus Impedance Matrix PDF
10 pages
Autodesk Robot Structural Analysis Professional 2017 - Foundation Design
No ratings yet
Autodesk Robot Structural Analysis Professional 2017 - Foundation Design
1 page
All Questions 1-50
No ratings yet
All Questions 1-50
25 pages
NAME - Nandikaa Shanmugam Reg No. - 19BCE2461 1) : Package Import Public Class Public Static Void New
No ratings yet
NAME - Nandikaa Shanmugam Reg No. - 19BCE2461 1) : Package Import Public Class Public Static Void New
41 pages
Business Email - Students
No ratings yet
Business Email - Students
4 pages
ATLauncher Log 2
No ratings yet
ATLauncher Log 2
2 pages
Infographics: Ariel V. Fabrigas, Spst-I
No ratings yet
Infographics: Ariel V. Fabrigas, Spst-I
36 pages
Parallel Vs Server Jobs
No ratings yet
Parallel Vs Server Jobs
4 pages
Project Report On e Commerce
No ratings yet
Project Report On e Commerce
74 pages
TOC Full Note For PU
No ratings yet
TOC Full Note For PU
50 pages
PTC Earning Guide
No ratings yet
PTC Earning Guide
3 pages
Maths (Arith+Adv) (VOD) B-24
No ratings yet
Maths (Arith+Adv) (VOD) B-24
90 pages
Altronic - Digital Annunciatior DD-40NTV-O & 40NTV-U PDF
No ratings yet
Altronic - Digital Annunciatior DD-40NTV-O & 40NTV-U PDF
35 pages
Appendix B. Engineering Specifications Checklist (Guidance)
No ratings yet
Appendix B. Engineering Specifications Checklist (Guidance)
2 pages
Binary Trees Ques
No ratings yet
Binary Trees Ques
6 pages
PowerBuilder Online Courses Master
No ratings yet
PowerBuilder Online Courses Master
125 pages
Week1 Intro To Production Systems
100% (1)
Week1 Intro To Production Systems
33 pages