0% found this document useful (0 votes)
2 views15 pages

Lecture 4

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 15

Introduction to

Machine Learning

LECTURE 4: DECISION TREES

1
A simple example

2
Other solutions for the same data

3
Decision tree: the prinicple

An ordered set of tests


Each test-result points to another test or to a leaf labeled with a class
Induction of decisions aims at perfect classification of the training data
But: many decision trees can be found, doing the same thing; how to
choose the best one?

4
Small trees better than big trees

Easier to interpret
Lower danger of training-data overfitting
Tendency to eliminate irrelevant and redundant attributes
Cheaper classification when attribute values expensive or difficult to
obtain

5
Induction of small decision trees

6
An attribute’s information content in a
2-class domain
Information contents of the message, “example x is pos”
◦ If most examples are positive, information contents is low
◦ If most examples are negative, information contents is big

The formula to quantify information contents:

7
Entropy: average information contents

Repeat the same for a series of examples.


For each example, two different messages are possible:
◦ “example is positive”
◦ “example is negative”

Average information content of all these messages:

8
Amount of information contributed by
an attribute

Attrib. at divides training set into subsets, 𝑇𝑖 , each defined by one value of at
|𝑇𝑖 |
Let 𝑃𝑖 =
|𝑇|

Weighted average of the entropies of the subsets:

Information gain:

9
Searching for the best attribute

10
Find the best attribute (example)

11
Find the best attribute (cont.)

12
Binary splits of numeric attrib. (cont.)

13
A decision tree is a set of rules

14
Other approaches

• Present the entire vector of


attributes describing the example
• Classifier returns the class label
Two
Two
alternative
sources of
Decision trees

classification
classifiers’ • Decision tree recommends an
attribute test
errors
scenarios • User finds this attribute’s value
and presents it to the tree
• Important: We may not know all
attribute values. May have to
measure them at great costs!

15

You might also like