0% found this document useful (0 votes)

256 views51 pages

ID3 Algorithm & ROC Analysis

ID3 Algorithm & ROC Analysis Talha KABAKUŞ talha.kabakus@ibu.edu.tr Agenda ● ● ● ● ● ● ● ● ● ● Where are we now? Decision Trees What is ID3? Entropy Information Gain Pros and Cons of ID3 An Example - The Simpsons What is ROC Analysis? ROC Space ROC Space Example over predictions Where are we now? Decision Trees ● One of the most used classification approach because of its clear model and presentation ● Classification by using data attributes ● Aim is to reaching estimating destination fiel

Uploaded by

Talha KABAKUŞ

We take content rights seriously. If you suspect this is your content, claim it here.

0% found this document useful (0 votes)

256 views51 pages

ID3 Algorithm & ROC Analysis

Uploaded by

Talha KABAKUŞ

We take content rights seriously. If you suspect this is your content, claim it here.

You are on page 1/ 51

ID3 Algorithm & ROC Analysis

Talha KABAKU talha.kabakus@ibu.edu.tr

Agenda
Where are we now? Decision Trees What is ID3? Entropy Information Gain Pros and Cons of ID3 An Example - The Simpsons What is ROC Analysis? ROC Space ROC Space Example over predictions

Where are we now?

Decision Trees
One of the most used classification approach because of its clear model and presentation Classification by using data attributes Aim is to reaching estimating destination field value using source fields Tree Induction Create tree Apply data into tree to classify Each branch node represents a choice between a number of alternatives Each leaf node represents a classification or decision Leaf Count = Rule Count

Decision Trees (Cont.)

Leafs are inserted through top to bottom A B C

Sample Decision Tree

Creating Tree Model by Training Data

Decision Tree Classification Task

Apply Model to Test Data

Apply Model to Test Data (Cont.)

Decision Tree Algorithms

Classification and Regression Algorithms Twoig Gini Entropy-based Algorithms ID3 C4.5 Memory-based (Sample-based) Classification Algorithms

Decision Trees by Variable Type

Single Variable Decision Trees Classifications are done with asking questions over only one variable Hybrid Decision Trees Classifications are done with asking questions over both single and multiple variables Multiple Variables Decision Trees Classifications are done with asking questions over multiple variables

ID3 Algorithm
Iterative Dichotomizer 3 Developed by J. Ross Quinlan in 1979 Based on Entropy Only works for discrete data Can not work with defective data Advantage over Hunt's algorithm is choosing the right attribute while classification. (Hunt's algorithm chooses randomly)

Entropy
A formula to calculate the homogeneity of a sample; gives idea about how much information gain provides each leaf A complete homogeneous sample entropy value is 0 An equally divided sample entropy value is 1 Formula:

Information Gain (IG)

Information Gain calculates effective change in entropy after making a decision based on the value of an attribute. Which attribute creates the most homogeneous branches? First the entropy of the total dataset is calculated. The dataset is then split on the different attributes.

Information Gain (Cont.)

The entropy for each branch is calculated. Then it is added proportionally, to get total entropy for the split. The resulting entropy is subtracted from the entropy before the split. The result is the Information Gain, or decrease in entropy. The attribute that yields the largest IG is chosen for the decision node.

Information Gain (Cont.)

A branch set with entropy of 0 is a leaf node. Otherwise, the branch needs further splitting to classify its dataset. The ID3 algorithm is run recursively on the non-leaf branches, until all data is classified.

ID3 Algorithm Steps

Pros of ID3 Algorithm

Builds decision tree in min. steps The most important point while tree induction is collecting enough reliable associated data over specific properties. Asking right questions determines tree induction. Each level benefits from previous level choices Whole dataset is scanned to create tree

Cons of ID3 Algorithm

Tree can not be updated when new data is classified incorrectly, instead a new tree must be generated. Only one attribute at a time is tested for making a decision. Can not work with defective data Can not work with numerical attributes

An Example - The Simpsons

Person Homer Marge Bart Lisa Maggie Abe Selma Otto Krusty Hair Length 0'' 10'' 2'' 6'' 4'' 1'' 8'' 10'' 6'' Weight 250 150 90 78 20 170 160 180 200 Age 36 34 10 8 1 70 41 38 45 Class M F M M F F F M M

Information Gain over Hair Length

E(4F, 5M) = -(4/9)log2(4/9) - (5/9)log2(5/9) = 0.9911 ==> General Information Gain

Hair Length <= 5 Yes No

E(1F,3M) = -(1/4)log2(1/4) - (3/4)log2(3/4) = 0.9710

E(3F,2M) = -(3/5)log2(3/5) - (2/5)log2(2/5) =0.8113

Gain(Hair Length <= 5) = 0.9911 (4/9 * 0.9710 + 5/9 * 0.8113) = 0.0911

Information Gain over Weight

E(4F, 5M) = -(4/9)log2(4/9) - (5/9)log2(5/9) = 0.9911 ==> General Information Gain

Weight <= 160 Yes No

E(4F,1M) = -(4/5)log2(4/5) - (1/5)log2(1/5) = 0.7219

E(0F,4M) = -(0/4)log2(0/4) - (4/4)log2(4/4) = 0

Gain(Weight<= 160) = 0.9911 (5/9 * 0.7219 + 4/9 * 0 ) = 0.5900

Information Gain over Age

E(4F, 5M) = -(4/9)log2(4/9) - (5/9)log2(5/9) = 0.9911 ==> General Information Gain

Age <= 40 Yes No

E(3F,3M) = -(3/6)log2(3/6) - (3/6)log2(3/6) = 1

E(1F,2M) = -(1/3)log2(1/3) - (2/3)log2(2/3)= 0.9188

Gain(Age z= 40) = 0.9911 (6/9 * 1 + 3/9 * 0.9183 ) = 0.0183

Results
Attribute Hair Length <= 5 Weight <= 160 Age <= 40 Information Gain (IG) 0.0911 0.5900 0.0183

As seen in the results, weight is the best attribute to classify these group.

Constructed Decision Tree

Weight <= 160 Yes No

Hair Length <= 5 Yes No

Female

Male

Entropy over Nominal Values

If an attribute has nominal values:
First calculate information gain for each attribute value Then calculate attribute information gain

Example II

IG= -(5/15)log2(5/15)-(10/15)log2(10/15) = ~0.918

Example II (Cont.)
Information Gain over Engine
Engine: 6 small, 5 medium, 4 large 3 values for attribute engine, so we need 3 entropy calculations small: 5 no, 1 yes IGsmall = -(5/6)log2(5/6)-(1/6)log2(1/6) = ~0.65 medium: 3 no, 2 yes IGmedium = -(3/5)log2(3/5)-(2/5)log2(2/5) = ~0.97 large: 2 no, 2 yes IGlarge = 1 (evenly distributed subset) => IGEngine = IE(S) [(6/15)*IGsmall + (5/15)*IGmedium + (4/15)*Ilarge] = IGEngine = 0.918 0.85 = 0.068

Example II (Cont.)
Information Gain over SC/Turbo
SC/Turbo: 4 yes, 11 no 2 values for attribute SC/Turbo, so we need 2 entropy calculations yes: 2 yes, 2 no IGturbo = 1 (evenly distributed subset) no: 3 yes, 8 no IGnoturbo = -(3/11)log2(3/11)-(8/11)log2(8/11) = ~0.84 IGturbo = IE(S) [(4/15)*IGturbo + (11/15)*IGnoturbo] IGturbo = 0.918 0.886 = 0.032

Example II (Cont.)
Information Gain over Weight
Weight: 6 Average, 4 Light, 5 Heavy 3 values for attribute weight, so we need 3 entropy calculations average: 3 no, 3 yes IGaverage = 1 (evenly distributed subset) light: 3 no, 1 yes IGlight = -(3/4)log2(3/4)-(1/4)log2(1/4) = ~0.81 heavy: 4 no, 1 yes IGheavy = -(4/5)log2(4/5)-(1/5)log2(1/5) = ~0.72
IGWeight = IE(S) [(6/15)*IGaverage + (4/15)*IGlight + (5/15)*IGheavy] IGWeight = 0.918 0.856 = 0.062

Example II (Cont.)
Information Gain over Full Eco
Fuel Economy: 2 good, 3 average, 10 bad 3 values for attribute Fuel Eco, so we need 3 entropy calculations good: 0 yes, 2 no IGgood = 0 (no variability) average: 0 yes, 3 no IGaverage = 0 (no variability) bad: 5 yes, 5 no IGbad = 1 (evenly distributed subset)
We can omit calculations for good and average since they always end up not fast. IGFuelEco = IE(S) [(10/15)*IGbad] IGFuelEco = 0.918 0.667 = 0.251

Example II (Cont.)

Results:
IGEngine IGturbo IGWeight IGFuelEco 0.068 0.032 0.062 0.251

Root of the tree

Example II (Cont.)

Since we selected the Fuel Eco attribute for our Root Node, it is removed from the table for future calculations.

General Information Gain = 1 (Evenly distributed set)

Example II (Cont.)
Information Gain over Engine
Engine: 1 small, 5 medium, 4 large 3 values for attribute engine, so we need 3 entropy calculations small: 1 yes, 0 no IGsmall = 0 (no variability) medium: 2 yes, 3 no IGmedium = -(2/5)log2(2/5)-(3/5)log2(3/5) = ~0.97 large: 2 no, 2 yes IGlarge = 1 (evenly distributed subset)
IGEngine = IE(SFuelEco) (5/10)*IGmedium + (4/10)*IGlarge] IGEngine = 1 0.885 = 0.115

Example II (Cont.)
Information Gain over SC/Turbo
SC/Turbo: 3 yes, 7 no 2 values for attribute SC/Turbo, so we need 2 entropy calculations yes: 2 yes, 1 no IGturbo = -(2/3)log2(2/3)-(1/3)log2(1/3) = ~0.84 no: 3 yes, 4 no IGnoturbo = -(3/7)log2(3/7)-(4/7)log2(4/7) = ~0.84 IGturbo = IE(SFuelEco) [(3/10)*IGturbo + (7/10)*IGnoturbo] IGturbo = 1 0.965 = 0.035

Example II (Cont.)
Information Gain over Weight
Weight: 3 average, 5 heavy, 2 light 3 values for attribute weight, so we need 3 entropy calculations average: 3 yes, 0 no IGaverage = 0 (no variability) heavy: 1 yes, 4 no IGheavy = -(1/5)log2(1/5)-(4/5)log2(4/5) = ~0.72 light: 1 yes, 1 no IlGight = 1 (evenly distributed subset) IGEngine = IE(SFuel Eco) [(5/10)*IGheavy+(2/10)*IGlight] IGEngine = 1 0.561 = 0.439

Example II (Cont.)
Results:
IGEngine IGturbo IGWeight 0.115 0.035 0.439

Weight has the highest gain, and is thus the best choice.

Example II (Cont.)
Since there are only two items for SC/Turbo where Weight = Light, and the result is consistent, we can simplify the weight = Light path.

Example II (Cont.)
Updated Table: (Weight = Heavy)

All cars with large engines in this table are not fast. Due to inconsistent patterns in the data, there is no way to proceed since medium size engines may lead to either fast or not fast.

ROC Analysis
Receiver Operating Characteristic The limitations of diagnostic accuracy as a measure of decision performance require introduction of the concepts of the sensitivity and specificity of a diagnostic test. These measures and the related indices, true positive rate and false positive rate, are more meaningful than accuracy. ROC curve is shown to be a complete description of this decision threshold effect, indicating all possible combinations of the relative frequencies of the various kinds of correct and incorrect decisions.

ROC Analysis (Cont.)

Combinations of correct & incorrect decisions:
Actual Value
p p n n

Prediction Outcome
p n p n

Description
True Positive Rate (TPR) False Negative Rate (FNR) False Positive Rate (FPR) True Negative Rate (TNR)

TPR is equivalent with sensitivity. FPR is equivalent with 1 - specificity. Best possible prediction would be 100% sensitivity and 100% specificity (which means FPR = 0%).

ROC Space
A ROC space is defined by FPR and TPR as x and y axes respectively, which depicts relative trade-offs between true positive (benefits) and false positive (costs). Since TPR is equivalent with sensitivity and FPR is equal to 1 specificity, the ROC graph is sometimes called the sensitivity vs (1 specificity) plot. Each prediction result one point in the ROC space.

Calculations
Sensitivity TPR = TP / P = TP / (TP + FN) Specificity FPR = FP / N = FP / (FP + TN) Accuracy ACC = (TP + TN) / (P + N)

A ROC Space Example

Let A, B, C, D to be predictions over 100 negative and 100 positive instance:
Prediction/ Combination A B C D TP 63 77 24 76 FP 28 77 88 12 FN 37 23 76 24 TN 72 23 12 88 TPR 0.63 0.77 0.24 0.76 FPR 0.28 0.77 0.88 0.12 ACC 0.68 0.50 0.18 0.82

A ROC Space Example (Cont.)

References
1. Data Mining Course Lectures, Ass. Prof. Nilfer Yurtay 2. Quinlan, J.R. 1986, Machine Learning, 1, 81 3. http://www.cse.unsw.edu. au/~billw/cs9414/notes/ml/06prop/id3/id3.html 4. J. Han, M. Kamber, J. Pie, Data Mining Concepts and Techniques, 3rd Edition, Elsevier, 2011. 5. http://www.cise.ufl.edu/~ddd/cap6635/Fall97/Short-papers/2.htm 6. C. E. Metz, Basic Principles of ROC Analysis, Seminars in Nuclear Medicine, Volume 8, Issue 4, P 283-298

Nathan Kutz - Dynamic Mode Decomposition Data-Driven Modeling of Complex Systems
100% (1)
Nathan Kutz - Dynamic Mode Decomposition Data-Driven Modeling of Complex Systems
255 pages
BSBWRT401 Student Assessment Booklet CBSA V1.0 ID 173704
100% (1)
BSBWRT401 Student Assessment Booklet CBSA V1.0 ID 173704
36 pages
Introduction To Fiber Optics PDF
No ratings yet
Introduction To Fiber Optics PDF
32 pages
23 Id3
No ratings yet
23 Id3
20 pages
UNIT - 5 - ID3 Algotithm (Good Slide)
No ratings yet
UNIT - 5 - ID3 Algotithm (Good Slide)
28 pages
Designing An Improved Id3 Decision Tree Algorithm
No ratings yet
Designing An Improved Id3 Decision Tree Algorithm
5 pages
ID3 Algorithm: Michael Crawford
No ratings yet
ID3 Algorithm: Michael Crawford
28 pages
ID3 MedhaPradhan
No ratings yet
ID3 MedhaPradhan
22 pages
ID3 Algorithm: Michael Crawford
No ratings yet
ID3 Algorithm: Michael Crawford
28 pages
ID3 AllanNeymark
No ratings yet
ID3 AllanNeymark
22 pages
Lesson 5
No ratings yet
Lesson 5
28 pages
Decision Tree 2
No ratings yet
Decision Tree 2
20 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Chapter#03 Supervised Learning and Its Algorithms - III
No ratings yet
Chapter#03 Supervised Learning and Its Algorithms - III
29 pages
ID3 Algorithm: Abbas Rizvi CS157 B Spring 2010
No ratings yet
ID3 Algorithm: Abbas Rizvi CS157 B Spring 2010
19 pages
Classification Trees
No ratings yet
Classification Trees
48 pages
6CS4-02 Machine Learning Manish Bhardwaj
No ratings yet
6CS4-02 Machine Learning Manish Bhardwaj
625 pages
Decision Tree Classifier-Introduction, ID3
No ratings yet
Decision Tree Classifier-Introduction, ID3
34 pages
Artificial Intelligence 11. Decision Tree Learning
No ratings yet
Artificial Intelligence 11. Decision Tree Learning
25 pages
Decision Tree
No ratings yet
Decision Tree
29 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
52 pages
MLT UNIT-3 Notes
No ratings yet
MLT UNIT-3 Notes
35 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
80 pages
FALLSEM2024-25 BCSE209L TH VL2024250101735 2024-07-29 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101735 2024-07-29 Reference-Material-I
48 pages
Unit 3 (A) NGP
No ratings yet
Unit 3 (A) NGP
78 pages
02 DecisionTrees Done
No ratings yet
02 DecisionTrees Done
68 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
Decision Tree Learning Notes On 23rd July
No ratings yet
Decision Tree Learning Notes On 23rd July
23 pages
Decision Trees
No ratings yet
Decision Trees
25 pages
New Module 3 Part1
No ratings yet
New Module 3 Part1
69 pages
UN Data Minig
No ratings yet
UN Data Minig
24 pages
Lecture 11 Classification-1
No ratings yet
Lecture 11 Classification-1
30 pages
3 Decision Tree Learning
No ratings yet
3 Decision Tree Learning
38 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
61 pages
7 DecisionTree
No ratings yet
7 DecisionTree
58 pages
Unit 3
No ratings yet
Unit 3
46 pages
Ai 01 Id3
No ratings yet
Ai 01 Id3
7 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
Data Science Lectures 3
No ratings yet
Data Science Lectures 3
46 pages
L3 - Decision Trees
No ratings yet
L3 - Decision Trees
28 pages
Unit 3
No ratings yet
Unit 3
81 pages
Entropy and Information Gain Explained
No ratings yet
Entropy and Information Gain Explained
10 pages
MLT Unit 3
100% (1)
MLT Unit 3
38 pages
Unit 4 - Decision Tree ID3
No ratings yet
Unit 4 - Decision Tree ID3
5 pages
Chapter4 Machine Learning Part3
No ratings yet
Chapter4 Machine Learning Part3
43 pages
Decision Tree Basics
No ratings yet
Decision Tree Basics
30 pages
Module 3-Decision Tree Learning
100% (1)
Module 3-Decision Tree Learning
33 pages
06-Classification Part1
No ratings yet
06-Classification Part1
44 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
Decision Tree ID3 Algorithm - Machine Learning - by AshirbadPradhan - Medium
No ratings yet
Decision Tree ID3 Algorithm - Machine Learning - by AshirbadPradhan - Medium
18 pages
07. Decision Trees
No ratings yet
07. Decision Trees
34 pages
15 1 Random Forest and Decision Tree
No ratings yet
15 1 Random Forest and Decision Tree
66 pages
Mod 3 Part1 - Merged
No ratings yet
Mod 3 Part1 - Merged
101 pages
ID3
No ratings yet
ID3
7 pages
DT Classifier
No ratings yet
DT Classifier
45 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Video Tutorial: Decision Tree Learning
No ratings yet
Video Tutorial: Decision Tree Learning
21 pages
Overview of Activities Risk Factors
No ratings yet
Overview of Activities Risk Factors
36 pages
Quang Silic - List 600 Domain Link Building + 160 Forum - Tuyển Dụng - Rao Vặt Việt Nam
No ratings yet
Quang Silic - List 600 Domain Link Building + 160 Forum - Tuyển Dụng - Rao Vặt Việt Nam
23 pages
John Eastman Memo
No ratings yet
John Eastman Memo
2 pages
Order 36389265
No ratings yet
Order 36389265
4 pages
Related Literature
No ratings yet
Related Literature
10 pages
Engineering Materials: Course Type Sub. Abbrev. Name of Course L T P Credit
No ratings yet
Engineering Materials: Course Type Sub. Abbrev. Name of Course L T P Credit
1 page
Ch. 6 Measuring National Output and National Income
No ratings yet
Ch. 6 Measuring National Output and National Income
37 pages
Modern Information Retrieval Systems: Bs (Lis)
No ratings yet
Modern Information Retrieval Systems: Bs (Lis)
129 pages
DIN HANdsfsDBOOK 1 PDF
No ratings yet
DIN HANdsfsDBOOK 1 PDF
6 pages
Latest Final Year Electronics and
100% (1)
Latest Final Year Electronics and
17 pages
The Effective Use of Social Media in Crime Detection and Prevention: The Promotion of Public Trust in The Uae Police-The Case of The Abu Dhabi Police
No ratings yet
The Effective Use of Social Media in Crime Detection and Prevention: The Promotion of Public Trust in The Uae Police-The Case of The Abu Dhabi Police
305 pages
TCSESM Managed Switch Fourp
No ratings yet
TCSESM Managed Switch Fourp
6 pages
About Periskope
No ratings yet
About Periskope
8 pages
Assignment On Design Thinking
0% (1)
Assignment On Design Thinking
2 pages
Quiz - Resume Writing (Final Demo)
No ratings yet
Quiz - Resume Writing (Final Demo)
2 pages
An Extension and Evaluation of Job Characteristics, Organizational Commitment and Job Satisfaction in An Expatriate, Guest Worker, Sales Setting - 12 Pgs
No ratings yet
An Extension and Evaluation of Job Characteristics, Organizational Commitment and Job Satisfaction in An Expatriate, Guest Worker, Sales Setting - 12 Pgs
12 pages
Miele WWH860 Brief Operating Instructions
No ratings yet
Miele WWH860 Brief Operating Instructions
2 pages
Sudip Bhattacharjee - CV
No ratings yet
Sudip Bhattacharjee - CV
7 pages
Level 7
No ratings yet
Level 7
31 pages
Creed Rice Market Report.
No ratings yet
Creed Rice Market Report.
4 pages
Using Basic Views
No ratings yet
Using Basic Views
8 pages
ETI 1981-06 June-OCR-Page-0047
No ratings yet
ETI 1981-06 June-OCR-Page-0047
1 page
Grading in Civil 3D
No ratings yet
Grading in Civil 3D
12 pages
236k N515 Purchaser FBR
No ratings yet
236k N515 Purchaser FBR
1 page
Class Wise Time W.E.F. 22.01.2024 Table 4th 6th & 8th Semester
No ratings yet
Class Wise Time W.E.F. 22.01.2024 Table 4th 6th & 8th Semester
4 pages
Problem 2:: Theory
No ratings yet
Problem 2:: Theory
4 pages
Hts Log
No ratings yet
Hts Log
261 pages