Decision Tree
Decision Tree
Information gain, do not, therein allowing multiway splits (i.e., two or more branches to be grown from a node).
Information Gain (ID3/C4.5)
Select the attribute with the highest information gain
Let pi be the probability that an arbitrary tuple in D belongs to
class Ci, estimated by |Ci, D|/|D|
Expected information (entropy) needed tomclassify a tuple in D:
Info( D ) E ( S ) pi log 2 ( pi )
i 1
Information needed (after using A to split D into v partitions) to
classify D: v | D |
InfoA ( D ) I ( attribute)
j
Info( D j )
j 1 | D |
Information gained by branching on attribute A
Gain(A) Info(D) Info A(D)
9
Expected information (entropy)
needed to classify a tuple in D:
Information needed (after using A to split D into
v partitions) to classify D:
• The reduction in impurity that would be incurred by a binary split on a discrete- or continuous-valued
attribute A is