ID3 A LGORITHMS IN D ECISION T REES
Soumya Dasgupta, Akankshya Nayak
National Institute of Science Education and Research
February 13, 2023
C ONTENTS
1 Understanding Decision Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Introduction to ID3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 ID3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4 Inductive Bias in ID3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5 Major steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
6 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
7 Disadvantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1 / 11
Part I
ID3 A LGORITHM
2 / 11
U NDERSTANDING D ECISION T REES
▶ A decision tree is a tree in which a decision is taken at every node. The leaf nodes of the tree
generally indicate the final decision of the tree. The set of questions that are asked to take a
decision are known as features. Through the answers to these features, the decision tree reaches
at a conclusion usually termed as the label.
▶ There are multiple algorithms to create decision trees. One such algorithm is ID3.
3 / 11
I NTRODUCTION TO ID3
▶ ID3 stands for Iterative Dichotomiser 3 which was first invented by Ross Quinlan
▶ Iteratively (repeatedly) Dichotomizes (divides) the features into groups
▶ Top-down (builds the tree from the top), greedy (at each step, selects the current best feature to
create a node) approach to build decision trees
▶ Classification of nominal data
4 / 11
ID3 A LGORITHM
▶ Finds the best feature on the basis of Information Gain or Gain
▶ Information Gain tries to minimize the entropy in the data set i.e. the measure of disorder in the
target feature. Entropy of a dataset S is denoted as:
Entropy(S) = −Σpi ∗ log2 (pi ); i = 1 to n
Where, n: Total no. of classes in target column, pi : Probability of class i in the target column
5 / 11
ID3 A LGORITHM
▶ Then, Information Gain of a particular feature column A of the dataset Sis calculated as:
IG(S, A) = Entropy(S) − Σ((|SV |/|S|) ∗ Entropy(SV ))
Where, SV : Set of rows in S for which the feature column A has value V, |SV |: Number of rows
in SV , |S|: Number of rows in S r2
6 / 11
I NDUCTIVE B IAS IN ID3 A LGORITHM
▶ Inductive bias is based on the ordering of hypotheses by search strategy
▶ Approximate inductive bias of ID3: Shorter trees are preferred over Larger trees.
▶ Trees that place high information gain attributes close to the root are preferred over those that
do not.
7 / 11
M AJOR STEPS
▶ Calculation of Information Gain(IG)
▶ If all rows don’t belong to the same class, find the feature with maximum Information Gain and
split the dataset into subsets on the basis of this feature
▶ Make a decision tree node using this feature
▶ If all the rows belong to the same class, the current node is assigned a leaf node with a label of
that class
▶ Repeat for every feature
▶ Terminate when all the features are over or the decision tree contains only leaf nodes
▶ The following website can be visited to understand ID3 with an example: https:
//sefiks.com/2017/11/20/a-step-by-step-id3-decision-tree-example/
8 / 11
A DVANTAGES
▶ Prediction rules are created from the training data and are easily understandable
▶ Creates a short tree in relatively less time
▶ It only needs to test enough attributes until all data is classified
▶ Finding leaf nodes enables test data to be pruned, reducing the number of tests
9 / 11
D ISADVANTAGES
▶ Data may be over-fitted or over-classified, if a small sample is tested.
▶ Only one attribute at a time is tested for making a decision.
10 / 11
R EFERENCES
▶ Machine learning/inductive inference/decision trees/construction. Available at:
www.cs.ucdavis.edu/~vemuri/classes/ecs271/Decision%
20Trees-Construction.htm#:~:text=Approximate%20inductive%20bias%20of%
20ID3,over%20those%20that%20do%20not.https://
▶ Sakkaf, Y. (2020) Decision trees for classification: Id3 Algorithm explained, Medium. Towards
Data Science. Available at: https://towardsdatascience.com/
decision-trees-for-classification-id3-algorithm-explained-89df76e72df1
▶ Mantri, N. (2021) Using ID3 algorithm to build a decision tree to predict the weather,
OpenGenus IQ: Computing Expertise amp; Legacy. Available at:
https://iq.opengenus.org/id3-algorithm/
▶ Serengil, S. (2021) A step by step id3 decision tree example, Sefik Ilkin Serengil. Available at:
https://docs.rapidminer.com/latest/studio/operators/modeling/
predictive/trees/id3.html
▶ GmbH, R.M. Id3 (RapidMiner Studio Core), ID3 - RapidMiner Documentation. Available at:
https:
//sefiks.com/2017/11/20/a-step-by-step-id3-decision-tree-example/
11 / 11