Decision Tree Induction Algorithm
Decision Tree Induction Algorithm
Decision Tree Induction Algorithm
A decision tree is a machine learning algorithm that creates a tree-like model of decisions and their
possible consequences. In classification, a decision tree is used to classify input data into one of
several possible classes. Here are the steps in how a decision tree works in classification in data
mining
2. The find_best_split() : function determines which attribute should be selected as the test
condition for splitting the training records. The choice of test condition depends on which
impurity measure is used to determine the goodness of a split. The popular measures include
entropy and the Gini index.
3. The Classify() : This Function determines the class label to be assigned to a leaf node. For
each leaf node t, let p(i|t) denote the fraction of training records from class i associated with
the node t. the leaf node is assigned to the class that has the majority number of training
records :
Algorithm 3.1 A skeleton decision tree induction algorithm.
3: leaf. Label = Classify (E). # determines the class label and assigned to a
leaf node 4: return leaf.
5: else
6: root = createNode().
7: root. Test cond = find best split(E, F). # recursively select best attribute to
9: for each v ∈ V do
split data. 8: let V = {v|v is a possible outcome of root.test cond }.
where the argmax operator returns the class i that maximizes p(i|t).
5. The stopping Cond() : Function is used to terminate the tree-growing process by testing
whether all the records have same class label or same attribute values. After building the
decision tree, a tree-pruning step can be performed to reduce the size of the decision tree.