0% found this document useful (0 votes)
10 views

Supervised Learning Algorithm DT

Uploaded by

Mahpara saleem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Supervised Learning Algorithm DT

Uploaded by

Mahpara saleem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Supervised Learning

Algorithm
Decision Tree Algorithm
Supervised Learning
• Supervised learning is the types of machine learning in which
machines are trained using well "labelled" training data, and on basis
of that data, machines predict the output.
• Supervised learning is a process of providing input data as well as
correct output data to the machine learning model. The aim of a
supervised learning algorithm is to find a mapping function to map
the input variable(x) with the output variable(y).
Working of Supervised Learning
Types of supervised Machine learning Algorithms:
Classification

• Classification algorithms are used when the output variable is


categorical, which means there are two classes such as Yes-No, Male-
Female, True-false, etc
• Random Forest
• Decision Trees
• Logistic Regression
• Support vector Machines
Decision Trees

• Decision Tree is a Supervised learning technique that can be used for


both classification and Regression problems, but mostly it is preferred
for solving Classification problems.
• In a Decision tree, there are two nodes, which are the Decision
Node and Leaf Node.
• Decision nodes are used to make any decision and have multiple
branches, whereas Leaf nodes are the output of those decisions and
do not contain any further branches.
How Does the Decision Tree Work?

• Root node: The base of the decision tree.


• Splitting: The process of dividing a node into multiple sub-nodes.
• Decision node: When a sub-node is further split into additional sub-
nodes.
• Leaf node: When a sub-node does not further split into additional
sub-nodes; represents possible outcomes.
• Pruning: The process of removing sub-nodes of a decision tree.
• Branch: A subsection of the decision tree consisting of multiple
nodes.
Decision Tree Working Structure
Why use Decision Trees?

There are various algorithms in Machine learning, so choosing the best


algorithm for the given dataset and problem is the main point to
remember while creating a machine learning model. Below are the two
reasons for using the Decision tree:
• Decision Trees usually mimic human thinking ability while making a
decision, so it is easy to understand.
• The logic behind the decision tree can be easily understood because it
shows a tree-like structure.
Attribute Selection Measures

While implementing a Decision tree, the main issue arises that how to
select the best attribute for the root node and for sub-nodes. So, to
solve such problems there is a technique which is called as Attribute
selection measure or ASM.
There are two popular techniques for ASM which are:
• Information Gain
• Gini Index
Information Gain

• Information gain is the measurement of changes in entropy after the


segmentation of a dataset based on an attribute.
• It calculates how much information a feature provides us about a
class.
• According to the value of information gain, we split the node and
build the decision tree.
• A decision tree algorithm always tries to maximize the value of
information gain, and a node/attribute having the highest information
gain is split first.
• Entropy: Entropy is a metric to measure the impurity in a given
attribute. It specifies randomness in data. Entropy can be calculated as:

• Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)


• Where,

• S= Total number of samples


• P(yes)= probability of yes
• P(no)= probability of no
Gini index
• Gini index is a measure of impurity or purity used while creating a
decision tree in the CART(Classification and Regression Tree)
algorithm.
• An attribute with the low Gini index should be preferred as compared
to the high Gini index.
• It only creates binary splits, and the CART algorithm uses the Gini
index to create binary splits.
Advantages of the Decision Tree

• It is simple to understand as it follows the same process which a


human follow while making any decision in real-life.
• It can be very useful for solving decision-related problems.
• It helps to think about all the possible outcomes for a problem.
• There is less requirement of data cleaning compared to other
algorithms.
Disadvantages of the Decision Tree

• The decision tree contains lots of layers, which makes it complex.


• It may have an overfitting issue, which can be resolved using
the Random Forest algorithm.
• For more class labels, the computational complexity of the decision
tree may increase.

You might also like