Jadeja Satyarajsinh
MACHINE
LEARNING
--------
ALGORITHMS
ALGORITHMS
--------
MACHINE LEARNING (ML) IS A SUBSET OF
ARTIFICIAL INTELLIGENCE (AI) THAT
FOCUSES ON DEVELOPING ALGORITHMS AND
STATISTICAL MODELS THAT ENABLE
MACHINES TO LEARN FROM AND MAKE
DECISIONS OR PREDICTIONS BASED ON
DATA WITHOUT BEING EXPLICITLY
PROGRAMMED FOR SPECIFIC TASKS.
WHAT IS SUPERVISED MACHINE LEARNING?
The textbook definition for supervised
learning is that it is a machine learning
technique where the model is trained using
labelled data. In this context, labelled data
refers to a dataset where each input is
associated with a known output. The model
learns to map the inputs to the correct
outputs, enabling it to make predictions
when exposed to new, unseen data.
There are two types of supervised learning :
1) Classification
2) Regression
Image Source: link
1. Linear Regression
So Linear Regression is a Regression is one of
the simplest algorithm and it is use to predict
continue value based on the features
(independent variable) in the training dataset.
In the above figure,
X-axis = Independent variable
Y-axis = Output / dependent variable
Line of regression = Best fit line for a model
Here, a line is plotted for the given data points that
suitably fit all the issues. Hence, it is called the
‘best fit line.’ .... The goal of the linear regression
algorithm is to find this best fit line seen in the
above figure.
2. Logistic Regression
Logistic regression is one of the most popular
Machine Learning algorithms. It accomplishes
binary classification tasks by predicting the
probability of an outcome, event, or
observation. The model delivers a binary or
dichotomous outcome limited to two possible
outcomes: yes/no, 0/1, or true/false.
The Logistic function gets its characteristic ‘S’
shape due to the range it varies in, that is 0 and
1 as shown in the figure above.
3. Decision Tree
Unlike other supervised learning algorithms,
the decision tree algorithm can be used for
solving regression and classification
problems too.In Decision Trees, for
predicting a class label for a record we start
from the root of the tree. We compare the
values of the root attribute with the record’s
attribute. On the basis of comparison, we
follow the branch corresponding to that
value and jump to the next node.
Each node in the tree acts as a test case for
some attribute, and each edge descending from
the node corresponds to the possible answers
to the test case. This process is recursive in
nature and is repeated for every subtree
rooted at the new node.
4. Support Vector Machine
The objective of the SVM algorithm is to find a
hyperplane that, to the best degree possible,
separates data points of one class from those of
another class. ‘Best’ is defined as the hyperplane
with the largest margin between the two classes,
represented by plus versus minus in the figure
below. Margin means the maximal width of the
slab parallel to the hyperplane that has no
interior data points. Only for linearly separable
problems can the algorithm find such a
hyperplane, for most practical problems the
algorithm maximizes the soft margin allowing a
small number of misclassifications.
This algorithm is used for many Classification and
Regression problem, including medical
applications, natural language processing, and
speech and image classification.
5. Random Forest
Random Forest, supervised machine learning algorithm
that is extremely popular and is used for Classification
and Regression problems in Machine Learning. We know
that a forest comprises numerous trees, and the more
trees more it will be robust. Similarly, the greater the
number of trees in a Random Forest Algorithm, the
higher its accuracy and problem-solving ability.
One of the most important features of the
Random Forest Algorithm is that it can handle
the data set containing continuous variables, as
in the case of regression, and categorical
variables, as in the case of classification.
Moreover, The algorithm’s strength lies in its
ability to handle complex datasets and mitigate
overfitting
6. Naive Bayes
Naive Bayes is use for mostly classifications,
which is based on Bayes theorem. It is mainly
used in text classification that includes a high-
dimensional training dataset.
There is three types of Naive Bayes:
1. Gaussian Naive Bayes: Used for continuous data
2. Bernoulli Naive Bayes: Designed for binary feature
data
3. Multinomial Naive Bayes: Suitable for discrete data,
Commonly for textdata.
Jadeja Satyarajsinh
7.Gradient Boosting
Gradient boosting is a machine learning
ensemble technique that sequentially
combines the predictions of multiple weak
learners, typically decision trees. It aims to
improve overall predictive performance by
optimizing the model’s weights based on the
errors of previous iterations, gradually
reducing prediction errors and enhancing
the model’s accuracy. This technique is
most commonly used for Linear Regression.