0% found this document useful (0 votes)
97 views

7 Types of Classification Algorithms

The document discusses 7 types of classification algorithms: Logistic Regression, Naive Bayes, Stochastic Gradient Descent, K-Nearest Neighbors, Decision Tree, Random Forest, and Support Vector Machine. It provides a brief definition and the advantages and disadvantages of each algorithm type. It also notes that the purpose is to code examples of the 7 algorithms in Python. The dataset used is from the US Census Bureau and contains salaries, with the goal being to classify incomes as either over $50k or less than $50k.

Uploaded by

pritinigam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views

7 Types of Classification Algorithms

The document discusses 7 types of classification algorithms: Logistic Regression, Naive Bayes, Stochastic Gradient Descent, K-Nearest Neighbors, Decision Tree, Random Forest, and Support Vector Machine. It provides a brief definition and the advantages and disadvantages of each algorithm type. It also notes that the purpose is to code examples of the 7 algorithms in Python. The dataset used is from the US Census Bureau and contains salaries, with the goal being to classify incomes as either over $50k or less than $50k.

Uploaded by

pritinigam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

5/13/22, 6:54 PM 7 Types of Classification Algorithms

(https://www.intuit.com/careers/oa/technology/?cid=rodb_aim_click_in_ttt-
global_aw_round3Shwetasharma%7Calltechaudience_gif%7C980x90_intuit-
talent)

(https://ub.jigsawacademy.com/ub-executive-pg-diploma-in-manageme
(https://analyticsindiamag.com)
leadsource=AIM&utm_source=AIM&utm_medium=Banner&utm_campaign
banner-

(https://praxis.ac.in/data-science-course-in-bangalore/?
utm_source=AIM&utm_medium=banner&utm_campaign=PAT_21MAY22)

PUBLISHED ON
JANUARY 19, 2018
IN DEVELOPERS CORNER (HTTPS://ANALYTICSINDIAMAG.COM/CATEGORY/DEVELOPERS_CORNER/)

7 Types of Classification Algorithms


BY ROHIT GARG(HTTPS://ANALYTICSINDIAMAG.COM/AUTHOR/F2005636GMAIL-COM/)

https://analyticsindiamag.com/7-types-classification-algorithms/ 1/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

(https://www.sas.com/gms/redirect.jsp?detail=GMS224019_309942)
Advertisement

The purpose of this research is to put together the 7 most common types of classification algorithms along with
the python code: Logistic Regression (https://analyticsindiamag.com/understanding-logistic-regression-in-r-
with-machinehacks-predict-the-data-scientists-salary-in-india-hackathon/), Naïve Bayes
(https://analyticsindiamag.com/a-hands-on-introduction-to-naive-bayes-classification-in-python/), Stochastic
Gradient Descent (https://analyticsindiamag.com/how-stochastic-gradient-descent-is-solving-optimisation-
problems-in-deep-learning/), K-Nearest Neighbours (https://analyticsindiamag.com/a-complete-guide-for-
beginning-with-k-nearest-neighbours-algorithm-in-python/), Decision Tree
(https://analyticsindiamag.com/hands-on-tutorial-how-to-use-decision-tree-regression-to-solve-machinehacks-
new-data-science-hackathon/), Random Forest (https://analyticsindiamag.com/solving-the-titanic-ml-survival-
problem-using-random-forest-vs-neural-networks-on-tensorflow-which-one-is-better/), and Support Vector
Machine (https://analyticsindiamag.com/this-is-how-support-vector-machines-are-helping-assess-personality-
types-with-iris-classification/)

1 Introduction
1.1 Structured Data Classification
Classification can be performed on structured or unstructured data. Classification is a technique where we
categorize data into a given number of classes. The main goal of a classification problem is to identify the
category/class to which a new data will fall under.

Few of the terminologies encountered in machine learning – classification:

Classifier: An algorithm that maps the input data to a specific category.


Classification model (https://analyticsindiamag.com/benchmark-analysis-of-popular-image-
classification-models/): A classification model tries to draw some conclusion from the input values given
for training. It will predict the class labels/categories for the new data.
Feature: A feature is an individual measurable property of a phenomenon being observed.
Binary Classification (https://analyticsindiamag.com/correcting-class-imbalanced-data-for-binary-
classification-problems-demonstrations-using-animated-videos/): Classification task with two possible
outcomes. Eg: Gender classification (Male / Female)
Multi-class classification (https://analyticsindiamag.com/step-by-step-guide-to-implement-multi-class-
classification-with-bert-tensorflow/): Classification with more than two classes. In multi class

https://analyticsindiamag.com/7-types-classification-algorithms/ 2/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

classification each sample is assigned to one and only one target label. Eg: An animal can be cat or dog
but not both at the same time
Multi-label classification (https://analyticsindiamag.com/multi-label-image-classification-with-
tensorflow-keras/): Classification task where each sample is mapped to a set of target labels (more than
one class). Eg: A news article can be about sports, a person, and location at the same time.
The following are the steps involved in building a classification model:

Initialize the classifier to be used.


Train the classifier: All classifiers in scikit-learn (https://analyticsindiamag.com/a-beginners-guide-to-
scikit-learns-mlpclassifier/) uses a fit(X, y) method to fit the model(training) for the given train data X
and train label y.
Predict the target: Given an unlabeled observation X, the predict(X) returns the predicted label y.
Evaluate the classifier model

1.2 Dataset Source and Contents


The dataset contains salaries. The following is a description of our dataset:

of Classes: 2 (‘>50K’ and ‘<=50K’)


of attributes (Columns): 7
of instances (Rows): 48,842
This data was extracted from the census bureau database found at:

http://www.census.gov/ftp/pub/DES/www/welcome.html

1.3 Exploratory Data Analysis

https://analyticsindiamag.com/7-types-classification-algorithms/ 3/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

(https://analyticsindiamag.com/7-types-classification-algorithms/7-types-of-classification-algorithms/)

2 Types of Classification Algorithms (Python)


2.1 Logistic Regression (https://analyticsindiamag.com/a-beginners-guide-to-regression-techniques/)
Definition: Logistic regression is a machine learning algorithm for classification. In this algorithm, the
probabilities describing the possible outcomes of a single trial are modelled using a logistic function.

Advantages: Logistic regression is designed for this purpose (classification), and is most useful for
understanding the influence of several independent variables on a single outcome variable.

Disadvantages: Works only when the predicted variable is binary, assumes all predictors are independent of
each other and assumes data is free of missing values.

(https://analyticsindiamag.com/7-types-classification-algorithms/screen-shot-2018-01-19-at-10-52-28-am/)

https://analyticsindiamag.com/7-types-classification-algorithms/ 4/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

2.2 Naïve Bayes (https://analyticsindiamag.com/what-is-a-naive-bayes-classifier-and-what-significance-


does-it-have-for-ml/)

Definition: Naive Bayes algorithm based on Bayes’ theorem with the assumption of independence between
every pair of features. Naive Bayes classifiers work well in many real-world situations such as document
classification and spam filtering.

Advantages: This algorithm requires a small amount of training data to estimate the necessary parameters.
Naive Bayes classifiers are extremely fast compared to more sophisticated methods.

Disadvantages: Naive Bayes is is known to be a bad estimator.

(https://analyticsindiamag.com/7-types-classification-algorithms/screen-shot-2018-01-19-at-10-52-55-am/)

2.3 Stochastic Gradient Descent


Definition: Stochastic gradient descent (https://analyticsindiamag.com/a-lowdown-on-alternatives-to-gradient-
descent-optimization-algorithms/) is a simple and very efficient approach to fit linear models. It is particularly
useful when the number of samples is very large. It supports different loss functions and penalties for
classification.

Advantages: Efficiency and ease of implementation.

Disadvantages: Requires a number of hyper-parameters and it is sensitive to feature scaling.

(https://analyticsindiamag.com/7-types-classification-algorithms/screen-shot-2018-01-19-at-10-53-58-am/)

2.4 K-Nearest Neighbours


Definition: Neighbours based classification is a type of lazy learning as it does not attempt to construct a
general internal model, but simply stores instances of the training data. Classification is computed from a
simple majority vote of the k nearest neighbours of each point.

Advantages: This algorithm is simple to implement, robust to noisy training data, and effective if training data
is large.

https://analyticsindiamag.com/7-types-classification-algorithms/ 5/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

Disadvantages: Need to determine the value of K and the computation cost is high as it needs to compute the
distance of each instance to all the training samples.

(https://analyticsindiamag.com/7-types-classification-algorithms/screen-shot-2018-01-19-at-10-58-19-am/)

2.5 Decision Tree


Definition: Given a data of attributes together with its classes, a decision tree produces a sequence of rules that
can be used to classify the data.

Advantages: Decision Tree (https://analyticsindiamag.com/hands-on-tutorial-how-to-use-decision-tree-


regression-to-solve-machinehacks-new-data-science-hackathon/) is simple to understand and visualise, requires
little data preparation, and can handle both numerical and categorical data.

Disadvantages: Decision tree can create complex trees that do not generalise well, and decision trees can be
unstable because small variations in the data might result in a completely different tree being generated.

(https://analyticsindiamag.com/7-types-classification-algorithms/screen-shot-2018-01-19-at-10-59-33-am/)

2.6 Random Forest


Definition: Random forest (https://analyticsindiamag.com/step-by-step-guide-to-reviews-classification-using-
svc-naive-bayes-random-forest/) classifier is a meta-estimator that fits a number of decision trees on various
sub-samples of datasets and uses average to improve the predictive accuracy of the model and controls over-
fitting. The sub-sample size is always the same as the original input sample size but the samples are drawn
with replacement.

Advantages: Reduction in over-fitting and random forest classifier is more accurate than decision trees in most
cases.

Disadvantages: Slow real time prediction, difficult to implement, and complex algorithm.

https://analyticsindiamag.com/7-types-classification-algorithms/ 6/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

(https://analyticsindiamag.com/7-types-classification-algorithms/screen-shot-2018-01-19-at-11-00-06-am/)

2.7 Support Vector Machine


Definition: Support vector machine (https://analyticsindiamag.com/understanding-the-basics-of-svm-with-
example-and-python-implementation/) is a representation of the training data as points in space separated into
categories by a clear gap that is as wide as possible. New examples are then mapped into that same space and
predicted to belong to a category based on which side of the gap they fall.

Advantages: Effective in high dimensional spaces and uses a subset of training points in the decision function
so it is also memory efficient.

Disadvantages: The algorithm does not directly provide probability estimates, these are calculated using an
expensive five-fold cross-validation.

(https://analyticsindiamag.com/7-types-classification-algorithms/screen-shot-2018-01-19-at-11-00-44-am/)

3 Conclusion
3.1 Comparison Matrix
Accuracy: (True Positive + True Negative) / Total Population
Accuracy is a ratio of correctly predicted observation to the total observations. Accuracy is the
most intuitive performance measure.
True Positive: The number of correct predictions that the occurrence is positive
True Negative: The number of correct predictions that the occurrence is negative
F1-Score: (2 x Precision x Recall) / (Precision + Recall)
F1-Score is the weighted average of Precision and Recall used in all types of classification
algorithms. Therefore, this score takes both false positives and false negatives into account. F1-
Score is usually more useful than accuracy, especially if you have an uneven class distribution.
Precision: When a positive value is predicted, how often is the prediction correct?
Recall: When the actual value is positive, how often is the prediction correct?

Classification Algorithms Accuracy F1-Score

Logistic Regression 84.60% 0.6337

Naïve Bayes 80.11% 0.6005

Stochastic Gradient Descent 82.20% 0.5780

https://analyticsindiamag.com/7-types-classification-algorithms/ 7/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

K-Nearest Neighbours 83.56% 0.5924

Decision Tree 84.23% 0.6308

Random Forest 84.33% 0.6275

Support Vector Machine 84.09% 0.6145

Code location: https://github.com/f2005636/Classification (https://github.com/f2005636/Classification)

3.2 Algorithm Selection

(https://analyticsindiamag.com/7-types-classification-algorithms/screen-shot-2018-01-19-at-11-01-28-am/)

(Types of Classification Algorithms)

More Great AIM Stories


Robotics As A Service Will Be The New Trend: Sangeet Kumar, Addverb Technologies
(https://analyticsindiamag.com/robotics-as-a-service-will-be-the-new-trend-sangeet-kumar-addverb-technologies/)

Top Open-Source Datasets For Object Detection In 2021 (https://analyticsindiamag.com/top-open-source-


datasets for object detection in 2021/)
https://analyticsindiamag.com/7-types-classification-algorithms/ 8/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms
datasets-for-object-detection-in-2021/)

Is Google FloC A Step In The Wrong Direction? (https://analyticsindiamag.com/is-google-floc-a-step-in-the-


wrong-direction/)

What Does Darktrace & Microsoft’s Partnership Mean For Cloud Security? (https://analyticsindiamag.com/what-
does-darktrace-microsofts-partnership-mean-for-cloud-security/)

Is MLP Better Than CNN & Transformers For Computer Vision? (https://analyticsindiamag.com/is-mlp-better-
than-cnn-transformers-for-computer-vision/)

Highlights From IBM’s Think 2021 Conference (https://analyticsindiamag.com/highlights-from-ibms-think-2021-


conference/)


(https://analyticsindiamag.com/author/f2005636gmail-com/)
Rohit Garg has close to 7 years of work experience in field of data analytics and machine learning. He has worked
extensively in the areas of predictive modeling, time series analysis and segmentation techniques. Rohit holds BE from
BITS Pilani and PGDM from IIM Raipur.

(https://business.louisville.edu/learnmore/msba-india/?

utm_campaign=MSBA-
INDIA&utm_source=analyticsindia&utm_medium=display&utm_keyword=analyticsindia&utm_content=GetPaid)

https://analyticsindiamag.com/7-types-classification-algorithms/ 9/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

(https://machinecon.analyticsindiamag.com/)

Our Upcoming Events

Webinar

Speed up deep learning inference

13th May
Register
(https://register.gotowebinar.com/register/431913878469697548)

Conference, in-person (Bangalore)

MachineCon 2022

24th Jun
Register
(https://machinecon.analyticsindiamag.com/tickets/)

https://analyticsindiamag.com/7-types-classification-algorithms/ 10/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

Conference, Virtual

Deep Learning DevCon 2022

30th Jul
Register
(https://dldc.adasci.org/get-the-tickets/)

Conference, in-person (Bangalore)

Cypher 2022

21-23rd Sep
Register
(https://www.analyticsindiasummit.com/about/buy-tickets/)

3 Ways to Join our Community

Discord Server
Stay Connected with a larger ecosystem of data science and ML Professionals

JOIN DISCORD COMMUNITY


(HTTPS://DISCORD.GG/SBTJ3JDEAZ)

Telegram Channel
Discover special offers, top stories, upcoming events, and more.

JOIN TELEGRAM
(HTTPS://T.ME/+TRPAPV7GNN2OZ1AZ)

https://analyticsindiamag.com/7-types-classification-algorithms/ 11/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

Subscribe to our newsletter


Get the latest updates from AIM

SUBSCRIBE

MORE FROM AIM

Top institutes leading in quantum computing research


(https://analyticsindiamag.com/top-institutes-leading-in-quantum-computing-
https://analyticsindiamag.com/7-types-classification-algorithms/ 12/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

research/)
IISc plans to bring the Indian pursuit in this field on par with the rest of the world, with a dedicated and focused
effort.

Microsoft joins hands with AIIMS to establish Mixed Reality CoE at Jodhpur
Campus (https://analyticsindiamag.com/microsoft-joins-hands-with-aiims-to-

establish-mixed-reality-coe-at-jodhpur-campus/)
AIIMS Jodhpur will also deliver mixed reality enabled remote healthcare services in the district of Sirohi to
strengthen medical facilities delivered to underserved locations.

https://analyticsindiamag.com/7-types-classification-algorithms/ 13/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

Intel reveals Gaudi 2 AI training engine to challenge NVIDIA


(https://analyticsindiamag.com/intel-reveals-gaudi-2-ai-training-engine-to-
challenge-nvidia/)
The new Gaudi2 and Greco processors are purpose-built for AI deep learning applications, implemented in 7-
nanometer technology and manufactured on Habana’s high-efficiency architecture.

Protected Computing, password-free future, virtual credit cards and other


privacy announcements at Google I/O
(https://analyticsindiamag.com/protected-computing-password-free-future-
virtual-credit-cards-and-other-privacy-announcements-at-google-i-o/)
Protected Computing will allow users to remove personally identifiable information from Google Search results.

https://analyticsindiamag.com/7-types-classification-algorithms/ 14/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

The 3rd edition of Deep Learning DevCon (DLDC 2022) is here | July 30
(https://analyticsindiamag.com/the-3rd-edition-of-deep-learning-devcon-dldc-
2022-is-back-july-30/)
The summit will feature talks, workshops, paper presentations, exhibitions and hackathons.

How to build a robust ML model using Curriculum learning?


(https://analyticsindiamag.com/how-to-build-a-robust-ml-model-using-
curriculum-learning/)
Curriculum learning is also a type of machine learning that trains the model in such a way that humans get trained
using their education system

https://analyticsindiamag.com/7-types-classification-algorithms/ 15/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

Google introduces PostgreSQL database AlloyDB


(https://analyticsindiamag.com/google-introduces-postgresql-database-
alloydb/)
Google informs that AlloyDB for PostgreSQL was built on the principle of disaggregation of compute and storage
and designed to leverage disaggregation at every layer of the stack.

How to make a time series stationary?  (https://analyticsindiamag.com/how-to-


make-a-time-series-stationary/)
The statistical features of a time series could be made stationary by differencing method.

https://analyticsindiamag.com/7-types-classification-algorithms/ 16/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

USEReady receives growth capital from Abry Partners


(https://analyticsindiamag.com/useready-receives-growth-capital-from-abry-
partners/)
This is the first institutional round for USEReady.

Council Post: The importance of organisational structure from an AI & data


science perspective (https://analyticsindiamag.com/the-importance-of-
https://analyticsindiamag.com/7-types-classification-algorithms/ 17/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms
science perspective (https://analyticsindiamag.com/the importance of
organisational-structure-from-an-ai-data-science-perspective/)
In most organisations, the objective and key results, hiring plan, delivery and performance management are data-
driven.

O U R M I S S I O N I S TO B R I N G A B O U T B E T T E R - I N F O R M E D A N D M O R E C O N S C I O U S D E C I S I O N S
A B O U T T E C H N O LO G Y T H R O U G H A U T H O R ITAT I V E , I N F LU E NT I A L , A N D T R U S T W O RT HY
JOURNALISM.

SHAPE THE FUTURE OF AI


C O NTA CT U S ⟶
( HT T P S : // A N A LY T I C S I N D I A M A G . C O M / C O NTA CT- U S / )

(https://analyticsindiamag.com)

https://analyticsindiamag.com/7-types-classification-algorithms/ 18/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

(https://www.linkedin.com/company/analytics-
(https://www.facebook.com/AnalyticsIndiaMagazine/)
(https://www.youtube.com/channel/UCAlwrsgeJavG1vw9qSFOUmA)
(https://twitter.com/@analyticsindiam)
(https://www.instagram.com/analyticsindiamagazine/)
india-magazine)

About Us
Advertise
Weekly Newsletter
Write for us
Careers
Contact Us

RANKINGS & LISTS


Academic Rankings
Best Firms To Work For
Top Leaders
Emerging Startups
Trends
PeMa Quadrant

RESOURCES
Python Libraries for data science
Best Firms for Data Scientists Certification

OUR BRANDS
AIM Research
AIM Recruits
AIM Leaders Council

VIDEOS
Documentary – The Transition Cost
Web Series – The Dating Scientists
Podcasts – Simulated Reality
Analytics India Guru
The Pretentious Geek
Deeper Insights with Leaders
Curiosum – AI Storytelling

OUR CONFERENCES
Cypher
The MachineCon
Machine Learning Developers Summit
The Rising
Data Engineering Summit
https://analyticsindiamag.com/7-types-classification-algorithms/ 19/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms
Data Engineering Summit

AWARDS
Analytics100
40 under 40 Data Scientists
Women in AI Leadership
Data Science Excellence

EVENTS
AIM Custom Events
AIM Virtual

MACHINEHACK
For Organizations
Hackathons
Discussion Forum
Job Portal
Mock Assessments
Practice ML
Courses

NEWSLETTER
Stay up to date with our latest news, receive exclusive deals, and more.

Enter Your Email Address

SUBSCRIBE ⟶

© Analytics India Magazine Pvt Ltd 2022

Terms of use (https://analyticsindiamag.com/terms-use/)

Privacy Policy (https://analyticsindiamag.com/privacy-policy/)

https://analyticsindiamag.com/7-types-classification-algorithms/ 20/21
5/13/22, 6:54 PM 7 Types of Classification Algorithms

Copyright
(https://analyticsindiamag.com/copyright-trademarks/)

https://analyticsindiamag.com/7-types-classification-algorithms/ 21/21

You might also like