0% found this document useful (0 votes)

357 views31 pages

Diabetes Pridiction Using Machine Learning

prediction of diabities using machine learnig

Uploaded by

Abhishek

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

357 views31 pages

Diabetes Pridiction Using Machine Learning

prediction of diabities using machine learnig

Uploaded by

Abhishek

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 31

DIABETES PRIDICTION USING MACHINE

LEARNING

A PROJECT REPORT

Submitted in partial fulfilment of the requirement of the degree of

BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING

Supervised by: Submitted by:

Mr. Tejbir Rana Dimple Kumari
Assistant Professor 6319008

Department of Computer Science

GURU NANAK INSTITUTE OF TECHNOLOGY, MULLANA

AFFILIATED TO

KURUKSHETRA UNIVERSITY KURUKSHETRA,HYARYANA

1
CANDIDATE’S DECLARATION

I hereby certify that the work which is being presented in this Project entitled “DIABETES
PRIDICTION USING MACHINE LEARNING” in partial fulfilment of requirement for the award
of degree of B .Tech ,Computer Science and Engineering submitted in Department of
Computer Science & Engineering at Guru Nanak Institute Of Technology , Mullana , affiliated
to Kurukshetra University , Kurukshetra is an authentic record of my work carried out under
the supervision of Mr . Tejbir Rana, Head of Department of CSE , GNIT , Mullana , Ambala.

The matter presented here has not been submitted by me in any other
University/Institute for the award of any other degree.

DIMPLE KUMARI

6319008

This is to certify that the above statement made by the candidate is correct to the best of
my knowledge .

Asst . Prof. Tejbir Rana HOD – Sidharth Arora

Department . Of CSE

2
ACKNOWLEDGEMENT

I acknowledge the contribution of each and every individual in the development of this
project entitled “DIABETES PREDICTION USING MACHINE LEARNING”, who directly
or indirectly helped me in this project. Without their support it would have been a tough job
for me to complete this project.

I express my sincere thanks to Mr Sidharth Arora (Hod of cs:) who guided me across the
process of learning and implementing the python, which is the language used to develop this
project.

I pay my deep sense of gratitude to my colleagues, friends and my parents for their valuable
moral.

3
Abstract
Diabetes (Diabetes Mellitus), is a group of metabolic disorders and millions of people are
affected. Detection of diabetes is of a great significance and serious complications should be
concerned. Many research studies have been done on the diagnosis of diabetes, most of the
research studies are based on one particular data set which is the Pima Indian diabetes data
set. This Pima Indian data set is a data set of studies of women in India's population that
began in 1965., and its onset rate is relatively high in diabetes. Most research studies were
carried out prior to focusing primarily on one or two specialized complex techniques for
testing data, while an inclusive research on several general techniques are missing. In this
system, we extensively explore the most popular techniques in Machine Learning (e.g. KNN
algorithm) used to identify the diabetes and pre-processing of data methods. We will
examine this technique by the accuracy of the cross validation on the UCI ML repository
data set. Keywords: Machine learning, Classification, KNN, Diabetes.

4
TABLE OF CONTENTS
Contents
ACKNOWLEGEMENT………………………………………………………………………3
ABSTRACT…………………………………………………………………………………..4
TABLE OF CONTENTS…………………………………………………………………….5
Chapter 1……………………………………………………………………………………..7
1.1 Introduction…………………………………………………………………………7-8
1.2 The contribution of this work………………………………………………………8
Chapter 2……………………………………………………………………………………..9
2.1 Literature review……………………………………………………………………9
Chapter 3……………………………………………………………………………………10
3.1 Methodology………………………………………………………………………10
3.2 Data Constraints…………………………………………………………………..10
3.3 Train Dataset or Test Dataset…………………………………………………….11
3.4 Pre-processing of data…………………………………………………………….11
3.5 Features Extraction………………………………………………………………..11
3.6 ML Algorithm :KNN………………………………………………………………...12
3.7 Result………………………………………………………………………………..12
Chapter 4……………………………………………………………………………………13
4.1 Proposed Work…………………………………………………………………….13
4.2 Dataset Description……………………………………………………………13-14
4.3 Correlation Matrix………………………………………………………………….15
4.4 Histogram…………………………………………………………………………..16
4.5 Bar Plot For Outcome Class……………………………………………………...17
4.6 K-Nearest Neighbours…………………………………………………………….18
4.7 Logistic Regression……………………………………………………………….19
4.8 Decision Tree………………………………………………………………………20
4.9 Feature importance in decision trees……………………………………………21
4.10 Random Forests………………………………………………………………….22

5
4.11 Features importance in random forests………………………………………..22
4.12 Support Vector Machine…………………………………………………………23
4.13 Accuracy Comparison……………………………………………………………23
Chapter 5……………………………………………………………………………………24
5.1 Source Code and Output……………………………………………………...24-27
Chapter 6……………………………………………………………………………………28
6.1 Conclusion………………………………………………………………………….28
6.2 Future Scope……………………………………………………………………….28
6.3 Bibliography…………………………………………………………………….29-30

LISTS OF FIGURES

Fig no Figure Description page no

1 Methodology 10
2 Data Pre-Processing 11
3 Proposed Model Diagram 14
4 Correlation Matrix 15
5 Histogram 16
6 Bar Plot For Outcome Class 17
7 K-Nearest Neighbours 18
8 Logistic Regression 20
9 Features Importance In Decision Tree 21
10 Feature Importance In Random forest 22
11 Support Vector Machine 23

6
CHAPTER-1
1.1 INTRODUCTION

Diabetes mellitus, or diabetes for short, is a chronic disease that occurs either when the
pancreas does not produce enough insulin or when the body cannot effectively use the
insulin it produces . Diabetes has two main types called type 1 and type 2. In type 1 diabetes
(also known as insulin-dependent or childhood-onset), there is insulin production deficiency
in the body, which requires daily administration of insulin, whereas in type 2 diabetes
(known formally as non-insulin-dependent or adult-onset), the body cannot use insulin
effectively. According to the World Health Organization (WHO), the number of people with
diabetes in 2014 was 422 million. Moreover, in 2016, diabetes was the direct cause of 1.6
million deaths.There are different causes for diabetes. For instance, type 1 diabetes mellitus
(T1DM) can develop due to an autoimmune reaction that destroys the cells in the pancreas
that make insulin, called beta cells , whereas type 2 diabetes is mainly caused by age, family
history of diabetes, high blood pressure, high levels of triglycerides, heart disease or stroke
[3]. Early detection of diabetes can be of great benefit, especially because the progression of
prediabetes to type 2 diabetes is quite high. According to CDC , diabetes can affect any part
of the body over time, leading to different types of complications. The most common types
are divided into micro- and macrovascular disorders. The former are those long-term
complications that affect small blood vessels, including retinopathy, nephropathy, and
Healthcare 2021, 9, 1712. https://doi.org/10.3390/healthcare9121712
https://www.mdpi.com/journal/healthcare Healthcare 2021, 9, 1712 2 of 19 neuropathy.
Macrovascular disorders, however, include ischemic heart disease, peripheral vascular
disease, and cerebrovascular disease . Due to high diabetes mortality and morbidity along
with its possible complications, it is very important to understand how to deal with diabetes

7
and how to prevent such possible complications. To reduce the possibility of developing
some serious complications related to diabetes, machine learning and data mining
techniques can be applied to diabetes-related datasets. Machine learning is a branch of
artificial intelligence and computer science which focuses on the use of data and algorithms
to imitate the way that humans learn. Machine learning itself can be divided into two main
categories, namely, supervised and unsupervised learning. The main goal in both cases is to
make use of a given dataset to enhance our understanding of the data and discover useful
knowledge. Supervised machine learning is characterized by the use of labeled data to train
its algorithms and can be utilized for classification or regression tasks. The goal of
classification is to assign each unknown instance to one of possible classes or categories for
prediction or diagnosis purposes. The proposed work implements several supervised
machine learning techniques and algorithms to predict different complications related to
diabetes. Unlike typical diabetes datasets, the complications’ set consists of various
collections of complications such as metabolic syndrome, dyslipidemia, neuropathy,
nephropathy, diabetic foot, hypertension, obesity, and retinopathy. Furthermore, logistic
regression (LR), support vector machine (SVM), decision tree (DT CART), random forest (RF),
AdaBoost, and XGBoost were utilized to build and evaluate different resulting classifiers.

1.2 The contributions of this work are as follows:

• Implementation and evaluation of traditional and ensemble machine learning models to

predict eight complications in diabetic patients by utilizing a comprehensive UAEbased
dataset.

• Identification of the dominant characteristics that may lead to diabetic complications

using feature selection methods.

8
CHAPTER-2

2.1 LITERATURE SURVEY

Defusal Faruque and Asaduzzaman, Iqbal H.Sarker has discussed that diabetes is one of the
most common disorder of the human body it is caused due the metabolic disorder .Hence
that they used various and important ML algorithms that are Support Vector machine,
NB,KNN and DT to predict the diabetes.

Sidong Wei,Xuejiao Zhao and Chunyan Miao presented that diabetes is commonly called as
disorder in which glucose level in body is high. In this paper they use popular methods such
as SVM and deep neural network for identify the disease and data processing.

Lakshmi K.S and G.Santhosh Kumar according to them Hospital databases serve as wealthy
information source for the fruitful medication diagnosis. IN this they used NLP tools along
with combined with data mining algorithms for the extraction of rules.

Jian-xunChen , Shih-LiSu and Che-Ha Chang discussed about Ontology that generate a
primary care planning to the medical professional’s for the accustoming. The result of the
research paper shows the model can be provided personalize diabetes mellitus care
planning efficiently.

MM Alotaib, RSH.Istepanian, and A.Sungoor they are present a clever based mobile
polygenic disease control system & tutoring model for the patients with diabetes. In this,
system is able to store the clinical information about the diabetes system, such an often
blood sugar level and BP measured and hypo glycaemia event.

IJARCCE ISSN (Online) 2278-1021 ISSN (Print) 2319-5940 International Journal of Advanced
Research in Computer and Communication Engineering Vol. 9, Issue 7, July 2020 Copyright
to IJARCCE DOI 10.17148/IJARCCE.2020.9712 81 Berina Alic and Lejila Gurbea ,Almir
Badnjevic they presented the overview of techniques in machine learning in the diabetes
classification and cardiovascular diseases using BNs and ANN.

9
CHAPTER-3

3.1 METHODOLOGY

The Proposed method use KNN algorithm for classification and prediction of diabetes using
trained data. And, the proposed system also predicts the time of getting diabetes.

Figure 1: Methodology

3.2 Data Constraints:

Data is a collection global dataset. IN this system use Pima Indian data set is used for
training a model. Data set contain 21 parameters and around 1000 dataset. The dataset
feature/parameters are:

• Age

• Gender

• Relation

• DOB

• Sugar tested value

• Symptoms

10
• Family history etc.

This are data is trained to the model for the prediction of diabetes.

3.3 Train Dataset and Test Dataset:

The training data is a initial set of data which is used to understand the program. This is the
one in which we have to train the model first because to set the feature and this data is
available on system. This data is used to teach the machine for do different actions. It is the
data in which model can learn with algorithm to teach the model and doing work automatic.

Testing data is the input given to a software. It shows the data affects when the execution
of the module that specifying and this is basically used for testing.

3.4 Pre-processing of data:

Data Preprocessing is a process in which that is actual use for converting the basic data into
the clean data set. It is the step in which the data transform or an encode to the state that
the machine can be easily parse. The major task of data Preprocessing in learning process is
to remove the unwanted data and filling the missed value. So that it help to machine can be
trained easily.

Figure 2: Data Pre-processing

3.5 Feature Extraction:

Feature Extraction is the method in which it used for alter the key data for features of
outcomes. This, trait square is used to compute the characteristics of designs given that
facilitate in different amid the class of key pattern details. This method involving to decrease
the counts of resource required to describe the huge set of data. Feature extraction is an

11
attribute reduction process. This is also used to increasing the speed and effectiveness of
supervised learning.

3.6 ML Algorithm: KNN

The k-nearest neighbour’s is a ML algorithm is the non-parametric method proposed by

Thomas Cover used for Regression and Classification. This algorithm is mainly used for the
classification of problems in the industry. KNN algorithm is a type of instance-based learning
method. This algorithm relies on the distance for objects classification, training data
normalizing to the improve its accuracy dramatically. The neighbours are derived from the
set of things for which classes or object property values are known. It can be thought of as a
training set for the algorithm, although no explicit training steps are required.

3.7 Result:

After taking that input data from the system will able to divine the statistics by appeal the
ML algorithm & also provided the foremost output in the devise of different in between to
detection the most accurate to treatment to diabetes millets.

12
CHAPTER-4

4.1 PROPOSED WORK

The forecast or detection of diabetes is the major and concerning it is severe the
complications. The diabetes complications showed in the below picture. Detection of
mellitus in the starting phase and played a significant role in the heal the diabetes.

The detection diabetes is plays very important role for the human life because it leads to
death. The offered system is used to initial detection of diabetes and time prediction
whereas time prediction means when the patients the diabetes it will be help to improve
the habit of the patients. The proposed system is mainly concentrated on development of
machine learning model and also it helpful in the medical sector to identify the diseases.
This offer system is an automation to predicts the diabetes using old patient’s data.

4.2 Dataset Description:

The diabetes data set was originated from https://www.kaggle.com/johndasilva/diabetes.

Diabetes dataset containing 2000 cases. The objective is to predict based on the measures
to predict if the patient is diabetic or not.

13
Figure-3 : Proposed Model Diagram.

RESULT & DISCUSSION

14
4.3 Correlation Matrix:

Figure-4

It is easy to see that there is no single feature that has a very high correlation with our
outcome value. Some of the features have a negative correlation with the outcome value
and some have positive

15
4.4 Histogram:

Figure-5

Let’s take a look at the plots. It shows how each feature and label is distributed along
different ranges, which further confirms the need for scaling. Next, wherever you see
discrete bars, it basically means that each of these is actually a categorical variable. We will
need to handle these categorical variables before applying Machine Learning. Our outcome
labels have two classes, 0 for no disease and 1 for disease.

16
4.5 Bar Plot For Outcome Class:

Figure-6

The above graph shows that the data is biased towards datapoints having outcome value as 0 where
it means that diabetes was not present actually. The number of non-diabetics is almost twice the
number of diabetic patients.

17
4.6 k-Nearest Neighbours:

The k-NN algorithm is arguably the simplest machine learning algorithm. Building the model consists
only of storing the training data set. To make a prediction for a new data point, the algorithm finds
the closest data points in the training data set, its “nearest neighbours.”

First, let’s investigate whether we can confirm the connection between model. complexity and
accuracy:

Fig
ure-7

The above plot shows the training and test set accuracy on the y-axis against the setting of
N_neighbours on the x axis. Considering if we choose one single nearest neighbour, the
prediction on the training set is perfect. But when more neighbours are considered, the
training accuracy drops, indicating that using the single nearest neighbour leads to a model
that is too complex. The best performance is somewhere around 9 neighbours.

18
TABLE-1

4.7 Logistic regression:

Logistic Regression is one of the most common classification algorithms.

TABLE-2

➔ In first row, the default value of C=1 provides with 77% accuracy on the training and 78%
accuracy on the test set.

➔ In second row, using C=0.01 results are 78% accuracy on both the training and the test
sets.

➔ Using C=100 results in a little bit lower accuracy on the training set and little bit highest
accuracy on the test set, confirming that less regularization and a more complex model may
not generalize better than default setting.

Therefore, we should choose default value C=1.

19
Figure-8

4.8 Decision Tree:

This classifier creates a decision tree based on which, it assigns the class values to each data
point. Here, we can vary the maximum number of features to be considered while creating
the model.

TABLE-3

The accuracy on the training set is 100% and the test set accuracy is also good.

20
4.9 Feature Importance in Decision Trees

Feature importance rates how important each feature is for the decision a tree makes. It is
a number between 0 and 1 for each feature, where 0 means “not used at all” and 1 means
“perfectly predicts the target”.

Figure-9

Feature “Glucose” is by far the most important feature.

21
4.10 Random Forest:

This classifier takes the concept of decision trees to the next level. It creates a forest of trees
where each tree is formed by a random selection of features from the total features.

T
ABLE-4

4.11 Feature importance in Random Forest:

Figure-10

Similarly to the single decision tree, the random forest also gives a lot of importance to the
“Glucose” feature, but it also chooses “BMI” to be the 2nd most informative feature overall.

22
4.12 Support Vector Machine:

This classifier aims at forming a hyper plane that can separate the classes as much as
possible by adjusting the distance between the data points and the hyper plane. There are
several kernels based on which the hyper plane is decided. I tried four kernels namely linear
oly, rby and sigmoid.

Figure-11

As can be seen from the plot above, the linear kernel performed the best for this dataset
and achieved a score of 77%.

4.13 Accuracy Comparison:

TABLE-5

Table-5 shows the accuracy values for all five machine learning algorithms. Table-5 shows
that Decision Tree algorithm gives the best accuracy with 98% training accuracy and 99%
testing accuracy.

23
CHAPTER-5

5.1 Source code/output

24
25
26
27
CHAPTER-6

6.1 Conclusion

The prediction of diabetes is one the of great importance in today scenario, and concerning
with its severe complications. Due to the biggest reason for the death in worldwide is
diabetes. The System model is mainly focus to identification of diabetes using some of the
parameters. System is useful to physicians to predict the diabetes in initial dais. So, that
conventional treatments and solutions may be given to the patients. System used some of
the techniques like ML for the prediction, so that to get the more precise results. There have
been fortune of investigation on the diabetes imprint. Building diabetes disease prediction
system is useful for hospitals and doctors. System predicts disease at early stages, so
doctors can treat patients in a better way. Proposed model is the real time application in
which is meant for multiple hospitals and predicts disease in less time. As we use machine
learning algorithms for disease prediction, we will get more accurate and efficient results .

6.2 Future Scope

Proposed system uses “KNN algorithm” to find the diabetes disease, in data science we
have many algorithms for classification such as Naive Bayes, SVM, Decision Tree, ID3 etc… in
future we can add more algorithms to find outputs and algorithms can be compared to find
the efficient algorithm. We can add visitor query module, where visitors can post queries to
administrator and admin can send reply to those queries. We can add treatment module,
where doctors upload treatment details for patients and patient can view those treatment
details.

28
6.3 BIBLIOGRAPHY
1. World Health Organization. Diabetes. 10 November 2020. Available online:
https://www.who.int/news-room/fact-sheets/ detail/diabetes (accessed on 30 November 2020).

2. Centers for Disease Control and Prevention. What Is Type 1 Diabetes. Available online:
https://www.cdc.gov/diabetes/basics/ what-is-type-1-diabetes.html (accessed on 22 November
2021).

3. MedlinePlus. How to Prevent Diabetes. 15 June 2020. Available online:

https://medlineplus.gov/howtopreventdiabetes.html (accessed on 30 November 2020).

4. Centers for Disease Control and Prevention. Diabetes. 2019. Available online:
https://www.cdc.gov/diabetes/managing/ problems.html (accessed on 30 November 2020).

5. Cade, W.T. Diabetes-Related Microvascular and Macrovascular Diseases in the Physical Therapy
Setting. Phys. Ther. 2008, 88, 1322–1335. [CrossRef] [PubMed]

6. Tekieh, M.H.; Raahemi, B. Importance of Data Mining in Healthcare. In Proceedings of the 2015
IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015—
ASONAM, Paris, France, 25–28 August 2015; pp. 1057–1062. [CrossRef]

7. Sharma, R.; Singh, S.N.; Khatri, S. Medical Data Mining Using Different Classification and
Clustering Techniques: A Critical Survey. In Proceedings of the 2016 Second International Conference
on Computational Intelligence & Communication Technology (CICT), Ghaziabad, India, 12–13
February 2016; pp. 687–691. [CrossRef]

8. Dominic, V.; Gupta, D.; Khare, S.; Aggarwal, A. Investigation of chronic disease correlation using
data mining techniques. In Proceedings of the 2015 2nd International Conference on Recent
Advances in Engineering & Computational Sciences (RAECS), Chandigarh, India, 21–22 December
2015; pp. 1–6. [CrossRef]

9. Hasan, M.K.; Alam, M.A.; Das, D.; Hossain, E.; Hasan, M. Diabetes Prediction Using Ensembling of
Different Machine Learning Classifiers. IEEE Access 2020, 8, 76516–76531. [CrossRef]

10. Sisodia, D.; Sisodia, D.S. Prediction of Diabetes using Classification Algorithms. Procedia Comput.
Sci. 2018, 132, 1578–1585. [CrossRef]

29
11. Meng, X.-H.; Huang, Y.-X.; Rao, D.-P.; Zhang, Q.; Liu, Q. Comparison of three data mining models
for predicting diabetes or prediabetes by risk factors. Kaohsiung J. Med. Sci. 2013, 29, 93–99.
[CrossRef] [PubMed]

12. Abdulhadi, N.; Al-Mousa, A. Diabetes Detection Using Machine Learning Classification Methods.
In Proceedings of the 2021 International Conference on Information Technology (ICIT), Amman,
Jordan, 14–15 July 2021; pp. 350–354. [CrossRef]

13. Kantawong, K.; Tongphet, S.; Bhrommalee, P.; Rachata, N.; Pravesjit, S. The Methodology for
Diabetes Complications Prediction Model. In Proceedings of the 2020 Joint International Conference
on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical,
Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON), Pattaya,
Thailand, 11–14 March 2020; pp. 110–113. [CrossRef]

14. Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [CrossRef]

15. Islam, M.S.; Qaraqe, M.K.; Belhaouari, S.B. Early Prediction of Hemoglobin Alc: A novel
Framework for better Diabetes Management. In Proceedings of the 2020 IEEE Symposium Series on
Computational Intelligence (SSCI), Canberra, Australia, 1–4 December 2020; pp. 542–547. [CrossRef]
16. Dagliati, A.; Marini, S.; Sacchi, L.; Cogni, G.; Teliti, M.; Tibollo, V.; De Cata, P.; Chiovato, L.;
Bellazzi, R. Machine Learning Methods to Predict Diabetes Complications. J. Diabetes Sci. Technol.
2018, 12, 295–302. [CrossR]

30
31

Project Report On Diabetes Prediction
No ratings yet
Project Report On Diabetes Prediction
29 pages
Final Project Report
No ratings yet
Final Project Report
14 pages
Laptop Price Prediction Using Machine Learning: International Journal of Computer Science and Mobile Computing
100% (1)
Laptop Price Prediction Using Machine Learning: International Journal of Computer Science and Mobile Computing
5 pages
PR3197 - DiseasePredictionUsingMachineLearning - Report - MAYUR SHIVAKU
100% (1)
PR3197 - DiseasePredictionUsingMachineLearning - Report - MAYUR SHIVAKU
51 pages
Report Minor Project PDF
No ratings yet
Report Minor Project PDF
37 pages
Shandon Cytospin 3 Operator Guide
No ratings yet
Shandon Cytospin 3 Operator Guide
68 pages
Unit 4 Extra Practice KEY
No ratings yet
Unit 4 Extra Practice KEY
4 pages
Mini Project On Diabetes Prediction: Information Technology
No ratings yet
Mini Project On Diabetes Prediction: Information Technology
19 pages
54 Batch Project Documentation-1
No ratings yet
54 Batch Project Documentation-1
82 pages
House Price Prediction Using Machine Learning
No ratings yet
House Price Prediction Using Machine Learning
6 pages
Loan Approval System Based On Machine Learning Approach
100% (1)
Loan Approval System Based On Machine Learning Approach
55 pages
Music Recommendation Based On Facial Expression
No ratings yet
Music Recommendation Based On Facial Expression
4 pages
Internship Report DiabetesPrediction
No ratings yet
Internship Report DiabetesPrediction
15 pages
Lung Disease Prediction From X Ray Images
100% (1)
Lung Disease Prediction From X Ray Images
63 pages
Project Report On Flight Price Predication Using ML Techniques
No ratings yet
Project Report On Flight Price Predication Using ML Techniques
23 pages
Internship Report File
No ratings yet
Internship Report File
35 pages
Fake Profile Detection
100% (1)
Fake Profile Detection
69 pages
Ooad Record Abinash
No ratings yet
Ooad Record Abinash
241 pages
Final ML Report
No ratings yet
Final ML Report
34 pages
For Fake or Real Disaster Tweet Analysis of Machine Learning Algorithms
No ratings yet
For Fake or Real Disaster Tweet Analysis of Machine Learning Algorithms
23 pages
Dbms Project Report Inventory Management System
No ratings yet
Dbms Project Report Inventory Management System
41 pages
Credit Card Fraud Detection Using Machine Learning
No ratings yet
Credit Card Fraud Detection Using Machine Learning
28 pages
Health Care Final Project
No ratings yet
Health Care Final Project
78 pages
SC&RP - Unit 5
No ratings yet
SC&RP - Unit 5
36 pages
Face Recognition Attendance System
No ratings yet
Face Recognition Attendance System
18 pages
LP3 - ML Mini-Project Report Format Shreeyas
No ratings yet
LP3 - ML Mini-Project Report Format Shreeyas
13 pages
Final Major Project
No ratings yet
Final Major Project
99 pages
Disease Prediction Using Machine Learning
No ratings yet
Disease Prediction Using Machine Learning
4 pages
Air Quality Prediction
No ratings yet
Air Quality Prediction
21 pages
Disease Prediction Using ML
100% (1)
Disease Prediction Using ML
43 pages
Student Result Management System Presentation
No ratings yet
Student Result Management System Presentation
11 pages
Crime Prediction in Nigeria's Higer Institutions
No ratings yet
Crime Prediction in Nigeria's Higer Institutions
13 pages
REPORT FILE of FACE MASK DETECTION
No ratings yet
REPORT FILE of FACE MASK DETECTION
45 pages
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
No ratings yet
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
34 pages
Phishing Website Detection by Machine Learning Techniques Presentation
No ratings yet
Phishing Website Detection by Machine Learning Techniques Presentation
12 pages
PROJECT REPORT For Machine Learning
100% (1)
PROJECT REPORT For Machine Learning
22 pages
100 Project Ideas
No ratings yet
100 Project Ideas
15 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
70 pages
Medical Expenses Prediction
No ratings yet
Medical Expenses Prediction
51 pages
Heart Disease Prediction Synopsis
No ratings yet
Heart Disease Prediction Synopsis
36 pages
Medicinal Drug Recommendation System
No ratings yet
Medicinal Drug Recommendation System
52 pages
Department of Computer Engineering: Mini Project Report On " Covid-19 Tracker Using Python "
No ratings yet
Department of Computer Engineering: Mini Project Report On " Covid-19 Tracker Using Python "
17 pages
Applications of AI
No ratings yet
Applications of AI
56 pages
Multiple Disease Detection
No ratings yet
Multiple Disease Detection
79 pages
A Report of 08 Weeks Industrial Training At: ASPEXX Health Solution Pvt. LTD
No ratings yet
A Report of 08 Weeks Industrial Training At: ASPEXX Health Solution Pvt. LTD
74 pages
Prediction of Diabetes Using Classi Cation Algorithms
No ratings yet
Prediction of Diabetes Using Classi Cation Algorithms
8 pages
08250771
No ratings yet
08250771
8 pages
SMS Spam Detection Using Machine Learning
No ratings yet
SMS Spam Detection Using Machine Learning
9 pages
Online Banking System Asp .Net C Source
No ratings yet
Online Banking System Asp .Net C Source
4 pages
Loan Prediction System
No ratings yet
Loan Prediction System
5 pages
Aiml Project Report
No ratings yet
Aiml Project Report
10 pages
Project Final Report
100% (1)
Project Final Report
44 pages
Online Attendance Management System: College of It and Management Education Bhubaneswer
100% (1)
Online Attendance Management System: College of It and Management Education Bhubaneswer
49 pages
Unit - 4 Machine Learning
100% (1)
Unit - 4 Machine Learning
84 pages
Application Development Using Flutter
No ratings yet
Application Development Using Flutter
5 pages
Artificial Intelligence and Machine Learning (18CS71) : "Personality Prediction System"
No ratings yet
Artificial Intelligence and Machine Learning (18CS71) : "Personality Prediction System"
28 pages
"Accident Detection and Alert System": Visvesvaraya Technological University "Jnana Sangama" Belagavi-590018
No ratings yet
"Accident Detection and Alert System": Visvesvaraya Technological University "Jnana Sangama" Belagavi-590018
23 pages
Campus Placement Analyzer: Using Supervised Machine Learning Algorithms
No ratings yet
Campus Placement Analyzer: Using Supervised Machine Learning Algorithms
5 pages
18 Converging Blockchain and Machine Learning For Healthcare
No ratings yet
18 Converging Blockchain and Machine Learning For Healthcare
3 pages
Flight Delay Prediction: Project Synopsis On
No ratings yet
Flight Delay Prediction: Project Synopsis On
13 pages
AIML Internship Report
No ratings yet
AIML Internship Report
53 pages
Touchpad Plus Ver. 1.1 Class 7
From Everand
Touchpad Plus Ver. 1.1 Class 7
Nisha Batra
No ratings yet
Online Jewellery Shop A Project Report S
No ratings yet
Online Jewellery Shop A Project Report S
60 pages
Driver Drowsiness Detection Project Report
No ratings yet
Driver Drowsiness Detection Project Report
14 pages
Car Rental System Synopsis (AutoRecovered)
No ratings yet
Car Rental System Synopsis (AutoRecovered)
17 pages
Online Restaurant Management System GOOD
No ratings yet
Online Restaurant Management System GOOD
25 pages
12 DMC
No ratings yet
12 DMC
1 page
Endometrium Embryology and Development
No ratings yet
Endometrium Embryology and Development
1 page
Daily Lesson Log: Tle - Icttd9 - 12al - Ic - E - 3
No ratings yet
Daily Lesson Log: Tle - Icttd9 - 12al - Ic - E - 3
4 pages
About The Author: Fabio Saccomanno Was Born in Genoa, Italy in 1933. He Received The Laurea
No ratings yet
About The Author: Fabio Saccomanno Was Born in Genoa, Italy in 1933. He Received The Laurea
2 pages
Module 1
No ratings yet
Module 1
5 pages
Unit Ix Cost Effectiveness and Cost Accounting
No ratings yet
Unit Ix Cost Effectiveness and Cost Accounting
38 pages
Investor Behaviuor in Volatile Market
No ratings yet
Investor Behaviuor in Volatile Market
66 pages
Bodybuilding, Drugs and Risk
No ratings yet
Bodybuilding, Drugs and Risk
230 pages
Civic Education Lesson Plan
No ratings yet
Civic Education Lesson Plan
2 pages
Factors Affecting The Extent of Compliance of Adolescent Pregnant Mothers On Prenatal Care Services
100% (1)
Factors Affecting The Extent of Compliance of Adolescent Pregnant Mothers On Prenatal Care Services
29 pages
What Is Behavioral Finance
No ratings yet
What Is Behavioral Finance
10 pages
Operating Systems
No ratings yet
Operating Systems
7 pages
T150mm - Beam and Blocks PDF
No ratings yet
T150mm - Beam and Blocks PDF
2 pages
The Chevron Way
No ratings yet
The Chevron Way
7 pages
2ND Performance Task in Science
No ratings yet
2ND Performance Task in Science
6 pages
Hipotesis Uji T Kontrol Dan Intervensi
No ratings yet
Hipotesis Uji T Kontrol Dan Intervensi
3 pages
Catch Up Friday Research
No ratings yet
Catch Up Friday Research
1 page
Trainng report-BSNL: Monday, July 2, 2007
No ratings yet
Trainng report-BSNL: Monday, July 2, 2007
40 pages
Big Data in Healthcare Systems and Research
No ratings yet
Big Data in Healthcare Systems and Research
4 pages
Kruse
No ratings yet
Kruse
25 pages
Final Project Report MRI Reconstruction
No ratings yet
Final Project Report MRI Reconstruction
19 pages
Cloud Seeding
No ratings yet
Cloud Seeding
23 pages
Eyal Lederman - Process Approach in PT
100% (1)
Eyal Lederman - Process Approach in PT
72 pages
Definition: The Ability To Use Strength Quickly To Produce An Explosive Effort
No ratings yet
Definition: The Ability To Use Strength Quickly To Produce An Explosive Effort
41 pages
API 653 Notes
No ratings yet
API 653 Notes
3 pages
Practice Exam For Final Exam Acct301 With Answers
No ratings yet
Practice Exam For Final Exam Acct301 With Answers
9 pages
Mitochondrial Disorders Biochemical and Molecular Analysis Methods in Molecular Biology Vol 837 2012th Edition Lee-Jun C. Wong (Editor) Download PDF
100% (2)
Mitochondrial Disorders Biochemical and Molecular Analysis Methods in Molecular Biology Vol 837 2012th Edition Lee-Jun C. Wong (Editor) Download PDF
84 pages
The Nexus Between Visioning and Planning
No ratings yet
The Nexus Between Visioning and Planning
2 pages
Sneha SVMCM SC 2023-2024
No ratings yet
Sneha SVMCM SC 2023-2024
2 pages

Diabetes Pridiction Using Machine Learning

Uploaded by

Diabetes Pridiction Using Machine Learning

Uploaded by

DIABETES PRIDICTION USING MACHINE

Submitted in partial fulfilment of the requirement of the degree of

Supervised by: Submitted by:

Department of Computer Science

KURUKSHETRA UNIVERSITY KURUKSHETRA,HYARYANA

Asst . Prof. Tejbir Rana HOD – Sidharth Arora

Fig no Figure Description page no

1.2 The contributions of this work are as follows:

• Implementation and evaluation of traditional and ensemble machine learning models to

• Identification of the dominant characteristics that may lead to diabetic complications

2.1 LITERATURE SURVEY

3.2 Data Constraints:

• Sugar tested value

3.3 Train Dataset and Test Dataset:

3.4 Pre-processing of data:

Figure 2: Data Pre-processing

3.5 Feature Extraction:

3.6 ML Algorithm: KNN

The k-nearest neighbour’s is a ML algorithm is the non-parametric method proposed by

4.1 PROPOSED WORK

4.2 Dataset Description:

The diabetes data set was originated from https://www.kaggle.com/johndasilva/diabetes.

RESULT & DISCUSSION

4.7 Logistic regression:

Logistic Regression is one of the most common classification algorithms.

Therefore, we should choose default value C=1.

4.8 Decision Tree:

Feature “Glucose” is by far the most important feature.

4.11 Feature importance in Random Forest:

4.13 Accuracy Comparison:

5.1 Source code/output

6.2 Future Scope

3. MedlinePlus. How to Prevent Diabetes. 15 June 2020. Available online:

You might also like