0% found this document useful (0 votes)
13 views22 pages

Python Report 2

This document outlines a project on developing a Diabetes Prediction model using Streamlit, a Python library for creating web applications. It details the process of data collection, exploration, training, and deployment of a machine learning model to predict diabetes based on various health metrics. The project aims to enhance early detection and intervention for diabetes through improved predictive accuracy using machine learning techniques.

Uploaded by

Akshatha M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views22 pages

Python Report 2

This document outlines a project on developing a Diabetes Prediction model using Streamlit, a Python library for creating web applications. It details the process of data collection, exploration, training, and deployment of a machine learning model to predict diabetes based on various health metrics. The project aims to enhance early detection and intervention for diabetes through improved predictive accuracy using machine learning techniques.

Uploaded by

Akshatha M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Diabetes Prediction model using streamlit

VISVESVARAYA TECHNOLOGICAL UNIVERSITY


"Jnana Sangama", Belagavi – 590018

INTER/INTRA INSTITUTIONAL INTERNSHIP

Submitted in partial full fillment of the requirement for the award of the degree

Bachelor of Engineering

In
ELECTRONICS AND COMMUNICATION

Submitted by
DEEPTHI A : 4GL21EC015

Under the guidance of


Prof. Pavithra T

M.Tech

HOD(Group A)

Dept. E&C Engineering

GEC Kushalnagar -571 234

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING

GOVERNMENT ENGINEERING COLLEGE, KUSHALNAGAR.

Government Engineering College, Kushalnagar-571234

1
Diabetes Prediction model using streamlit

Chapter 1
Abstract

Python is popular programming , interpreted and object oriented language. It is used for
software development, game development, scripting and mathematics. It was created by
Guido Van Rossum in 1991. In 1994 the first version of python was released with new
features like lambda, map and filter. In 1997 python second version was released with
features like garbage collection, libraries etc. In 2008 python version 3 was released with
features like standard library, mongo db, flask etc. The Jupiter notebook app is a server-client
application that allows editing and running notebook documents via a web browser. The
Jupiter notebook app can be executed on a local desktop requiring no internet access or can
be installed on a remote server and accessed through the internet. In addition to
displaying/editing/running notebook documents, the Jupiter notebook app has a dashboard, a
control panel showing local files and allowing to open notebook documents or shutting down
their kernels. As an alternative to this tool, Colaboratory or colab for short, is a product from
Google research. Colab allows anybody to write and execute arbitrary pythan code through
the browser and is especially well suited to machine learning , data analysis and education.
More technically colab is a hosted Jupyter notebook service thst requires no setup to use,
while providing access free of charge to computing resources including GPUs.

Streamlit is an open source Python library which is blazingly fast that makes it easy to build
beautiful custom web-apps for machine learning and dada science. It is an awesome tool that
allows you to create highly interactive dashboards just with some knowledge of python.
Creating applications using streamlit creates an impact on the end-user as it has a good user
interface and supports a lot of widgets that are user friendly. Also creating apps in streamlit is
easy. We will create an application using streamlit which will predict whether a user has
diabetes or not. The dataset which contains 8 prediction varialbles and 1 target variable. Let
us look at what are the different attributes in the dataset. The predictor variables is named
outcome which is encoded as 0 and where 1 represents Diabetic.

2
Diabetes Prediction model using streamlit

Chapter 2

Introduction
According to the National Institute of Health (NIH), “Diabetes is a disease that occurs when
your blood glucose, also called blood sugar, is too high.” Most of the food we eat is broken
down into a sugar called glucose, and insulin is the harmone that enables glucose to get into
our cells. Diabetes is caused by the body’s inability to produce enough insulin or to properly
utilize the insulin it produces, resulting in excess glucose in the blood leading to significant
health issues. Although there is no treatment for diabetes, you could take steps to preserve
your health. There are three major types of diabetes, type1, type2, gestational diabetes. (1)
Your body does not produce insulin if you have type 1 diabetes. Your immune system targets
and kills the insulin-producing cells in your pancreas. Diabetes type 1 is most commonly
diagnosed in children and young adults, although it can affect anybody. If you have type 1
diabetes you need to take insulin every day. (2) Your body does not generate or utilize insulin
well if you have type 2 diabetes. This is critical to get tested if at risk because no symptoms
may appear. Type 2 diabetes can be delayed or prevented by leading a healthy lifestyle.
(3)Gestational diabetes- This happens to certain women during pregnancy. This form of
Diabetes usually goes away once the baby is born. If you have experienced gestational
diabetes, you are more likely to acquire type 2 diabetes later. If you have gestational diabetes
your baby is more likely to suffer from health issues, including type 2 diabetes. We are going
to build a project on Diabetes Prediction using Machine learning. Machine Learning is a very
useful in the medical field to detect many diseases in their early stage. Diabetes Prediction is
one such Machine Learning model which helps to detect diabetes in humans. Also we will
see how to Deploy a Machine Learning model using streamlit.

3
Diabetes Prediction model using streamlit

Chapter 3

Company profile

3.1 Formation of company


Aqmenz Automation Private Limited is a private incorporated on 15 th October 2018. It is
classified as Non-Govt company and is registered at Registrar of companies, Bangalore.

Brief history of company


Aqmenz Automation Pvt Ltd (AAPL) is situated in northern part of Bangalore, RT Nagar,
Karnataka. AAPL provides Mechanical Design & Automation solutions to their client
companies. AAPL also involved in Open source Robotics and developed different varieties of
Robots.
AAPL also started INDOSKILL, a separate platform for the students to get training and work
on various Real Time Industrial Projects. Indoskill offers skill-oriented hands-on training
through an online platform.

Field of Expertise: Open-source Robotics, Industrial Automation, Product Design, Python &
Deep Learning and Embedded Systems

Objectives
 AAPL had a trust in Skill India mission & vision, hence our utmost priority is to add
skill to the young Generation and make them Profitable and productive for the nation.
 We aim in Providing Industrial Automation Training Skill module kits to Institution,
University’s & Collage Lab Facilities with Lowest Possible Price for the Benefits of
Technical Students.
 Identifying young entrepreneurs and Motivate, training them to establish Startup to
create Employment as well as prosperity for the nation.
 Consultation, Sourcing and supplying highly skilled Manpower to Industry for better
efficiency and productivity.
 Providing low cast & precise industrial automation solutions. Very eager to fetch
solution for most complex industrial problems in a modest way.

4
Diabetes Prediction model using streamlit

3.2 Vision and mission


Our Motto and Vision are to create awareness & training young generation to current and
future jobs demands and also help to current and future jobs demands; meanwhile help the
students and employees to meet the mandatory necessities of future human resources and
skill demands. We are in the 4th industrial revolution. The technological revolution is
catastrophic like never before, hence continues awareness for the up-gradation environment is
much essential. Aqmenz Automation Pvt. Ltd. is working to help and enhance the potential of
students and employees. So that future human resources will be very beneficial, purposeful
and profitable to the nation.

Major Milestones
We have under gone many industrial projects. Our major clients are BIAL (Bangalore
International Airport Limited), GE (General Electric) and Amics technologies.

About the company:


Organization structure
The organization structure is having three different departments such as design department,
software department and sales and marketing.

AAPL

SALES &
DESIGN SOFTWARE
MARKETING
Service offered
 Provides Design & Automation solutions.
 All type of automation projects to companies using PLC’s, SCADA embedded systems.
 We provide robots and robotic solutions to small and medium scale companies
 Embedded solutions to companies like GE
 We conduct technical skill oriented training programs to engineering colleges.
 We also provide robotics and automation lab equipments for colleges.

Number of people working in company and their responsibilities:

5
Diabetes Prediction model using streamlit

There are 20 persons in this company, out of which:


 Shamanna Mohan, Chief Executive Officer (CEO)
 Mohammed Azhar Hussain, Chief Technology Officer (CTO)

3.3 Ongoing projects


 Automation related projects
 CNC Machines
 Open-source Custom Robots
 Garment Industry slider project

Chapter 4

6
Diabetes Prediction model using streamlit

System analysis
4.1 Existing system:
 The current methods used for diabetes prediction, such as manual diagnosis or basic
statistical analysis are not much accurate and efficient.
 There are some limitations and shortcomings of the existing system, such as low
accuracy, time consuming process, and lack of automation.
 In existing system, the single data mining technique is used to diagnose the disease.
 There is no previous research that identifies which data mining technique can provide
more reliable accuracy.
 They usess only the internal measures to measure the fasting plasma glucose for
predicting the type 2 diabetes.
 Diabetes prediction using algorithms such as k-Nearest Neighbour, k-means, branch
and bound algorithm was proposed. A basic diabetes dataset is chosen for carrying out
the comparative analysis. The importance of feature analysis for predicting diabetes
by employing machine learning technique is discussed.

4.2 Proposed system:


 The proposed system study is classification of Indian PIMA dataset for diabetes as
binary classification problem.
 This is proposed to achieve through machine learning and deep learning classification
algorithm.
 For machine learning, SVM algorithm is proposed whereas for deep learning Nueral
network is used.
 The proposed system improves accuracy of prediction through deep learning
techniques.
 We can have the advantages of early detection, personalized predictions, scalability,
and automation by analyzing various data sources and variables, the model can
potentially identify patterns and indicators of diabetes at an early stage, allowing for
timely intervention and treatment.
 The proposed system can handle a large volume of data and can be easily scaled to
accommodate a growing number of users.

Chapter 5

7
Diabetes Prediction model using streamlit

Project working and flow

The work flow to build the Machine learning project to predict diabetes is as follows:

1. Collection of data
2. Exploring the data
3. Splitting the data
4. Training the data
5. Evaluating the model
6. Deploying the model

5.1 Collection of data:

The very first step is to choose the dataset for our model. We can get a lot of different
datasets from Kaggle. You just need to sign in to Kaggle and search for any dataset you
need for the project. The objective is to predict whether a patient has diabetes based on
diagnostic measurements. Several constraints were placed on the selection of these
instances from a larger database.

The data contains 9 columns which are as follows :

 Pregnancies: Number of times pregnant


 Glucose: Plasma glucose concentration a 2 hours in an oral glucose tolerance test
 Blood Pressure: Diastolic blood pressure (mm Hg)
 Skin Thickness: Triceps skin fold thickness (mm)
 Insulin: 2-Hour serum insulin (mu U/ml)
 BMI: Body mass index (weight in kg/(height in m)^2)
 Diabetes Pedigree Function: Diabetes pedigree function
 Age: Age (years)
 Outcome: Class variable (0 or 1)

5.2 Exploring the Data

Now we have to set the development environment to build our project. For this project, we
are going to build this Diabetes prediction using Machine Learning in Google Colab. You can
also use Jupyter Notebook.

8
Diabetes Prediction model using streamlit

After downloading the dataset, import the necessary libraries to build the model.

# Import the required libraries


import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn import svm
from sklearn.metrics import accuracy_score
import pickle

Load the data using the read_csv method in the pandas library. Then the head() method in the
pandas library is used to print the rows up to the limit we specify. The default number of rows
is five.
# Load the diabetes dataset to a pandas DataFrame
diabetes_dataset = pd.read_csv('diabetes.csv')

# Print the first 5 rows of the dataset


diabetes_dataset.head()
Output:

# To get the number of rows and columns in the dataset


diabetes_dataset.shape
#prints (768, 9)

# To get the statistical measures of the data


diabetes_dataset.describe()
Output:

9
Diabetes Prediction model using streamlit

And, it is clear that the Outcome column is the output variable. So let us explore more details
about that column.

# To get details of the outcome column


diabetes_dataset['Outcome'].value_counts()
Output:

In the output, the value 1 means the person is having Diabetes, and 0 means the person is not
having Diabetes. We can see the total count of people with and without Diabetes.

5.3 Splitting the data

The next step in the building of the Machine learning model is splitting the data into training
and testing sets. The training and testing data should be split in a ratio of 3:1 for better
prediction results.

# separating the data and labels


X = diabetes_dataset.drop(columns = 'Outcome', axis=1)
Y = diabetes_dataset['Outcome']

# To print the independent


variables print(X)
Output:

10
Diabetes Prediction model using streamlit

# To print the outcome variable


print(Y)
Output:

#Split the data into train and test


X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.2, stratify=Y,
random_state=2)
print(X.shape, X_train.shape, X_test.shape)
Output:
(768, 8) (614, 8) (154, 8)

5.4 Training the model

The next step is to build and train our model. We are going to use a Support vector classifier
algorithm to build our model.

# Build the model

11
Diabetes Prediction model using streamlit

classifier = svm.SVC(kernel='linear')

# Train the support vector Machine Classifier


classifier.fit(X_train, Y_train)

After building the model, the model has to predict output with test data. After the prediction
of the outcome with test data, we can calculate the accuracy score of the prediction results
by the model.

# Accuracy score on the training data


X_train_prediction =
classifier.predict(X_train)
training_data_accuracy = accuracy_score(X_train_prediction, Y_train)

print('Accuracy score of the training data : ', training_data_accuracy)

# Accuracy score on the test data


X_test_prediction =
classifier.predict(X_test)
test_data_accuracy = accuracy_score(X_test_prediction, Y_test)
Output:
Accuracy score of the training data: 0.7833876221498371
Accuracy score of the test data: 0.7727272727272727

5.5 Evaluating the model

input_data = (5,166,72,19,175,25.8,0.587,51)

# Change the input_data to numpy array


input_data_as_numpy_array = np.asarray(input_data)

# Reshape the array for one instance


input_data_reshaped = input_data_as_numpy_array.reshape(1,-1)

prediction =
classifier.predict(input_data_reshaped)
print(prediction)

12
Diabetes Prediction model using streamlit

if (prediction[0] == 0):
print('The person is not diabetic')
else:
print('The person is diabetic')
Output:
The person is diabetic
Saving the file
# Save the trained model
filename = 'trained_model.sav'
pickle.dump(classifier, open(filename, 'wb'))

# Load the saved model


loaded_model = pickle.load(open('trained_model.sav', 'rb'))

Once you run this code a new file named trained_model.sav will be saved in the project folder.

5.6 Deploying the model

One of the most important and final steps in building a Machine Learning project is Model
deployment. There are many frameworks available for deploying the Machine learning model
on the web. Some of the most used Python frameworks are Django and Flask. But these
frameworks require a little knowledge of languages such as HTML, CSS, and JavaScript.

So, a new framework known as Streamlit was introduced to deploy the Machine Learning
model without the need to have the knowledge of Front End Languages. It is quite easy to
deploy using Streamlit. So, we will use the Streamlit framework to deploy our model.
Although Streamlit has many advantages over the other frameworks, lot more features are
under development. If you are getting started in Machine Learning then this framework will
be a perfect start to deploy your machine learning model on the web.

Python Code to Deploy ML model using Streamlit

To install Streamlit run the following command in the command prompt or terminal.

pip install streamlit

13
Diabetes Prediction model using streamlit

5.7 Source code with Explanation

Open a new Python file and put the following code.

App.py

Importnumpyasnp
Importpickle
Importstreamlitasst
# Load the saved model
loaded_model=pickle.load(open('C:/Users/ELCOT/Downloads/trained_model.sav', 'rb'))
# Create a function for Prediction
defdiabetes_prediction(input_data):
# Change the input_data to numpy array
input_data_as_numpy_array=np.asarray(input_data)
# Reshape the array as we are predicting for one instance
input_data_reshaped=input_data_as_numpy_array.reshape(1,-1)

prediction=loaded_model.predict(input_data_reshaped)
print(prediction)
if(prediction[0]==0):
return'The person is not diabetic'
else:
return'The person is diabetic'

defmain():

# Give a title
st.title('Diabetes Prediction Web App')

# To get the input data from the user


Pregnancies=st.text_input('Number of Pregnancies')
Glucose=st.text_input('Glucose Level')
BloodPressure=st.text_input('Blood Pressure value')

14
Diabetes Prediction model using streamlit

SkinThickness=st.text_input('Skin Thickness value')


Insulin=st.text_input('Insulin Level')
BMI=st.text_input('BMI value')
DiabetesPedigreeFunction=st.text_input('Diabetes Pedigree Function value')
Age=st.text_input('Age of the Person')

# Code for Prediction


diagnosis=''

# Create a button for Prediction

ifst.button('Diabetes Test Result'):


diagnosis=diabetes_prediction([Pregnancies, Glucose, BloodPressure, SkinThickness,
Insulin, BMI, DiabetesPedigreeFunction, Age])

st.success(diagnosis)

if name ==' main ':


main()

# Code for Prediction


diagnosis=''

# Create a button for Prediction

ifst.button('Diabetes Test Result'):


diagnosis=diabetes_prediction([Pregnancies, Glucose, BloodPressure, SkinThickness,
Insulin, BMI, DiabetesPedigreeFunction, Age])

st.success(diagnosis)

if name ==' main ':


main()

15
Diabetes Prediction model using streamlit

Save the file after typing the code. And then to deploy using streamlit go to command prompt
and run the following command.

streamlit run
App.py (or)
streamlit run filename.py

After running the command the web app will open in the local host web server. Otherwise, go
to your browser and type localhost:8501. The following output will be shown.
Output:

Sample Input data for a person does not have diabetes is {1, 85, 66, 29, 0, 26.6, 0.351, 31}.
These data as input will generate the following output in the web app.

16
Diabetes Prediction model using streamlit

Sample input data for a person who have diabetes is {6, 148, 72, 35, 0, 33.6, 0.627, 50}. These
data as input will generate the following output in the web app.

Chapter 6

Results

17
Diabetes Prediction model using streamlit

 Early intervention: With the help of a diabetes prediction model, healthcare providers
can intervene early and provide targeted interventions to help individuals manage
their health and prevent complications.

 Resource optimization: By accurately predicting the likelihood of diabetes, healthcare


resources can be allocated, more efficiently, ensuring that those who need it most
receive the necessary care and attention.

 Personalized predictions: Machine learning algorithms can analyze large amounts of


data and provide personalized risk assessments based on individual characteristics, such
as age, lifestyle and medical history.

 Improved accuracy: By leveraging complex patterns in data, machine learning models


can potentially achieve higher accuracy in predicting diabetes compared to traditional
methods.

 Cost-effective: Implementing a machine learning model for diabetes prediction can


potentially reduce healthcare costs by focusing resources on high-risk individuals and
preventive measures.

 Research opportunities: Developing a diabetes prediction model using machine


learning opens up avenues for further research and understanding of the disease,
leading to advancements in treatment and prevention strategies.

Chapter 7

Advantages and Disadvantages

18
Diabetes Prediction model using streamlit

7.1 Advantages:

 Early detection: Machine learning models can help identify individuals at risk of
developing diabetes at an early stage, allowing for timely intervention and prevention.

 Personalized predictions: Machine learning algorithms can analyze large amounts of


data and provide personalized risk assessments based on individual characteristics, such
as age, lifestyle and medical history.

 Improved accuracy: By leveraging complex patterns in data, machine learning models


can potentially achieve higher accuracy in predicting diabetes compared to traditional
methods.

 Cost-effective: Implementing a machine learning model for diabetes prediction can


potentially reduce healthcare costs by focusing resources on high-risk individuals and
preventive measures.

 Resource optimization: By accurately predicting the likelihood of diabetes, healthcare


resources can be allocated mare efficiently, ensuring that those who need it most
receive the necessary care and attention.

 Research opportunities: Developing a diabetes prediction model using machine


learning opens up avenues for further research and understanding of the disease,
leading to advancements in treatment and prevention strategies.

7.2 Disadvantages:

 Data quality: The accuracy of machine learning models heavily relies on the quality
and representativeness of the data used for training. If the data is incomplete, biased
or of poor quality, it can affected the reliability of the predictions.

 Interpretability: Some machine learning models such as deep learning algorithms, can
be complex and difficult to interpret. Understanding how the model arrives at its
predictions may pose challenges for healthcare professionals.

19
Diabetes Prediction model using streamlit

 Ethical Considerations: There are ethical concerns related to the use of personal health
data for predictive modelling. Ensuring privacy, consent, and fair use of data is crucial
in developing responsible and ethical machine learning models.

 Limited generalizability: machine learning models trained on specific populations


may not generalize well to diverse populations or different regions due to variations in
lifestyle, genetics, and healthcare systems.

 User acceptance: some individuals may be hesitant to embrace predictive models due
to concerns about privacy, data security, and the potential for discrimination based on
predicted health outcomes.

 Expertise and infrastructure requirements: developing and implementing a diabetes


prediction model requires expertise in machine learning, access to quality data, and
the necessary computational resources.

Chapter 8

CONCLUSION

Machine learning is a quickly growing field in computer science. It has application in nearly
every other field of study and has already been implemented commercially because machine
learning can solve too difficult problems or time consuming for humans to solve. We have a
simple overview of some techniques and algorithms in machine learning. Furthermore, there

20
Diabetes Prediction model using streamlit

are more techniques apply machine learning as solution. In the future, machine learning will
play an important role in our daily life.

Python has a simple syntax like the English language. Python has syntax that allows
developers to write programs with fewer lines than some other programming languages.
Python runs on an interpreter system, meaning that code can be executed as soon as it is
written. This means that prototyping can be very quick.

In this internship we mainly learnt about Basics of Machine learning using python.

List of References

https://towardsdatascience.com

https://writingtestys.hashnode

https://github.com

https://www.anlyticvidhya.com

https://www.geeksforgeeks.org/getting-strarted-with-jupyter-notebook-python/

21
Diabetes Prediction model using streamlit

22

You might also like