The Prediction of Disease Using Machine Learning: December 2021
The Prediction of Disease Using Machine Learning: December 2021
The Prediction of Disease Using Machine Learning: December 2021
net/publication/357449131
CITATION READS
1 1,338
1 author:
C K Gomathy
Sri Chandrasekharendra Saraswathi Viswa Mahavidyalaya University
189 PUBLICATIONS 339 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Mining the E-commerce Data to Analyze the Target Customer Behavior View project
All content following this page was uploaded by C K Gomathy on 31 December 2021.
Most of the chronic diseases are predicted by our system. It accepts the structured type of data as input to the
machine learning model. This system is used by end-users i.e. patients/any user. In this system, the user will
enter all the symptoms from which he or she is suffering. These symptoms then will be given to the machine
learning model to predict the disease. Algorithms are then applied to which gives the best accuracy. Then
System will predict disease on the basis of symptoms. This system uses Machine Learning Technology. Naïve
Bayes algorithm is used for predicting the disease by using symptoms, for classification KNN algorithm is
used, Logistic regression is used for extracting features which are having most impact value, the Decision tree
is used to divide the big dataset into smaller parts. The final output of this system will be the disease predicted
by the model
IV. METHODOLOGY
To calculate performance evaluation in the experiment, first, we denote TP, TN, Fp and FNias true
positive(the number of results correctly predicted as required), true negative (the number of results not
required), false positive (the number of results incorrectly predicted as required), false negative(the number of
results incorrectly predicted as not required)respectively. We can obtain four measurements: recall, precision,
accuracy, and F1 measures as follows:
V. ALGORITHM TECHNIQUES
KNN K Nearest Neighbour (KNN) could be terribly easy, simple to grasp, versatile and one amongst the
uppermost machine learning algorithms. In the Healthcare System, the user will predict the disease. In this
system, the user can predict whether the disease will detect or not. In the proposed system, classifying disease
in various classes that shows which disease will happen on the basis of symptoms. KNN rule used for each
classification and regression issue. KNN algorithm is based on feature similarity approach. It is the best
choice for addressing some of the classification related tasks. K-nearest neighbor classifier algorithm is to
predict the target label of a new instance by defining the nearest neighbor class. The closest class will be
identified using distance measures like Euclidean distance. If K = 1, then the case is just assigned to the
category of its nearest neighbor.
The value of ‘k’ has to be specified by the user and the best choice depends on the data. The
larger value of ‘k’ reduces the noise on the classification. If the new feature i.e in our case symptom has to
classify, then the distance is calculated and then the class of feature is selected which is nearest to the newer
instance. In the instance of categorical variables, the Hamming distance must be used. It conjointly brings up
the difficulty of standardization of the numerical variables between zero and one once there's a combination of
numerical and categorical variables within the dataset
NAIVE BAYES:
Naive Bayes is an easy however amazingly powerful rule for prognosticative modeling. The independence
assumption that allows decomposing joint likelihood into a product of marginal likelihoods is called as 'naive'.
This simplified Bayesian classifier is called as naive Bayes. The Naive Bayes classifier assumes the presence
of a particular feature in a class is unrelated to the presence of any other feature. It is very easy to build and
useful for large datasets. Naive Bayes is a supervised learning model. Bayes theoremmprovides some way of
calculative posterior chance P(b|a) from P(b), P(a) and P(a|b). Look atithe equation below:
P(b v a)= P(a v b)P(b)/P(a)
Above,
P(b|a) is that the posterior chance of class (b,target) given predictor (a, attributes). P(b) is the
prioriprobability of class.
P(a|c)iis that chance that is that the chance of predictor given class.
P(a) is the prioriprobability ofipredictor. In our system, Naïve Bayes decides which symptom is to put in
classifier and which is not. 8.3 LOGISTIC REGRESSION Logistic regression could be a supervised learning
classification algorithm accustomed to predict the chance of a target variable that is Disease.
DECISION TREE
A decision tree is a structure that can be used to divide up a large collection of records into successfully
smaller sets of records by applying a sequence of simple decision tree. With each successive division, the
members of the resulting sets become more and more similar to each other. A decision tree model consists of
a set of rules for dividing a large heterogeneous population into smaller, more homogeneous (mutually
exclusive) groups with respect to a particular target. The target variable is usually categorical and the decision
tree is used either to:
Calculate the probability that a given record belong to each of the category and,
To classify the record by assigning it to the most likely class (or category). In this disease prediction
system, decision tree divides the symptoms as per its category and reduces the dataset difficulty
GRAPHS:
VII. CONCLUSION
The main aim of this disease prediction system is to predict the disease on the basis of the symptoms. This
system takes the symptoms of the user from which he or she suffers as input and generates final output as a
prediction of disease. Average prediction accuracy probability of 100% is obtained. Disease Predictor was
successfully implemented using the grails framework. This system gives a user-friendly environment and easy
to use.
As the system is based on the web application, the user can use this system from anywhere and at any time. In
conclusion, for disease risk modeling, the accuracy of risk prediction depends on the diversity feature of the
hospital data.
This systematic review aims to determine the performance, limitations, and future use of Software in health
care. Findings may help inform future developers of Disease Predictability Software and promote
personalized patient care. The program predicts Patient Diseases. Disease Prediction is done through User
Symbols.
In this System Decision tree, Unplanned Forest, the Naïve Bayes Algorithm is used to predict diseases. For
the data format, the system uses the Machine Learning algorithm Process Data on Database Data namely,
Random Forest, Decision Tree, Naive Bayes. System accuracy reaches 98.3%. machine learning skills are
designed to successfully predict outbreaks.
VIII. REFERENCES
1. Dr.C K Gomathy, Article: A Semantic Quality of Web Service Information Retrieval Techniques Using
Bin Rank A Cloud Monitoring Framework Perform in Web Services, International Journal of Scientific
Research in Computer Science Engineering and Information Technology IJSRCSEIT | Volume 3 | Issue 5 |
ISSN : 2456-3307,May-2018
2. Dr.C K Gomathy, Article: Supply chain-Impact of importance and Technology in Software Release
Management, International Journal of Scientific Research in Computer Science Engineering and
Information Technology ( IJSRCSEIT ) Volume 3 | Issue 6 | ISSN : 2456-3307, P.No:1-4, July-2018
3. Dr.C K Gomathy, Article: A Scheme of ADHOC Communication using Mobile Device Networks,
International Journal of Emerging technologies and Innovative Research ( JETIR ) Volume 5 | Issue 11 |
ISSN : 2349-5162, P.No:320-326, Nov-2018
4. Dr.C K Gomathy, Article: A Study on the recent Advancements in Online Surveying , International
Journal of Emerging technologies and Innovative Research ( JETIR ) Volume 5 | Issue 11 | ISSN : 2349-
5162, P.No:327-331, Nov-2018
5. D. W. Bates, S. Saria, L. Ohno-Machado, A. Shah, no G. Escobar, “Big data for health care: using
analytics to identify and treat high-risk and high-risk patients, Health, vol. 33, no. 7, pages 1123–1131,
2014.
6. K.R. Lakshmi, Y. Nagesh and Mr. Veera Krishna, “Comparison of performance of the three data mines
ways to predict survival of kidney disease”, International Journal of Engineering Development &
Technology, March 2014.
Author’s Profile:
1. A. Rohith Naidu, Student, B.E. Computer Science and Engineering, Sri Chandrasekharendra
SaraswathiViswa Mahavidyalaya deemed to be university, Enathur, Kanchipuram, India. His Area of
Interest Internet of things,big data analytics.