11
IV
https://doi.org/10.22214/ijraset.2023.50056
April 2023
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
Speech Based Parkinson's Disease Detection Using
Machine Learning
J. Spandana1, T. Rakesh2, K. Pranay3, Ch. Vijaya Bhaskar4, Sunil Bhutada5
1, 2, 3, 4, 5
Sreenidhi Institute of Science and Technology, Yamnampet, Hyderabad, Telangana-501301
Abstract: The diagnosis of Parkinson's disease (PD) is often made after careful observation and evaluation of clinical indicators,
such as the description of various motor symptoms. Traditional methods of diagnosis, on the other hand, may be prone to error
since they depend on the subjective judgment of motions that might be difficult for human eyes to categorise. However, nonmotor symptoms of PD in its early stages may be minor and might be due to a wide variety of diseases. Thus, it is difficult to
make an early diagnosis of PD since these symptoms are often disregarded. These challenges have prompted the use of machine
learning approaches for the categorization of PD and healthy controls or patients with comparable clinical presentations as a
means of improving diagnostic and evaluation processes for PD (e.g., movement disorders or other Parkinson an syndromes).
PD has been diagnosed using a wide variety of data types and machine learning techniques; the goal of this article is to present a
synopsis of these approaches. In this study, we investigate PD recognition from a spoken language using CNN, ANN, and XGB.
The CNN was fed stacked 2D input maps consisting of spectrograms and other short-term characteristics. The effectiveness of
PD detection was analyzed by breaking down a voice recording into its component parts and comparing the results to those
obtained by fusing all of the segments at the decision level.
Keywords: Parkinson, disease, machine learning, biomarkers, clinical, decision making, diagnosis, Assessment.
I. INTRODUCTION
Millions of individuals across the world suffer from Parkinson's disease (PD), a neurological ailment characterized by a broad range
of symptoms including tremor, cognitive impairment, hallucinations, dementia, and sleep disturbances. In Bangladesh, 1600
individuals lose their lives annually to PD, and there is currently no treatment. Loss of smell, problems with REM sleep, cramped
handwriting, an inability to walk, and other mobility issues are all symptoms of Parkinson's disease [1]. The symptoms often
manifest on one side of the body and progress to the other; however, this is not always the case. Early symptoms are often subtle
and easily missed, and they also vary from person to person. A lack of dopamine-producing neurons in the brain lies at the root of
Parkinson's disease. A lack of the amino acid dopamine causes aberrant brain activity, which in turn causes Parkinson's disease. PD
is thought to affect between 7 and 10 million people around the world. Compared to individuals over the age of 50, just 4% of those
under the age of 50 really get a diagnosis. While PD cannot be prevented or cured, its symptoms may be managed [2]
Talking is a difficult undertaking since it requires the coordinated and precise regulation of a wide variety of processes and systems.
The lungs are the major organ responsible for speech creation in humans, since they force enough air through the glottis to cause the
vocal folds to vibrate and allow for sound to be produced. Vocal fold vibration results in an excitation signal with the same
characteristics as a lung-expelled pressure wave. After entering the vocal tract, the source signal is filtered by the spectral envelope
to produce the speech signal. [3, 4].
A. Motivation
As the Parkinson’s disease is affected to a large extent of audience and this disease is also hard to diagnose many people are
suffering with this when it reached to a chronic stage. We chose this topic because of my friend's grandfather, who was affected by
this disease. At starting they ignored it as it may be due to an age factor but, it affected them very severely, and he lost his voice. If
this had been detected early, there would have been some solution to control this disease. Inspired by that incident, we opted for this
project to implement machine learning algorithms to detect it.
B. Objectives
1) Although the clinical picture of PD will almost certainly change throughout the course of extended dopaminergic medication,
close monitoring of the patient is essential to the success of correcting the primary clinical signs of PD.
© IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
260
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
2) The accelerometers included in most new cell phones should make it possible to track a patient's every move and provide a
quantitative measure of their daily activity (e.g., walk or sit). Smart phones are of sufficient grade to provide enhanced medical
diagnostics and status monitoring.
A considerable influence on healthcare costs, patient longevity, and quality of life might be realized by identifying speech
alterations in Parkinson's patients before the development of debilitating physical symptoms. Parkinson's disease is often diagnosed
by a combination of medical history, physical examination, and the detection of specific motor symptoms (PD). Yet, traditional
diagnostic approaches may be prone to subjectivity since they depend on the evaluation of movements that might be difficult to
characterize due to their subtlety to the human eye. Early nonmotor signs of Parkinson's disease, meantime, might be mild and can
be brought on by a wide range of diseases. Because of this, early PD diagnosis is challenging [5], and these symptoms are often
ignored. ML techniques have been under consideration as a possible game-changer in the diagnosis of this illness by scientists for
some time. Methods of gait analysis that don't involve touching the patient might be widely used at home [6]. Very little effort has
focused on incorporating ML approaches into the process to make it fully self-sufficient and useable even without an active network
connection. Early-stage patients may also have speech issues [7] such dysphonia, echolalia, and hypophonia. The utilization of
human voice by computers for information retrieval and analysis is a potential future development [8].
In Section 2, we detail the research survey that formed the basis of the study. In Section 3, we'll talk about the framework and
approach that will be employed to reach the desired goal. Materials and techniques are outlined in Section 4, and the experiment and
its outcomes are discussed in Section 5. The overall usefulness of the planned effort is discussed in Section 5. Section 6 wraps up
the planned work and discusses potential upgrades.
.
II.
RELATED WORK
For efficient learning and categorization, ML relies on the tried-and-true naive Bayes classifier algorithm. In accordance with Bayes'
theorem, it calculates the probability that a certain event will take place given a given set of conditions. Many variations in voice
signals are included in the data used to calculate the probability of a health problem. Naive Bayes classification is carried out with
the help of the simple Gaussian naive Bayes algorithm, which supplies the classifier module [9].
Particle swarm optimization is used with a new method called RMDL for classification to get the best possible outcomes in both
areas. In addition to being applicable to a wide variety of data types and file formats, the acquired findings demonstrated an increase
in the reliability and efficiency of the models. This method shows promise for aiding PD diagnosis since no filter configuration is
required to produce the structured co-incidence matrix [10].
Using deep learning methods to diagnose PD. PD was detected using a variety of data mining methods, including the Naive Bayes
algorithm, support vector machines, multilayer perceptron neural networks, and decision trees. Multi-layer perceptron and logistic
regression (MLPR) models were applied to speech input from acoustic devices in order to predict PD. Patients' geographical origins
and linguistic characteristics were studied for their ability to foretell the development of PD [11].
Digital Parkinson's disease Analysis used machine learning techniques to categorise the many aspects of deep brain surgery images,
and [12], it's been cleaned up and shown. A novel ensemble deep learning technique for categorization, Random Multimodal Deep
Learning (RMDL) accepts data in many forms, including text, video, pictures, and symbolic representations. Across a wide variety
of data types and classification issues, the acquired solutions yield consistently better performance than model techniques. The
purpose of this research is to improve the reliability of machine learning approaches for the categorization of deep brain surgery
pictures.
Effective diagnostic software based on the fuzzy K-nearest neighbor algorithm 2. The FKNN model is developed on top of a
principal component analysis-derived optimal feature set. The model is then contrasted with SVM-based methods. FKNN-based
systems were shown to be superior to SVM-based ones [13]. Massive amounts of information are gathered from both healthy people
and those who have had Parkinson's disease in the past [14]. In order to train these algorithms, we need this data. XGBoost, Naive
Bayes, and Decision Tree were utilized for the categorization, with Decision Tree achieving an accuracy of 87%. Sixty percent of
the data is used for training, and forty percent is used for testing.
Finding persons who have Parkinson's disease has been investigated using a wide variety of deep learning and machine learning
methods (PD). The primary focus of this investigation is on the feasibility of using speech signal analysis as diagnostic evidence for
PD. As speech processing has been around for a while and can be put to use in a wide range of situations, it holds a lot of potential
for the classification and diagnosis of PD. The purpose of this research is to examine the similarities and differences between many
popular classification schemes [15].
© IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
261
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
III.
FRAME WORK
Proposed PD framework block diagram
The gradient boosted trees approach is widely used, and XG Boost is a popular and effective open-source implementation of the
algorithm. To improve the accuracy of its predictions, the supervised learning method known as "gradient boosting" combines the
predictions of many less sophisticated models.
For regression with gradient boosting, the weak learners take the form of regression trees, with each input data point being mapped
into a leaf of the tree that stores a continuous score. The regularized (L1 and L2) objective function that XG Boost minimizes
consists of a convex loss function (based on the difference between the predicted and target outputs) plus a penalty term for model
complexity (in other words, the regression tree functions). Iteratively, as the training progresses, new trees are added to forecast the
residuals or mistakes of earlier trees, which are then integrated with earlier trees to form the final prediction. To reduce the loss
while including more models, gradient boosting employs a gradient descent approach.
IV.
IMPLEMENTATION ANALYSIS
The Parkinson's illness speech dataset from the UCI Machine Learning library serves as training data. In addition, by combining the
inputs of healthy and Parkinson's afflicted patients' spiral drawings, our suggested approach produces reliable outcomes. We suggest
a hybrid approach, which is both efficient and accurate, by evaluating the patient's speech and their spiral drawing data. By
comparing the two sets of data, the doctor can determine if the patient is healthy or not and what medication to give them depending
on the severity of their condition.
Pre-processing: speech signals have to be broken down into their component parts, with the quiet portions having less energy than
the spoken portions since their amplitude is lower. So, this method may be used to separate the sound of speaking from that of quiet.
In this study, we isolate the consistent parts of each speech signal before carrying out the segmentation process. As these signals
tend to be most consistent midway through their whole duration, cutting off the beginning and finish portions allows for continuous
data transmission throughout. To eliminate issues at the beginning and end of the phonations, 2s segments were selected from the
intermediate, steady section of the speech signals for the following acoustic analysis.
Distinct auditory units from healthy and Parkinson's patients were analyzed and contrasted. Waveform, spectrogram, intensity, and
formant frequency representations are used in the acoustic analysis. Patients with Parkinson's disease (PD) and healthy controls had
their voices removed from a longer audio recording, and the participants were asked to interpret the resulting 20-second piece. Pitch
(shown in blue) and volume (shown in red) are two examples of acoustic parameters that may be extracted from audio recordings of
both Parkinson's disease and healthy patients, as shown in the figure below (in yellow). As can be seen in Figure 'a,' when
comparing the acoustic waveform of the sound unit "The North Wind and the Sun were arguing who was the stronger," to that of a
typical patient, some of the peaks are flattened (refer figure b). This mimics the muffled condition of a throt microphone in much the
same way. In PD patients, there is a noticeable break after every few words said over that 20-second period, but in healthy
individuals, the sentences flow smoothly together as seen by the spectrogram. The intensity of a spot on a spectrogram represents
the magnitude of the frequency variable. Very dark points have large amplitudes while extremely bright points have tiny amplitudes.
© IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
262
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
Table 1: The data set of voice based Parkinson’s disease
Fig 1 :(a) The waveform of PD patient sound
(b) The waveform of healthy patient of sound units
The intensity of sound units from people with Parkinson's disease and from those in good health varies as seen in the figure below.
In this case, we observe that PD patients speak words with reduced intensity, i.e., below 58.1 dB, therefore most of the syllables
were not heard clearly, but healthy persons say them with considerably greater intensity. PD patients and healthy persons' formant
frequencies are shown in Figure 4. Here, we can hear how the clear formant of a healthy person's speech represents a certain
frequency range, whereas the muffled formant of a person with PD's voice implies a lower frequency range. Motives for using
characteristics like intensity and formant features are provided by this study. The following section elaborates on the suggested
architecture for the system.
Fig 2: (a) Intensity plot of PD audio sample
(b) Intensity plot of HC audio sample
Fig 3: (a) Formant speckles plot of PD audio
(b) Formant speckles plot of HC audio
© IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
263
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
V.
CONCLUSION
In this work, stress the significance of early diagnosis and prognosis of Parkinson's disease so that sufferers may get therapy and
care as soon as feasible. This study investigates PD recognition from a speech signal using CNN, ANN, and XGB. The CNN was
fed stacked 2D input maps consisting of spectrograms and other short-term characteristics. Decision-level fusion of all segments in a
voice recording was compared to the influence of each individual segment on PD detection efficiency. Although deep learning does
beat machine learning models, distinguishing it as the better method remains challenging. This is because our audio dataset was too
small to properly categorise the deep learning approaches. ANN-based recognition outperforms HMM-based recognition when
compared to CNN. The results of this research might therefore be seen as a first attempt to use cutting-edge science for the purpose
of diagnosing diseases at an early stage. In addition to the forthcoming work on the speech dataset, additional symptoms of
Parkinson's patients may be gathered to enable early detection of the disease. In PD diagnosis, it would be possible to gather and
evaluate a dataset of motor and nonmotor symptoms. As compared to the other available algorithms for making diagnoses, the
XGBoost classifier achieves the best results in this scenario. As our findings show, the Extreme Gradient Booster Algorithm is the
best option for the Prediction of Parkinson's Disease when the data is boosted and trained using this method, with an effective
accuracy.
VI.
FUTURE ENHANCEMENT
Future research might explore other methods for making Parkinson's disease forecasts from a variety of sources. In this study, we
categorise patients into two groups based on a single binary attribute: those with illness and those without. Patients with Parkinson's
disease will be categorized into their stages using a variety of features in the future.
·
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
K. Mamun, "Deep brain stimulation: a light of hope for Parkinson’s patients", Deep brain stimulation: a light of hope for Parkinson’s patients |
theindependentbd.com, 2018.
Anitha. R et al., “ EARLY DETECTION OF PARKINSON’S DISEASE USING MACHINE LEARNING”, IJARIIE-ISSN(O)-2395-4396, Vol-6 Issue-2 2020.
Hariharan M, et al., 2006 SPEECH EMOTION RECOGNITION USING STATIONARY WAVELET TRANSFORM AND TIMBRAL TEXTURE
FEATURES ARPN Journal of Engineering and Applied Sciences 9 1316-22.
Khan T, et al., 2014 Classification of speech intelligibility in Parkinson's disease Biocybernetics and Biomedical Engineering 34 35-45
Shikha Singh, Nikita Shingade, Priti Sarote, Deepti Yelale and Nihar Ranjan. “Parkinson’s disease detection using machine learning”, International Journal of
Development Research, 12, (04), 55117-55119.
Lakany, H. Extracting a diagnostic gait signature. Pattern Recognition. 2008, 41, 1627–1637.
Hazan, H.; et al., Early diagnosis of Parkinson’s disease via machine learning on speech data. In Proceedings of the 2012 IEEE 27th Convention of Electrical
and Electronics Engineers in Israel, Eilat, Israel, 14–17 November 2012; pp. 1–4.
rid, A.; et al., Computational diagnosis of Parkinson’s Disease directly from natural speech using machine learning techniques. In Proceedings of the 2014
IEEE International Conference on Software Science, Technology and Engineering, Washington, DC, USA, 11–12 June 2014; pp. 50–53.
Bhatia, A.; Sulekh, R. Predictive Model for Parkinson’s disease through Naïve Bayes Classification. Int. J. Comput. Sci. Commun. 2017, 9, 194–202
V.Kakulapati et al., "RMDL: Classification of Parkinson's disease by nature- Inspired algorithm", International Journal of Pharmaceutical Research Volume 12,
issue 3, July - Sept, 2020, 10.31838/ijpr/2020.12.03.001.
V.Kakulapati et al., "RMDL: Classification of Parkinson's disease by nature- Inspired algorithm", International Journal of Pharmaceutical Research Volume 12,
issue 3, July - Sept, 2020, 10.31838/ijpr/2020.12.03.001.
Kakulapati, V., et al., (2020). Metaheuristic Approach of RMDL Classification of Parkinson’s Disease. In: Oliva, D., Hinojosa, S. (eds) Applications of Hybrid
Metaheuristic Algorithms for Image Processing. Studies in Computational Intelligence, vol 890. Springer, Cham. https://doi.org/10.1007/978-3-030-409777_17.
Chen, H. L., Huang, C. C., Yu, X. G., Xu, X., Sun, X., Wang, G., and Wang, S. J. (2013). An efficient diagnosis system for detection of Parkinson’s disease
using fuzzy k-nearest neighbor approach. Expert systems with applications, 40(1), 263-271.
C K GOMATHY, B. DHEERAJ KUMAR REDDY, Ms. B. VARSHA and B. VARSHINI, “The parkinson’s disease detection using machine learning
techniques”, International Research Journal of Engineering and Technology (IRJET), 8(10),pp 440- 444,2021.
Senjuti Rahman et al., Classification of Parkinson’s Disease using Speech Signal with Machine Learning and Deep Learning Approaches” EJECE, European
Journal of Electrical Engineering and Computer Science ISSN: 2736-5751, Vol 7| Issue 2| March2023, DOI: http://dx.doi.org/10.24018/ejece.2023.7.2.488
© IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
264