Full Report PDF

VISVESVARAYA TECHNOLOGICAL UNIVERSITY
BELAGAVI-590014, KARNATAKA.
A PROJECT REPORT
On
“A PERSONALIZED MEDICAL ASSISTANT CHATBOT”

Submitted in Partial Fulfillment for the Award of the Degree
of
BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE & ENGINEERING
Submitted By:
1SG15CS094 Satvik Ranjan

1SG15CS114 Tathagat Ankit
1SG15CS125 Vivek Kumar
Under the Guidance of

Prof. Sneha G
Assistant Professor
Department of Computer Science and Engineering

SAPTHAGIRI COLLEGE OF ENGINEERING
14/5, Chikkasandra, Hesarghatta Main Road
Bengaluru – 560057.
2019-2020
14/5, Chikkasandra, Hesaraghatta Main Road, Bengaluru – 560057.
Certificate
Certified that the Project Work entitled “A PERSONALIZED MEDICAL ASSISTANT
CHATBOT” carried out by SATVIK RANJAN (1SG15CS094), TATHAGAT ANKIT (1SG15CS114),
VIVEK KUMAR (1SG15CS125), bonafide students of Sapthagiri College of Engineering, in partial
fulfillment for the award of Bachelor of Engineering in Computer Science and Engineering of
Visvesvaraya Technological University, Belagavi during the academic year 2019-2020. It is certified that
all corrections/suggestions indicated for Internal Assessment have been incorporated in the report deposited
in the department library. The project report has been approved as it satisfies the academic requirements in
respect of Project Work (15CSP85) prescribed for the said degree.
Signature of the Guide Signature of the HOD Signature of the Principal

Sneha G Dr. Kamalakshi Naganna Dr. K L Shivabasappa
Asst. Professor (IC)Professor & Head Principal
EXTERNAL EXAMINATION:
Name of the Examiners Signature with Date
1.___________________________ _________________________
2.___________________________ _________________________
ACKNOWLEDGEMENT
Any achievement does not depend solely on the individual efforts but on the guidance,
encouragement and co-operation of intellectuals, elders and friends. A number of personalities, in their
own capacities have helped us in carrying out this project work. We would like to take this opportunity
to thank them all.
We would like to express my profound thanks to Sri. G Dayanand, Chairman, Sapthagiri

College of Engineering Bangalore, for their continuous support in providing amenities to carry out this
Project.
Special Thanks to Dr. Nagabhushana K R, Administrative Officer, Sapthagiri College of

Engineering Bangalore, for their valuable suggestion.
Also we would like to express our immense gratitude to Dr. K L SHIVABASAPPA Principal,
Sapthagiri College of Engineering Bangalore, for their help and inspiration during the tenure of the
course.
We also extend our sincere thanks to Dr. Kamalakshi Naganna, Professor and Head (IC),
Department of Computer Science and Engineering, Sapthagiri College of Engineering, for their constant
support.
We would like to express our heartful gratitude to Sneha G, Asst. Professor, Department of
Computer Science and Engineering, Sapthagiri College of Engineering, for their timely advice on the
project and regular assistance throughout the work.
We also extend our sincere thanks to all the Faculty members and Supporting staff
Department of Computer Science and Engineering, Sapthagiri College of Engineering, for their constant
support and encouragement.
Finally, we thank our parents and friends for their moral support.
1SG15CS094 SATVIK RANJAN
1SG15CS114 TATHAGAT ANKIT
1SG15CS125 VIVEK KUMAR
I
ABSTRACT
We are living in the world driven by technology and Chatbot will play a key role in coming
years. With the help of new age emerging technologies like Machine Learning a lot of common diseases
can be predicted just by providing symptoms. But no human can possibly know about all such diseases.
So, the problem is that there isn’t any proper system where anyone can get the details of the diseases just
by entering the symptoms. So, the proposed idea is to create a system that can predict the common
diseases if symptoms are provided. Also, the system can possibly suggest some treatments. Hence it can
be beneficial for people as early detection of the disease can help in better treatment and recovery.
Moreover, this system will help people to keep track of their heath regularly and properly without going
anywhere.
II
14/5, Chikkasandra, Hesaraghatta Main Road, Bengaluru-560057.
DECLARATION
We SATVIK RANJAN (1SG15CS094), TATHAGAT ANKIT (1SG15CS114) and VIVEK

KUMAR (ISG15CS125), bonafide students of Sapthagiri College of Engineering, hereby declare that
the project entitled “A PERSONALIZED MEDICAL ĀSSISTANT CHATBOT” submitted in partial
fulfilment for the award of Bachelor of Engineering in Computer Science & Engineering of the
Visvesvaraya Technological University, Belagavi during the year 2019-2020 is our original work and
the project has not formed the basis for the award of any other degree, fellowship or any other similar
titles.
Name & Signature of the Student with date
1) Satvik Ranjan
2) Tathagat Ankit
3) Vivek Kumar
III
INDEX
Chapter 1: Introduction
1.1 Background
1.2 Overview of the Present Work
1.3 Problem Statement
1.4 Objectives
1.5 Organization of the Project Report
Chapter 2: Literature review
2.1 Summary of prior works

2.2 Outcome of the review- problems identified
2.3 Proposed work
Chapter 3: System Requirements
3.1 Functional Requirements

3.2 Non-Functional Requirements
3.3 Software Used
3.4 Software Specifications
3.5 Hardware Used
Chapter 4: System Design /Methodology
4.1 Architecture
4.2 Major Algorithms
Chapter 5: Implementation
5.1 Module 1: Data Preprocessing

5.2 Module 2: Model Creation
5.3 Module 3: Training the Model
5.4 Module 4: Speech to Text
5.5 Module 5: Text to Speech
5.6 Module 6: Response Generation
5.7 Module 7: Disease Predictor Model
Chapter 6: Results and Discussion
6.1 Testing
6.2 Results
Chapter 7: Conclusion and Future work
References
I
LIST OF FIGURES
Figure No Page No
2.1 Proposed system 11
3.1 Supervised Learning Algorithm 14
3.2 Unsupervised Leaning Algorithm 15
3.3 Reinforced Learning 16
4.1 System Architecture 20
4.2 Overview of Voice Recognition 21
4.3 Text Processing 21
4.4 Workflow diagram 23
4.5 Sequence to Sequence Model 25
4.6 Encoder and Decoder 26
5.1 Word Embedding 30
5.2 Bag of Words 31
5.3 Bag of Words Position Vector 32
5.4 Split Dataset 35
5.5 Cross Validation 37
5.6 Sequence to Sequence Model 40
5.7 Sequence Diagram for Seq2Seq Model 41
5.8 Sequence Diagram for Disease Model 46
6.1 Start of Chatbot Screenshot 50
6.2 Generation Conversation with Chabot Screenshot 51
6.3 Change to Disease Predictor Mode Screenshot 51
6.4 Chatbot giving symptom suggestion Screenshot 52
(v)
6.5 Chatbot giving symptom suggestion Screenshot 52
6.6 Chabot showing no match Screenshot 53
6.7 Disease Predicted for Symptoms Set-1 Screenshot 54
6.10 Text Input Screenshot 55
6.11 Voice Input Screenshot 56
6.12 Voice Input and output Screenshot 56
(vi)
LIST OF TABLES
S.l. No. Page No.
5.1. Frequency Table 42
5.2. Test Data 43
6.1. Voice recognition Module Test Cases 47
6.2. Text to Speech Module Test Cases 48
6.3. Symptoms Suggestion Function Test Cases 48
6.4. Class Details Module Test cases 49
6.5. Chatbot GUI Module test cases 49
(vi)
PERSONALIZED MEDICAL ASSISTANT CHATBOT
CHAPTER 1
INTRODUCTION
1.1 Background
1.1.1 Chat bots
Chatbots are programs that mimic human conversation using Machine Learning
algorithms. It is designed to be the ultimate virtual assistant helping one to complete tasks
ranging from answering questions, getting driving directions, to playing one favorite tunes
etc.Chatbots have become more popular in business groups right now as they reduce
customer service cost and handles multiple users at a time. But yet to accomplish many tasks,
there is a need to make chatbots efficientin medical field as well.
The chatbot also known as a smartbot, conversational bot, chatterbot, interactive

agent, conversational interface, Conversational AI, talkbot or artificial conversational entity
is a computer program or an artificial intelligence which conducts a conversation via auditory
or textual methods. Such programs are often designed to convincingly simulate how a human
would behave as a conversational partner, thereby passing the Turing test. Chatbots are
typically used in dialog systems for various practical purposes including customer service or
information acquisition. Some chatbots use sophisticated natural language processing
systems, but many simpler ones scan for keywords within the input, then pull a reply with the
most matching keywords, or the most similar wording pattern, from a database.
The term "Chatterbot" was originally coined by Michael Mauldin creator of the first
Verbot, Julia in 1994 to describe these conversational programsIn 1950, Alan Turing asked
the question “Can machines think?” Turing conceptualized the problem as an “imitation
game” now called the Turing Test, in which an “interrogator” asked questions to human and
machine subjects, with the goal of identifying the human. If the human and machine are
indistinguishable, we say the machine can think. In 1966, Joseph Weinbaum at MIT created
the first chatbot that, arguably, came close to imitating a human: ELIZA. Given an input
Dept of CS&E, SCE 2018-2019 Page 1

sentence, ELIZA would identify keywords and pattern match those keywords against a set of
pre-programmed rules to generate appropriate responses.
Since ELIZA, there has been progress in the development of increasingly intelligent
chatbots. In 1972, Kenneth Colby at Stanford created PARRY, a bot that impersonated a
paranoid schizophrenic.In 1995, Richard Wallace created A.L.I.C.E, a significantly more
complex bot that generated responses by pattern matching inputs against <pattern> (input)
<template> (output) pairs stored in documents in a knowledge base. These documents were
written in Artificial Intelligence Markup Language (AIML), an extension of XML, which is
still in use today. ALICE is a three-time winner of the Loebner prize, a competition held each
year which attempts to run the Turing Test, and awards the most intelligent chatbot.
Modern chatbots include: Amazon’s Echo and Alexa, Apple’s Siri, and Microsoft’s
Cortana. The architectures and retrieval processes of these bots take advantage of advances in
machine learning to provide advanced “information retrieval” processes, in which responses
are generated based on analysis of the results of web searches. Others have adopted
“generative” models to respond; they use statistical machine translation (SMT) techniques to
“translate” input phrases into output responses. Seq2Seq, an SMT algorithm that used
recurrent neural networks (RNNs) to encode and decode inputs into responses is a current
best practice.
Chatbots has become more popular in business groups right now as they reduce
customer service cost and handles multiple users at a time. They are being used in almost
every domain from virtual assistant like Siri on mobile to customer support in tech industry
to e-commerce websites. Chatbots are currently the one of the best trending technologies
available. It is certainly one of the most advanced and time saving technology also. But yet to
accomplish many tasks there is a need to make chatbots efficient in medical field as well. To
address this problem this project provides a platform where human can interact with the
chatbot.
These chatbots are highly trained on datasets using Machine Learning algorithms. It
is like having a child and then teaching him about the world right from the beginning of his
birth. Machine Learning algorithms are mostly on natural rather than being logical. Even
without knowing the proper logic behind an algorithm one can make use of those algorithms

and can be benefited from it. In Machine learning the biggest challenge is the choice of
algorithm to use for our dataset. There are number of models available for machine learning.
There is saying in Machine Learning community that all models are wrong but some are
useful.
1.1.2 A Personalized Medical Assistant Chatbot
A Personalized Medical Assistant Chatbot is specifically used to detect the

diseases of the user on basic of symptoms given in the form of voice or text. The bot mimic
the human conversation and can understand the user input given and comeback with proper
reply for the queries given to the botit’s like having a doctor friend which gives you
suggestion regarding your health issues if asked and help the user to diagnose the user in the
mean time for some disease without visiting the doctor.
A Personalized Medical Assistant Chatbot not only predict the diseases ,it also got
the feature through which can be used for general conversation as a simple chatbot talking
friend which gives responses as per the input queries given to the machine (chatbot).The
chatbot machine is trained in such a way that it works with minimal efforts. In the fast
running world people don't even have enough time to attend to their medical needs in a
proper manner. Some people also hesitate to see a doctor due to the increasing medical cost
even for the simplest of issues like cold, fever, etc. It's also not possible to take a doctor’s
appointment and get immediate assistance because the doctor to people ratio in a particular
area is very small and due to this people experience delays at hospitals/clinics or prefer not to
attend to their medical needs which eventually leads to various health issues.
In addition, people are also afraid or feel shy to share their mentalhealth
issueswithadoctor. While the medical costs and issues are increasing, we are in an era which
has seen a heavy increase in the use of Technology. Every other person is equipped with at
least one and has access to everything at just a click of a button. So, there is a serious need to
combine the medical issues to technological environment which will make everything
accessible to a person in just a click.
The growing busy schedule of an individual and busy work timing people usually
ignore their common health issues and neglect the need of consulting a doctor and more often

people do not have immediate option when they suffer with particular disease. So, this
system can be helpful to the people.
1.2 Overview of the Present Work
The modern technology has increased the standard of living for the humans. So, the
implementation of this system facilitates the user with medical assistance. This system
provides a well-defined comprehensive interface to interact with the bot. The system
reduces the hectic and time consuming appointment queues and long waiting for simple
medical issuethus optimizes the time. This system provides a simple interface for the
conversation of the user and machine. The design and implementation of this system is to
provide service to those people who have no time due to their day to day work and seek
medical advice for some common symptoms. To achieve this objective manually is very
difficult as it is not possible to take a doctor’s appointment and get immediate assistance
because the doctor to people ratio in a particular area is very small and due to this people
experience delays at hospitals/clinics or prefer not to attend to their medical needs which
eventually leads to various health issues.
The system can provide a solution to the healthcare sector in the form of a chatbot
that can improve the way patients interact with doctors or any healthcare organization.
Patients get a quicker solution to their health-related questions and can thus act promptly
during critical conditions.A chatbot that is created for healthcare and patient care can easily
perform certain functions on a patient’s behalf, thus making interaction smoother on both
the end.
A Personalized Medical Assistant chatbot helps the patients or user to monitor their
health symptoms and work accordingly instead of rushing to doctor for common sign of
illness for prediction. The user provide the system with their respective symptoms what and
all they are experiencing such as change in body temperature, sore throat, cold shivering etc.
In personalized medical assistance chatbot user can give input in two ways either in form of
text or in form of speech. The system receives the input through a interactive interface where

the user enter the symptoms they are facing these symptoms were analyzed by the training
model the brain of the machine and the system responds with the same interface in form of
textual representation.
1.3 Problem Statement
“To design and develop a voice enabled Chatbot for medical assistance that can
predict the disease on the basis of the symptoms provided by the user and also the chatbot
can be used for general conversation.”
1.4 Objectives
● To combine disease prediction system and general conversation chatbot into one
single unit.
● To make the system simple to use.
● To make the user interface and user experience amazing enough.
● To manage the symptoms given by user and predict the disease a person is suffering
from.
● Help the user to interact with chatbot by text as well as speech/voice.
● To provide most appropriate first aid information for common medical conditions.
● To build a system which can be easily integrated and updated.
● Save time and effort of visiting a doctor for common health illness.
1.5 Organization of the Project Report
The project report is organized into seven chapters.
● Chapter One- Introduction: This chapter tells about the background of the project,
problem statement, objectives of the project and as well as proposed system with
theoretical outline.
● Chapter Two- Literature Review: It gives brief description about the existing
techniques, problems identified in the existing system and proposed system with all

its features. It summarizes the prior works, the outcome of the review problem
identified. This chapter also reveals the proposed work.
● Chapter Three- System Requirement: This chapter discusses the requirement of the
system. It reveals the functional and the non- functional requirements. This also
discusses the software and the hardware used.
● Chapter Four- System Design/Methodology: This chapter discusses the overall
system methodology process, which includes the complete architecture and the
algorithm implemented. Besides that, this chapter covers basic terms and theories of
the system design consideration.
● Chapter Five- Implementation: Gives the brief description about how the project is
implemented. The various modules used in the project along with their functionality
are explained in brief.
● Chapter Six- Results and Discussion: This chapter gives the brief description about
how the testing is done on the modules, to see if they are successfully implemented
and work as required. The results consist of various cases depicted as snapshots.
● Chapter Seven- Conclusion and Future Work: This chapter discusses the conclusion
of this project and recommendation on further works and upgrades of the system.

CHAPTER 2
LITERATURE REVIEW
2.1 Summary of prior works
In the present systemConversational modeling is an important task in natural

language processing as well as machine learning. Like most important tasks, it’s not easy.
Previously, conversational models have been focused on specific domains, such as booking
hotels or recommending restaurants. They were built using hand-crafted rules, like
ChatScript, a popular rule-based conversational model. The sequence to sequence model
being used for translation opened the possibility of phrasing dialogues as a translation
problem: translating from an utterance to its response. The systems built using this principle,
while conversing fairly fluently, aren’t very convincing because of their lack of personality
and inconsistent persona.
In the present system, experiment building open-domain response generator with

personality and identity. A successful model of this kind can have a lot of applications, such
as allowing people to speak with their favorite celebrities, creating more life-like AI
assistants, or creating virtual alter-egos of ourselves. The model was trained end-to-end
without any hand-crafted rules. The bots talk reasonably fluently, have distinct personalities,
and seem to have learned certain aspects of their identity.
“A Neural-network based Chat Bot” [1], ICCES(2017) ,Milla T Mutiwokuziva,

Melody W Chanda, Prudence Kadebu, Addlight Mukwazvure, Tatenda T Gotora . This
paper, explore the avenues of teaching computers to process natural language text by
developing a chat bot. We take an experiential approach from a beginner level of
understanding, in trying to appreciate the processes, techniques, the power and possibilities
of natural language processing using recurrent neural networks (RNN). To achieve this, kick
started the experiment by implementing sequence to sequence long short-term memory cell
neural network (LSTM) in conjunction with Google word2vec. The results show the
relationship between the number of training times and the quality of language model used for

training our model bot affect the quality of its prediction output. Furthermore, they
demonstrate reasoning and generative capabilities or RNN based chat bot.
“Chatbot for University Related FAQs” [2], IEEE transaction (2017) Bhavika R.
Ranoliya, Nidhi Raghuwanshiand Sanjay Singh. Chatbots are programs that mimic human
conversation using Artificial Intelligence (AI). It is designed to be the ultimate virtual
assistant, entertainment purpose, helping one to complete tasks ranging from answering
questions, getting driving directions, turning up the thermostat in smart home, to playing
one’s favorite tunes etc. Chatbot has become more popular in business groups right now as
they can reduce customer service cost and handles multiple users at a time. But yet to
accomplish many tasks there is need to make chatbots as efficient as possible. To address this
problem, this paper provide the design of a chatbot, which provides an efficient and accurate
answer for any query based on the dataset of FAQs using Artificial Intelligence Markup
Language (AIML) and Latent Semantic Analysis (LSA). Template based and general
questions like welcome/ greetings and general questions will be responded using AIML and
other servicebasedquestionsusesLSAtoprovideresponsesatanytime that will serve user
satisfaction. This chatbot can be used by any University to answer FAQs to curious students
in an interactive fashion.
“Survey on Chatbot Design Techniques in Speech Conversation System” [3],

International Journal of Advanced Computer Science and Application (2015) by Sameera A.
Abdul-Kader, Dr. John Woods.Human-Computer Speech is gaining momentum as a
technique of computer interaction. There has been a recent upsurge in speech based search
engines and assistants such as Siri, Google Chrome and Cortana. Natural Language
Processing (NLP) techniques such as NLTK for Python can be applied to analyze speech,
and intelligent responses can be found by designing an engine to provide appropriate human
like responses. This type of program is called a Chatbot, which is the focus of this study. This
paper presents a survey on the techniques used to design Chatbots and a comparison is made
between different design techniques from nine carefully selected papers according to the
main methods adopted. These papers are representative of the significant improvements in
Chatbots in the last decade. The paper discusses the similarities and differences in the
techniques and examines in particular the Loebner prizewinning Chatbots.

“A Tool of Conversation: Chatbot”[4], International Journal of Computer Science

and Engineering, (2017) by M. Dahiya.. A chatbot is a program designed to counterfeit a
smart communication on a text or spoken ground. But this paper is based on the text only
chatbot. Chatbot recognize the user input as well as by using pattern matching, access
information to provide a predefined acknowledgment. When the input is bringing into being
in the database, a response from a predefined pattern is given to the user. A Chatbot is
implemented using pattern comparing, in which the order of the sentence is recognized and a
saved response pattern is acclimatize to the exclusive variables of the sentence. They cannot
register and respond to complex questions, and are unable to perform compound activities.
Chatbot is relatively a new technology. The application of a Chatbot can be seen in various
fields in the future. This paper covers the techniques used to design and implement a
Chatbot. Comparisons are made, findings are discussed and conclusion is drawn at the end.
“College Enquiry Chatbot Using A.L.I.C.E” [5], International Journal of New

Technology and Research, 2017, by Balbir Singh Bani, Ajay Pratap Singh. In this paper, a
proposal is carried on to explain the design of a chat bot specifically tailored as an
application which is going to help new students to solve all the problems they face and
the questions which arises in their mind during and after the admission . In particular, the
proposal investigates the implementation of ALICE chat bot system as an application named
as college enquiry chat bot. A keywords-based human-computer dialog system makes it
possible that the user could chat with the computer using a natural language.
“A Neural with Personality” [6], Stanford University Report by Huyen Nguyen,

David Morales, Tessera Chin, Conversational modeling is an important task in natural
language processing as well as machine learning. Like most important tasks, it’s not easy.
Previously, conversational models have been focused on specific domains, such as booking
hotels or recommending restaurants. They were built using hand-crafted rules, like Chat
Script, a popular rule-based conversational model.
In 2014, the sequence to sequence model being used for translation opened the
possibility of phrasing dialogues as a translation problem translating from an utterance to its
response. The systems built using this principle, while conversing fairly fluently, are not very
convincing because of their lack of personality and inconsistent persona.

In this paper, the experiment is building open-domain response generator with

personality and identity.
The model was trained end-to-end without any hand-crafted rules. The bots talk
reasonably fluently, have distinct personalities, and seem to have learned certain aspects of
their identity. The results of standard automated translation model evaluations yielded very
low scores. However, we designed an evaluation metric with a human judgment element, for
which the chatbots performed well. We are able to show that for a bot’s response, a human is
more than 50% likely to believe that the response actually came from the real character.
“Online Shopping Management System with Customer Multi-Language

Supported Query handling AIML Chatbot” [7], by Md. Shahriare Satu, Tajim Md.
Niamat Ullah Akhund, Mahammad Abu Yousuf. In the e-commerce sites, there are deal with
many kinds of products throughout the world. The proposed shopping system contains
different services to make user feasible in e-shopping time. When user want to buy anything
from these sites, he needs guideline about product and other things in this system just like
make shopping in a store. To provide this kind of things in online, it necessary to integrate
an artificial chatting system with e-commerce site which gives unlimited chatting services.
When user first get into the ecommerce site, he can ask queries to know in the system.
Ecommerce system sends customer query to the AIML Knowledge Base System to get
answer by applying pattern matching algorithm. Then this answer return back to the system
and then back to the user
2.2 Outcome of the review- problems identified
● The existing system was based on text as a input in which it explore the avenues of
teaching computers to process natural language text by developing a chatbot it has an
experiential approach from a beginner level of understanding, in trying to appreciate
the processes, techniques, the power and possibilities of natural language processing
using recurrent neural networks (RNN).
● The previous system did not have features such as text to speech or speech to text
conversion.Uses template to match queries/Not possible to write template for every
query.

● In the previous system extensive training of Dataset is required also High

computational power required in order to process the data.
● The previous system uses template to match queries for each response with respect to
the input. It is not possible to write template for every query.
● This is a major hindrance to the growth of the chatbot industry. The very fact that
chatbots are scripted and handmade sometimes make them feel as IVR with brains.
Even though AI is the brains of the chatbots, the range to which they can solve a
query is still limited
2.3 Proposed work
The proposed system is to develop a chatbot which can do normal human conversation
and also detect the disease of the user based on the symptoms provided either in the form of
text or speech. The generated response can also be in written text response or voice. The
proposed system is consisting of three basic modules which are interlinked together with a
user interface.
Figure 2.1 Proposed system
The architecture has a user interface through which the users communicate to the
chatbot and comes up with the reply to the query given by the user. The chatbot has to decide
whether the given input is for normal human conversation or the user wants prediction of
disease and comes up with the response according to the query given by the user.

CHAPTER 3
SYSTEM REQUIREMENTS
3.1 Functional Requirements
● The chatbot should respond to any input it receives.

● If the chatbot doesn’t understand the input, it should ask for more simplified input.
● If the chatbot understands the input, it should respond with correct information.
● If the chatbot needs more information to find an answer, it should ask for more
information.
● The chatbot should use a text recognition API to understand the input.
● The chatbot should be able to send users data in the following ways:
● The data can be sent as a text message or voice/speech.
● The chatbot should be able to query the data from the API.
3.2 Non Functional Requirements
● The system should be easy to access.

● The system should provide an interactive user interface.
● The system should never fail in the middle of operation.
● The system should work consistently across the platform.
● The response time of the system should be acceptable.
● Each user’s activity should be separated from the other user’s activities.
3.3 Software Requirements
▪ Operating System: Windows 7 or higher

▪ Anaconda
▪ Jupyter notebook
▪ Spyder

3.4 Software Specification
3.4.1 Python
Python is an interpreted, high-level, general-purpose programming language. Created

by Guido van Rossum and first released in 1991, Python's design philosophy emphasizes
code readability with its notable use of significant whitespace. Its language constructs and
object-oriented approach aims to help programmers write clear, logical code for small and
large-scale projects.
Python is dynamically typed and garbage-collected. It supports multiple programming

paradigms, including procedural, object-oriented, and functional programming. Python is
often described as a "batteries included" language due to its comprehensive standard library.
Python was conceived in the late 1980s as a successor to the ABC language. Python
2.0, released 2000, introduced features like list comprehensions and a garbage collection
system capable of collecting reference cycles. Python 3.0, released 2008, was a major
revision of the language that is not completely backward-compatible, and much Python 2
code does not run unmodified on Python 3. Due to concern about the amount of code written
for Python 2, support for Python 2.7 (the last release in the 2.x series) was extended to 2020.
Language developer Guido van Rossum shouldered sole responsibility for the project until
July 2018 but now shares his leadership as a member of a five-person steering council.
Python interpreters are available for many operating systems. A global community of
programmers develops and maintains CPython, an open source reference implementation. A
non-profit organization, the Python Software Foundation, manages Python and CPython.
3.4.2 Machine Learning
Machine learning (ML) is the scientific study of algorithms and statistical models that
computer systems use to effectively perform a specific task without using explicit
instructions, relying on patterns and inference instead. It is seen as a subset of artificial
intelligence. Machine learning algorithms build a mathematical model based on sample data,
known as "training data", in order to make predictions or decisions without being explicitly

programmed to perform the task. Machine learning algorithms are used in a wide variety of
applications, such as email filtering, and computer vision, where it is infeasible to develop an
algorithm of specific instructions for performing the task. Machine learning is closely related
to computational statistics, which focuses on making predictions using computers. The study
of mathematical optimization delivers methods, theory and application domains to the field
of machine learning. Data mining is a field of study within machine learning, and focuses on
exploratory data analysis through unsupervised learning. In its application across business
problems, machine learning is also referred to as predictive analytics.
There arefour types of machine learning
1. Supervised Learning: “The outcome or output for the given input is knownbefore
itself” and the machine must be able to map or assign the given input to the output.
Multiple images of a cat, dog, orange, apple etc here the images are labelled. It is fed
into the machine for training and the machine must identify the same. Just like a
human child is shown a cat and told so, when it sees a completely different cat among
others still identifies it as a cat, the same method is employed here.
Figure 3.1: Supervised learning algorithm

Key points:
• Regression and classification problems are mainly solved here.

• Labeled data is used for training here.
• Popular Algorithms: Linear Regression, Support Vector Machines (SVM),
Neural Networks, Decision Trees, Naive Bayes, Nearest Neighbor.
• It is mainly used in Predicting Modelling.
2. Unsupervised Learning: “The outcome or output for the given inputs is

unknown”, here input data is given and the model is run on it. The image or the input
given are grouped together here and insights on the inputs can be found here(which is
the most of the real world data available). The main algorithms include Clustering
algorithms() and learning algorithms.
Figure 3.2: Unsupervised Learning
Key points:
• It is used for Clustering problems (grouping), Anomaly Detection (in banks

for unusual transactions) where there is a need for finding relationships
among the data given.
• Unlabeled data is used in unsupervised learning.
• Popular Algorithms:k-means clustering, Association rule.
• It is mainly used in Descriptive Modelling.

3. Semi-supervised Learning: It is in-between that of Supervised and Unsupervised

Learning. Where the combination is used to produce the desired results and it is the
most important in real-world scenarios where all the data available are a combination
of labelled and unlabeled data.
4. Reinforced Learning: The machine is exposed to an environment where it gets

trained by trial and error method; here it is trained to make a much specific decision.
The machine learns from past experience and tries to capture the best possible
knowledge to make accurate decisions based on the feedback received.
Figure 3.3: Reinforced Learning
Key points:
• Basic reinforcement is modelled as Markov Decision Process

• The most popular algorithms used here is Q-Learning, Deep Adversarial
Networks.
• Its practical applications include computer playing board games such as chess
and GO, Self-driving cars also use this learning.

3.4.3 HTML
Hypertext Markup Language (HTML) is the standard markup language for creating web
pages and web applications. With Cascading Style Sheets (CSS) and JavaScript, it forms a
triad of cornerstone technologies for the World Wide Web.[4]
Web browsers receive HTML documents from a web server or from local storage and
render the documents into multimedia web pages. HTML describes the structure of a web
page semantically and originally included cues for the appearance of the document.
HTML elements are the building blocks of HTML pages. With HTML constructs,
images and other objects such as interactive forms may be embedded into the rendered page.
HTML provides a means to create structured documents by denoting structural semantics for
text such as headings, paragraphs, lists, links, quotes and other items. HTML elements are
delineated by tags, written using angle brackets. Tags such as <img /> and <input /> directly
introduce content into the page. Other tags such as <p> surround and provide information
about document text and may include other tags as sub-elements. Browsers do not display the
HTML tags, but use them to interpret the content of the page.
HTML can embed programs written in a scripting language such as JavaScript, which
affects the behavior and content of web pages. Inclusion of CSS defines the look and layout
of content. The World Wide Web Consortium (W3C), maintainer of both the HTML and the
CSS standards, has encouraged the use of CSS over explicit presentational HTML since
1997.
3.4.4 Java Script
JavaScript (JS) is a lightweight interpreted or just-in-time compiled programming

language with first-class functions. While it is most well-known as the scripting language for
Web pages, many non-browser environments also use it, such as Node.js, Apache CouchDB
and Adobe Acrobat. JavaScript is a prototype-based, multi-paradigm, dynamic language,
supporting object-oriented, imperative, and declarative (e.g. functional programming) styles.
Read more about JavaScript.

This section is dedicated to the JavaScript language itself, and not the parts that are
specific to Web pages or other host environments. For information about APIs specific to
Web pages, please see Web APIs and DOM.
The standard for JavaScript is ECMAScript. As of 2012, all modern browsers fully
support ECMAScript 5.1. Older browsers support at least ECMAScript 3. On June 17, 2015,
ECMA International published the sixth major version of ECMAScript, which is officially
called ECMAScript 2015, and was initially referred to as ECMAScript 6 or ES6. Since then,
ECMAScript standards are on yearly release cycles. This documentation refers to the latest
draft version, which is currently ECMAScript 2020.
Do not confuse JavaScript with the Java programming language. Both "Java" and
"JavaScript" are trademarks or registered trademarks of Oracle in the U.S. and other
countries. However, the two programming languages have very different syntax, semantics,
and uses.
3.4.5 Anaconda
Anaconda is a free and open-source distribution of the Python and R programming

languages for scientific computing (data science, machine learning applications, large-scale
data processing, predictive analytics, etc.), that aims to simplify package management and
deployment. Package versions are managed by the package management system conda. The
Anaconda distribution is used by over 13 million users and includes more than 1400 popular
data-science packages suitable for Windows, Linux, and MacOS.
Anaconda distribution comes with more than 1,400 packages as well as the Conda
package and virtual environment manager, called Anaconda Navigator , so it eliminates the
need to learn to install each library independently.
The open source packages can be individually installed from the Anaconda repository
with the conda install command or using the pip install command that is installed with
Anaconda. Pip packages provide many of the features of conda packages and in most cases
they can work together.

Custom packages can be made using the conda build command, and can be shared
with others by uploading them to Anaconda Cloud, PyPI or other repositories.
The default installation of Anaconda2 includes Python 2.7 and Anaconda3 includes
Python 3.7.
3.4.6 Spyder
Spyder is an open source cross-platform integrated development environment (IDE)

for scientific programming in the Python language. Spyder integrates with a number of
prominent packages in the scientific Python stack, including NumPy, SciPy, Matplotlib,
pandas, IPython, SymPy and Cython, as well as other open source software. It is released
under the MIT license.
Initially created and developed by Pierre Raybaut in 2009, since 2012 Spyder has
been maintained and continuously improved by a team of scientific Python developers and
the community.
Spyder is extensible with first- and third-party plugins, includes support for
interactive tools for data inspection and embeds Python-specific code quality assurance and
introspection instruments, such as Pyflakes, Pylint and Rope. It is available cross-platform
through Anaconda, on Windows, on macOS through MacPorts, and on major Linux
distributions such as Arch Linux, Debian, Fedora, Gentoo Linux, openSUSE and Ubuntu.
Spyder uses Qt for its GUI, and is designed to use either of the PyQt or PySide
Python bindings. QtPy, a thin abstraction layer developed by the Spyder project and later
adopted by multiple other packages, provides the flexibility to use either backend.
3.5 Hardware Requirements
▪ Processor: Intel CORE i3 or higher

▪ Clock Speed: 2.0 GHz
▪ RAM: Minimum 4 GB
▪ Hard Disk: 520GB or above
▪ Monitor: 15 VGA Color

CHAPTER 4
SYSTEM DESIGN
4.1 Architecture
The below figure shows a general block diagram describing the chatbot interaction
with user activities and operations, along with several layers used in implementing and
supporting the communication and prediction system.
Figure 4.1 System Architecture
The proposed architecture has been divided into 3 classes-:

A. Interaction with chatbot
The user starts a conversation with the chatbot by providing input through speech.
The input is recorded through a microphone. It is just like starting a normal conversation with
any human being. The conversation with the chatbot is completely voice based.
B. Decision Making
As soon as the chatbot starts the conversation with the user, the user is asked to
choose from the options provided. Two options are provided: General Conversation and

Medical Assistant. Based on the response from the user the chatbot decides if it has to
continue a general conversation or to help the user for self-diagnosis.
C. Response from chatbot
The response is generated based on the decision making in the previous stage. If the
user needs help related to health the chatbot enters into a questionnaire. It starts asking about
different symptoms and then finally predicts a disease. If the user does not need any help
then the chatbot continues with the normal conversation. All the response from the chatbot is
voice based.
Figure 4.2 Overview of Voice Recognition
Figure 4.3 Text Processing

The proposed architecture can be summarized as follows-:
● Implementing and interacting withthe voice enabled chatbot require conversion of

speech/voice given by the end user as thesystem only understands machine language.
● Speech passes through a microphone to a digital signal processing package built in

the computer to convert into stream of pulse that contains speech information.
● The input to the chatbot is given through a microphone which is further passed to the
speech to text conversion API and the input is processed and the representation of the
input is provided to the system.
● Text is splitted for tagging with parts of speech labels according to their positions and
neighbours in the sentence.
● Individual tagged words are chunked to form phrases using different grammar
● Keywords can be extracted from these phrases by eliminating unwanted words in

chinking operations
● The user chooses the option either he wants to have a normal conversation or medical
assistance based on the input the system makes the decision and comes with the
response.

4.1.1 Workflow
Figure 4.4 Workflow Diagram
As we can see from the Fig, 4.1.5first the given input from the user if it is in the form
of voice/speech gets converted in form of text and the conversion is done through API further
the converted text are broken down in to individual words fragments and these words
fragments are converted into number as machine only understand the numeric input. After
conversion the training model is build the model is further act as the input for the creation of
chatbot. The chatbot is created using machine learning concept and its is trained with the help
of dataset once the chatbot is build the given input of the user is feed to the machine and the
machine takes the input and use the model for making the response.

4.1.2 High level design
High level design contains the functional and non functional requirements of the
software. High level design includes the design considerations.
There are several design consideration issues that need to be addressed or resolved
before getting down designing a complete solution for the system.The main assumptions and
dependencies identified are as follows:
● The software packages needs to be installed on a Pc with windows operating system.

● There shall not be any multiple version of anaconda installed or any other python
development tool.
● There shouldn’t be any permission related issues.
● The input provided by the users to the chatbot should be valid either in form of voice
or text format.
● The chatbot should understand the symptoms provided by user and should be given in
appropriate form.
● Theconversation and response generation should be in a structured way.
4.2 Major Algorithms

Two algorithmsare used in this proposed system. They are:
• Sequence-to-Sequence Model
• Naïve Byes
4.2.1 Sequence-to-Sequence Model

The seq2seq model consists of two RNN: an encoder and a decoder. The encoder takes a
sentence as input and processes one word one at a time. The decoder generates words one by
one in each time step of the decoder’s iteration. Sequence To Sequence model introduced in
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine
Translation has since then, become the Go-To model for Dialogue Systems and Machine
Translation. It consists of two RNNs (Recurrent Neural Network) An Encoder and a
Decoder. The encoder takes a sequence(sentence) as input and processes one symbol(word)

at each timestep. Its objective is to convert a sequence of symbols into a fixed size feature
vector that encodes only the important information in the sequence while losing the
unnecessary information. You can visualize data flow in theencoder along the time axis, as
the flow of local information from one end of the sequence to another.
Figure 4.5 Sequence to Sequence model
Each hidden state influences the next hidden state and the final hidden state can be seen as
the summary of the sequence. This state is called the context or thought vector, as it
represents the intention of the sequence. From the context, the decoder generates another
sequence, one symbol(word) at a time. Here, at each time step, the decoder is influenced by
the context and the previously generated symbols.

Figure 4.6 Encoder and Decoder
4.2.2 Naïve Bayes
Naive Bayes classifier is a straightforward and powerful algorithm for the classification task.
Even if we are working on a data set with millions of records with some attributes, it is
suggested to try Naive Bayes approach. Naive Bayes classifier gives great results when we
use it for textual data analysis. Such as Natural Language Processing.
4.2.2.1 Bayes Theorem:

Bayes theorem named after Rev. Thomas Bayes. It works on conditional probability.
Conditional probability is the probability that something will happen, given that something
else has already occurred. Using the conditional probability, we can calculate the probability
of an event using its prior knowledge.Below is the formula for calculating the conditional
probability.
P(H∣E) = P (E ∣ H) P (E) P(H).
Where,
• P(H) is the probability of hypothesis H being true. This is known as the prior
probability.
• P(E) is the probability of the evidence (regardless of the hypothesis).
• P(E|H) is the probability of the evidence given that hypothesis is true.

• P(H|E) is the probability of the hypothesis given that the evidence is there.
Naive Bayes Classifier
Naive Bayes is a kind of classifier which uses the Bayes Theorem. It predicts membership
probabilities for each class such as the probability that given record or data point belongs to a
particular class. The class with the highest probability is considered as the most likely class.
This is also known as Maximum A Posteriori (MAP).
The MAP for a hypothesis is:
MAP(H) = max( P(H|E) )
=max( (P(E|H)*P(H))/P(E))
= max(P(E|H)*P(H))
P(E) is evidence probability, and it is used to normalize the result. It remains same so,
removing it won’t affect. Naive Bayes classifier assumes that all the features are unrelated to
each other. Presence or absence of a feature does not influence the presence or absence of
any other feature. We can use Wikipedia example for explaining the logic.In real datasets, we
test a hypothesis given multiple evidence(feature). So, calculations become complicated. To
simplify the work, the feature independence approach is used to ‘uncouple’ multiple
evidence and treat each as an independent one.
P (H | Multiple Evidences) = P (E1| H) * P(E2|H) ……*P (En | H) *P(H) / P (Multiple

Evidences)
4.2.2.1 Types of Naive Bayes Algorithm
• Gaussian Naive Bayes: When attribute values are continuous, an assumption is made
that the values associated with each class are distributed according to Gaussian i.e.,
Normal Distribution.If in our data, an attribute say “x” contains continuous data. We
first segment the data by the class and then compute mean µ{y} & Variance
{σ*σ{y}} of each class.
• Multinomial Naive Bayes: Multinomial Naive Bayes is preferred to use on data that
is multinomially distributed. It is one of the standard classic algorithms. Which is
used in text categorization (classification). Each event in text classification represents
the occurrence of a word in a document.

• Bernoulli Naive Bayes: Bernoulli Naive Bayes is used on the data that is distributed
according to multivariate Bernoulli distributions., multiple features can be there, but
each one is assumed to be a binary-valued (Bernoulli, Boolean) variable. So, it
requires features to be binary valued.
4.2.2.2 Advantages and Disadvantage of Naive Bayes classifier
Advantages
• Naive Bayes Algorithm is a fast, highly scalable algorithm.
• Naive Bayes can be use for Binary and Multiclass classification. It provides different
types of Naive Bayes Algorithms like GaussianNB, MultinomialNB, BernoulliNB.
• It is a simple algorithm that depends on doing a bunch of counts.
• Great choice for Text Classification problems. It’s a popular choice for spam email
classification.
• It can be easily train on small dataset
Disadvantages
• It considers all the features to be unrelated, so it cannot learn the relationship between
features. E.g., Let’s say Remo is going to a part. While cloth selection for the party,
Remo is looking at his cupboard. Remo likes to wear a white color shirt. In Jeans, he
likes to wear a brown Jeans, But Remo doesn’t like wearing a white shirt with Brown
Jeans. Naive Bayes can learn individual features importance but can’t determine the
relationship among features.

CHAPTER 5
IMPLEMENTATION
5.1 Module 1: Data Preprocessing
5.1.1 Padding
Before training, we work on the dataset to convert the variable length sequences into
fixed length sequences, by padding. We use a few special symbols to fill in the sequence.
1. EOS: End of sentence
2. PAD:Filler
3. GO:Start decoding
4. UNK:unknown; word not in vocabulary
Consider the following query-response pair.
Q: How are you?

A: I am fine.
Assuming that we would like our sentences (queries and responses) to be of fixed
length, 10, this pair will be converted to:
Q: [PAD,PAD,PAD,PAD,PAD,PAD,”?”,”you”,”are”,”How”]
A: [GO,”I”,”am”,”fine”, ”.”,EOS,PAD,PAD,PAD,PAD]
5.1.2 Bucketing
Introduction of padding did solve the problem of variable length sequences, but
consider the case of large sentences. If the largest sentence in our dataset is of length 100, we
need to encode all our sentences to be of length 100, in order to not lose any words. Now,
what happens to “How are you?” ? There will be 97 PAD symbols in the encoded version of
the sentence. This will overshadow the actual information in the sentence.

Bucketing kind of solves this problem, by putting sentences into buckets of different
sizes. Consider this list of buckets :[ (5,10), (10,15), (20,25), (40,50) ]. If the length of a
query is 4 and the length of its response is 4 (as in our previous example), we put this
sentence in the bucket (5,10). The query will be padded to length 5 and the response will be
padded to length 10. While running the model (training or predicting), we use a different
model for each bucket, compatible with the lengths of query and response. All these models,
share the same parameters and hence function exactly the same way.
If we are using the bucket (5,10), our sentences will be encoded to :
Q: [PAD,”?”, ”you”, ”are”, ”How”]

A: [GO,”I”,”am”,”fine”,”.”,EOS,PAD,PAD,PAD,PAD]
5.1.3 Word Embedding
Word Embedding is a technique for learning dense representation of words in a low

dimensional vector space. Each word can be seen as a point in this space, represented by a
fixed length vector. Semantic relations between words are captured by this technique. The
word vectors have some interesting properties.
Paris – France + Poland = Warsaw.
The vector difference between Paris and France captures the concept of capital city.
Figure 5.1 Word Embedding

Word Embedding is typically done in the first layer of the network: Embedding layer,
that maps a word (index to word in vocabulary) from vocabulary to a dense vector of given
size. In the seq2seq model, the weights of the embedding layer are jointly trained with the
other parameters of the model.
5.2 Module 2: ModelCreation
5.2.1 Bag of words
Figure 5.2 Bag of words

The bag-of-words model is one of the feature extraction algorithms for text.A
problem with modeling text is that it is messy, and techniques like machine learning
algorithms prefer well defined fixed-length inputs and outputs.Machine learning algorithms
cannot work with raw text directly; the text must be converted into numbers. Specifically,
vectors of numbers.
A bag-of-words model, or BoW for short, is a way of extracting features from text for
use in modeling, such as with machine learning algorithms.The approach is very simple and
flexible, and can be used in a myriad of ways for extracting features from documents.A bag-

of-words is a representation of text that describes the occurrence of words within a

document. It involves two things:
• A vocabulary of known words.
• A measure of the presence of known words.
It is called a “bag” of words, because any information about the order or structure of
words in the document is discarded. The model is only concerned with whether known words
occur in the document, not where in the document.The intuition is that documents are similar
if they have similar content. Further, that from the content alone we can learn something
about the meaning of the document.The bag-of-words can be as simple or complex as you
like. The complexity comes both in deciding how to design the vocabulary of known words
(or tokens) and how to score the presence of known words.
In the above figure a simple representation of sentences broken into words in a

specific order as the first two bit is reserved of start of sentence and end of sentence and rest
for words occurrences the value changes from zero to one if the word is present in mentioned
sentence the last bit is for special character like name, place etc.Vectorization the general
process of turning a collection of text documents into numerical feature vectors. This specific
strategy (tokenization, counting and normalization) is called the Bag of Words or “Bag of n-
grams” representation. Documents are described by word occurrences while completely
ignoring the relative position information of the words in the document
Figure 5.3 Bag of words position vector

Limitations of the Bag of Words representation:

A collection of unigrams (what bag of words is) cannot capture phrases and multi-
word expressions, effectively disregarding any word order dependence. Additionally, the bag
of words model doesn’t account for potential misspellings or word derivations.
N-grams to the rescue! Instead of building a simple collection of unigrams (n=1), one
might prefer a collection of bigrams (n=2), where occurrences of pairs of consecutive words
are counted.One might alternatively consider a collection of character n-grams, a
representation resilient against misspellings and derivations.For example, let’s say we’re
dealing with a corpus of two documents: ['words', 'wprds']. The second document contains a
misspelling of the word ‘words’. A simple bag of words representation would consider these
two as very distinct documents, differing in both of the two possible features. A character 2-
gram representation, however, would find the documents matching in 4 out of 8 features,
which may help the preferred classifier decide better.
The word boundaries-aware variant char_wb is especially interesting for languages
that use white-spaces for word separation as it generates significantly less noisy features than
the raw char variant in that case. For such languages it can increase both the predictive
accuracy and convergence speed of classifiers trained using such features while retaining the
robustness with regards to misspellings and word derivation
5.3 Module 3: Training the model

The machine learning model that is used to make predictions on new data is called the
final model. A final machine learning model is a model that is used to make predictions on
new data.That is, given new examples of input data, the model is used to predict the expected
output. This may be a classification (assign a label) or a regression (a real value).For
example, whether the photo is a picture of a dog or a cat, or the estimated number of sales for
tomorrow.The goal of the machine learning project is to arrive at a final model that performs
the best, where “best” is defined by:
• Data: the historical data that is available.
• Time: the time spent on the project.
• Procedure: the data preparation steps, algorithm or algorithms, and the chosen
algorithm configurations.

the data is gathered, spend the time which is required, and discover the data preparation
procedures, algorithm to use, and how to configure it.The final model is the pinnacle of this
process, at the end the model will start actually making predictions.
5.3.1 The Purpose of Train/Test Sets

Creating a train and test split of the dataset is one method to quickly evaluate the
performance of an algorithm on your problem.The training dataset is used to prepare a
model, to train it. The test dataset is new data where the output values are withheld from the
algorithm. The predictions from the trained model on the inputs is gathered from the test
dataset and compare them to the withheld output values of the test set.Comparing the
predictions and withheld outputs on the test dataset allows to compute a performance
measure for the model on the test dataset. This is an estimate of the skill of the algorithm
trained on the problem when making predictions on unseen data.
The performance measure calculated on the predictions is an estimate of the skill of
the whole procedure. The performance measure is generalized from:
“the skill of the procedure on the test set”
to
“the skill of the procedure on unseen data”
This is quite a leap and requires that the procedure is sufficiently robust that the
estimate of skill is close to what we actually expect on unseen data.The choice of
performance measure accurately captures what we are interested in measuring in predictions
on unseen data.The choice of data preparation is well understood and repeatable on new data,
and reversible if predictions need to be returned to their original scale or related to the
original input values.
The choice of algorithm makes sense for its intended use and operational environment
(e.g. complexity or chosen programming language).A lot rides on the estimated skill of the
whole procedure on the test set.In fact, using the train/test method of estimating the skill of
the procedure on unseen data often has a high variance (unless we have a heck of a lot of data
to split). This means that when it is repeated, it gives different results, often very different
results.The outcome is that we may be quite uncertain about how well the procedure actually
performs on unseen data and how one procedure compares to another.

The problem with applied machine learning is that we are trying to model the
unknown.On a given predictive modeling problem, the ideal model is one that performs the
best when making predictions on new data.We don’t have new data, so we have to pretend
with statistical tricks.The train-test split and k-fold cross validation are called resampling
methods. Resampling methods are statistical procedures for sampling a dataset and
estimating an unknown quantity.In the case of applied machine learning, we are interested in
estimating the skill of a machine learning procedure on unseen data. More specifically, the
skill of the predictions made by a machine learning procedure.Once we have the estimated
skill, we are finished with the resampling method. If you are using a train-test split, that
means you can discard the split datasets and the trained model.If you are using k-fold cross-
validation, that means you can throw away all of the trained models.
5.3.2 Split Dataset:
The dataset is divided into three parts, namely Training dataset, Validation
dataset and Testing Dataset.
5.3.2.1 Training Dataset
Training Dataset: The sample of data used to fit the model.
The actual dataset that we use to train the model (weights and biases in the case of
Neural Network). The model sees and learns from this data.
Fig 5.4 Split Dataset

5.3.2.2 Validation Dataset
Validation Dataset: The sample of data used to provide an unbiased evaluation of a model fit
on the training dataset while tuning model hyperparameters. The evaluation becomes more
biased as skill on the validation dataset is incorporated into the model configuration.
The validation set is used to evaluate a given model, but this is for frequent evaluation.
We as machine learning engineers use this data to fine-tune the model hyperparameters.
Hence the model occasionally sees this data, but never does it “Learn” from this. The result of
validation set is taken into consideration to tune our hyperparameters. So the validation set in
a way affects a model, but indirectly.
5.3.2.3 Test Dataset
Test Dataset: The sample of data used to provide an unbiased evaluation of a final model fit
on the training dataset.
The Test dataset provides the gold standard used to evaluate the model. It is only used
once a model is completely trained (using the train and validation sets). The test set is
generally what is used to evaluate competing models. Many a times the validation set is used
as the test set, but it is not good practice. The test set is generally well curated. It contains
carefully sampled data that spans the various classes that the model would face, when used in
the real world.
Model parameters
• Model parameters are learned attributes that define individual models.
• e.g. regression coefficients
• e.g. decision tree split locations
They can be learned directly from the training data
Hyperparameters
• Hyperparameters express "higher-level" structural settings for algorithms.

• strength of the penalty used in regularized regression
• the number of trees to include in a random forest
They are decided before fitting the model because they can't be learned from the data
5.3.2.4 Cross-validation
Cross-validation is a method for getting a reliable estimate of model performance

using only your training data.There are several ways to cross-validate. The most common
one, 10-fold cross-validation, breaks your training data into 10 equal parts (a.k.a. folds),
essentially creating 10 miniature train/test splits.These are the steps for 10-fold cross-
validation:
• Split your data into 10 equal parts, or "folds".
• Train your model on 9 folds (e.g. the first 9 folds).
• Evaluate it on the 1 remaining "hold-out" fold.
• Perform steps (2) and (3) 10 times, each time holding out a different fold.
• Average the performance across all 10 hold-out folds.
The average performance across the 10 hold-out folds is your final performance estimate,
also called your cross-validated score. Because you created 10 mini train/test splits, this score
is usually pretty reliable.
Figure 5.5: Cross Validation

5.4 Module 4: Speech to text:
The Web Speech API, introduced at the end of 2012, allows web developers to
provide speech input and text-to-speech output features in a web browser. Typically, these
features aren’t available when using standard speech recognition or screen reader software.
This API takes care of the privacy of the users. Before allowing the website to access the
voice via microphone, the user must explicitly grant permission.
Some important points you need to know:
• It is only available till the date (23.02.2016) only in Google Chrome.

• Local files (file:// protocol) are not allowed, the file needs to be holster someway in a
server (or localhost).
5.5 Module 5: Text to speech:
Speech synthesis is accessed via the SpeechSynthesis interface, a text-to-speech

component that allows programs to read out their text content (normally via the device's
default speech synthesizer.) Different voice types are represented by SpeechSynthesisVoice
objects, and different parts of text that you want to be spoken are represented by
SpeechSynthesisUtterance objects. You can get these spoken by passing them to the
SpeechSynthesis.speak() method.
The SpeechSynthesisUtterance represents a speech request. It contains the content the

speech service should read and information about how to read it (e.g. language, pitch and
volume.) The SpeechSynthesisUtterance object has various properties that can be set, to
manipulate the speech generated.
• .lang - Gets and sets the language of the utterance.
• .pitch - Gets and sets the pitch at which the utterance will be spoken at.
• .rate- Gets and sets the speed at which the utterance will be spoken at.
• .text - Gets and sets the text that will be synthesized when the utterance is spoken.

• .voice - Gets and sets the voice that will be used to speak the utterance.
• .volume - Gets and sets the volume that the utterance will be spoken at.
5.6 Module 6: Response Generation
Once the chatbot understands the user’s message, the next step is to generate a
response. One way is to generate a simple static response. Another way is to get a template
based on intent and put in some variables. The chatbot development company chooses the
method for generating the response depending on the purpose for which chatbots are
employed.For example, a weather forecast chatbot that uses API to get a weather forecast for
the given location can either say, "it will most probably rain today" or "it’s a rainy day" or
"probability of rain is 80%, so put your umbrellas to use today."The style of response varies
from user to user. In that case, the bot can study and analyze previous chats and its associated
metrics to tailor customized responses for the user.
5.6.1 Sequence To Sequence
Seq2seq was first introduced for machine translation, by Google. Before that, the
translation worked in a very naïve way. Each word that you used to type was converted to its
target language giving no regard to its grammar and sentence structure. Seq2seq
revolutionized the process of translation by making use of deep learning. It not only takes the
current word/input into account while translating but also its neighborhood.
Seq2seq Working:
As the name suggests, seq2seq takes as input a sequence of words (sentence or

sentences) and generates an output sequence of words. It does so by use of the recurrent
neural network (RNN). Although the vanilla version of RNN is rarely used, its more
advanced version i.e. LSTM or GRU are used. This is because RNN suffers from the
problem of vanishing gradient. LSTM is used in the version proposed by Google. It develops
the context of the word by taking 2 inputs at each point of time. One from the user and other
from its previous output, hence the name recurrent (output goes as input). It mainly has two
components i.e. encoder and decoder, and hence sometimes it is called the Encoder-Decoder
Network.

Encoder
An encoder reads in "source data", e.g. a sequence of words or an image, and
produces a feature representation in continuous space. For example, a Recurrent Neural
Network encoder may take as input a sequence of words and produce a fixed-length vector
that roughly corresponds to the meaning of the text. An encoder based on a Convolutional
Neural Network may take as input an image and generate a new volume that contains higher-
level features of the image. The idea is that the representation produced by the encoder can
be used by the Decoder to generate new data, e.g. a sentence in another language, or the
description of the image.
Decoder
A decoder is a generative model that is conditioned on the representation created by
the encoder. For example, a Recurrent Neural Network decoder may learn generate the
translation for an encoded sentence in another language.
Model
A model defines how to put together an encoder and decoder, and how to calculate
and minimize the loss functions. It also handles the necessary preprocessing of data read
from an input pipeline. Under the hood, each model is implemented as a model_fn passed to
a tf.contrib.learn Estimator.
Figure 5.6: Sequence to Sequence model

Apart from these three, many optimizations have lead to other components of seq2seq:
• Attention: The input to the decoder is a single vector which has to store all the
information about the context. This becomes a problem with large sequences. Hence
the attention mechanism is applied which allows the decoder to look at the input
sequence selectively.
• Beam Search: The highest probability word is selected as the output by the decoder.
But this does not always yield the best results, because of the basic problem of greedy
algorithms. Hence beam search is applied which suggests possible translations at each
step. This is done making a tree of top k-results.
Fig 5.7 Sequence diagram for General Conversion of Chatbot
5.7 Module 7: Disease Predictor Module
The disease predictor module works on the Naive Bayes Classifier. Naïve Bayes is a
multi-class classifier that is based on the Bayes Theorem.

Naive Bayes is a kind of classifier which uses the Bayes Theorem. It predicts
membership probabilities for each class such as the probability that given record or data point
belongs to a particular class. The class with the highest probability is considered as the most
likely class. This is also known as Maximum A Posteriori (MAP).
Below Example explains the working of Naive Bayes on our dataset
Let us consider a small set of 3 classes and 4 variables for the explanation.
We have 3 classes associated with disease:
• Malaria
• Jaundice
• Common Cold
The predictor features set consists of 4 features as
• Headache
• High Fever
• Vomiting
• Dark urine
Headache High Fever Vomiting Dark urine Disease

7/10 10/10 9/10 1/10 Malaria
0 9/10 6/10 9/10 Jaundice
8/10 9/10 0 0 Common Cold
Table: 5.1 Frequency Table
The above table shows a frequency table of our data. In our training data:
• Malaria having 7/10(70%) value for headache i.e. out of 10 Malaria patient 7 suffer
from headache. Similarly, high fever has 10/10(100%) value i.e100% Malaria patient
suffer from high fever. Vomiting has 9/10(90%) value i.e90% Malaria patient suffer

from vomiting. Dark urine has 1/10(10%) value i.e10% Malaria patient suffer from
dark urine
• Jaundice having 0/10(0%) value for headache i.e. out of 10 Jaundice patient 0 suffer
from headache. Similarly, high fever has 9/10(100%) value i.e90% Jaundice patient
suffer from high fever. Vomiting has 6/10(60%) value i.e60% Jaundice patient suffer
from vomiting. Dark urine has 9/10(90%) value i.e90% Jaundice patient suffer from
dark urine
• Common Cold having 8/10(80%) value for headache i.e. out of 10 Common Cold
patient 8 suffer from headache. Similarly, high fever has 9/10(90%) value i.e90%
Common Cold patient suffer from high fever. Vomiting has 0/10(0%) value i.e0%
Common Cold patient suffer from vomiting. Dark urine has 0/10(0%) value i.e. 0%
Common Cold patient suffer from dark urine
Now, it’s time to work on predict classes using the Naive Bayes model. We have taken 2
records that have values in their feature set, but the target variable needs to predicted.
Headache High Fever Vomiting Dark urine

TRUE FALSE TRUE TRUE
FALSE TRUE FALSE TRUE
Table: 5.2 Test Data
We have to predict Disease using the feature values. We have to predict whether the disease
is Malaria, Jaundice, Common Cold
We will use the Naive Bayes approach.

P (H | Multiple Evidences) = P (E1| H) * P (E2 | H) ……* P (En | H) * P (H) / P (Multiple
Evidences)
Let’s consider the first record.
The Evidence here is Headache, Vomiting and Dark urine. The Hypothesis can be a
diseaseamong Malaria, Jaundice, Common Cold.

For Hypothesis testing for the disease to be a Malaria:
P(Malaria |Headache, Vomiting,Dark urine) = P(Malaria|Headache) * P(Malaria|Vomiting) *

P(Malaria|Dark urine) * P(Malaria) / P(Headache, Vomiting,Dark urine)
= 0.7 * 0.90 * 0.1 * 0.33/ P(Headache, Vomiting,Dark urine)
= 0.021
For Hypothesis testing for the disease to be a Jaundice:
P(Jaundice |Headache, Vomiting,Dark urine) = P(Jaundice |Headache) * P(Jaundice

|Vomiting) * P(Jaundice |Dark urine) * P(Jaundice) / P(Headache, Vomiting,Dark urine)
= 0.0
For Hypothesis testing for the disease to be a Common Cold:
P(Common Cold |Headache, Vomiting,Dark urine) = P(Common Cold |Headache) *

P(Common Cold |Vomiting)* P(Common Cold |Dark urine) * P(Common Cold) /
P(Headache, Vomiting ,Dark urine)
= 0.0
The denominator of all the above calculations is same i.e., P(Headache,

Vomiting,Dark urine). The value of P(Malaria |Headache, Vomiting,Dark urine) is greater
that P(Jaundice |Headache, Vomiting,Dark urine) and P(Common Cold |Headache,
Vomiting,Dark urine)
Using Naive Bayes, we can predict that the class of this record is Malaria.
Let’s consider the second record.
The Evidence here is High Fever, Dark urine. The Hypothesis can be a disease among
Malaria, Jaundice, Common Cold.

For Hypothesis testing for the disease to be a Malaria:
P(Malaria |High Fever,Dark urine) = P(Malaria| High Fever) * P(Malaria|Dark urine) *

P(Malaria) / P(High Fever,Dark urine)
= 1.0 * 0.10 * 0.33/ P(High Fever,Dark urine)
= 0.033
For Hypothesis testing for the disease to be a Jaundice:
P(Jaundice |High Fever,Dark urine) = P(Jaundice | High Fever) * P(Jaundice |Dark urine) *
P(Jaundice) / P(High Fever,Dark urine)
= 0.26
For Hypothesis testing for the disease to be a Common Cold:
P(Common Cold |High Fever,Dark urine) = P(Common Cold | High Fever) * P(Common
Cold |Dark urine) * P(Common Cold) / P(High Fever,Dark urine)
= 0.0
The denominator of all the above calculations is same i.e., P (High Fever, Dark urine). The
value of P (Jaundice |High Fever, Dark urine) is greater that P (Malaria |High Fever, Dark
urine) and P (Common Cold |High Fever, Dark urine)
Using Naive Bayes, we can predict that the class of this record is Jaundice.

Fig 5.8 Sequence Diagram for disease prediction

CHAPTER 6
RESULTS AND DISCUSSION

6.1 Testing
6.1.1 Test case : Voice Recognition Module
a. Input : User’s Voice Input

b. Expected output : Speech to text
c. Observed output : Speech to text
d. Result : Pass
Description: Users are given option to interact with the system through their speech. So,
when the user interacts with the system the voice input needs to be converted into text. This
conversion is required for further processing of the input.
Input Observed output Expected output Result
Low frequency Male voice input Text of the Text of the Pass
through external mic
Speech input Speech input
Low frequency Female voice input Text of the Text of the Pass
through external mic
High frequency Male voice input Text of the Text of the Pass
through inbuilt device mic
High frequency Male voice input Text of the Text of the Pass
through inbuilt mic
Moderate frequency Male voice Text of the Text of the Pass

input through inbuilt device mic

Moderate frequency Male voice Text of the Text of the Pass

input through inbuilt device mic
Table 6.1 Voice Recognition module test cases

6.1.2 Test case : Text to Speech Module
a. Input: : Text response received from the chatbot

b. Observed Output: : Text to Speech
c. Expected Output: : Text to Speech
d. Result: : Pass
Description: The system is given the option to reply to the user in the form of speech. When
the response is created by the system both text and voice output is provided to the user.
Input Observed Output Expected Output Result
General Responses Speech Output Speech Output Pass
Symptoms Next Symptom Next Symptom Pass

Asked Asked
Table 6.2 Text to Speech module test cases
6.1.3 Test case : Symptom Suggestion Function

a. Input : Symptoms
b. Observed Output : List of suggested symptoms
c. Expected Output : List of suggested symptoms
d. Result : Pass
Description: When the user starts typing the name of any disease then a list of related
symptoms is suggested and if the user gives any input that is not present in our dataset then
no match is displayed.
Symptoms Symptoms List Symptoms List Pass

Shown Shown
Other than No match found No match found Pass

Symptoms shown Shown
Table 6.3 Symptom Suggestion Function test cases

6.1.4 Test case : Disease Predictor Module

a. Input : Symptoms
b. Observed Output : A single disease predicted
c. Expected Output : A single disease predicted
d. Result : Pass
Description: Users have to give four symptoms to the system and as a result the system
predicts a single disease based on the symptoms given by the user. A user cannot give less
than four symptoms.
Predict Disease Input Symptoms Input Symptoms Pass
Symptom 1 Next Symptom Next Symptom Pass

Asked Asked

Asked Asked

Asked Asked
Symptom 4 Disease Predicted Disease Predicted Pass

Table 6.4 Disease Predictor Module test cases
6.1.5 Test case : Chatbot GUI Module

a. Input : Text or Speech
b. Observed Output : Text and Speech
c. Expected Output : Text and Speech
d. Result : Pass
Description: Users have the options of interacting with the system either by typing or through
their speech. There are different ways in which one can interact with the system.
Text (when Text to Text Response Text Response Pass

Speech is not Given Given
selected)
Text (when Text to Speech Response Speech Response Pass

Speech is selected) Given Given

Speech (when Text Text Response Text Response Pass

to Speech is not Given Given
selected)
Speech (when Text Speech Response Speech Response Pass

to Speech is Given Given
selected)
Table 6.5 Chabot GUI module test cases
6.2 Results
6.2.1 GUI of Chatbot
Figure (specify here) shows the Graphical User Interface of the system through which
the users will interact with the system. The GUI has two checked buttons to turn ON/OFF the
Speech to Text and Text to Speech respectively. While giving the input through typing the
user can either directly press enter key or they can click the Say button.
Figure 6.1: Start of Chatbot screenshot
6.2.2 General Conversation Module
The system has two interaction modes. Here the users interact with the system by
having general conversation.

In the figure shown below the user starts the conversation with a greeting message
and the system replies the user.
Figure 6.2: General Conversation with Chatbot Screenshot
6.2.3 Initial State of Disease Predictor Module
Here the users can interact with the system in the second mode of the system, the
disease predictor mode. As soon as the users ask for help for disease prediction the system
changes it’s mode from general conversation to disease predictor mode.
In this mode the system starts asking the user to input symptoms.
Figure 6.3: Change to Disease Predictor Mode Screenshot

6.2.4 Symptoms Suggestion Function
Here in this function, once the user gives a symptom, a list of symptoms is shown to
help the user in choosing the symptoms.
In this way the user need not worry about the complete spelling of the symptoms and
the can select the symptoms from that list.
Figure 6.4: Chatbot giving Symptoms suggestion Screenshot
Figure 6.5: Chatbot giving Symptoms suggestion Screenshots

The figure shown below shows a case, when a user enter any symptom and that
symptom is not present in the dataset then no match found is shown.
This happens only the system fails to convert the voice into text correctly.
Figure 6.6: Chatbot showing No Match Screenshots
6.2.5 Disease Predictor Module
Here in this module, the user starts giving the symptoms as input. The system asks for
four symptoms from the user. Once the fourth symptom is provided the system responds by
predicting one of the diseases based on the highest probability of the symptoms.
There are three examples shown in the below figures. All the three figures correspond
to three different set of symptoms.

Figure 6.7: Disease Predicted for Symptoms Set-1 Screenshots
Figure 6.8: Disease Predicted for Symptoms Set-2 Screenshots

Figure 6.9: Disease Predicted for Symptoms Set-3 Screenshot33
6.2.6 GUI of Chatbot with Only Text Input and Output
Here the user interacts with the Chatbot only with text input. The output is also
received in the text form.
Figure 6.10: Text Input Screenshot

6.2.7 GUI of Chatbot with Only Voice Input
Here the user interacts with the Chatbot through his voice. The output received is in
the text form.
Figure 6.11: Voice Input Screenshot
6.2.8 GUI of Chatbot with Only Voice Input and Output
Here the user interacts with the Chatbot through voice. The output is received in both
text and voice form.
Figure 6.12: Voice Input and Output Screenshot

CHAPTER 7
CONCLUSION AND FUTURE WORK

A chatbot design was implemented which can be used as a tool for general conversation as
well as for self-diagnosis. The system can be very useful in the field of medical science for
early and faster detection of disease. Also, the system helps an individual to keep track of
their health more properly. However, the biggest challenge is the lack of correct and accurate
medical dataset. Although there are a number of dataset available for developing a
conversational chatbot, there is only one dataset available for disease prediction which is
limited. There is one more big challenge that the seq2seq model requires a lot of time for
training even though the hardware is capable of handling it.
For future enhancements we can use an offline Application Programming Interface (API) for
Speech-to-text conversion and Text-to-Speech conversation. Also, the symptom-diseases
mapping can be done more accurately if a more reliable dataset is made available. Along
with this, with the increasing use of smart wearable more accurate body measurements like
heart rate, Body Mass Index (BMI) etc. can be given as input to the system and the disease
prediction algorithm can be improved for more accurate prediction of diseases.

References
[1]. Milla T Mutiwokuziva, Melody W Chanda, Prudence Kadebu, Addlight Mukwazvure,
Tatenda T Gotora, “A Neural-network based Chat Bot” ICCES(2017)
[2]. Bhavika R. Ranoliya, Nidhi Raghuwanshiand Sanjay Singh, “Chatbot for University
Related FAQs”. IEEE transaction (2017)
[3]. Sameera A. Abdul-Kader, Dr. John Woods, “Survey on Chatbot Design Techniques in
Speech Conversation System”, International Journal of Advanced Computer Science and
Application(2015)
[4]. M. Dahiya, “A Tool of Conversation: Chatbot”, International Journal of Computer

Science and Engineering, (2017)
[5]. Balbir Singh Bani, Ajay Pratap Singh, “College Enquiry Chatbot Using A.L.I.C.E”
International Journal of New Technology and Research, 2017
[6]. Huyen Nguyen, David Morales, Tessera Chin “A Neural with Personality”. Stanford
University Report.
[7]. Md. Shahriare Satu, Tajim Md. Niamat Ullah Akhund, Mahammad Abu Yousuf. (2017,
Feb.). Online Shopping Management System with Customer Multi-Language
Supported Query handling AIML Chatbot

Full Report PDF

Uploaded by

Copyright:

Available Formats

Full Report PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Full Report PDF

Uploaded by

Copyright:

Available Formats

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

“A PERSONALIZED MEDICAL ASSISTANT CHATBOT”

1SG15CS094 Satvik Ranjan

Under the Guidance of

Department of Computer Science and Engineering

Department of Computer Science and Engineering

Signature of the Guide Signature of the HOD Signature of the Principal

Name of the Examiners Signature with Date

We would like to express my profound thanks to Sri. G Dayanand, Chairman, Sapthagiri

Special Thanks to Dr. Nagabhushana K R, Administrative Officer, Sapthagiri College of

Department of Computer Science and Engineering

We SATVIK RANJAN (1SG15CS094), TATHAGAT ANKIT (1SG15CS114) and VIVEK

Name & Signature of the Student with date

Chapter 2: Literature review

2.1 Summary of prior works

Chapter 3: System Requirements

3.1 Functional Requirements

Chapter 4: System Design /Methodology

5.1 Module 1: Data Preprocessing

Chapter 6: Results and Discussion

Chapter 7: Conclusion and Future work

2.1 Proposed system 11

3.1 Supervised Learning Algorithm 14

3.2 Unsupervised Leaning Algorithm 15

3.3 Reinforced Learning 16

4.1 System Architecture 20

4.2 Overview of Voice Recognition 21

4.3 Text Processing 21

4.4 Workflow diagram 23

4.5 Sequence to Sequence Model 25

4.6 Encoder and Decoder 26

5.1 Word Embedding 30

5.2 Bag of Words 31

5.3 Bag of Words Position Vector 32

5.4 Split Dataset 35

5.5 Cross Validation 37

5.6 Sequence to Sequence Model 40

5.7 Sequence Diagram for Seq2Seq Model 41

5.8 Sequence Diagram for Disease Model 46

6.1 Start of Chatbot Screenshot 50

6.2 Generation Conversation with Chabot Screenshot 51

6.3 Change to Disease Predictor Mode Screenshot 51

6.4 Chatbot giving symptom suggestion Screenshot 52

6.6 Chabot showing no match Screenshot 53

6.7 Disease Predicted for Symptoms Set-1 Screenshot 54

6.8 Disease Predicted for Symptoms Set-2 Screenshot 54

6.9 Disease Predicted for Symptoms Set-3 Screenshot 55

6.10 Text Input Screenshot 55

6.11 Voice Input Screenshot 56

6.12 Voice Input and output Screenshot 56

5.1. Frequency Table 42

5.2. Test Data 43

6.1. Voice recognition Module Test Cases 47

6.2. Text to Speech Module Test Cases 48

6.3. Symptoms Suggestion Function Test Cases 48

6.4. Class Details Module Test cases 49

6.5. Chatbot GUI Module test cases 49

1.1.1 Chat bots

The chatbot also known as a smartbot, conversational bot, chatterbot, interactive