Speech Mentor For Visually Impaired
Speech Mentor For Visually Impaired
Speech Mentor For Visually Impaired
Mini project report submitted in partial fulfillment of the Requirements for the Award of the Degree of
BACHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE AND ENGINEERING
By
V.BHAVANA (178W1A05H7)
K.SHUSHMA SRI (178W1A05G4)
P.G.L. SAHITHI(178W1A05F2)
K.JHANSI(188W5A028)
(AUTONOMOUS)
CERTIFICATE
This is to certify that the MINI PROJECT Report entitled “SPEECH MENTOR FOR VISUALLY
IMPAIRED PEOPLE” being submitted by V.BHAVANA(178W1A05H7), K.SHUSHMA
SRI(178W1A05G4),P.G.L. SAHITHI(178W1A05F2),K.JHANSI(188W5A0528) in partial fulfillment for the
award of the Degree of Bachelor of Technology in Computer Science and Engineering to the Jawaharlal Nehru
Technological University, Kakinada is a record of bonafide work carried out under my guidance and
supervision.
We hereby declare that the MINI PROJECT Report entitled “SPEECH MENTOR FOR
VISUALLY IMPAIRED PEOPLE” submitted for the B.Tech Degree is our original work and the
dissertation has not formed the basis for the award of any degree, associateship, fellowship or any
other similar titles.
Place: Vijayawada
Date:
V.BHAVANA (178W1A05H7)
K.SHUSHMA SRI (178W1A05G4)
P.G.L. SAHITHI (178W1A05F2)
K.JHANSI (188W5A0528)
P.S
ACKNOWLEDGEMENT
Behind every achievement lies an unfathomable sea of gratitude to those who activated it, without
whom it would ever have come into existence. To them we lay the words of gratitude imprinted with
us.
We would like to thank our respected Principal, Dr. A.V. Ratna Prasad and Dr. D. Rajeswara Rao,
Head of the Department, Computer Science and Engineering for their support throughout our Project.
It is our sincere obligation to thank our guide, Ch. Raga Madhuri ,Asst. Professor, Department of
Computer Science and Engineering, for her timely valuable guidance and suggestions for this Project.
We owe my acknowledgements to an equally long list of people who helped us in this Project.
Finally, We wish to thank all the supporting staff who gave us facility in lab for the completion of this
Project.
Place: Vijayawada
Date:
V.BHAVANA (178W1A05H7)
K.SHUSHMA SRI (178W1A05G4)
P.G.L. SAHITHI (178W1A05F2)
K.JHANSI (188W5A028)
ABSTRACT
Book reading is an interesting habit but it is difficult for the blind and the
visually impaired ones. Braille machines helps them to some extent but are not
accessible to everyone. The current application aims to help such people by
making their work simple and handy. The system has an in-built camera fixed to
the spectacles that reads the content and gives audio as an output to the user.
Raspberry pi can accommodate the pi camera and the audio output. The system
uses Optical Character Recognition for text extraction from image thereby
converting the text to speech and sending audio signals as output. In addition to
that, the system will be capable enough to extract text from currency notes, text
labels on product packagings etc.
MOTIVATION
Problem Statement:
Visually impaired people often find it difficult to read books, naming plates, to
identify currency notes, to read text on posters etc.Braille machines helps them
to some extent but are not accessible to everyone.We propose a visual aid for
completely blind individuals, with an integrated reading assistant. The setup is
mounted on a pair of eyeglasses and can provide real-time auditory feedback.
Scope:
The scope of this application is limited to book reading, text extraction from
objects etc. The background from which the text is extracted need to be black
and white in colour for obtaining better results. The documents which are in
cursive handwriting can’t be detected effectively. The language set is limited to
few Indian Languages.
Objective:
The main objective of this application is to aid the visually challenged people
and to scan each page of book through their spectacles. To extract text from
product labels, currency notes and posters. It should give them the audio version
of the content in respective languages.
ROI detection and text localisation algorithms are used for identifying text from
object in [1]. Camera is used to capture the objects and an audio is obtained as
output to the blind user. The main advantage of this prototype is it can identify
the moving objects and can also read text labels. Some of the demerits are its
lack of accuracy in identification of the object and it is a time taking process.
Region of interest pooling (also known as RoI pooling) is an operation widely
used in object detection tasks using convolutional neural networks.
For example, to detect multiple cars and pedestrians in a single image. Its
purpose is to perform max pooling on inputs of nonuniform sizes to obtain
fixed-size feature maps (e.g. 7×7).
A camera is used to recognize the color and pattern of the cloth in [2]. CCNY
clothing pattern dataset and UIUC texture pattern datasets are used. The Radon
Signature is used to characterize the directionality feature of clothing patterns.
Radon Signature (RadonSig) is based on the Radon transform which is
commonly used to detect the principle orientation of an image. The image is
then rotated according to this dominant direction to achieve rotation
invariance.This prototype is used to identify the cloth through camera and an
audion description is provided about the color and pattern of the cloth. This
model is used only for color and pattern identification.
Images are trained using CNN in [4] and text is converted to speech. It uses and
ultrasonic sensor and CNN to identify the potholes on the road. This prototype
identifies the potholes in all the directions and assists the user all together. As
the distance increases the error rate also increases i.e., it is applicable only for
short range and the obstacles that are at knee level only can be identified and
those are at head level cannot be identified.
Sensors for obstacle avoidance and image processing algorithms are used for
object detection in [7]. The system includes a reading assistant in the form of
image to text converter and gives audio as an output. It can also detect moving
objects and allows the user to read the text from any document to inform the
user about his or her distance from the object. It fails to read text containing
tables and pictures and the accuracy of detection is very low.
In [8] the machine will accept the page form and also provides online correction
facility. The output medium is a digital magnetic tape which can also be
checked by an editor. The output is an audio. The prototype provides us with a
facility to hear the audio again and again and speaking range is adjustable
within wide limits. But the machine fails to detect the objects and has smaller
limited fonts.
CHAPTER 3: SOFTWARE REQUIREMENT ANALYSIS
Functional Requirements
Non Functional Requirements
Software Requirements
Chapter 4: PROPOSED SYSTEM