Speech Mentor For Visually Impaired

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

SPEECH MENTOR FOR VISUALLY IMPAIRED PEOPLE

Mini project report submitted in partial fulfillment of the Requirements for the Award of the Degree of
BACHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE AND ENGINEERING
By
V.BHAVANA (178W1A05H7)
K.SHUSHMA SRI (178W1A05G4)
P.G.L. SAHITHI(178W1A05F2)
K.JHANSI(188W5A028)

Under the Guidance of


Ch. Raga Madhuri
Asst.Professor
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
V.R SIDDHARTHA ENGINEERING COLLEGE
Autonomous and Approved by AICTE
Affiliated to Jawaharlal Nehru Technological University, Kakinada
Vijayawada 520007
2020
V.R.SIDDHARTHA ENGINEERING COLLEGE

(AUTONOMOUS)

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

CERTIFICATE

This is to certify that the MINI PROJECT Report entitled “SPEECH MENTOR FOR VISUALLY
IMPAIRED PEOPLE” being submitted by V.BHAVANA(178W1A05H7), K.SHUSHMA
SRI(178W1A05G4),P.G.L. SAHITHI(178W1A05F2),K.JHANSI(188W5A0528) in partial fulfillment for the
award of the Degree of Bachelor of Technology in Computer Science and Engineering to the Jawaharlal Nehru
Technological University, Kakinada is a record of bonafide work carried out under my guidance and
supervision.

Ch. Raga Madhuri, Asst Professor Dr. D Rajeswara Rao

Professor & Guide Professor & HOD


DECLARATION

We hereby declare that the MINI PROJECT Report entitled “SPEECH MENTOR FOR
VISUALLY IMPAIRED PEOPLE” submitted for the B.Tech Degree is our original work and the
dissertation has not formed the basis for the award of any degree, associateship, fellowship or any
other similar titles.

Place: Vijayawada

Date:
V.BHAVANA (178W1A05H7)
K.SHUSHMA SRI (178W1A05G4)
P.G.L. SAHITHI (178W1A05F2)
K.JHANSI (188W5A0528)

P.S
ACKNOWLEDGEMENT

Behind every achievement lies an unfathomable sea of gratitude to those who activated it, without
whom it would ever have come into existence. To them we lay the words of gratitude imprinted with
us.

We would like to thank our respected Principal, Dr. A.V. Ratna Prasad and Dr. D. Rajeswara Rao,
Head of the Department, Computer Science and Engineering for their support throughout our Project.

It is our sincere obligation to thank our guide, Ch. Raga Madhuri ,Asst. Professor, Department of
Computer Science and Engineering, for her timely valuable guidance and suggestions for this Project.

We owe my acknowledgements to an equally long list of people who helped us in this Project.
Finally, We wish to thank all the supporting staff who gave us facility in lab for the completion of this
Project.

Place: Vijayawada

Date:

V.BHAVANA (178W1A05H7)
K.SHUSHMA SRI (178W1A05G4)
P.G.L. SAHITHI (178W1A05F2)
K.JHANSI (188W5A028)
ABSTRACT
Book reading is an interesting habit but it is difficult for the blind and the
visually impaired ones. Braille machines helps them to some extent but are not
accessible to everyone. The current application aims to help such people by
making their work simple and handy. The system has an in-built camera fixed to
the spectacles that reads the content and gives audio as an output to the user.
Raspberry pi can accommodate the pi camera and the audio output. The system
uses Optical Character Recognition for text extraction from image thereby
converting the text to speech and sending audio signals as output. In addition to
that, the system will be capable enough to extract text from currency notes, text
labels on product packagings etc.

Keywords: Optical Character Recognition, text-to-speech synthesis, Raspberry


pi, pi camera.
INTRODUCTION
In the real world, books and documents are the sources of knowledge. But this
knowledge is only bounded to people with clear vision. Our society includes a
group of people who does not have a clear vision or people who are blind. For
this group, world is like a black illusion.
As stated by WHO, the total visually impaired people around the world
accounts to about 285 million who are struggling in day to day life. In India
there are about 8.7 million people who are visually impaired. Our project
focuses to provide a better life to such large number of visually impaired
people. With the help of this project, a visually impaired person can be able to
do almost anything that a normal person does while reading. This device reads
the text that’s in front of the person in any language and in any font so that they
can know the surroundings and gather knowledge. It gives them the freedom to
study any book which is considered to be the vital part of anyone’s life. Among
such large number of visually impaired people, very few have the knowledge of
braille system which they use to learn.

In our world information is generally available in the form of books and


documents. It is fully usable for the sighted people. A major problem for a blind
or visually impaired person (BVI) to interact with the world to share
knowledge. For them information has to be in a special tactile language or in
voice format. About 20 million people in USA live with visual impairments .
They are affected in every works of their daily life. Nowadays technology helps
them to overcome this difficulty to some extent. Many hardware or software
tools are invented to help them .Many people do not lose the eye and some
people have been blind since childhood. According to a World Health
Organization survey in 2015, some 246 million individuals worldwide are
outwardly disabled and 39 million are blind. Many types of investigations have
been carried out to solve these problems. Previously, Louis Braille was a French
educator and inventor of Braille. This is a hepatic script and users must learn it
before they understand it. Today, several technologies are being developed
based on the latest technology for the blind. Moderators will be introduced,
based on OCR, blind obstruction and much more. However, these are not
enough techniques to overcome all problems of the blind. People with visual
disability cannot survive in their daily life without any assistance. In many
previous developments and designs I have noticed drawbacks like using
application, blind stick, hand wearable device etc. they are not effective. The
most difficult task for them is reading text from the books or documents. For the
blind or visual impaired (BVI) person, it is a very difficult job to acquire
information from the world. One feasible way in order to perform that job is
that someone will help him to read aloud the context. That's why we introduce
a new technology based on the previous technology which is more efficient and
low cost project. With this electronic widget will provide excellent assistance
for the impaired people. With a camera, you can capture an image, convert it to
text, and convert it to audio using OCR.Now a days a smart technology is
adapted to help the blind. An application is developed that read aloud the
context of the document.

MOTIVATION

The main motivation to this project is very interested in solving problems in


different domains this will open on to do this project. I have read news about
blind who had hit by a train and children stating that “can’t read, so use new
tech to let books speak” in Deccan Chronicle news paper. This statement will
provoke me to find a new technology. I searched many IEEE papers and
websites for present technologies and I found that there are several technologies
with minimum equipment. Existing projects are having drawbacks. They can
handle by people if they know present technology like smart phones. The main
goal is to plan and execution of smart devices for the blind. This project
encourages outwardly impeded individuals to feel comfortable in this world.
Here I am presenting the systems for the prevention of obstacles and the readers
of textbooks, posters and other readers of texts. With a camera, you can capture
an image, convert it to text, and convert it to audio using OCR.

Problem Statement:

Visually impaired people often find it difficult to read books, naming plates, to
identify currency notes, to read text on posters etc.Braille machines helps them
to some extent but are not accessible to everyone.We propose a visual aid for
completely blind individuals, with an integrated reading assistant. The setup is
mounted on a pair of eyeglasses and can provide real-time auditory feedback.

Scope:

The scope of this application is limited to book reading, text extraction from
objects etc. The background from which the text is extracted need to be black
and white in colour for obtaining better results. The documents which are in
cursive handwriting can’t be detected effectively. The language set is limited to
few Indian Languages.
Objective:

The main objective of this application is to aid the visually challenged people
and to scan each page of book through their spectacles. To extract text from
product labels, currency notes and posters. It should give them the audio version
of the content in respective languages.

Chapter 2: LITERATURE SURVEY

ROI detection and text localisation algorithms are used for identifying text from
object in [1]. Camera is used to capture the objects and an audio is obtained as
output to the blind user. The main advantage of this prototype is it can identify
the moving objects and can also read text labels. Some of the demerits are its
lack of accuracy in identification of the object and it is a time taking process.
Region of interest pooling (also known as RoI pooling) is an operation widely
used in object detection tasks using convolutional neural networks.
For example, to detect multiple cars and pedestrians in a single image. Its
purpose is to perform max pooling on inputs of nonuniform sizes to obtain
fixed-size feature maps (e.g. 7×7).

A camera is used to recognize the color and pattern of the cloth in [2]. CCNY
clothing pattern dataset and UIUC texture pattern datasets are used. The Radon
Signature is used to characterize the directionality feature of clothing patterns.
Radon Signature (RadonSig) is based on the Radon transform which is
commonly used to detect the principle orientation of an image. The image is
then rotated according to this dominant direction to achieve rotation
invariance.This prototype is used to identify the cloth through camera and an
audion description is provided about the color and pattern of the cloth. This
model is used only for color and pattern identification.

In [3] a sight to sound human machine interface is proposed that employs a


camera to perform scene analysis and generates the feedback. Here machine
capture the images. It identifies the key objects and then builds a map so that the
blind can reach. The output is an audio and it detects the objects through smart
device camera. The demerit of the prototype is that it takes a lot of time to
identify all the possible ways to reach the destination by detecting all the
obstacles. It is only applicable in indoors.

Images are trained using CNN in [4] and text is converted to speech. It uses and
ultrasonic sensor and CNN to identify the potholes on the road. This prototype
identifies the potholes in all the directions and assists the user all together. As
the distance increases the error rate also increases i.e., it is applicable only for
short range and the obstacles that are at knee level only can be identified and
those are at head level cannot be identified.

In [5] Infrared Transceiver Sensor Module Adoption and Triangulation Method


are used for Distance Detection of Aerial Objects. It uses a stick and glasses as a
medium to capture the images. It gives audio as an output. This model can read
text from aerial objects and potholes identification is also possible. But the
identification of potholes is not always accurate.

Optical Character Recognition module is used to detect text from


images,hardcopy or documents with the help of a camera in [6]. It converts the
text from the captured image into a voice using TTS module. It can detect the
object within a fine range and its accuracy is about 98.3%. The demerit of the
prototype is the detection of front aerial and ground objects is not possible.

Sensors for obstacle avoidance and image processing algorithms are used for
object detection in [7]. The system includes a reading assistant in the form of
image to text converter and gives audio as an output. It can also detect moving
objects and allows the user to read the text from any document to inform the
user about his or her distance from the object. It fails to read text containing
tables and pictures and the accuracy of detection is very low.

In [8] the machine will accept the page form and also provides online correction
facility. The output medium is a digital magnetic tape which can also be
checked by an editor. The output is an audio. The prototype provides us with a
facility to hear the audio again and again and speaking range is adjustable
within wide limits. But the machine fails to detect the objects and has smaller
limited fonts.
CHAPTER 3: SOFTWARE REQUIREMENT ANALYSIS

Functional Requirements
Non Functional Requirements
Software Requirements
Chapter 4: PROPOSED SYSTEM

You might also like