0% found this document useful (0 votes)
19 views4 pages

rajendran2020

The document presents the design and implementation of voice-assisted smart glasses aimed at aiding visually impaired individuals by utilizing the Google Vision API. The system integrates ultrasonic sensors and GPS to provide real-time obstacle detection and navigation assistance, allowing users to communicate their location and receive alerts about obstacles. This innovative device enhances the autonomy and safety of visually impaired users while enabling constant communication with their caregivers.

Uploaded by

html backup
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views4 pages

rajendran2020

The document presents the design and implementation of voice-assisted smart glasses aimed at aiding visually impaired individuals by utilizing the Google Vision API. The system integrates ultrasonic sensors and GPS to provide real-time obstacle detection and navigation assistance, allowing users to communicate their location and receive alerts about obstacles. This innovative device enhances the autonomy and safety of visually impaired users while enabling constant communication with their caregivers.

Uploaded by

html backup
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Fourth International Conference on Electronics, Communication and Aerospace Technology (ICECA-2020)

IEEE Xplore Part Number: CFP20J88-ART; ISBN: 978-1-7281-6387-1

Design and Implementation of Voice Assisted


Smart Glasses for Visually Impaired People Using
Google Vision API
P. Selvi Rajendran Padmaveni Krishnan D. John Aravindhar
Prof, CSE Asso. Prof, CSE Prof, CSE
Hindustan Institute of Technology and Hindustan Institute of Technology and Hindustan Institute of Technology and
Science Science Science
Chennai, India Chennai, India Chennai, India
selvir@hindustanuniv.ac.in kpadmaveni@hindustanuniv.ac.in jaravindhar@hindustanuniv.ac.in

Abstract - Generally, visually challenged people tends to have individual in front of them. It is difficult to distinguish items
difficulties in traveling and managing many kinds of challenges or individuals with only perceptive and audio knowledge.
in their routine life. Mostly, wooden Sticks are used to sense Smart, lightweight, less expensive, self-contained guidance
barriers and obstacles next to them. As a result, visually impaired and facial recognition systems are widely sought-after by
people cannot know exactly what kind of challenges they face people who are blind. This system will be helped for blind
and must thus rely entirely on lead sticks and training to navigate people to navigate in indoors and outdoors environments
safely and in the right direction. This research work focu ses on with the support of a Smartphone, this system is equipped
the development of a guidance system that uses smart glass
with ultrasonic sensors and a GPS facility to provide location
paired with a sensor to continually capture images from the
information.[1,2]. This proposed device has many features
environment by the u ser wearable smart glass. The smart glass is
equipped with a processor to process the captured images and
that provide the visually challenged people with a means of
objecst will be detected to inform the u ser about the results of the
secure and autonomous movement and it has a feature to
image and the user would have a much more comprehen sive view provide constant communication with their family members
of the method. This system allows visually impaired people not and caregivers who can monitor their location where they
only to inform about traveling route and distance to the obstacle, struck out. The device depends on cameras that collect
but it also can inform about what the obstacle is. This smart glass photos to identify items at a distance from the user, warn the
can sense the distance from the obstacle and produce a warning family and caregivers when the user comes to a standstill,
to alert the user in advance. This application is developed to and notify the family, caregivers, and others when the user
provide such a speech-based interface for the user, i.e. the u ser wants assistance. This research proposes a simple mobile
sends a voice that interprets his destination location when and guidance system to resolve direction-finding problems for
when he is about to reach the destination. Here, instead of an visually challenged people and also to reduce hurdles for
alarm signal, the blind man can hear the location recorded by the helping visually impaired people to travel effectively fro m
user. the starting location to reach the destination location with a
Keywords—Assistive device, Image Recognition, Arduino, better understanding of their environments.
Visually Impaired People, Smart Glass II.RELATED WORK
I. INT RODUCT ION Hsieh-Chang[3] has carried out an obstacle recognition
India is a world where vision disability is a big concern research that is based upon applying knowledge to enable
due to pervasive extremist violence and because of water and visually impaired people to escape obstacles while traveling
food poisoning these types of defects arisen for newborn through an unknown environment. The machine consists of
babies. However, technical developments are required to three parts: the identification of situations, the identification
help the people those are living in hard surroundings. As a of barriers, and the voice announcement. They reduced the
whole, visually challenged people are gradually capable of over-segmentation problem by adding the ground plane. This
performing everyday activities in their own way, such they approach solves the issue of segmentation by eliminating the
can drive on the highways and they can move inside their edge and the original seed location issue for the area growth
homes independently. It is known that people with process by applying the Connected Node Process (CCM).
disabilities need the help of others to recognize items. Even Van-Nam [2] increased the precision of the procedure by
then, the latest research has presented a variety of approaches sensing obstacles and localizing them and then transmitting
to overcome the problems of visually impaired persons information on obstacles to visually disabled people using
(VIPs) and to give them more mobility, but these methods various styles such as speech, tactile, vibration. They
have not sufficiently discussed safety measures when VIPs introduced the Visually Driven Assistance Device based on
travel around their own. Moreover, many alternative systems the electrode matrix and the handheld Kinect. The method
are available but no method assists disabled people to be in consists of two key components: the processing and analysis
constant touch with their family and relatives and is also of the context and the representation of information. The
typically complicated and costly. very first component tries to analyze the world with the use
of mobile Kinect for recognizing the predefined obstacles for
The main challenges faced by blind people come into the visually challenged persons, while the second aspect aims to
context of managing indoor and outdoor conditions, portray obstructive data in the form of an electrode matrix.
comprising of numerous barriers and the awareness of the

978-1-7281-6387-1/20/$31.00 ©2020 IEEE 1221

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 19,2021 at 16:16:11 UTC from IEEE Xplore. Restrictions apply.
Fourth International Conference on Electronics, Communication and Aerospace Technology (ICECA-2020)
IEEE Xplore Part Number: CFP20J88-ART; ISBN: 978-1-7281-6387-1

Muhammad Asad[1] has imp lemented a methodology to enables the shoes to keep a record of where they're headed.
find the direction among several straight paths in various When you arrive at your destination and pick a route, you
environments in the real world scenario. These straight paths can turn off your mob ile and run or walk also with your left
will have the corresponding description about the path when or right foot buzzing.
image has been captured, it will be appeared for converging
to a predetermined location fro m the vanishing point. The
proper isolation and practical simulation of the trapped frame
contribute to the analysis of these parallel edges. The III. TALKING SMART GLASS EMBEDDED WITH
vanishing point is then measured and a decision -making COMPUTER VISION
mechanism is created which shows the visually challenged
This proposed wearable system includes a pair of glasses
person his / her divergence from the straight line. and an obstacle detection module to be mounted in the
This proposed device has many features that provide the processing unit and an output screen. The output panel
visually challenged people with a means of secure and element is a beeping part that warns the consumer of the
autonomous movement and constant communication with obstruction intrusion and the power supply. The obstacle
their families and caregivers who can monitor their detection module and the output panel assembly are attached
whereabouts. This proposed smart glass completely depends to the processor unit. The energy supply shall be used to
on a camera which is embedded in the front and center produce electricity for the central processing unit.
portion of the smart glasses. Here, cameras are used to The ultrasonic sensor and is used to implement the
collect photos of the environment which is in front of the
control module. The control unit controls the ultrasonic
blind people to identify the obstacles on the path. The control
sensors, which gather information on the obstacle in front of
unit is responsible for sending the notification to the family
the person, evaluates the informat ion, and sends feedback
members and caregivers when the user fails to detect the
through the buzzer. The Ultrasonic Smart Glasses for Blind
obstacle and face the problem. The information will be
People are portable, easy-to-use, lightweight, user-friendly,
communicated to the family, caregivers, and others when the
and low-cost devices. These glasses can easily guide the
user wants assistance by sending the prerecorded voice blind and assist the blind.
message.
A. Maintaining the Integrity of the Specifications

1)SmartCane

Smart Cane[ 4,16,19] is an interactive navigation assistance


device that works like a wh ite cane handle. Since white cane
can only detect obstacles up to the height of some level o f
human, Smartcane complements its versatility by detecting
obstacles from the knee to the height of the head. Electrical
signals have been used to detect the obstacles which may
create a harmful effect, and the position of obstacles is
informed to the users as a vibrational pattern. All the
Figure 1. Architecture diagram of Designing of voice
components are connected with a battery such as a cell
Assisted Smart Glasses
phone that can be used in both indoor and outdoor
navigation modes. It was developed to accommodate A. Training Data
various types of business grips that are widely used by
visually disabled individuals. The training data used for the model is taken fro m the
Context Co mmon Ob jects (COCO) dataset. The You Just
2)Tap tapsee Look Once (YOLO) v3 is used for programming and the
value of the weights are obtained from the file o f
TapTapSee [4, 16, 20] is a handheld camera technology 200+mb.[7,8,9,10]
developed exclusively for b lind and visually disabled
people, operated by the CloudSight Image Recognition API. B. Model
Tap TapSee uses the camera and Vo iceOver features of the The concept imp lemented used the You Just Look Once
smartphone to record a picture or v ideo of something and to (YOLO) algorith m that operates in an extremely
recognize it clearly for a blind person. In the screen, double- complicated Convolutional Neural Net work architecture
tap the top portion of your screen to take the photo , or called the Darknet. Besides, the python cv2 module is used
double-tap the left side of your screen to take a recording. for setting up Darknet in the yolov3.cfg format.
Tap TapSee will reliably v isualize and describe any two o r
C. Input Data
three-dimensional objects in any d irection within seconds.
The VoiceOver unit then talks about the ID loud. The sensor that is mounted in the smart glass is being used
to transmit images to this equipped model. The transfer
3)Smart shoes speed is thirty frames per second, and can only be
programmed to handle any other frame to improve
Smart shoes [4,16,21] are coordinated with the person's performance.
device, and the software that piggybacks on Google Maps

978-1-7281-6387-1/20/$31.00 ©2020 IEEE 1222

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 19,2021 at 16:16:11 UTC from IEEE Xplore. Restrictions apply.
Fourth International Conference on Electronics, Communication and Aerospace Technology (ICECA-2020)
IEEE Xplore Part Number: CFP20J88-ART; ISBN: 978-1-7281-6387-1

D. API IV. IMPLEMENT AT ION


The class description of objects observed for each frame
will be a set of sequence images, e.g. "car." The various The proposed system is implemented by combing the
position of the image such as top, left, right, center and numerous technology stacks discussed below. Android
bottom also received with the image and it will be added to Studio is mostly used for the development of applications
the class predictor "auto." The text concept can then be sent because it is an Approved Integrated Development
to the Google Text-to-Speech API using the gTTS package. Environ ment (IDE) primarily developed for the various
[7,8,9,10] for the transformation of text form into speech android applications [7].
form to alert the blind person. OpenCV library is used for image recognition since it
E. Output offers support for real-time applications [8].Python scripting
language has been used to build a machine learning model
As the output, the bounding box coordinates of every object [9]. TensorFlow software is used for machine learning
will be captured within the frames, and it connects the boxes applications.
to the objects detected, and returns the stream of the frames
as a video replay. After a few seconds, voice response on It has a scalable architecture that makes it easy to spread
the first frame of the second (instead of 30 fps) will be computing efficiently across a range of platforms [10].
generated e.g. "lower left cat"—mean ing a cat being located YOLO (You Only Look Once) algorith m has been used
at the bottom left of my camera view. to analyze the video streams in real-time. It uses a particular
convolution neural network that is stronger than the R-CNN
F. YOLO algorithm
(Regional Convolutional Neural Network) and enables the
target to be projected in the form of bounding boxes [11].
You Just Look Once (YOLO) seems to be a real-time object
detection algorithm [12,17], amongst the most effective Google Cloud is being used to train machine learning
object detection algorithms that also combine some of its models as it provides high computing power at a low cost. It
also allows for a huge quantity of data to be processed in the
most innovative ideas in the computer vision and machine
cloud and can be used later to construct a more reliable
learning environ ment. The detection of obstacles is a core model [12].
feature of autonomous vehicle technologies. YOLO is a
smart neural convolutionary network (CNN) for real-time The computer uses two modules to display objects and a
object identification. Th is algorith m works by imp lementing text reader in real-time. The text reader reads data in real-
the several layers of a neural network for processing the time using that same method, which lets users, easily
whole image, then divides the image into several small interpret menu cards in hotels, hotel room nu mbers , or even
regions and determines bounding boxes and probabilities find their properties.
between each region is calculated. These bounding boxes A. Dataset
are weighted according to the projection.
For the testing purpose, the Generic Ob ject in Context
G. Darknet dataset is being used to train the You Just Look Once model
and text reader. It is capable of recognizing eighty distinct
There seem to be several d ifferent versions of the YOLO groups.
algorith m on the internet. Darknet is also an open-access
neural network system. Darknet was developed in C V. RESULTS AND DISCUSSION
Language and CUDA technology, which makes it very fast
and allo ws for GPU co mputing, which is important for real-
time predictions. The application can recognize and consider various
classes of hazards that may be detected while driving,
Installation is easy and takes only 3 lines of code to run (it is ordinary objects and machinery, various types of vehicles,
important to change the settings in the Makefile script after food, etc. It can be seen in the Table. 1.
cloning the repository to use the GPU). After installation,
use a pre-trained model or create a new one fro m the Using this method, an obstacle can be found in a specific
scratch. area. After the barrier has been found, the user will be
navigated to their destination. That's why the rationale is to
establish two borders on the left and right border of the ROI.
H. Working Principle Therefore, the ROI will be divided into three parts. If the
Co mputer Vision Glass consists of two majo r hardware barrier is in the right side then the user move to the left, and
components, Raspberry Pi and a co mpact HD sensor. The if the barrier is in the left side then the user to move to the
camera serves as an input sensor, which collects the live right.
video stream fro m the neighborhood and afterward feeds the
brain, i.e. Raspberry Pi, wh ich monitors the entire network,
detects, and recognizes objects which are presented in the
neighboring area.
The most important component, sensing, and identifying
objects were developed. The project uses the YOLO
algorith m to identify art ifacts and devices that are
constraining tools, such as Raspberry Pi or Smartphone.

978-1-7281-6387-1/20/$31.00 ©2020 IEEE 1223

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 19,2021 at 16:16:11 UTC from IEEE Xplore. Restrictions apply.
Fourth International Conference on Electronics, Communication and Aerospace Technology (ICECA-2020)
IEEE Xplore Part Number: CFP20J88-ART; ISBN: 978-1-7281-6387-1

VI. . CONCLUSION
The device presented here is a smart glass that incorporates
the functionality of a machine vision and obstacle detection
and recognition sensor. It can be conveniently advertised and
made accessible to the visually disabled population. It would
also help to deter future injuries. Smart devices can be
transported comfortably and the system camera can be used
to track objects from the surrounding environment and
display in audio format. So, allowing visually disabled
individuals to 'see without the eyePre-process Image

REFERENCES
[1] Muhammad Asad, Waseem Ikram,” Smartphone based guidance
system for visually impaired person”, 2012 3rd International
Conference on Image Processing Theory, Tools and Applications
(IPT A), 15-18 Oct. 2012.
[2] Hsieh-Chang Huang, Ching-Tang Hsieh and Cheng-Hsiang Yeh,” An
Indoor Obstacle Detection System Using Depth Information and
Region Growth”, Sensors (Basel). 2015 Oct; 15(10): 27116–27141
Figure 2. Obstacle Detection [3] Ales Berger, Andrea Vokalova, Petra Poulová,”Google Glass Used as
Assistive Technology Its Utilization for Blind and Visually Impaired
People”,Book Chapter,Springer International Publishing.
The relation between the distance and the frequency of [4] Aditya Raj, Manish Kannaujiya, Ajeet Bharti, Rahul Prasad, Namrata
vibration is given in the below-mentioned table. It also Singh, Ishan Bhardwaj “ Model for Object Detection using Computer
includes the type of vibration which is provided as feedback Vision and Machine Learning for Decision Making ” Int ernational
Journal of Computer Applications (0975 – 8887) Volume 181 – No.
to the user. Certainly, it has been known fro m the table 43,March 2019.
values that as the distance from the obstacle to the user [5] Selman TOSUN, Enis KARAARSLAN “ Real-T ime Object Detection
reduces, the vibration increases, users are alerted with the Application for Visually Impaired People: Third Eye”. IEEE
continuous vibration of the device. Conferences 2018.
[6] Android Developer, https://developer.android.com/
Table 1. Experimentation of Obstacle Detection
[7] OpenCV, https://opencv.org/
Distance Time Vibration Vibration Vibration [8] Python programming language, https://www.python.org/
( count) (sec) (Type) [9] T ensorflow, https://www.tensorflow.org/
1m 1sec 1 time 2sec continuous [10] Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi “You
Only Look Once: Unified, Real-T ime Object Detection”. Cornell
2m 1sec 1 time 2sec continuous University,Jun 2015
3m 1sec 1time 2sec continuous [11] Google “ Google Cloud Services”, https://cloud.google.com/
4m 1sec 1time 2sec continuous [12] P.Selvi Rajendran, “ Virtual Bulletin Board using Man-Machine
Interface (MMI) for Authorized Users”, Indian Journal of Science and
5m 1sec 2 times 1sec short T echnology, Vol.12, No.34, September 2019.
10m 2sec 3 times 1sec short [13] P. Selvi Rajendran, “ AREDAI: Augmented Reality based Educational
15m 2sec 4 times 1sec short Artificial Intelligence System”, International Journal of Recent
T echnology and Engineering (IJRTE), Vol.8, No 1, May 2019.
[14] P.Selvi Rajendran, “ Virtual Information Kiosk Using Augmented
The above table 1 shows the Obstacle detection with Reality for Easy Shopping”, International Journal of Pure and Applied
vibration indication. Here, the experiment is performed for Mathematics (IJPAM),Vol. 118, No. 20, pp. 985-994, 2018.
varying distances fro m the range of one meter to fifteen [15] Suresh A., Arora C., Laha D., Gaba D., Bhambri S.,”Intelligent Smart
meters. Also, as an optional feature, the user can alert the Glass for Visually Impaired Using Deep Learning Machine Vision
Techniques and Robot Operating System (ROS)”, In: Kim JH. et al.
system to stop vibration by supplying through the voice- (eds) Robot Intelligence T echnology and Applications 5. RiTA 2017.
oriented command like "Stop Vibration" till the user has Advances in Intelligent Systems and Computing, vol
connection with the object or human. There is also a 751.Springer.2019.
variation in the form of acceleration where the target is in the [16] Redmon, Joseph, Santosh Divvala, Ross Girshick, and Ali Farhadi.
furthest distance to the nearest point. When the distance "You only look once: Unified, real-time object detection." In
between the obstacle and the user decreases, and then the Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, pp. 779-788. 2016.
vibration type changes from short to continuous mode to
[17] Liu, Wei, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy,
indicate the continuous alert about the unsafe situations to Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. "Ssd: Single
the user. There is also the option to provide continuous shot multibox detector." In European conference on computer vision,
feedback to the consumer on the variance of the system. pp. 21-37. Springer, Cham, 2016.
[18] Mohamed Dhiaeddine Messaoudi , Bob-Antoine J. Menelas and
Hamid Mcheick,” Autonomous Smart White Cane Navigation System
for Indoor Usage”, Technologies 2020, 8, 37
[19] Deepti Patole, Sunayana Jadhava, Saket Gupta, Khyatee Thakkar,”
Touch Tap See: A Navigation Assistant for the Visually Impaired
Person”, January 2019 ,SSRN Electronic Journal
[20] Moaiad Khder ,”Smart Shoes for Visually Impaired/Blind People”,
November 2017Conference: International Conference on Sustainable
Futures ICSF 2017At: bahrain

978-1-7281-6387-1/20/$31.00 ©2020 IEEE 1224

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 19,2021 at 16:16:11 UTC from IEEE Xplore. Restrictions apply.

You might also like