0% found this document useful (0 votes)
62 views4 pages

Facereminder Mobile Application For Prosopagnosia Patients: Ethodology

This document presents a mobile application called FaceReminder to assist patients with prosopagnosia (face blindness). The application uses computer vision techniques like RetinaFace for face detection and alignment and FaceNet for face recognition. It can connect to a Bluetooth camera to facilitate taking photos and detecting faces. The application divides into three main parts - face recognition techniques, a backend server, and a frontend. It is designed to perform face recognition tasks for users by employing state-of-the-art AI models in a lightweight mobile application.

Uploaded by

Suhaila Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views4 pages

Facereminder Mobile Application For Prosopagnosia Patients: Ethodology

This document presents a mobile application called FaceReminder to assist patients with prosopagnosia (face blindness). The application uses computer vision techniques like RetinaFace for face detection and alignment and FaceNet for face recognition. It can connect to a Bluetooth camera to facilitate taking photos and detecting faces. The application divides into three main parts - face recognition techniques, a backend server, and a frontend. It is designed to perform face recognition tasks for users by employing state-of-the-art AI models in a lightweight mobile application.

Uploaded by

Suhaila Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

FaceReminder Mobile Application for

Prosopagnosia Patients
Lamis Kamal Omara Suhaila Ahmed Bekhet Mohammed Zakaria Hussien
Systems and Biomedical department Systems and Biomedical department Systems and Biomedical department
Faculty of Engineering, Cairo Faculty of Engineering, Cairo Faculty of Engineering, Cairo
University University University
lamiskamal665@gmail.com suhailaahmedbk@gmail.com mohamed.sedik01@eng-st.cu.edu.eg

Nada Nasr Ali Elmaghraby Shimaa Mamdouh Ahmed Aboudeif


Systems and Biomedical department Systems and Biomedical department
Faculty of Engineering, Cairo Faculty of Engineering, Cairo
University University
nadamaghraby18@gmail.com shimaa.aboudief98@eng-st.cu.edu.eg

Abstract— This paper addresses a mobile assistive technology III. METHODOLOGY


for Prosopagnosia (Face Blindness) patients characterized by
their lack of ability to recognize familiar faces. In this paper, we
In order to assist Prosopagnosia patients in simplifying
present a mobile assistive application named “FaceReminder” the task of remembering people’s faces easier, we suggested
that performs the task of face recognition for them. The mobile a solution made up of a mobile app using computer vision
application can, also, be connected to any Bluetooth camera to and facial recognition technology. The application divided
facilitate taking the picture and detection of a human face, the into three main parts face recognition techniques, back-end
proposed method is based on CNN artificial intelligence server and front-end.
algorithm FaceNet.
A. Techniques used in face recognition
Keywords— Face recognition, Assistive mobile application Various techniques are utilized for facial recognition,
development, Databases, Prosopagnosia, Face blindness including Traditional Feature-based Methods, Deep
I. INTRODUCTION Learning-based Methods, Principal Component Analysis
(PCA), Local Feature Descriptors, 3D Facial Recognition,
Prosopagnosia is a neurological condition, where the Face Detection, and Face Alignment. Our FaceReminder
patient's brain loses its ability to recognize familiar faces as mobile app incorporates RetinaFace for face detection and
well as their facial expressions in spite of having normal alignment, along with a Deep Learning-based method for
vision, it has two types: developmental or acquired due to recognition using the FaceNet model. RetinaFace is a robust
brain damage, such as following a stroke, head injury, algorithm capable of accurately detecting faces in diverse
inflammation of the brain (encephalitis), or Alzheimer's environmental conditions, orientations, and lighting. The
disease. Although studies have shown that 1 in 50 people FaceNet model, developed by researchers at Google Inc., is a
suffer from developmental Prosopagnosia in the world, there dependable and resilient solution that fulfills our
is still no treatment for this condition till now[1]. requirements. We will employ the lightweight architecture
Statistics show that there are currently over six billion called InceptionResnet during the process.
smartphone subscriptions worldwide, and that number is  RetinaFace face detection and alignment
predicted to increase by several hundred million over the
next few years. Smartphones are widely accessible. On the RetinaFace is a state-of-the-art face detection and
other hand, artificial intelligence models in the field of facial alignment algorithm widely used in facial recognition
recognition have improved to the point where their applications.[5] RetinaFace incorporates a feature pyramid
performance is comparable with that of humans. Our mobile for detecting small faces, a context module for enhanced
application provides assistance to those patients by performance, and a multi-loss function for efficient
performing the task of face recognition for them, employing computations. It detects multiple faces in an image, but our
computer vision techniques [2]. objective is to identify a single face for the recognition task.
To achieve this, we iterate over the detected faces, select the
II. LITERATURE REVIEW largest one, and then input it into the model.
In 2010, a wearable device named OrCam MyEye [3] got
introduced, it's designed for the visually impaired and blind
patients, it's attached to the patient's glassess to help them
recognize faces and read text, activated by voice and/or a
click. The device only recognizes the faces that are registered
by the user through a complicated process. It costs around
$4250 USD. In 2015, Social recall mobile application [4] is
for events, the attendees consent to their information being
used during the event and after that it gets deleted. The
mobile application theory is currently being expanded to
include Prosopagnosia patients.
Fig. 1. Sample output of RetinaFace
 FaceNet For data collection, we utilized two distinct sources. The
first source involved acquiring the Face recognition dataset
FaceNet system is developed by researchers at Google from Kaggle, which contained face data belonging to 31
Inc. [6] which is composed of Deep CNN model, L2 distinct classes representing celebrities.[8] The second source
normalization, embeddings, triplet loss function. involved creating a custom Google Form where individuals
from our surroundings were asked to provide their face data,
resulting in a collection of 129 different classes.
Subsequently, both datasets were combined to create a
unified dataset, which was then used for the validation of our
models.
Fig. 2. System architecture
Before introducing the collected data to the model, a
preprocessing step is performed. This involves scaling all the
The The primary distinction between FaceNet and other
gathered data to a size of 512x512 pixels and converting
methods lies in its ability to learn the mapping from images
them to grayscale. By resizing the images and converting
and generate Embeddings, which are 128-dimensional
them to grayscale, the data is standardized and prepared in a
vectors representing the essential facial features. In our
consistent format that can be effectively utilized by the
implementation, we utilize the inception model based on the
model.
GoogleLeNet style. This particular architecture is
advantageous due to its lower parameter count, ranging from B. The proposed mobile application
approximately 6.6 million to 7.5 million, in contrast to the  Architecture
Zeiler&Fergus architecture with 140 million parameters. As
a result, the FaceNet model we employ is more lightweight Figure 4 shows the overall system architecture of mobile
and efficient. application. It interfaces with AI face recognition model in
order to recognize faces around user.
 DeepFace Library
It’s a lightweight face recognition and facial attribute
analysis framework for python which is licensed under MIT
License [7].
The framework offers a diverse range of hybrid models,
including VGG-Face, FaceNet, FaceNet512, OpenFace,
DeepFace, ArcFace, and SFace, as well as various face
detectors such as MediaPipe, SSD, Dlib, MTCNN, and
RetinaFace. Among these models, MTCNN and RetinaFace
are particularly robust in handling variations in
environmental conditions, facial expressions, and face
orientations. However, it is important to note that in our
specific case, we have limited control over these factors.
After conducting comparisons between the both of them
with the different recognition models in the framework, as
shown in fig.3, it turned out that RetinaFace is two times
faster than MTCNN, which is an important criteria to the Fig. 4. The system architecture of mobile application
success of our application.
After either registering a new account of patient or
logging into an existing account, the user will be asked to
choose the face detection method out of three options:
Mobile camera, Local image, External camera. Then, the
picture will be passed to the face recognition model and
decide whether the person exits and display information
about this person or the person doesn’t exist and in this case
the application will ask if the user want to add this person to
his/her network and add some information related to person.

Fig. 3. Processing time of MTCNN and RetinaFace

It also grant us access to its source code, enabling us to


manipulate and overwrite the functions in order to tailor
them to our specific needs.
 Data Collection and Preparation
Fig. 5. The system database
Using SQL to create the database of system shown in HTTP request. To access the user's local storage, the react-
Figure 5. native image picker library is utilized with the necessary
permissions to open and select images from the device's
 Deployment storage.
To deploy the models, an AWS EC2 instance is utilized Below are examples of samples obtained through the
due to its capability to offer scalable computing power within utilization of the aforementioned methods during the
the AWS cloud environment. By leveraging EC2, the models development of the FaceReminder mobile app.
can be efficiently deployed and managed. Additionally, the
chosen database for this deployment is MySQL, which is
used to store and manage the relevant data.
In order to address the challenge of transferring
large deep learning files to a remote repository, we
successfully implemented Git LFS (Large File Storage). This
solution enabled us to manage and store these sizable files
effectively, overcoming the difficulties posed by their size.
To overcome the challenge of loading large packages and
libraries such as deepface and TensorFlow, we utilized
Docker and created a Docker file and Docker compose. By
leveraging these lightweight executable containers, we were
able to package all the necessary components needed to run
the application. This approach facilitated the seamless Fig. 7. Samples of output from FaceReminder mobile app
deployment and execution of the software, ensuring that all
essential dependencies were included in a self-contained In cases where the individual is not found in the database,
environment. the user is promptly notified and prompted to provide
 User Interface additional information about that person.

The primary interface of the application offers users three


main methods to recognize faces: capturing photos using the
device's camera, uploading images from local storage, or
utilizing an external camera connected via WiFi.
Additionally, the app includes an authentication component
to protect user privacy and connections. After carefully
comparing different mobile application platforms, we
decided to utilize react-native due to its extensive collection
of open-source libraries and its compatibility with Django [9].
To manage user data and ensure seamless navigation
across screens, we employ redux toolkit [10] to create a user
and token slice that can be accessed throughout the
application. Moreover, a separate slice is created for user
connections, allowing easy retrieval of recognized person Fig. 8. Sample screen from the app
data. Axios is used to fetch data from the backend server,
making use of async/await functions for efficient handling of IV. RESULTS
asynchronous operations. At first, processing time was 7 minutes which is un-
acceptable in our case. We have noticed that resizing and
changing the depth of images significantly reduced the
processing time, after preprocessing it’s reduced to 3 minutes.
The images are all resized into (512x512), and converted to
grayscale.

Fig. 6. Application flow diagram

For accessing the device's camera, we use react-native


vision camera, which offers a wide range of features such as
device filtering, format filtering, night mode, and flash
control for capturing photos [11]. Additionally, the
application supports an external camera, specifically the
'ESP32-CAM OV2640 Camera V2.0,' which allows users to
capture images by pressing a button and activating the flash.
The captured image is then sent to the backend server via an Fig. 9. Second runtime of models vs detectors without preprocessing
Efficiencies of models are then validated after the V. CONCLUSION
preprocessing and are as follows: In this paper, we have introduced 'FaceReminder' mobile
application that aims at assisting patients with Prosopagnosia,
the mobile application provides information about the user
connections that they had previously entered to be retrieved
later when the camera detects a face or when specifically
requested by the patients. The engine of FaceReminder
mobile application flow consists of face detection, analysis,
and face recognition, employing FaceNet face recognition
model. Picture acquisition is done in three ways, the user
uses a picture from their mobile gallery, the user takes a
picture through the FaceReminder mobile application, or the
user connects the mobile application to an external camera
via Bluetooth where a picture is sent to the mobile
application each time a face is detected.
Fig. 10. Efficiencies of models after preprocessing VI. FUTURE WORK
Afterward, we examined the data we were utilizing and Our aim is to introduce innovative features into the
observed that the inclusion of the celebrities dataset application, one of which includes the implementation of
significantly hindered the accuracy. Consequently, we voice control specifically tailored to cater to the needs of
decided to exclusively utilize the data gathered through the visually impaired individuals. The face recognition task will
Google form. Efficiencies are the calculated: be automatically activated when an individual is positioned
directly in front of the patient. We are working towards
enabling real-time recognition capabilities. This
advancement will allow for instantaneous identification and
recognition of individuals as they are encountered, providing
users with timely and accurate information. We plan to
incorporate a feature that enables the grouping of individuals
based on specific events, time, and locations. This
enhancement aims to provide users with a more seamless
experience in recalling and remembering people in various
contexts. Additionally, it can serve as a valuable tool for
event management and keeping track of class attendance.
REFERENCES
[1] Prosopagnosia | National Institute of Neurological Disorders and
Stroke (nih.gov)
Fig. 11. Efficiencies of models on data from google form only [2] L.Ceci. “Topic: Mobile app usage,” Statista. [Online]. Available:
https://www.statista.com/topics/1002/mobile-app-usage/. [Released:
After exploring the efficiencies, we noticed that the 12-Nov-2022].
model processes the entire dataset each time. We modified [3] OrCam MyEye 2.0 - For People Who Are Blind or Visually Impaired
the source code for FaceNet model to decrease the [4] “Topic: Social Recall Personal,” AppAdvice. [Online]. Available:
computational time by updating on the representation instead https://appadvice.com/app/social-recall-personal/1437463847.
of deleting it and making a new one upon adding a new [Released : 25-Oct-2018].
person. [5] J. Deng, J. Guo, E. Ververas, I. Kotsia and S. Zafeiriou, "RetinaFace:
Single-Shot Multi-Level Face Localisation in the Wild," 2020
 Sample result of the accepted and rejected data: IEEE/CVF Conference on Computer Vision and Pattern Recognition
(CVPR), Seattle, WA, USA, 2020, pp. 5202-5211, doi:
10.1109/CVPR42600.2020.00525.
[6] F. Schroff, D. Kalenichenko and J. Philbin, "FaceNet: A unified
embedding for face recognition and clustering," 2015 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR),
Boston, MA, USA, 2015, pp. 815-823, doi:
10.1109/CVPR.2015.7298682.
[7] S. I. Serengil and A. Ozpinar, "LightFace: A Hybrid Deep Face
Recognition Framework," 2020 Innovations in Intelligent Systems
Fig. 12. Sample of accepted data and Applications Conference (ASYU), Istanbul, Turkey, 2020, pp. 1-
5, doi: 10.1109/ASYU50717.2020.9259802.
[8] Vasuki Patel. “Title: ] Face Recognition Dataset | ”, Kaggle. [Online].
Available: https://www.kaggle.com/datasets/vasukipatel/face-
recognition-dataset?select=Original+Images. [Updated : Jul-2020].
[9] React Native · Learn once, write anywhere
[10] Redux Toolkit | Redux Toolkit (redux-toolkit.js.org)
[11] Lifecycle | VisionCamera (mrousavy.com)

Fig. 13. Sample of rejected data

You might also like