Pro Mahi (1) - 1
Pro Mahi (1) - 1
Pro Mahi (1) - 1
PICTURES
MINI PROJECT – I
Submitted by
V.MAHALAKSHMI (421122106030)
V.SWETHA (421122106054)
in
ARITFICIAL INTELLIGENCE AND DATA SCIENCE
1
IFET COLLEGE OF ENGINEERING
(An Autonomous Institution)
BONAFIDE CERTIFICATE
SIGNATURE SIGNATURE
Mrs. P.MANJUBALA,M.TECH Mr.A.RANJEETH.M.E
HEAD OF THE DEPARTMENT SUPERVISOR
Associate Professor, Associate Professor,
Department of AI & DS, Department of AI & DS,
IFET College of Engineering, IFET College of Engineering,
Villupuram – 605108 Villupuram – 605108
The project report submitted for the viva voce held on……………
2
ACKNOWLEDGEMENT
We thank the almighty, for the blessings that have been showered upon me to bring
forth the success of the project. We would like to express my sincere gratitude to
our Chairman Mr.K.V.Raja, Secretary Mr.K.Shivram Alva and our Treasurer
Mr.R.Vimal for providing us with an excellent infrastructure and necessary
resources to carry out this project and we extend our gratitude to our principal
Dr.G.Mahendran, for his constant support to our work. We also take this
opportunity to express our sincere thanks to our Vice-Principal and Dean
Academics Dr.S.Matilda who has provided all the needful help in executing the
project successfully.
We wish to express our thanks to our Head of the Department, Mrs.P.Manju bala ,
for her persistent encouragement and support to complete this project.
3
ABSTRACT
4
CHAPTER NO TITLE PAGE NO
ABSTRACT iv
LIST OF FIGURES vi
LIST OF ABBREVIATIONS vii
1
INTRODUCTION 1
1.1 General 1
1.2 Domain overview 3
2 LITERATURE SURVEY 11
3 EXISTING SYSTEM 15
3.1 Existing System 15
3.2 Disadvantages of existing system 15
4 PROPOSED SYSTEM 17
4.1 Proposed system 17
4.2 Advantages of proposed system 19
5 MODEL DESCRIPTION 25
6 SYSTEM REQUIREMENTS 26
6.1 Hardware requirements 27
6.2 Software requirements 26
7 SYSTEM DESIGN 28
7.1 System Architecture 28
7.2 Packages 28
8 CONCLUSION AND FUTUREWORK 31
APPENDIX – I 33
35
APPENDIX – II
36
REFERENCE
5
FIGURE NO TITLE PAGE NO
9.2 Output 35
6
LIST OF ABBREVETIONS
7
CHAPTER 1
INTRODUCTION
1.1.GENERAL
8
analyzed emotions.
Real-time Display:
Presents the processed video stream in real-time, enabling users to observe the live
analysis of facial expressions and the associated predicted emotions.
Continuous Analysis Loop:
Engages in an ongoing loop, analyzing subsequent frames until user intervention,
potentially through an interactive command to pause or terminate the analysis.
1.2.DOMAIN OVERVIEW:
9
CHAPTER 2
LITERATURE SURVEY
10
2) [2021] L. Sun, L. He, and R. Wang, "An Ensemble Deep Learning
Approach for Emotion Recognition Using Multiple Features," IEEE Access,
vol. 9, pp. 83401-83415, Jun. 2021. (Deep learning with multiple features for
emotion recognition)
The advancements in neural networks and the on-demand need for accurate and near
real-time Speech Emotion Recognition (SER) in human–computer interactions make
it mandatory to compare available methods and databases in SER to achieve feasible
solutions and a firmer understanding of this open-ended problem. The current study
reviews deep learning approaches for SER with available datasets, followed by
conventional machine learning techniques for speech emotion recognition. Ultimately,
we present a multi-aspect comparison between practical neural network approaches in
speech emotion recognition. The goal of this study is to provide a survey of the field
of discrete speech emotion recognition.
11
the generalization ability of the whole system. The results of simulation experiments
on Berlin, eNTERFACE, and CASIA speech corpora show that the proposed
algorithm can achieve excellent recognition results, and it is competitive with most of
the state-of-the-art algorithms.
12
approach outperformed other comparative methods, achieving Has0-F1 scores of
76.43%, 80.15%, and 81.93%, respectively. Our approach exhibited better
performance, as compared to multiple baselines
13
CHAPTER 3
EXISTING SYSTEM
14
Robustness concerns:
Inconsistent performance in varying environmental conditions, lighting, or facial
occlusions affects real-time adaptability and robustness.
Limited Emotion Understanding:
primarily focus on recognising basic emotions , potentially overlooking or complex
emotional states.May exhibit biases due to inadequate representation in training
data,leading to inaccuracies and limited inclusivity across diverse demographics and
cultures.
Interpretability Issues:
Lack interpretability, making it challenging to explain emotion predictions, potentially
impacting user trust and acceptance.
Robustness concerns:
Inconsistent performance in varying environmental conditions, lighting, or facial occlusions
affects real-time adaptability and robustness.
Limited Emotion Understanding:
primarily focus on recognising basic emotions , potentially overlooking or complex
emotional states
15
CHAPTER 4
PROPOSED SYSTEM
The proposed system for facial emotion recognition aims to address several
limitations present in existing systems by leveraging advancements in deep learning
techniques, multimodal data integration, and improved model robustness. In
comparison to existing systems, the proposed system emphasizes several key
advantages.The proposed system integrates multimodal data sources, combining
facial expressions with contextual cues such as voice tone or body language,
enhancing the depth and accuracy of emotion recognition. By fusing multiple
modalities, it aims to capture a more comprehensive understanding of emotions,
surpassing the limitations of relying solely on facial features. Additionally, the
proposed system focuses on mitigating biases by curating diverse and representative
datasets, ensuring inclusivity across demographics, cultures, and expressions.Utilizing
state-of-the-art deep learning architectures, the proposed system emphasizes model
interpretability and explainability, addressing concerns related to the 'black- box'
nature of deep neural networks in existing systems. It incorporates mechanisms for
interpreting model predictions, offering insights into why specific emotions are
identified, thereby enhancing user trust and system transparency.Facial emotion
analysis system project developed and implemented by the following steps given
STEP 1:Load the model
• Load the pre-trained deep learning model for facial emotion recognition. This
model has been previously trained on annotated datasets to recognize emotions
from facial expressions.
STEP 2:Initialize Video Capture
• Utilize OpenCV to access the webcam (or video feed) for capturing live frames.
STEP 3: Use the Facial detection
• Use a facial detection algorithm (like Haar Cascade Classifier) to detect faces
within each captured frame.
16
STEP 4: Data preprocessing
• Convert the detected facial regions to grayscale for standardization.
• Resize the regions to fit the model's input dimensions (e.g., 224x224 pixels).
• Normalize pixel values and prepare the cropped facial image for model input.
STEP 5:Load the image
• Feed the preprocessed facial image into the loaded deep learning model.
• Utilize the model to predict the prevailing emotion from the facial expression
within the cropped region.
STEP 6: Visualization
• Overlay the predicted emotion label onto the video frame using OpenCV's text-
rendering capabilities.
• Display the processed video frame in real time, showcasing detected faces with
their corresponding predicted emotions.
STEP 7: Looping process
• Continuously capture frames from the webcam or video feed.
Implement a user interaction mechanism (e.g., key press) to enable termination of the
program.
• Release the webcam/video feed and close any open windows upon program
termination.
4.2. Advantages of Proposed System:
Multimodal Integration:
Integrates multiple modalities (facial expressions, voice tone, body
language), providing a more comprehensive understanding of emotions beyond
facial features alone.
Bias Mitigation and Inclusivity:
Aims to mitigate biases by curating diverse datasets, fostering equitable and
17
across various demographics and cultures.
accurate emotion recognitionInterpretability and Transparency:
Prioritizes interpretability, enabling users to understand emotion
predictions, enhancing trust and system transparency.
Robustness and Real-time Adaptability:
Focuses on optimizing models for consistent performance across
varying conditions, enhancing real-time adaptability and robustness.
Comprehensive Emotion Understanding:
Strives for a more comprehensive understanding of emotions by
integrating multiple modalities, aiming to interpret subtle nuances and complex
emotional states.
IMAGE BLOCK:
18
ASPECT RATIO OF FACE:
The aspect ratio of a face plays a crucial role in accurately identifying and interpreting
facial expressions. The aspect ratio refers to the proportional relationship between the
width and height of the face detected within an image or video frame.Maintaining a
consistent aspect ratio helps algorithms better recognize facial landmarks and features
essential for emotion analysis. Distortions in the aspect ratio might lead to
misinterpretations or inaccuracies in detecting emotions, as changes in proportions can
affect the perception of expressions like joy, sadness, anger, or surprise.
fig4.2:aspect ratio
19
RECOGNISING PROCESS:
The aspect ratio of a face plays a crucial role in accurately identifying and
interpreting facial expressions. The aspect ratio refers to the proportional
relationship between the width and height of the face detected within an image or
video frame.Maintaining a consistent aspect ratio helps algorithms better recognize
facial landmarks and features essential for emotion analysis. Distortions in the
aspect ratio might lead to misinterpretations or inaccuracies in detecting emotions,
as changes in proportions can affect the perception of expressions like joy, sadness,
anger, or surprise.
20
ACCURACY:
the program output in real-time facial emotion analysis largely depends on the
quality of the pre-trained model, dataset diversity, and environmental factors.
Achieving approximately 60-65% accuracy is common for facial emotion
recognition in real-time scenarios, with variations based on lighting conditions,
facial expressions, and model robustness. Continuous refinement and model
optimization can lead to gradual accuracy improvements over time.
21
CHAPTER 5
MODULE DISCRIPTION
22
understanding by visually displaying the identified emotions on the respective detected
faces.
Control Flow Module:
Manages the flow of the program, encompassing a continuous loop that captures
webcam frames, analyzes the emotions exhibited on detected faces using the
aforementioned modules, and displays the interpreted emotions in real-time. This loop
continues until the user interrupts it by pressing a designated key ('q' in this case).
23
CHAPTER 6
SYSTEM REQUIREMENTS
6.1.Software Requirements:
6.1.1.Package Description:
OpenCV (cv2):
Used for image processing, video capture, and face detection.
Keras with TensorFlow backend (keras, tensorflow):
Keras provides a high-level neural networks API, while TensorFlowserves as the
backend for deep learning tasks.
Matplotlib (matplotlib):
Utilized for displaying images and plots.
Numpy (numpy):
Essential for numerical computations and array manipulation.
24
6.2.HARDWARE REQUIREMENTS
Processor (CPU):
A multi-core processor (modern CPUs like Intel i5 or above, or
AMD equivalent) would be suitable.
Memory (RAM):
At least 4GB of RAM is recommended for smooth performance.
Camera:
A webcam connected to the system to capture video frames.
Graphics Processing Unit (GPU) - Optional:
If utilizing deep learning models, having a dedicated GPU
(NVIDIA CUDA-enabled GPU) can significantly speed up
computations.
25
CHAPTER 7
SYSTEM DESIGN
26
PACKAGES DISCRIPTION:
OpenCV (cv2):
OpenCV (Open Source Computer Vision Library) is a powerful library used for
real-time computer vision. It provides a wide range of functionalities for image and
video processing, including reading and writing images, video capture, object
detection, face detection, and various image manipulation techniques.
Key Features:
➢ Image and video manipulation.
➢ Face detection and recognition.
➢ Object tracking.
➢ Feature detection and extraction.
➢ Camera calibration and 3D reconstruction.
Usage in Facial Emotion Analysis:
In the provided code, OpenCV (cv2) is primarily used for capturing video frames
from the webcam, converting color spaces, and detecting faces within the captured
frames.
Keras with TensorFlow Backend (keras, tensorflow):
Keras is a high-level neural networks API written in Python, which acts as an
interface for TensorFlow, Theano, and Microsoft Cognitive Toolkit (CNTK). In the
provided code, TensorFlow serves as the backend for Keras.
Key Features:
➢ Simplified API for building and training neural networks.
➢ Supports both convolutional and recurrent neural networks.
➢ Easily allows for model training, evaluation, and inference.
➢ Flexibility to run on CPUs and GPUs.
27
Usage in Facial Emotion Analysis:
Keras is employed for loading and utilizing a pre-trained deep learning model for
facial emotion recognition. It assists in loading the model, preprocessing images,
making predictions, and extracting emotions from detected faces.
Matplotlib (matplotlib):
Matplotlib is a comprehensive library for creating static, animated, and interactive
visualizations in Python. It provides functionalities for generating plots, histograms,
bar charts, scatterplots, and more.
Key Features:
➢ Creation of publication-quality plots.
➢ Support for various plot types and customizations.
➢ Integration with Jupyter Notebook for interactive visualizations.
Usage in Facial Emotion Analysis:
Matplotlib is used in the code to display the video feed from the webcam, draw
rectangles around detected faces, and annotate the predicted emotions on the
video frames.
Numpy (numpy):
Numpy is a fundamental package for scientific computing in Python. It provides
support for large, multi-dimensional arrays and matrices, along with a wide range of
mathematical functions to operate on these arrays.
Key Features:
➢ Efficient handling of large arrays and matrices.
➢ Mathematical operations like linear algebra, Fourier analysis, random number
generation, etc.
➢ Integration capabilities with C/C++ and Fortran code.
Usage in Facial Emotion Analysis:
Numpy is utilized for array manipulation, resizing images, and normalization of
pixel values within the facial image region
28
CHAPTER 8
CONCLUSION AND FUTURE WORK
29
✴ hybrid Models: Explore hybrid models integrating CNNs and attention
mechanismsfor better emotion feature extraction.
30
APPENDIX 1
SOURCE CODE:
import os
import cv2
import numpy as np
from keras.preprocessing import image
import warnings
warnings.filterwarnings("ignore")
from keras.preprocessing.image import load_img, img_to_array
from keras.models import load_model
import matplotlib.pyplot as plt
import numpy as np
# load model
model = load_model("best_model.h5")
face_haar_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
cap = cv2.VideoCapture(0)
while True:
ret, test_img = cap.read() # captures frame and returns boolean value and captured image
if not ret:
continue
gray_img = cv2.cvtColor(test_img, cv2.COLOR_BGR2RGB)
predictions = model.predict(img_pixels)
31
max_index = np.argmax(predictions[0])
cap.release()
cv2.destroyAllWindows
32
APPENDIX 2
STEP 1: Open your familier programming running platforms like cmd promt and
navigate to path of the program or jupyter notebook
STEP 2: The second step is that when we run the program the webcam opens by using
the Opencv the the we can that the emotion detecting output.
Fig.9.2:Output
33
1) [2020] J. Li, Y. Zhu, and F. Meng, "A Survey on Convolutional Neural
Network Based Facial Expression Recognition," IEEE Access, vol. 8, pp. 19215-
19237, May 2020. (Survey on deep learning for facial expression recognition)
34
7) [2022] S. Li et al., "Physiological Signal Based Emotion Recognition
Using Deep Feature Representation and Multiple Classifiers," IEEE Transactions
on Affective Computing, vol. 13, no. 4, pp. 841-853, Dec. 2022.(Physiological
signal-based emotion recognition with deep features and multiple classifiers)
35