MCA - IA-1 Report-3-1
MCA - IA-1 Report-3-1
Submitted by
Shobitha A S(R22DE138)
Sriya S(R22DE153)
February 2024
1
TABLE OF CONTENTS
1 Abstract
2 Introduction
2.1 Motivation
4 Architecture diagram
5 Literature Review
6 Conclusion
2
1. ABSTRACT
Selecting songs to listen from an extensive selection of songs available on the internet can be a
daunting task. While music genre plays a huge role in building and displaying social identity, the
emotion expression of a song and even more importantly its emotional impression on the listener is
often underestimated in the domain of music preferences. Only a few decades back, choosing music
by genre and/or artist was effectively the only option. With an advent of personalized playlists and
recommendations on digital music platforms, this has drastically changed. People tend to listen to
music based on their mood and interests. It is widely known that humans make use of facial
expressions to express themselves. Many people believe that at a certain point of time the number of
songs present in their song’s library is so large that they are unable to decide what they should play.
So, by developing a recommendation system which could detect users’ mood and suggest songs
could greatly help the user reduce time in looking up for songs.
3
2. INTRODUCTION
Music plays an important role in our daily life. Music has always known to alter the mood of a
person. Capturing and recognizing the emotion shown by a person and recommending suitable songs
matching one’s mood and increasingly calm the mind of a user. People often face a tough time in
creating playlists manually when they have a lot of songs. It is also difficult to keep track of all
songs. Sometimes songs which are added are never heard, wasting a lot of space on the device
forcing the user to find and delete songs manually. So, this project aims to capture the user’s emotion
and give them custom curated playlists.
2.1MOTIVATION
It's crucial to remember that, although this method can shed light on how people feel in response to
motivational music, the precision of emotion detection may differ based on a number of variables,
including the caliber of the training data, the resilience of the algorithms, and individual variations
in facial expressions. Deploying such systems also requires careful consideration of ethical issues
pertaining to user privacy and permission.
Develop a system that presents a cross-platform music player, which recommends music based on
the real-time mood of the user through a web camera using Machine Learning Algorithms.
4
3. SOFTWARE AND HARDWARE SPECIFICATION
Hardware Requirements
These are the Hardware interfaces used
1. Processor: Processor - Intel Pentium 4 or equivalent
2. RAM - Minimum of 4 GB or higher
3. HDD - 100 GB or higher
4. Architecture - 32-bit or 64-bit
5. Monitor - 15’’ or 17’’ color monitor
6. Web camera
Software Requirements
1. Operating System – Windows 10 or 11
2. Programming Language - Python – 3.10.9
3. Microsoft C++ 14.0 Build Tools
4. Mediapipe
5. Tensorflow
6. Streamlit Webrtc
7. Streamlit-1.2.0
5
4. ARCHITECTURE DIAGRAM
6
5. LITERATURE REVIEW
Proposed: Study of changes in the curvatures of the face and the intensities of the
corresponding pixels. The author used Artificial Neural Networks which was used to classify the
emotions and various approaches for a playlist.
Proposed: Interactions between the users and music player, which learned all the
preferences, emotions and activities of a user gave songs as a result. The various facial expressions
of users were recorded by the device to determine the emotion of the user to predict the genre of
the music.
Proposed: Emotion-based music player using image processing. This showed how various
algorithms and techniques that were suggested by different authors in their research could be used
for connecting the music player along with human emotions. It helped in reducing the efforts of
user in creating and managing playlist and providing excellent experience to the music listeners by
bringing them the most suitable song according to the users’ expression.
Author: A. Habibzad
Proposed: A new algorithm to recognize the facial emotion, which included three stages:
pre-processing, feature extraction and classification. The first part describes various stages in image
processing which includes preprocessing, filtering used to extract various facial features. The second
part optimized the eye and lip ellipse characteristics and in the third part, the eye and lip optimal
parameters were used to classify the emotions.
Proposed: "Emotional Detection and Music Recommendation System based on User Facial
Expression" where the proposed system can detect the facial expressions of the user and based on
his/her facial expressions extract the facial landmarks, which would then be classified to get a
particular emotion of the user. Once the emotion has been classified the songs matching the user's
emotions would be shown to the user. It could assist a user to make a decision regarding which
song one should listen to helping the user to reduce stress levels. The user would not have to waste
any time in searching or to look up for songs.
7
USER INTERFACE IMPLEMENTATION
The user interface is built with the streamlit framework. As soon as the page is loaded, a small
window is popped to capture the image of the user. After the image is captured, it goes through the
HaarCascade classifier to detect whether a face is present or not. If face is not present then error is
displayed in the UI else image is sent to the trained model to detect the emotion. After emotion is
detected, the emotion is displayed on the screen and simultaneously using the detected emotion,
the spotify module runs a search for songs in the spotify collection to match the user’s mood and
then displayed on the screen. The tracks are embedded in such a way that the user could listen to
the song in the web app itself else navigate to the spotify application by clicking on the particular
track.
With a 4-layer CNN model, the train accuracy reached 97% and the validation accuracy reached
82%.
Performance Measure
No of instances: 35888
8
6. CONCLUSION
We integrate computer vision and machine learning techniques for connecting facial
emotion for music recommendation. The approach is to use Deep Neural Networks (DNN) to
learn the most appropriate feature abstractions. DNNs have been a recent successful
approach in visual object recognition, human pose estimation, facial verification and many
more. Convolution Neural Networks (CNNs) are proven to be very effective in areas such as
image recognition and classification. The proposed system can detect facial expressions of
the user using a CNN model. Once the emotion has been classified, the song matching the
user’s emotions would be played. In this project, a main web page is designed where an
image or video of the user is recorded. The image/video is then sent to the server to make
the prediction about the emotion of the user. Once the emotion is detected, the next phase
is to play songs. This is where the client side requests tracks from Spotify app via an API call.
Signatur
e of the guide with date