Contriver Project (Manu C)
Contriver Project (Manu C)
Contriver Project (Manu C)
SUBMITTED BY
MANU C
(RBSD2017)
M/S CONTRIVER®
Building 609,Sinchana Clinic,
Panchamantra road,Kuvempu nagara,
Mysore-570023.
Karnataka,India.
2023 - 2024
CONTRIVER®
#609,Panchamantra road, Kuvempu nagar, Mysore 5710023.
Department of Programming and development
TRAINING CERTIFICATE
This is to certify that SRI. MANU C (RBSD2017). bonafide students of Sarada Vilas
Institution in partial fulfillment for the award of “Training Certificate” in
Department of Programming and development of the CONTRIVER, Mysore
during the year 2023-2024. It is certified that he/she as undergone internship during
the time period from 16/08/2023 to 30/09/2023 of all working days
corrections/suggestions indicated for internal validation have been incorporated in
the report deposited to the guide and trainer. The training report has been approved
as it satisfies the organizational requirements in respect of Internship training
prescribed for the said qualification.
I would like to express my sincere gratitude to everyone who has contributed to make my
project internship a fulfilling and educational experience.
First and foremost, I would want to thank my internship supervisor sincerely for all of the
excellent advice, mentoring, and steadfast support they provided me with during this
assignment. I've learned a lot about "Face Melody Using Intelligent System" thanks to your
knowledge, persistence, and support.
I'm also appreciative that the entire team gave me the chance to be a part of this exciting and
creative atmosphere. It has been quite enlightening to be exposed to problems in the real world
and work in a team environment.
I'd want to thank my coworkers and other interns for their support, friendship, and
opportunities to grow together. Your ideas and
Date:30/09/2023
Place: Mysore -MANU C
RESUME
B.Sc(honours)DS & AI
CONTACT INFORMATION
ADDRESS:
S/O Cheluvanayaka,
Agathuru village
Sagare post, Kendalike
Hobli , H D Kote Taluk
Mysore-571121
EMAIL ID:
manuachu0611gmail.com
CONTACT NO: 8951042342
OBJECTIVE
To Secure a challenging position in a reputable organization to expand my learning, knowledge and skills. And
also to strengthen responsible career opportunity to fully utilize skills, while making a significant contribution to
the success of a company.
ACADEMIC INFORMATION
EDUCATION QUALIFICATIONS:
PROJECT DETAILS
MINI PROJECT: Face Melody Using Intelligent System
Abstract: In the realm of music composition and production, the creation and manipulation of melodies hold a
pivotal role. Melodies, being the core essence of musical expression, often require innovative approaches to
harness their full potential. This abstract delves into the application of intelligent systems to transform and
enhance melodies
PERSONAL STRENGTH
Negotiation , Adaptable , Responsible and Self confident.
PERSONAL PROFILE
Name : Manu C
DOB : 10-01-2003
Nationality : Indian
DECLARATION
I hereby declare that all the information’s are correct and true to the best of my knowledge and belief.
(Manu C)
TAKEAWAY TOPICS FROM TRAINING
HTML:
HTML, which stands for Hyper Text Markup Language, is the standard markup language used to create web
pages and structure their content on the World Wide Web. It forms the backbone of most web pages and is
an essential technology for anyone involved in web development or web design.
Here's a brief overview of HTML:
Markup Language :HTML is a markup language, not a programming language. It consists of a set
of elements or tags that you use to structure content on a web page.
Elements: HTML documents are composed of HTML elements, which are enclosed in angled
brackets (< >). Elements typically consist of a start tag, content, and an end tag. For example, `<p>`
is a paragraph element, and `<a>` is an anchor (link) element.
Attributes: Elements can have attributes that provide additional information about the element.
Attributes are typically specified in the start tag and help define an element's behavior or appearance.
For instance, the `<img>` element has an `src` attribute to specify the image source.
Nesting: HTML elements can be nested within one another, creating a hierarchical structure. This
nesting determines the order and relationship of elements on a web page.
Semantic HTML:HTML5 introduced semantic elements like `<header>`, `<nav>`, `<article>`,
`<section>`, and `<footer>`. These elements provide more meaningful information about the
structure of a web page and improve accessibility and search engine optimization.
Hyperlinks: HTML is crucial for creating hyperlinks, allowing users to navigate between different
web pages and resources on the internet. The `<a>` element is used to create links.
Multimedia and forms : HTML supports embedding multimedia elements like images, audio,
video, and interactive content through elements like `<img>`, `<audio>`, `<video>`, and
`<iframe>`.It provides form elements (e.g., `<form>`, `<input>`, `<textarea>`) for creating
interactive forms for user input and data submission.
CSS:
CSS, which stands for Cascading Style Sheets, is a crucial technology used in web development to control
the presentation and styling of HTML documents. It allows web designers and developers to define how
web content should appear, specifying details such as layout, colors, fonts, and spacing.
Here's a brief overview of CSS:
Selectors and Declarations :CSS works by applying rules to HTML elements using selectors.
Selectors target specific elements in the HTML document, and declarations within those rules
Styles Cascading :The term "cascading" in CSS refers to the order in which styles are applied to
elements. Styles can be inherited from parent elements, overridden by more specific selectors, or
modified by external stylesheets. This cascade allows for flexibility and control over how styles are
applied.
External Stylesheets :CSS can be included in an HTML document internally (within a `<style>`
element in the document's `<head>`) or externally (in a separate .css file linked to the HTML
document). External stylesheets are preferred for larger projects because they promote consistency
and easier maintenance.
Box Model: CSS defines how elements are displayed in a box model, which includes content,
padding, borders, and margins. This model governs the layout of elements on the web page.
Transitions and Animations :CSS allows for the creation of smooth transitions and animations,
enhancing user experiences. Transition properties can change gradually over a specified duration,
while keyframes and animations offer more complex animations.
Accessibility :CSS can contribute to web accessibility by defining styles that are more readable and
usable for people with disabilities, such as those who use screen readers or have visual impairments.
WORDPRESS:
WordPress is a popular and widely-used content management system (CMS) and website creation platform.
It provides a user-friendly interface for building, managing, and maintaining websites and blogs.
Here's a brief overview of WordPress:
Content Management System (CMS):WordPress is primarily known as a CMS, which means it
enables users to easily create, organize, and manage various types of content on a website. It is
particularly well-suited for blogs, news sites, e-commerce stores, portfolios, and small to medium-
sized business websites.
Open Source: WordPress is open-source software, which means it is freely available for anyone to
use, modify, and distribute. This open nature has led to a large and active community of developers
and users who contribute to its growth and development.
Themes: WordPress allows users to choose from thousands of themes (both free and premium) that
control the design and layout of their websites. Themes can be customized to match specific
branding or design preferences.
Plugins: WordPress's functionality can be extended through plugins, which are small pieces of
software that add new features and functionality to a website. There are thousands of plugins
available for various purposes, including SEO optimization, e-commerce, social media integration,
and more.
Blogging: WordPress initially gained popularity as a blogging platform. It offers powerful blogging
tools, including categories, tags, and a comment system, making it a favorite among bloggers.
SEO-Friendly: WordPress is known for being SEO-friendly out of the box. It generates clean and
structured HTML code, offers SEO plugins, and provides features like customizable permalinks,
making it easier for websites to rank well in search engines.
Community and Support: The WordPress community is vast and active, providing ample support
through forums, documentation, and tutorials. Users can find answers to their questions and
troubleshoot issues easily.
JAVA SCRIPT:
JavaScript, often abbreviated as JS, is a versatile and essential programming language for web development.
It's primarily used to enhance the interactivity and functionality of websites. JavaScript allows developers to
create dynamic web content, manipulate the Document Object Model (DOM) to change webpage elements
in real-time, and interact with users through forms and user interfaces. With the rise of modern web
applications, JavaScript has become a core technology for building responsive, interactive, and user-friendly
websites. It's supported by all major web browsers and is commonly used alongside HTML and CSS to
create engaging online experiences, ranging from simple animations to complex web applications.
JavaScript's widespread use and continuous evolution make it a fundamental skill for web developers.
PYTHON:
Python is a high-level programming language that is commonly used in human speech emotion recognition
projects. It is a versatile language that offers a wide range of libraries and frameworks for speech
processing, machine learning, and data analysis.
Created by Guido van Rossum and first released in 1991, Python has gained immense popularity across
various domains, including web development, data analysis, machine learning, and automation.
Its clean and concise syntax, which emphasizes code readability through indentation, makes it an ideal
choice for both beginners and experienced developers. Python boasts a rich standard library and a vast
ecosystem of third-party packages and frameworks, contributing to its flexibility and efficiency in solving a
wide range of programming tasks. Its community-driven development model ensures regular updates and
support, making Python a top choice for anyone looking to develop software, analyze data, or explore the
world of artificial intelligence and machine learning.
MACHINE LEARNING:
Machine learning is a subfield of artificial intelligence (AI) that focuses on developing algorithms and
models that enable computers to learn from and make predictions or decisions based on data, without being
explicitly programmed.
Supervised Learning:In this type, the algorithm learns from a labeled dataset, where the input data
is paired with corresponding correct output or target values. It's used for tasks like classification and
regression.
Unsupervised Learning:Unsupervised learning deals with unlabeled data and aims to find patterns
or structures within the data, often through clustering or dimensionality reduction techniques.
Semi-Supervised Learning and Self-Supervised Learning: These are hybrid approaches that use
both labeled and unlabeled data or generate their own labels from the data.
TOPICS FROM GUEST LECTURER
WordPress is a popular content management system (CMS) widely used for building websites and blogs.
Here's a brief overview of web development using WordPress, which can be helpful for your internship:
Introduction to WordPress:
WordPress is an open-source platform known for its user-friendliness and versatility. It allows you to create
websites without extensive coding knowledge.
Installation and Setup:
Start by installing WordPress on a web server. You can use a local development environment or a web
hosting service. Once installed, configure the basic settings.
Themes:
WordPress offers a wide range of themes, both free and premium, to change the look and layout of your
website. You can also create custom themes to suit specific project requirements.
Plugins:
Plugins extend WordPress's functionality. There are thousands of plugins available for various purposes,
such as SEO optimization, e-commerce, and security. Choose and install plugins based on your project needs.
Content Management:
WordPress provides a user-friendly dashboard for managing content. You can create and edit pages, posts,
and multimedia elements easily. Utilize categories and tags to organize content effectively.
Customization:
WordPress allows for extensive customization. You can modify themes and plugins using HTML, CSS, and
PHP to meet unique design and functionality requirements.
SEO Optimization:
Optimize your website for search engines using SEO plugins and best practices. Focus on keyword
research, meta tags, and content quality to improve search engine rankings.
Performance Optimization:
Ensure your website loads quickly by optimizing images, using caching plugins, and implementing content
delivery networks (CDNs) if needed.
Security:
WordPress is a common target for hackers, so implement security measures. Regularly update themes,
plugins, and WordPress core. Use security plugins and strong passwords.
Testing and Debugging:
Test your website across different browsers and devices to ensure compatibility. Debug any issues that
arise during development.
Backup and Maintenance:
Set up regular backups to safeguard your website's data. Perform routine maintenance tasks, including
updates and security checks.
Performance Monitoring:
Use tools like Google Analytics to monitor website traffic and user behavior. Adjust your strategy based
on insights gained from analytics.
Responsive Design:
Ensure your website is responsive, meaning it adapts to various screen sizes, including mobile devices, to
provide a seamless user experience.
Collaboration and Communication:
Effective communication with team members and clients is crucial. Use project management tools and
keep stakeholders informed about progress.
Documentation:
Maintain detailed documentation of your work, including changes made, custom code snippets, and
configuration settings. This aids troubleshooting and future development.
Continuous Learning:
The field of web development, including WordPress, is continually evolving. Stay updated with the latest
trends, techniques, and technologies.
FEEDBACK/OPINOIN OF THE INTERSHIP
Innovative topics/Methods:
Web development is the process of creating and maintaining websites and web applications for the internet
or an intranet. It encompasses a range of tasks, from designing the user interface and user experience to
writing the code that makes a website functional. Front-end developers focus on the visible aspects of a
website that users interact with directly. This includes designing the layout, creating responsive designs for
different devices and screen sizes, and implementing the user interface using technologies like HTML
(Hypertext Markup Language), CSS (Cascading Style Sheets), and JavaScript. Front-end developers strive
to create visually appealing and user-friendly websites.
Content management systems (CMS) like WordPress, Drupal, and Joomla provide pre-built solutions for
website creation, while custom web development allows for tailored solutions to meet specific
requirements.The field of web development is constantly evolving, with new technologies and frameworks
emerging regularly. Additionally, web developers must consider factors like web accessibility, search
engine optimization (SEO), and user experience (UX) to create websites that are both functional and user-
friendly. Web development plays a critical role in shaping the digital presence of businesses, organizations,
and individuals on the internet.
Python is an excellent choice for an internship focused on ML due to its popularity, versatility, and extensive
ecosystem of libraries and tools for ML and data science. Ultimately, the quality of your internship
experience will depend on various factors, including the specific organization, team, and project you're
assigned
ABSTRACT
Music plays a very important role in human's daily life. Everyone wants to listen music of their individual
taste, mostly based on their mood. Users always face the task of manually browsing the music and to create
a playlist based on their current mood.
The proposed project is very efficient which generates a music playlist based on the current mood of users.
Facial expressions are the best way of expressing ongoing mood of the person. The objective of this project
is to suggest songs for users based on their mood by capturing facial expressions. Facial expressions are
captured through webcam and such expressions are fed into learning algorithm which gives most probable
emotion. Once the emotion is recognized, the system suggests a play-list for that emotion, thus saves a lot of
time for a user.
Once the emotion is detected by CNN then the emotion is used by Spotify API and then the Spotify API
generates a playlist according the emotion of the user.
CONTENTS
LIST OF FIGURES i
CHAPTER 1
INTRODUCTION..................................................................................................... 1-4
1.1 Machine Learning ................................................................................................. 4-7
1.2 Neural Network .................................................................................................. 8-10
1.3 Deep Learning ........................................................................................................ 11
1.4 Motivation Work .................................................................................................... 11
Problem Statement
CHAPTER 2
LITERATURE SURVEY..................................................................................... 12-13
2.1 A Face Expression Recognition Using CNN and LBP ........................................... 12
2.2 Emotion-Based Music Player .................................................................................. 12
2.3 A Machine Learning Based Music Player by Detecting Emotions......................... 13
2.4 Automatic facial expression recognition ................................................................. 13
2.5 Existing System ....................................................................................................... 13
CHAPTER 3
METHODOLOGY ............................................................................................... 14-23
3.1 Proposed System.................................................................................................... 14
3.2 Face Detection ....................................................................................................... 14
3.2.1 Haar Like Feature ........................................................................................................................ 15-16
3.2.2 Integral Image .............................................................................................................................. 16-17
3.2.3 Adaptive Boosting ....................................................................................................................... 17-18
3.2.4 The Cascade Classifier ................................................................................................................ 18-19
3.3 Facial Feature Extraction ................................................................................. 19-23
3.3.1 Convolution Neural Network ...................................................................................................... 19-20
3.3.2 Convolution Layer ....................................................................................................................... 21-22
3.3.3 Pooling Layer ................................................................................................................................... 22
3.3.4 Classification – Fully Connected Layer ...................................................................................... 22-23
CHAPTER 4
DESIGN ................................................................................................................. 24-31
4.1 Structure Chart ....................................................................................................... 24
4.2 UML Diagrams ................................................................................................ 25-31
4.3.1 Use Case Diagram ...................................................................................................................26
4.3.2 Sequence Diagram ...................................................................................................................27
4.3.3 Activity Diagram .....................................................................................................................28
4.3.4 Collaboration Diagram ............................................................................................................29
4.3.5 Flow Chart .............................................................................................................................30
4.3.6 Component Diagram ...............................................................................................................31
CHAPTER 5
DATASET DETAILS ........................................................................................... 32-33
CHAPTER 6
EXPERIMENTAL ANALYSIS AND RESULTS ............................................. 34-55
6.1 Software Configuration ................................................................................ 34-35
6.1.1 Software Requirements ...................................................................................................... 34-35
6.1.2 Hardware Requirements ..........................................................................................................35
6.2 Sample Code .................................................................................................. 36-48
6.3 Sample Inputs and Outputs .......................................................................... 49-50
6.4 Performance Measure ................................................................................... 51-52
6.5 Testing .......................................................................................................... 53-55
CHAPTER 7
CONCLUSION AND FUTURE WORK ........................................................... …56
CHAPTER 8
IMPLEMENTATION…………………………………………………………40-46
CHAPTER 8
APPENDIX ………………………………….............57-58
CHAPTER 9
REFERENCES ………………………………………………...59
LIST OF FIGURES
Chapter 1
1. INTRODUCTION
Music plays an important role in our daily life. Users have to face the task of manually browsing the
music.
Computer vision is a field of study which encompasses on how computer see and understand digital
images and videos.
Computer vision involves seeing or sensing a visual stimulus, make sense of what it has seen and
also extract complex information that could be used for other machine learning activities. We will
implement our use case using the Haar Cascade classifier. Haar Cascade classifier is an effective
object detection approach which was proposed by Paul Viola and Michael
Jones in their paper, “Rapid Object Detection using a Boosted Cascade of Simple Features” in
2001.
This project recognizes the facial expressions of user and play songs according to emotion. Facial
expressions are best way of expressing mood of a person. The facial expressions are captured using a
webcam and face detection is done by using Haar cascade classifier. The captured image is input to
CNN which learn features and these features are analyzed to determine the current emotion of user
then the music will be played according to the emotion. In this project, five emotions are considered
for classification which includes happy, sad, anger, surprise, neutral. This project consists of 4
modules-face detection, feature extraction, emotion detection, songs classification. Face detection is
done by Haar cascade classifier, feature extraction and emotion detection are done by CNN. Finally,
the songs are played according to the emotion recognized.
Convolutional Neural Networks (CNN) is a specific type of Artificial Neural Network which are
widely used for image classification.
CNN is a type of deep learning model for processing data that has a grid pattern, such as images,
which is inspired by the organization of animal visual cortex and designed to automatically and
adaptively learn spatial hierarchies of features, from low- to high-level patterns. CNN is a
mathematical construct that is typically composed of three types of layers (or building blocks):
convolution, pooling, and fully connected layers. The first two, convolution and pooling layers,
perform feature extraction, whereas the third, a fully connected layer, maps the extracted features
Applications of CNN:
1. Decoding Facial Recognition.
2. Analyzing Documents.
3. Historic and Environmental Collections.
4. Understanding Climate.
5. Advertising.
Advantages of CNN:
1. Processing speed.
2. Flexibility.
3. Versatile in nature.
4. Dynamic Behaviour.
5. Speed
6. Robustness
Disadvantages of CNN:
3. CNN do not encode the position and orientation of object.
TYPES OF LEARNINGS:
Machine Learning Algorithms can be classified into 3 types as follows:
1. Supervised learning
2. Unsupervised Learning
3. Reinforcement Learning
SUPERVISED LEARNING:
Supervised learning is the most popular paradigm for machine learning. It is the easiest to understand
and the simplest to implement. It is the task of learning a function that maps an input to an output based
on example input-output pairs. It infers a function from labelled training data consisting of a set of
training examples. In supervised learning, each example is a pair consisting of an input object
(typically a vector) and a desired output value (also called the supervisory signal). A supervised
learning algorithm analyses the training data and produces an inferred function, which can be used for
mapping new examples. Supervised Learning is very similar to teaching a child with the given data and
that data is in the form of examples with labels, we can feed a learning algorithm with these example-
label pairs one by one, allowing the algorithm to predict the right answer or not. Over time, the
algorithm will learn to approximate the exact nature of the relationship between examples and their
The goal is to approximate the mapping function so well that when you have new input data (x) that
you can predict the output variables (Y) for the data. It is called supervised learning because the process
of an algorithm learning from the training dataset can be thought of as a teacher supervising the
learning process. Supervised learning is often described as task oriented. It is highly focused on a
singular task, feeding more and more examples to the algorithm until it can accurately perform on that
task. This is the learning type that you will most likely encounter, as it is exhibited in many of the
common applications like Advertisement Popularity, Spam Classification, face recognition.
Two types of Supervised Learning are:
1. Regression:
Regression models a target prediction value based on independent variables. It is mostly used for
finding out the relationship between variables and forecasting. Regression can be used to estimate/
predict continuous values (Real valued output). For example, given a picture of a person then we have
to predict the age on the basis of the given picture.
2. Classification:
Classification means to group the output into a class. If the data is discrete or categorical then it is a
classification problem. For example, given data about the sizes of houses in the real estate market,
making our output about whether the house “sells for more or less than the asking price” i.e.,
Classifying houses into two discrete categories.
UNSUPERVISED LEARNING
Unsupervised Learning is a machine learning technique, where you do not need to supervise the
model. Instead, you need to allow the model to work on its own to discover information. It mainly
deals with the unlabeled data and looks for previously undetected patterns in a data set with no pre-
existing labels and with a minimum of human supervision. In contrast to supervised learning that
usually makes use of human labelled data, unsupervised learning, also known as self-organization,
allows for modelling of probability densities over inputs.
Unsupervised machine learning algorithms infer patterns from a dataset without reference to known, or
labelled outcomes. It is the training of machine using information that is neither classified nor labelled
and allowing the algorithm to act on that information without guidance. Here the task of machine is to
group unsorted information according to similarities, patterns, and differences without any prior
training of data. Unlike supervised learning, no teacher is provided that means no training will be
given to the machine. Therefore, machine is restricted to find the hidden structure in unlabeled data by
our-self. For example, if we provide some pictures of dogs and cats to the machine to categorized, then
initially the machine has no idea about the features of dogs and cats so it categorizes them according to
their similarities, patterns and differences. The Unsupervised Learning algorithms allows you to
perform more complex processing tasks compared to supervised learning. Although, unsupervised
learning can be more unpredictable compared with other natural learning methods.
• Clustering: A clustering problem is where you want to discover the inherent groupings in the data,
such as grouping customers by purchasing behaviour.
• Association: An association rule learning problem is where you want to discover rules that describe
large portions of your data, such as people that buy X also tend to buy Y.
REINFORCEMENT LEARNING
Reinforcement Learning (RL) is a type of machine learning technique that enables an agent to learn in
an interactive environment by trial and error using feedback from its own actions and experiences.
Machine mainly learns from past experiences and tries to perform best possible solution to a certain
problem. It is the training of machine learning models to make a sequence of decisions. Though both
supervised and reinforcement learning use mapping between input and output, unlike supervised
learning where the feedback provided to the agent is correct set of actions for performing a task,
reinforcement learning uses rewards and punishments as signals for positive and negative behaviour.
Reinforcement learning is currently the most effective way to hint machine’s creativity
Artificial Neural Networks can be best viewed as weighted directed graphs, where the nodes are
formed by the artificial neurons and the connection between the neuron outputs and neuron inputs can
be represented by the directed edges with weights. The ANN receives the input signal from the external
world in the form of a pattern and image in the form of a vector. These inputs are then mathematically
designated by the notations x(n) for every n number of inputs. Each of the input is then multiplied by its
corresponding weights (these weights are the details used by the artificial neural networks to solve a
certain problem). These weights typically represent the strength of the interconnection amongst neurons
inside the artificial neural network.
All the weighted inputs are summed up inside the computing unit (yet another artificial neuron).
If the weighted sum equates to zero, a bias is added to make the output non-zero or else to scale up
to the system’s response. Bias has the weight and the input to it is always equal to 1. Here the sum of
weighted inputs can be in the range of 0 to positive infinity. To keep the response in the limits of the
desired values, a certain threshold value is benchmarked. And then the sum of weighted inputs is
passed through the activation function. The activation function is the set of transfer functions used to
get the desired output of it. There are various flavours of the activation function, but mainly either
linear or non-linear set of functions. Some of the most commonly used set of activation functions are
the Binary, Sigmoid (linear) and Tan hyperbolic sigmoidal (non-linear) activation functions.
1. Input Layer: The input layers contain those artificial neurons (termed as units) which are to
receive input from the outside world. This is where the actual learning on the network happens or
corresponding happens else it will process.
2. Hidden Layer: The hidden layers are mentioned hidden in between input and the output
layers. The only job of a hidden layer is to transform the input into something meaningful that the
output layer/unit can use in some way. Most of the artificial neural networks are all interconnected,
which means that each of the hidden layers is individually connected to the neurons in its input
layer and also to its output layer leaving nothing to hang in the air. This makes it possible for a
complete learning process and also learning occurs to the maximum when the weights inside the
artificial neural network get updated after each iteration.
3. Output Layer: The output layers contain units that respond to the information that is fed into the
system and also whether it learned any task or not.
2. Take a set of examples of input data and pass them through the network to obtain their prediction.
3. Compare these predictions obtained with the values of expected labels and calculate the loss with
them.
4. Perform the backpropagation in order to propagate this loss to each and every one of the parameters
that make up the model of the neural network.
5. Use this propagated information to update the parameters of the neural network with the gradient
descent in a way that the total loss is reduced, and a better model is obtained.
6. Continue iterating in the previous steps until we consider that we have a good model
artificial neural networks. Deep learning is an artificial intelligence function that imitates the
workings of the human brain in processing data and creating patterns for use in decision making.
Deep learning is a subset of machine learning in artificial intelligence (AI) that has networks
capable of learning unsupervised from data that is unstructured or unlabeled. It has a greater number
of hidden layers and known as deep neural learning or deep neural network.
Deep learning has evolved hand-in-hand with the digital era, which has brought about an explosion
of data in all forms and from every region of the world. This data, known simply as big data, is
drawn from sources like social media, internet search engines, ecommerce platforms, and online
cinemas, among others. This enormous amount of data is readily accessible and can be shared
through fintech applications like cloud computing. However, the data, which normally is
unstructured, is so vast that it could take decades for humans to comprehend it and extract relevant
information. Companies realize the incredible potential that can result from unravelling this wealth
of information and are increasingly adapting to AI systems for automated support.
Deep learning learns from vast amounts of unstructured data that would normally take humans
decades to understand and process. Deep learning and utilizes a hierarchical level of artificial neural
networks to carry out the process of machine learning. The artificial neural networks are built like
the human brain, with neuron nodes connected like a web. While traditional programs build analysis
with data in a linear way, the hierarchical function of deep learning systems enables machines to
process data with a nonlinear approach.
Music plays an important part of our life. It gives us relief and reduce the stress. The goal of this
project is to generate a music playlist based on the facial expressions of the users. In this we discuss
how convolutional neural network (CNN) is used for generating music playlist according to the facial
expressions of user.
2. LITERATURE SURVEY
Literature survey is the most important step in software development process. Before developing
the tool, it is necessary to determine the time factor, economy and company strength. Once these
things are satisfied, then next step is to determine which operating system and language can be
used for developing the tool. Once the programmers start building the tool the programmers
need lot of external support. This support can be obtained from senior programmers, from book
or from websites. Before building the system, the above consideration is taken into account for
developing the proposed system.
Even a simple change in facial expression signifies happiness, sorrow, surprise and anxiety. The
facial expressions of every person should vary in various contexts such as lighting, posture and even
background. All these factors still remain an issue while recognizing facial expressions. This paper
hopes to bring out a fair comparison between two of the most commonly used face expression
recognition [FER] techniques and to shed some light on their precision. The methods being used here
are Local binary patterns [LBP] and Convolution neural networks [CNN]. The LBP is meant as a
method only for the purpose of extracting features so the Support vector machine [SVM] classifier is
being utilized for classifying the extracted features from LBP. The dataset used for the purpose of
testing and training in this paper are CK+, JAFFE and YALE FACE.
2.4 Automatic facial expression recognition using features of salient facial patches:
They proposed a system image from database is passed to the facial landmark detection stage to
remove noise by applying Gaussian Filter or mask. Here itself they used Viola Jones technique of
Haar-like features with Adaboost learning for face detection. The feature detection stage consists of
Eyebrow corners detector, Eye detector, Noise detector, Lip corner detector. After these active facial
patches are extracted, the classification of features is done by SVM (Support Vector Machine). While
testing it will take the hundreds of images from the database and extract the features and classifies
accordingly. They used CK+ (Cohn-Kanade) dataset and JAFEE dataset for training and testing the
database. The training database consist of 329 images in total.
3.METHODOLOGY
3.1 Proposed System
Convolution neural network algorithm is a multilayer perceptron that is the special design for the
identification of two-dimensional image information. It has four layers: an input layer, a convolution
layer, a sample layer, and an output layer. In a deep network architecture, the convolution layer and
sample layer may have multiple. CNN is not as restricted as the Boltzmann machine, it needs to be
before and after the layer of neurons in the adjacent layer for all connections, convolution neural
network algorithms, each neuron doesn’t need to experience the global image, just feel the local region
of the image. In addition, each neuron parameter is set to the same, namely, the sharing of weights,
namely each neuron with the same convolution kernels to the deconvolution image. The key era of
CNN is the local receptive field, sharing of weights, subsampling by using time or space, with a
purpose to extract features and reduce the size of the training parameters. The advantage of CNN
algorithm is to avoid the explicit feature extraction, and implicitly to learn from the training data. The
same neuron weights on the surface of the feature mapping, thus the network can learn parallel, and
reduce the complexity of the network Adopting sub-sampling structure by time robustness, scale, and
deformation displacement. Input information and network topology can be a very good match. It has
unique advantages in image processing.
The Viola-Jones Algorithm, developed in 2001 by Paul Viola and Michael Jones, the Viola-Jones
algorithm is an object-recognition framework that allows the detection of image features in real-time.
Viola-Jones is quite powerful and its application has proven to be exceptionally notable in real-time
face detection. The framework is still a leading player in face detection alongside many of its CNNs
counter parts. The Viola-Jones Object Detection Framework combines the concepts of Haar-like
Features, Integral Images, the AdaBoost Algorithm, and the Cascade Classifier to create a system
for object detection that is fast and accurate.
Viola-Jones was designed for frontal faces, so it is able to detect frontal the best rather than faces
looking sideways, upwards or downwards. Before detecting a face, the image is converted into
grayscale, since it is easier to work with and there’s lesser data to process. The Viola-Jones algorithm
first detects the face on the grayscale image and then finds the location on the colored image.
Viola-Jones outlines a box (as you can see on the right) and searches for a face within the box. It is
essentially searching for these haar-like features, which will be explained later. The box moves a step
to the right after going through every tile in the picture. In this case, I’ve used a large box size and
taken large steps for demonstration, but in general, you can change the box size and step size according
to your needs. With smaller steps, a number of boxes detect face-like features (Haar-like features) and
the data of all of those boxes put together, helps the algorithm determine where the face is.
There are 3 types of Haar-like features that Viola and Jones identified in their research:
• Edge features
• Line-features
• Four-sided features
These features help the machine understand what the image is. Imagine what the edge of a table would
look like on a b & w image. One side will be lighter than the other, creating that edge like b&w feature
as you can see in the picture above. In the two important features for Face Detection, the horizontal and
the vertical features describe what eyebrows and the nose, respectively, look like to the machine.
Additionally, when the images are inspected, each feature has a value of its own. It’s quite easy to
calculate: Subtract White area from the Black area. For example, look at the image below.
we calculated the value of a feature. In reality, these calculations can be very intensive since the
number of pixels would be much greater within a large feature. The integral image plays its part in
allowing us to perform these intensive calculations quickly so we can understand whether a feature of a
number of features fit the criteria. To calculate the value of a single box in the integral image, we take
the sum of all the boxes to its left.
Haar-like features are actually rectangular, and the integral image process allows us to find a feature
within an image very easily as we already know the sum value of a particular square and to find the
difference between two rectangles in the regular image, we just need to subtract two squares in the
integral image. So even if you had 1000 x 1000 pixels in your grid, the integral image method makes
the calculations much less intensive and can save a lot of time for any facial detection model.
The AdaBoost (Adaptive Boosting) Algorithm is a machine learning algorithm for selecting the best
subset of features among all available features. The output of the algorithm is a classifier (Prediction
Function, Hypothesis Function) called a “Strong Classifier”. A Strong Classifier is made up of a linear
combination of “Weak Classifiers” (best features). From a high level, in order to find these weak
classifiers the algorithm runs for T iterations where T is the number of weak classifiers to find and it is
set by you. In each iteration, the algorithm finds the error rate for all features and then choose the
feature with the lowest error rate for that iteration. The algorithm learns from the images we supply it
and is able to determine the false positives and true negatives in the data, allowing it to be more
accurate. We would get a highly accurate model once we have looked at all possible positions and
combinations of those features. Training can be super extensive because of all the different possibilities
and combinations you would have to check for every single frame or image.
Let’s say we have an equation for our features that determines the success rate (as seen in the image),
with f1, f2 and f3 as the features and a1, a2, a3 as the respective weights of the features. Each of the
features is known as a weak classifier. The left side of the equation F(x) is called a strong classifier.
Since one weak classifier may not be as good, we get a strong classifier when we have a combination
of two or three weak classifiers. As you keep adding, it gets stronger and stronger. This is called an
ensemble.
A Cascade Classifier is a multi-stage classifier that can perform detection quickly and accurately. Each
stage consists of a strong classifier produced by the AdaBoost Algorithm. From one stage to another,
the number of weak classifiers in a strong classifier increases. An input is evaluated on a sequential
(stage by stage) basis. If a classifier for a specific stage outputs a negative result, the input is discarded
immediately. In case the output is positive, the input is forwarded onto the next stage. According to
Viola & Jones (2001), this multi-stage approach allows for the construction of simpler classifiers which
can then be used to reject most negative (non face) input quickly while spending more time on positive
(face) input.
It is another sort of “hack” to boost the speed and accuracy of our model. So, we start by taking a sub
window and within this sub window, we take our most important or best feature and see if it is present
in the image within the sub window. If it is not in the sub window, then we don’t even look at the sub
window, we just discard it. Then if it is present, we look at the second feature in the sub window. If it
isn’t present, then we reject the sub window. We go on for the number of features have, and reject the
sub windows without the feature. Evaluations may take split seconds but since you have to do it for
each feature, it could take a lot of time. Cascading speeds up this process a lot, and the machine is able
to deliver results much faster.
Convolution neural network (CNN) is an efficient recognition algorithm which is widely used in
pattern recognition and image processing. It has many features such as simple structure, less training
parameters and adaptability.
CNN is a class of deep learning neural networks. CNNs represent a huge breakthrough in image
recognition. They’re most commonly used to analyze visual imagery and are frequently working
behind the scenes in image classification. They can be found at the core of everything from
facebook’s photo tagging to self-driving cars. They’re working hard behind the scenes in everything
from healthcare to security. Image classification is the process of taking an input (like a picture) and
outputting a class or a probability that the input is a particular class (“there’s a 90% probability that
this input is an image”).
CNNs can be thought of automatic feature extractors from the image. It effectively uses adjacent
pixel information to effectively down-sample the image. A CNN, in specific, has one or more layers
of convolution units. A convolution unit receives its input from multiple units from the previous layer
which together create a proximity. Therefore, the input units (that form a small neighborhood) share
their weights.
The role of the ConvNet is to reduce the images into a form which is easier to process, without losing
features which are critical for getting a good prediction. This is important when we are to design an
architecture which is not only good at learning features but also is scalable to massive datasets. A
CNN typically has three layers: a convolutional layer, a pooling layer, and a fully connected layer.
The element involved in carrying out the convolution operation in the first part of a Convolutional
Layer is called the Kernel/Filter. The objective of the Convolution Operation is to extract the high-
level features such as edges, from the input image. ConvNets need not be limited to only one
Convolutional Layer. Conventionally, the first ConvLayer is responsible for capturing the Low-Level
features such as edges, color, gradient orientation, etc. With added layers, the architecture adapts to the
High-Level features as well.
Convolution is a mathematical operation to merge two sets of information. In our case the convolution
is applied on the input data using a convolution filter to produce a feature map.
There are a lot of terms being used so let’s visualize them one by one.
We perform the convolution operation by sliding this filter over the input. At every location, we do
element-wise matrix multiplication and sum the result. This sum goes into the feature map. The green
area where the convolution operation takes place is called the receptive field. Due to the size of the
filter the receptive field is also 3x3.
ReLU (Rectified Linear Unit) activation function which is applied after the convolution operation. It is
used for bringing non-linearity to the model. It simply converts the negative values present in the
feature map to ‘0’.
Similar to the Convolutional Layer, the Pooling layer is responsible for reducing the spatial size of the
Convolved Feature. This is to decrease the computational power required to process the data through
dimensionality reduction. Furthermore, it is useful for extracting dominant features which are rotational
and positional invariant, thus maintaining the process of effectively training of the model. There are
two types of Pooling: Max Pooling and Average Pooling. Max Pooling returns the maximum value
from the portion of the image covered by the Kernel. On the other hand, Average Pooling returns the
average of all the values from the portion of the image covered by the Kernel.
Neurons in this layer have full connectivity with all neurons in the preceding and succeeding layer as
seen in regular FCNN. Fully Connected Layer is also called as Dense Layer. It provides learning
features from all the combinations of the features of the previous layer. The FC layer helps to map the
representation between the input and the output.
The flattened output is fed to a feed-forward neural network and backpropagation applied to every
iteration of training. Over a series of epochs, the model is able to distinguish between dominating and
certain low-level features in images and classify them using the SoftMax Classification technique.
WORKING:
1. The input 3-D volume is passed to this layer. The dimension would be H*W*C. H, W, and C
represent height, width, and the number of channels respectively.
2. There can be K filter used where K represents the depth of an output volume. The dimension of
all the K filters is the same which is f*f*C. f is the filter size and c is the number of channels
input image has.
3. If we have configured padding then will add padding to the input volume. If padding is equal to
same then will add one row and one column at each side of the dimension and the value would
be zero. Padding is applicable only along the height and width of an input dimension and is
applied along each layer.
4. After padding, the computation begins. Now we’ll slide our filter starting from the top-left
corner. The corresponding values of filter and input volume are multiplied and then the
summation of all the multiplied value takes place. Now the filter is slide horizontally taking
stride number of steps in each slide. So, if stride is 2, we’ll slide 2 columns horizontally. The
same process is repeated vertically until the whole image is covered.
5. After getting all the values from the filter computation they are passed through Relu activation
which is max(0,x). In this, all the negative values obtained are replaced by zero as negative
values have no significance in the pixel.
6. Step 4 & 5 generates just one layer of an output volume that is the 3-D input volume is
transformed into a 2-D volume.
7. Now, step 4 & 5 gets repeated for K filters. And the output of each filter is stacked above one
another and hence the depth of an output image is of dimension k.
8. Now, to calculate the dimension of an output volume we require all the hyperparameters of a
convolutional layer. All the filters used at this layer needs to be trained and are initialized with
random small numbers.
4. DESIGN
4.1 Structure Chart:
A structure chart (SC) in software engineering and organizational theory is a chart which shows the
breakdown of a system to its lowest manageable levels. They are used in structured programming to
arrange program modules into a tree. Each module is represented by a box, which contains the
module’s name
A UML diagram is a partial graphical representation (view) of a model of a system under design,
implementation, or already in existence. UML diagram contains graphical elements (symbols) - UML
nodes connected with edges (also known as paths or flows) - that represent elements in the UML model
of the designed system. The UML model of the system might also contain other documentation such as
use cases written as templated texts.
The kind of the diagram is defined by the primary graphical symbols shown on the diagram. For
example, a diagram where the primary symbols in the contents area are classes is class diagram. A
diagram which shows use cases and actors is use case diagram. A sequence diagram shows sequence
of message exchanges between lifelines.
UML specification does not preclude mixing of different kinds of diagrams, e.g. to combine structural
and behavioral elements to show a state machine nested inside a use case. Consequently, the
boundaries between the various kinds of diagrams are not strictly enforced. At the same time, some
UML Tools do restrict set of available graphical elements which could be used when working on
specific type of diagram.
UML specification defines two major kinds of UML diagram: structure diagrams and behavior
diagrams.
Structure diagrams show the static structure of the system and its parts on different abstraction and
implementation levels and how they are related to each other. The elements in a structure diagram
represent the meaningful concepts of a system, and may include abstract, real world and
implementation concepts.
Behavior diagrams show the dynamic behavior of the objects in a system, which can be described as a
series of changes to the system over time.
In the Unified Modelling Language (UML), a use case diagram can summarize the details of your
system's users (also known as actors) and their interactions with the system. To build one, you'll use a
set of specialized symbols and connectors. An effective use case diagram can help your team discuss
and represent:
• Scenarios in which your system or application interacts with people, organizations, or external
system
• Goals that your system or application helps those entities (known as actors) achieve.
• The scope of your system.
In the Unified Modelling Language (UML), A sequence diagram is a type of interaction diagram
because it describes how and in what order a group of objects works together. These diagrams are used
by software developers and business professionals to understand requirements for a new system or to
document an existing process. Sequence diagrams are sometimes known as event diagrams or event
scenarios.
Sequence diagrams can be useful references for businesses and other organizations. Try drawing a
sequence diagram to:
• Represent the details of a UML use case.
• Model the logic of a sophisticated procedure, function, or operation.
• See how objects and components interact with each other to complete a process.
• Plan and understand the detailed functionality of an existing or future scenario.
Collaboration diagrams are used to show how objects interact to perform the behavior of a particular
use case, or a part of a use case. Along with sequence diagrams, collaboration are used by designers to
define and clarify the roles of the objects that perform a particular flow of events of a use case. They
are the primary source of information used to determining class responsibilities and interfaces.
The collaborations are used when it is essential to depict the relationship between the object. Both the
sequence and collaboration diagrams represent the same information, but the way of portraying it quite
different. The collaboration diagrams are best suited for analyzing use cases.
A flowchart is a type of diagram that represents a workflow or process. A flowchart can also be defined
as a diagrammatic representation of an algorithm, a step-by-step approach to solving a task. The
flowchart shows the steps as boxes of various kinds, and their order by connecting the boxes with
arrows.
Component diagram is a special kind of diagram in UML. The purpose is also different from all other
diagrams discussed so far. It does not describe the functionality of the system but it describes the
Component diagrams are used in modeling the physical aspects of object-oriented systems that are used
for visualizing, specifying, and documenting component-based systems and also for constructing
executable systems through forward and reverse engineering. Component diagrams are essentially class
diagrams that focus on a system's components that often used to model the static implementation view
of a system.
5. DATASET DETAILS
Fer-2013 dataset was prepared by Pierre-Luc Carrier and Aaron Courville, as part of an ongoing
research project. They have graciously provided the workshop organizers with a preliminary version of
their dataset to use for this contest.
The data consists of 48x48 pixel grayscale images of faces. The faces have been automatically
registered so that the face is more or less centered and occupies about the same amount of space in each
image. The task is to categorize each face based on the emotion shown in the facial expression in to
one of seven categories (0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral).
The train.csv contains two columns, "emotion" and "pixels". The "emotion" column contains a numeric
code ranging from 0 to 6, inclusive, for the emotion that is present in the image. The "pixels" column
contains a string surrounded in quotes for each image. The contents of this string a space-separated
pixel values in row major order. test.csv contains only the "pixels" column and your task is to predict
the emotion column.
This dataset consists of 35,887 grayscale images. The training set consists of 28,709 examples. The
public test set consists of 3,589 examples.
Keras: Keras is an Open-Source Neural Network library written in Python that runs on top of Theano
or TensorFlow. It is designed to be modular, fast and easy to use. It was developed by François
Chollet, a Google engineer. Keras doesn't handle low-level computation. Instead, it uses another
library to do it, called the "Backend. Keras is high-level API wrapper for the low-level API, capable
of running on top of TensorFlow, CNTK, or Theano. Keras High-Level API handles the way we
make models, defining layers, or set up multiple input-output models. In this level, Keras also
compiles our model with loss and optimizer functions, training process with fit function.
Keras in Python doesn't handle Low-Level API such as making the computational graph, making
tensors or other variables because it has been handled by the "backend" engine.
PyWebIO: PyWebIO is a Python library that allows you to build simple web applications without the
knowledge of HTML and JavaScript. PyWebIO can also be easily integrated into existing web
services such as Flask or Django. It also provides support for click events, layout, etc. PyWebIO aims
to allow you to use the least code to interact with the user and provide a good user experience as
much as possible.
import cv2
def screenshot():
cam = cv2.VideoCapture(0)
cv2.namedWindow(“test”)
#img_counter = 0
while True:
if not ret:
break
cv2.imshow("test", frame)
k = cv2.waitKey(1)
break
img_name = "test.jpg"
cv2.imwrite(img_name, frame)
print("Picture Captured!")
cam.release()
cv2.destroyAllWindows()
if __name__ == '__main__':
screenshot()
import numpy as np
import tensorflow as tf
#variables
num_classes = 7 #angry, disgust, fear, happy, sad, surprise, neutral
#------------------------------
with open("fer2013/fer2013.csv") as f:
content = f.readlines()
lines = np.array(content)
num_of_instances = lines.size
[],[]
range(1,num_of_instances):
try:
if 'Training' in usage:
y_train.append(emotion)
x_train.append(pixels)
x_test.append(pixels)
except:
print("", end="")
np.array(y_test, 'float32')
x_test /= 255
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
CNN Structure:
#construct CNN structure
model = Sequential()
model.add(Flatten())
model.add(Dense(1024, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1024, activation='relu'))
model.add(Dropout(0.2))
gen = ImageDataGenerator()
#------------------------------
model.compile(loss='categorical_crossentropy,optimizer=keras.optimizers.Adam()
metrics=['accuracy'] )
#------------------------------
Evaluation:
#Evaluation
train_score = model.evaluate(x_train, y_train, verbose=0)
print('Train accuracy:',100*train_score[1])
model = load_model('model100.h5')
import spot as sp
def emotion_analysis(emotions):
objects = ('angry', 'disgust', 'fear', 'happy', 'sad', 'surprise', 'neutral')
y_pos = np.arange(len(objects))
plt.xticks(y_pos, objects)
Department of Programming and development , Contriver Page 41
Face-melody using Intelligent-System 2023-24
plt.ylabel('percentage') plt.title('emotion')
res=(max(emotions)) j=0
for i in emotions:
if(i==res) : break
else : j=j+1
Emotion=str(objects[j]) print('Emotion
str(res*100))
plt.show()
return Emotion
def facecrop(image):
facedata = "haarcascade_frontalface_default.xml"
cascade = cv2.CascadeClassifier(facedata)
img = cv2.imread(image)
try:
minisize = (img.shape[1],img.shape[0])
faces = cascade.detectMultiScale(miniframe)
for f in faces:
x, y, w, h = [ v for v in f ]
cv2.rectangle(img, (x,y), (x+w,y+h), (0,255,0), 2)
sub_face = img[y:y+h, x:x+w]
cv2.imwrite('capture.jpg', sub_face)
#print ("Writing: " + image)
except Exception as e:
print (e)
#cv2.imshow(image, img)
Department of Programming and development , Contriver Page 42
Face-melody using Intelligent-System 2023-24
if __name__ == '__main__':
#Testing a file.
facecrop('test.jpg')
file = 'capture.jpg'
true_image = image.load_img(file)
x = image.img_to_array(img)
x = np.expand_dims(x, axis = 0)
x /= 255
custom = model.predict(x)
final_Emotion=emotion_analysis(custom[0])
x = np.array(x, 'float32')
x = x.reshape([48, 48]);
plt.gray()
plt.imshow(true_image)
plt.show()
print('\n------------------------------------------------------------------\n')
print('Playlists Generated By Using The Emotion : ' + final_Emotion)
print('\n------------------------------------------------------------------\n')
final_list = sp.songs_by_emotion(final_Emotion)
sp.print_songs(final_list)
4. Using Spotify API for generating the playlist according to user emotion:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
sp=spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id="4e7b220ab55e4887a
6f275a639cd08a7", client_secret="a181a5078b0143b5a43d7f7e2883497f"))
playlist_limit = 5 song_limit_per_playlist = 20
def songs_by_emotion(emotion):
results = sp.search(q=emotion,type='playlist', limit=playlist_limit)
gs = []
for el in results['playlists']['items']:
temp = {}
temp['playlist_name'] = el['name']
temp['playlist_href'] = el['href']
temp['playlist_id'] = el['id']
temp['playlist_spotify_link'] = el['external_urls']['spotify']
gs.append(temp)
fnl_playlist_songs = gs
for i in range(0,len(gs)):
srn = res['tracks']['items'][0:song_limit_per_playlist]
tlist = []
for el in srn:
tlist.append(el['track']['name'])
fnl_playlist_songs[i]['playlist_songs'] = tlist
return fnl_playlist_songs
def print_songs(fnl_playlist_songs):
for el in fnl_playlist_songs:
print('playlist_songs : ' )
for i in range(0,len(el['playlist_songs'])):
print('-----------------------------------------------')
5. Main Code:
import io
import spot as sp
def emotion_analysis(emotions):
objects = ('Angry', 'Disgust', 'Fear', 'Happy', 'Sad', 'Surprise', 'Neutral') y_pos =
np.arange(len(objects))
plt.title('emotion') plt.savefig('graph.png')
res=(max(emotions)) j=0
for i in emotions:
if(i==res) : break
else : j=j+1
Emotion=str(objects[j])
accuracy = str(res*100)
cascade = cv2.CascadeClassifier(facedata)
try:
minisize = (img.shape[1],img.shape[0])
faces = cascade.detectMultiScale(miniframe)
print(faces)
for f in faces:
x, y, w, h = [ v for v in f ]
cv2.rectangle(img, (x,y), (x+w,y+h), (0,255,0), 2)
sub_face = img[y:y+h, x:x+w]
cv2.imwrite('capture.jpg', sub_face)
#print ("Writing: " + image)
except Exception as e:
print (e)
def main():
data = data['dataurl']
# img = json.loads(info)['dataurl']
# webout.put_image(data)
facecrop(data)
file = 'capture.jpg'
true_image = image.load_img(file)
img = image.load_img(file, grayscale=True, target_size=(48, 48))
x = image.img_to_array(img)
x = np.expand_dims(x, axis = 0)
x /= 255
custom = model.predict(x)
final_Emotion, final_Accuracy=emotion_analysis(custom[0])
html = ""# webout.put_image(open('graph.png' , 'rb').read())
blob_img = None
# print(custom[0])
emotion_html = "<center style='font-size:20px;font-weight:bold'>Emotion Detected: " +
final_Emotion + "<br />Accuracy : "+ final_Accuracy +"</center><br />" html += emotion_html
final_list = sp.songs_by_emotion(final_Emotion)
playlist_html = "" playlist_html += '<center><br/>--------------------------------------------------------
-------------------------------------------<br />'
playlist_html += "<h2 style='font-weight:bold'>PLAYLISTS GENERATED BY USING THE
playlist_html +='---------------------------------------------------------------------------------------------------
-------------------<br /><center>'
currentPlaylist = 0
for el in final_list:
currentPlaylist += 1
playlist_html+= "<h2 style='text-align:center;'>PLAYLIST - " +str(currentPlaylist)+ ":</h2>"
playlist_html+= "</tr>"
for i in range(0,len(el['playlist_songs'])):
playlist_html+= "<tr>"
playlist_html+= "<td style='width:8%;text-align:center;border:1px solid
black;bordercollapse: collapse;padding: 10px;'>" + str(i+1) + '. ' "</td>"
if __name__ == '__main__':
start_server(main, debug=True, port=8080, cdn=False)
1. Input: Output:
2. Input: Output:
Accuracy Measure:
2.Emotion Detected:
Sad Accuracy: 99.99
Accuracy: 84.611
The term Epoch is once all the images are processed one time individually of forward and backward to
the network. Usually, we feed a neural network the training data for more than one epoch in different
patterns by which a better generalization can be there when an unseen input data is given. If there is a
large but finite training dataset then it gives the network a chance to see the previous data to readjust
the model parameters so that the model is not biased towards the last few data points during training.
The term Loss, is nothing but a prediction error of neural network and the method to calculate the loss
is called loss function. A loss function is used to optimize the machine learning algorithm. The loss is
calculated on training and validation sets and its interpretation is based on how well the model is doing
in these two sets. It is the sum of errors made for each example in training or validation sets. Loss value
implies how poorly or well a model behaves after each iteration of optimization.
An accuracy metric is used to measure the algorithm’s performance in an interpretable way. Accuracy
of a model is usually determined after the model parameters and is calculated in the form of
percentage. It is the measure of how accurate the model prediction is compared to the true data i.e.,
training data.
6.5 TESTING
The purpose of testing is to discover errors. Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It provides a way to check the functionality of
components, subassemblies, assemblies and or/a finished product. It is the process of exercising
software with the intent of ensuring that the software system meets its requirements and user
expectations does not fail in unacceptable manner. There are various types of tests. Each test type
addresses a specific requirement.
TYPES OF TESTINGS
1. Unit Testing
Unit testing involves the design of test cases that validate the internal program logic is functioning
properly, and that program inputs procedure valid outputs. All decision branches and internal code
flow should be validated. It is the testing of individual software units of the application. It is done
after the completion of an individual unit before integration. This is structural testing that relies on
knowledge of its construction and is invasive. Unit test perform basic test at component level and test
a specific business, application and/or system configuration Unit test ensures that each unique path of
princess performs accurately to the documented specifications and contains clearly defined inputs and
expected results. 2. Integration Testing
Integration tests are designed to test integrated software components to determine if they actually run
as one program. Testing in event driven and more concerned with the basic outcome of screens or
fields. Integration test demonstrate that although the components were individually satisfaction, as
shown successfully by unit testing the combination of components is correct and consistent.
Integration testing is specifically aimed at exposing the problems that arise from the combination, of
components
3. Functional Testing
Functional test provides systematic demonstrations that function tests are available as specified by the
business and technical requirements requirement’s, system documentation and user manuals.
Organization and preparation of functions test is focused on requirements, key functions or special
test cases in addition, systematic coverage pertaining to identify business process flow; data fields,
predefined process and successive process must be considered for testing. Before functional testing is
complete, additional tests are identified and the effective value of current test is determined
4. System Test
System testing ensures that the entire integrated software system meets requirements. It tests a
configuration to ensure known and predictable results An example on site testing is configuration
oriented system integration test.
It is a testing in which the software tester has and of the inner workings, structure and language of the
software, or at least its purpose. It is used to test areas that cannot be reached from a black box level.
Black Box Testing a testing the software without any knowledge of the inner workings, structure or
language of the module being tested. Black box tests, as most other kind of tests must be written
from a definitive source document, such as specification requirements document. It is a testing in
which the software under test is treated as black box you cannot see into it. The test provides inputs
and responds to outputs without considering how the software works.
Test plan:
A document describing the scope, approach, resources and schedule of intended test activities. It
identifies amongst others test items, the features to be tested, the testing tasks, who will do each task,
degree of tester independence, the test environment the test design techniques and entry and exit
criteria to be used, and the rationale for their choice, and any risks requiring contingency planning. It
Department of Programming and development , Contriver Page 54
Face-melody using Intelligent-System 2023-24
is a record of the test planning process. Follow the below steps to create a test plan as per IEEE 829
Analyze the system: A system/product can be analyzed only when the tester has any information
about it i.e., how the system works, who the end users are, what software/hardware the system uses,
what the system is for etc.
Design the Test Strategy: Designing a test strategy for all different types of functioning, hardware
by determining the efforts and costs incurred to achieve the objectives of the system. For any project,
the test strategy can be prepared by
Define the Test Objectives: Test objective is the overall goal and achievement of the test execution.
Objectives are defined in such a way that the system is bug-free and is ready to use
by the end-users. Test objective can be defined by identifying the software features that are needed to
test and the goal of the test, these features need to achieve to be noted as successful.
Define Test Criteria: Test Criteria is a standard or rule on which a test procedure or test judgment
can be based. There are two such test criteria: Suspension criteria where if the specific number of test
cases are failed, then the tester should suspend all the active test cycle till the criteria is resolved, exit
criteria which specifies the criteria that denote a successful completion of a test phase.
Resource Planning: Resource plan is a detailed summary of all types of resources required to
complete the project task. Resource could be human, equipment and materials needed to complete a
project.
Plan Test Environment: A testing environment is a setup of software and hardware on which the
testing team is going to execute test cases.
Schedule & Estimation: Preparing a schedule for different testing stages and estimating the time
and man power needed to test the system is mandatory to mitigate the risk of completing the project
within the deadline. It includes creating the test specification, test execution, test report, test delivery.
CONCLUSION:
In this project, we are generating the playlist according the emotion of the user, we developed an
application for predicting the emotion of the user using Convolution neural networks and for
generating the playlist we have used Spotify API. We have applied it on various images and
achieved an accuracy of above 80%.
FUTURE WORK:
We want to extend our work by creating a real time music player which generates playlist and play
songs according .
8. APPENDIX
Python
Python is an interpreted, high-level, general purpose programming language created by Guido Van
Rossum and first released in 1991, Python's design philosophy emphasizes code Readability with its
notable use of significant Whitespace. Its language constructs and object-oriented approach aim to help
programmers write clear, logical code for small and large-scale projects. Python is dynamically typed
and garbage collected. It supports multiple programming paradigms, including procedural, object-
oriented, and functional programming.
OpenCV
OpenCV (Open-source computer vision) is a library of programming functions mainly aimed at real-
time computer vision. Originally developed by Intel, it was later supported by willow garage then
Itseez (which was later acquired by Intel). The library is cross platform and free for use under the open-
source BSD license. OpenCV supports some models from deep learning frameworks like TensorFlow,
Torch, PyTorch (after converting to an ONNX model) and Caffe according to a defined list of
supported layers. It promotes Open Vision Capsules which is a portable format, compatible with all
other formats.
Keras
Keras is an Open-Source Neural Network library written in Python that runs on top of Theano or
TensorFlow. It is designed to be modular, fast and easy to use. It was developed by François Chollet, a
Google engineer. Keras doesn't handle low-level computation. Instead, it uses another library to do it,
called the "Backend. Keras is high-level API wrapper for the low-level API, capable of running on top
of TensorFlow, CNTK, or Theano. Keras High-Level API handles the way we make models, defining
layers, or set up multiple input-output models. In this level, Keras also compiles our model with loss
and optimizer functions, training process with fit function. Keras in Python doesn't handle Low-Level
API such as making the computational graph, making tensors or other variables because it has been
handled by the "backend" engine.
NumPy
NumPy is a library for the python programming language, adding support for large, multi- dimensional
arrays and matrices, along with a large collection of high-level mathematical functions to operate on
these arrays. The ancestor of NumPy, Numeric, was originally created by Jim with contributions from
several other developers. In 2005, Travis created NumPy by incorporating features of the competing
Numarray into Numeric, with extensive modifications.
NumPy is open-source software and has many contributors.
PyWebIO
PyWebIO is a Python library that allows you to build simple web applications without the knowledge
of HTML and JavaScript. PyWebIO can also be easily integrated into existing web services such as
Flask or Django. It also provides support for click events, layout, etc. PyWebIO aims to allow you to
use the least code to interact with the user and provide a good user experience as much as possible.
9. REFERENCES
1. S.L.Happy and A. Routray,"Automatic facial expression recognition using features of salient
facial patches," in IEEE Transactions on Affective Computing, vol.
2. 6, no. 1, pp. 1-12, 1 Jan.-March 2015.
3. Rahul ravi, S.V Yadhukrishna, Rajalakshmi, Prithviraj, "A Face Expression Recognition Using
CNN & LBP",2020 IEEE.
6. [5 ] S.Deebika, K.A.Indira, Jesline, "A Machine Learning Based Music Player by Detecting
Emotions",2019 IEEE.