Data Science CBSE Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 45

SCOTLE High School

Grade 10
Data Science - notes
Data Sciences
Introduction
As we have discussed earlier in class 9, Artificial Intelligence is a technology which completely depends
on data. It is the data which is fed into the machine which makes it intelligent. And depending upon
the type of data we have; AI can be classified into three broad domains:

•Data Sciences
Data •Working around numeric and alpha-numeric data.

•Computer Vision
CV •Working around image and visual data.

•Natural Language Processing


NLP •Working around textual and speech-based data.

Each domain has its own type of data which gets fed into the machine and hence has its own way of
working around it. Talking about Data Sciences, it is a concept to unify statistics, data analysis, machine
learning and their related methods in order to understand and analyse actual phenomena with data.
It employs techniques and theories drawn from many fields within the context of Mathematics,
Statistics, Computer Science, and Information Science.
Now before we get into the concepts of Data Sciences, let us experience this domain with the help of
the following game:

* Rock, Paper & Scissors: https://www.afiniti.com/corporate/rock-paper-


scissors

Go to this link and try to play the game of Rock, Paper Scissors against an AI model. The challenge here
is to win 20 games against AI before AI wins them against you.
Did you manage to win?

__________________________________________________________________________________
__________________________________________________________________________________
What was the strategy that you applied to win this game against the AI machine?

_______________________________________________________________________________ ___
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Was it different playing Rock, Paper & Scissors with an AI machine as compared to a human?

__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
What approach was the machine following while playing against you?

__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________

Applications of Data Sciences


Data Science is not a new field. Data Sciences majorly work around analysing the data and when it
comes to AI, the analysis helps in making the machine intelligent enough to perform tasks by itself.
There exist various applications of Data Science in today’s world. Some of them are:

Fraud and Risk Detection*: The earliest applications of data


science were in Finance. Companies were fed up of bad debts and
losses every year. However, they had a lot of data which use to get
collected during the initial paperwork while sanctioning loans. They
decided to bring in data scientists in order to rescue them from
losses.
Over the years, banking companies learned to divide and conquer
data via customer profiling, past expenditures, and other essential
variables to analyse the probabilities of risk and default. Moreover,
it also helped them to push their banking products based on
customer’s purchasing power.

Genetics & Genomics*: Data Science applications also enable


an advanced level of treatment personalization through research
in genetics and genomics. The goal is to understand the impact
of the DNA on our health and find individual biological
connections between genetics, diseases, and drug response.
Data science techniques allow integration of different kinds of
data with genomic data in disease research, which provides a
deeper understanding of genetic issues in reactions to particular
drugs and diseases. As soon as we acquire reliable personal
genome data, we will achieve a deeper understanding of the
human DNA. The advanced genetic risk prediction will be a major step towards more individual care.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Internet Search*: When we talk about search engines, we
think
‘Google’. Right? But there are many other search engines like
Yahoo, Bing, Ask, AOL, and so on. All these search engines
(including Google) make use of data science algorithms to deliver
the best result for our searched query in the fraction of a second.
Considering the fact that Google processes more than 20 petabytes
of data every day, had there been no data science, Google wouldn’t
have been the ‘Google’ we know today.

Targeted Advertising*: If you thought Search would have


been
the biggest of all data science applications, here is a challenger –
the entire digital marketing spectrum. Starting from the display
banilrs on various websites to the digital billboards at the airports
– almost all of them are decided by using data science algorithms.
This is the reason why digital ads have been able to get a much
higher CTR (Call-Through Rate) than traditional advertisements.
They can be targeted based on a user’s past behaviour.

Website Recommendations:* Aren’t we all used to the


suggestions about similar products on Amazon? They not only
help us find relevant products from billions of products
available with them but also add a lot to the user experience. A
lot of companies have fervidly used this engine to promote
their products in accordance with the user’s interest and
relevance of information. Internet giants like Amazon, Twitter,
Google Play, Netflix, LinkedIn, IMDB and many more use this
system to improve the user experience. The recommendations
are made based on previous search results for a user.

Airline Route Planning*: The Airline


Industry across the world is known to
bear heavy losses. Except for a few airline
service
are providers, companies
struggling to maintain their occupancy
ratio and operating profits. With high rise
in air-fuel prices and the need to offer
heavy discounts to customers, the
situation has got worse. It wasn’t long
before airline companies started using
Data Science to identify the strategic areas of improvements. Now, while using Data Science, the
airline companies can:

* Images shown here are the property of individual organisations and are used here for reference purpose only.
• Predict flight delay Decide which class of airplanes to buy Whether to directly land at the
• destination or take a halt in between (For example, A flight can have a direct route from New
• Delhi to New York. Alternatively, it can also choose to halt in any country.) Effectively drive
customer loyalty programs

Getting Started
Data Sciences is a combination of Python and Mathematical concepts like Statistics, Data Analysis,
probability, etc. Concepts of Data Science can be used in developing applications around AI as it
gives a strong base for data analysis in Python.
Revisiting AI Project Cycle
But, before we get deeper into data analysis, let us recall how Data Sciences can be leveraged to solve
some of the pressing problems around us. For this, let us understand the AI project cycle framework
around Data Sciences with the help of an example.
Do you remember the AI Project Cycle?

Fill in all the stages of the cycle here:


The Scenario*

Humans are social animals. We tend to organise and/or participate in various kinds of social gatherings
all the time. We love eating out with friends and family because of which we can find restaurants
almost everywhere and out of these, many of the restaurants arrange for buffets to offer a variety of
food items to their customers. Be it small shops or big outlets, every restaurant prepares food in bulk
as they expect a good crowd to come and enjoy their food. But in most cases, after the day ends, a lot
of food is left which becomes unusable for the restaurant as they do not wish to serve stale food to
their customers the next day. So, every day, they prepare food in large quantities keeping in mind the
probable number of customers walking into their outlet. But if the expectations are not met, a good
amount of food gets wasted which eventually becomes a loss for the restaurant as they either have
to dump it or give it to hungry people for free. And if this daily loss is taken into account for a year, it
becomes quite a big amount.
Problem Scoping

Now that we have understood the scenario well, let us take a deeper look into the problem to find out
more about various factors around it. Let us fill up the 4Ws problem canvas to find out.
Who Canvas – Who is having the problem?

Who are the o Restaurants offering buffets


stakeholders? o Restaurant Chefs
o Restaurants cook food in bulk every day for their buffets to meet their
What do we customer needs.
know about o They estimate the number of customers that would walk into their
them? restaurant every day.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
What Canvas – What is the nature of their problem?

o Quite a large amount of food is leftover everyday unconsumed at the


What is the
restaurant which is either thrown away or given for free to needy people.
problem? o
Restaurants have to bear everyday losses for the unconsumed food.

How do you know o Restaurant Surveys have shown that restaurants face this problem of
it is a problem? food waste.

Where Canvas – Where does the problem arise?

What is the context/situation o Restaurants which serve buffet food


in which the stakeholders o At the end of the day, when no further food consumption is
experience this problem? possible

Why? – Why do you think it is a problem worth solving?

What would be of key o If the restaurant has a proper estimate of the quantity of food
value to the stakeholders? to be prepared every day, the food waste can be reduced.

How would it improve their o Less or no food would be left unconsumed.


situation? o Losses due to unconsumed food would reduce considerably.

Now that we have noted down all the factors around our problem, let us fill up the problem statement
template.

Our Restaurant Owners Who?


Have a problem of Losses due to food wastage What?
The food is left unconsumed due to improper Where?
While estimation
Be to be able to predict the amount of food to be
An ideal solution would prepared for every day consumption Why

The Problem statement template leads us towards the goal of our project which can now be stated
as:

“To be able to predict the quantity of food dishes to be


prepared for everyday consumption in restaurant buffets.”
Data Acquisition
After finalising the goal of our project, let us now move towards looking at various data features which
affect the problem in some way or the other. Since any AI-based project requires data for testing and
training, we need to understand what kind of data is to be collected to work towards the goal. In our
scenario, various factors that would affect the quantity of food to be prepared for the next day
consumption in buffets would be:

Quantity of
Total Number Dish
dish prepared
of Customers consumption
per day

Unconsumed Quantity of
dish quantity Price of dish dish for the
per day next day
Now let us understand how these factors are related to our problem statement. For this, we can use
the System Maps tool to figure out the relationship of elements with the project’s goal. Here is the
System map for our problem statement.
In this system map, you can see how the relationship of each element is defined with the goal of our
project. Recall that the positive arrows determine a direct relationship of elements while the negative
ones show an inverse relationship of elements.
After looking at the factors affecting our problem statement, now it’s time to take a look at the data
which is to be acquired for the goal. For this problem, a dataset covering all the elements mentioned
above is made for each dish prepared by the restaurant over a period of 30 days. This data is collected
offline in the form of a regular survey since this is a personalised dataset created just for one
restaurant’s needs. Specifically, the data collected comes under the following categories: Name of the
dish, Price of the
dish, Quantity of dish produced per day, Quantity of dish left unconsumed per day, Total number of
customers per day, Fixed customers per day, etc.
Data Exploration

After creating the database, we now need to look at the data collected and understand what is
required out of it. In this case, since the goal of our project is to be able to predict the quantity
of food to be prepared for the next day, we need to have the following data:

Quantity of
Quantity of that
unconsumed
Name of dish dish prepared per
portion of the dish
day
per day

Thus, we extract the required information from the curated dataset and clean it up in such a way that
there exist no errors or missing elements in it.
Modelling
Once the dataset is ready, we train our model on it. In this case, a regression model is chosen in which
the dataset is fed as a dataframe and is trained accordingly. Regression is a Supervised Learning model
which takes in continuous values of data over a period of time. Since in our case the data which we
have is a continuous data of 30 days, we can use the regression model so that it predicts the next
values to it in a similar manilr. In this case, the dataset of 30 days is divided in a ratio of 2:1 for training
and testing respectively. In this case, the model is first trained on the 20-day data and then gets
evaluated for the rest of the 10 days.
Evaluation
Once the model has been trained on the training dataset of 20 days, it is now time to see if the model
is working properly or not. Let us see how the model works and how is it tested.
Step 1: The trained model is fed data regards the name of the dish and the quantity produced for the
same.
Step 2: It is then fed data regards the quantity of food left unconsumed for the same dish on previous
occasions.
Step 3: The model then works upon the entries according to the training it got at the modelling stage.
Step 4: The Model predicts the quantity of food to be prepared for the next day. Step 5: The prediction
is compared to the testing dataset value. From the testing dataset, ideally, we
can say that the quantity of food to be produced for next day’s consumption should be the total
quantity minus the unconsumed quantity. Step 6: The model is tested for 10 testing datasets kept
aside while training. Step 7: Prediction values of testing dataset is compared to the actual values. Step
8: If the prediction value is same or almost similar to the actual values, the model is said to be
accurate. Otherwise, either the model selection is changed or the model is trained on more data for
better accuracy. Once the model is able to achieve optimum efficiency, it is ready to be deployed in the
restaurant for
real-time usage.
Data Collection

Data collection is nothing new which has come up in our lives. It has been in our society since ages.
Even when people did not have fair knowledge of calculations, records were still maintained in some
way or the other to keep an account of relevant things. Data collection is an exercise which does not
require even a tiny bit of technological knowledge. But when it comes to analysing the data, it
becomes a tedious process for humans as it is all about numbers and alpha-numerical data. That is
where Data Science comes into the picture. It not only gives us a clearer idea around the dataset, but
also adds value to it by providing deeper and clearer analyses around it. And as AI gets incorporated
in the process, predictions and suggestions by the machine become possible on the same.
Now that we have gone through an example of a Data Science based project, we have a bit of clarity
regarding the type of data that can be used to develop a Data Science related project. For the data
domain-based projects, majorly the type of data used is in numerical or alpha-numerical format and
such datasets are curated in the form of tables. Such databases are very commonly found in any
institution for record maintenance and other purposes. Some examples of datasets which you must
already be aware of are:
Banks Databases of loans issued, account holder, locker owners, employee
registrations, bank visitors, etc.
Usage details per day, cash denominations transaction details, visitor
ATM Machines details, etc.
Movie details, tickets sold offline, tickets sold online, refreshment
Movie Theatres purchases, etc.

Now look around you and find out what are the different types of databases which are maintained in
the places mentioned below. Try surveying people who are responsible for the designated places to
get a better idea.

Your classroom Your school Your city


As you can see, all the type of data which has been mentioned above is in the form of tables. Tables
which contain numeric or alpha-numeric data. But this leads to a very critical dilemma: are these
datasets accessible to all? Should these databases be accessible to all? What are the various sources
of data from which we can gather such databases? Let’s find out!
Sources of Data
There exist various sources of data from where we can collect any type of data required and the data
collection process can be categorised in two ways: Offline and Online.

Offline Data Collection Online Data Collection


Sensors Open-sourced Government Portals
Surveys Reliable Websites (Kaggle)
Interviews World Organisations’ open-sourced statistical
Observations websites

While accessing data from any of the data sources, following points should be kept in mind:
1. Data which is available for public usage only should be taken up.
2. Personal datasets should only be used with the consent of the owner.
3. One should never breach someone’s privacy to collect data.
4. Data should only be taken form reliable sources as the data collected from random sources
can be wrong or unusable.
5. Reliable sources of data ensure the authenticity of data which helps in proper training of the
AI model.

Types of Data
For Data Science, usually the data is collected in the form of tables. These tabular datasets can be
stored in different formats. Some of the commonly used formats are:
1. CSV: CSV stands for comma separated values. It is a simple file format used to store tabular
data. Each line of this file is a data record and reach record consists of one or more fields which
are separated by commas. Since the values of records are separated by a comma, hence they
are known as CSV files.
2. Spreadsheet: A Spreadsheet is a piece of paper or a computer program which is used for
accounting and recording data using rows and columns into which information can be
entered. Microsoft excel is a program which helps in creating spreadsheets.
3. SQL: SQL is a programming language also known as Structured Query Language. It is a domain-
specific language used in programming and is designed for managing data held in different
kinds of DBMS (Database Management System) It is particularly useful in handling structured
data.
A lot of other formats of databases also exist, you can explore them online!
Data Access

After collecting the data, to be able to use it for programming purposes, we should know how to
access the same in a Python code. To make our lives easier, there exist various Python packages which
help us in accessing structured data (in tabular form) inside the code. Let us take a look at some of
these packages:
NumPy
NumPy, which stands for Numerical Python, is the fundamental package for Mathematical and logical
operations on arrays in Python. It is a commonly used package when it comes to working around
numbers. NumPy gives a wide range of arithmetic operations around numbers giving us an easier
approach in working with them. NumPy also works with arrays, which is nothing but a homogenous
collection of Data.
An array is nothing but a set of multiple values which are of same datatype. They can be numbers,
characters, booleans, etc. but only one datatype can be accessed through an array. In NumPy, the
arrays used are known as ND-arrays (N-Dimensional Arrays) as NumPy comes with a feature of
creating n-dimensional arrays in Python.
An array can easily be compared to a list. Let us take a look at how they are different:

NumPy Arrays Lists


1. Homogenous collection of Data. 1. Heterogenous collection of Data.
2. Can contain only one type of data, hence not 2. Can contain multiple types of data,
flexible with datatypes. hence flexible with datatypes.
3. Cannot be directly initialized. Can be operated 3. Can be directly initialized as it is a part
with Numpy package only. of Python syntax.
4. Direct numerical operations can be done. For 4. Direct numerical operations are not possible. For
example, dividing
example, the whole
dividing list by
the whole 3 cannot
array divide every element by 3.
by 3 divides
every element by 3.
5. Widely used for arithmetic operations.
6. Arrays take less memory space. 5. Widely used for data management.
7. Functions like concatenation, appending, 6. Lists acquire more memory space.
reshaping, etc are not trivially possible with 7. Functionsconcatenation,like
appending, reshaping,
arrays. etc are trivially possible with lists.
8. Example: To create a numpy array ‘A’:
8. Example: To create a list:
import numpy
A=numpy.array([1,2,3,4,5,6,7,8,9,0]) A = [1,2,3,4,5,6,7,8,9,0]

Pandas
Pandas is a software library written for the Python programming language for data manipulation and
analysis. In particular, it offers data structures and operations for manipulating numerical tables and
time series. The name is derived from the term "panel data", an econometrics term for data sets that
include observations over multiple time periods for the same individuals.
Pandas is well suited for many different kinds of data:

• Tabular data with heterogeneously-typed columns, as in an SQL table or Excel spreadsheet


• Ordered and unordered (not necessarily fixed-frequency) time series data.
• Arbitrary matrix data (homogeneously typed or heterogeneous) with row and column labels
• Any other form of observational / statistical data sets. The data actually need not be labelled
at all to be placed into a Pandas data structure
The two primary data structures of Pandas, Series (1-dimensional) and DataFrame (2-dimensional),
handle the vast majority of typical use cases in finance, statistics, social science, and many areas of
engineering. Pandas is built on top of NumPy and is intended to integrate well within a scientific
computing environment with many other 3rd party libraries.
Here are just a few of the things that pandas does well:

• Easy handling of missing data (represented as NaN) in floating point as well as non-floating
point data
• Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional
objects
• Automatic and explicit data alignment: objects can be explicitly aligned to a set of labels, or
the user can simply ignore the labels and let Series, DataFrame, etc. automatically align the
data for you in computations
• Intelligent label-based slicing, fancy indexing, and subsetting of large data sets
• Intuitive merging and joining data sets
• Flexible reshaping and pivoting of data sets

Matplotlib*
Matplotlib is an amazing visualization library in Python for 2D plots of arrays. Matplotlib is a multi-
platform data visualization library built on NumPy arrays. One of the greatest benefits of
visualization is that it allows us visual access to huge amounts of data in easily digestible visuals.
Matplotlib comes with a wide variety of plots. Plots helps to understand trends, patterns, and to
make correlations. They’re typically instruments for reasoning about quantitative information. Some
types of graphs that we can make with this package are listed below:

Not just plotting, but you can also modify your plots the way you wish. You can stylise them and make
them more descriptive and communicable.
These packages help us in accessing the datasets we have and also in exploring them to develop a
better understanding of them.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Basic Statistics with Python
We have already understood that Data Sciences works around analysing data and performing tasks
around it. For analysing the numeric & alpha-numeric data used for this domain, mathematics comes
to our rescue. Basic statistical methods used in mathematics come quite hAmanin Python too for
analysing and working around such datasets. Statistical tools widely used in Python are:

Do you remember using these formulas in your class? Let us recall all of them here:

1. What is Mean? How is it calculated?


__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
2. What is Median? How is it calculated?

__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
3. What is Mode? How is it calculated?

__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
4. What is Standard Deviation? How is it calculated?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
5. What is Variance? How is it calculated?

__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Advantage of using Python packages is that we do not need to make our own formula or equation to
find out the results. There exist a lot of pre-defined functions with packages like NumPy which reduces
this trouble for us. All we need to do is write that function and pass on the data to it. It’s that simple!
Let us take a look at various Python syntaxes that can help us with the statistical work in data analysis.
Head to the Jupyter Notebook of Basic statistics with Python and start exploring! You may find the
Jupyter notebook here: http://bit.ly/data_notebook

Data Visualisation
While collecting data, it is possible that the data might come with some errors. Let us first take a look
at the types of issues we can face with data:
1. Erroneous Data: There are two ways in which the data can be erroneous:

• Incorrect values: The values in the dataset (at random places) are incorrect. For example, in
the column of phone number, there is a decimal value or in the marks column, there is a name
mentioned, etc. These are incorrect values that do not resemble the kind of data expected in
that position.
• Invalid or Null values: At some places, the values get corrupted and hence they become
invalid. Many times you will find NaN values in the dataset. These are null values which do not
hold any meaning and are not processible. That is why, these values (as and when
encountered) are removed from the database.
2. Missing Data: In some datasets, some cells remain empty. The values of these cells are missing and
hence the cells remain empty. Missing data cannot be interpreted as an error as the values here are
not erroneous or might not be missing because of any error.
3. Outliers: Data which does not fall in the range of a certain element are referred to as outliers. To
understand this better, let us take an example of marks of students in a class. Let us assume that a
student was absent for exams and hence has got 0 marks in it. If his marks are taken into account, the
whole class’s average would go down. To prevent this, the average is taken for the range of marks
from highest to lowest keeping this particular result separate. This makes sure that the average marks
of the class are true according to the data.
Analysing the data collected can be difficult as it is all about tables and numbers. While machines work
efficiently on numbers, humans need visual aid to understand and comprehend the information
passed. Hence, data visualisation is used to interpret the data collected and identify patterns and
trends out of it.
In Python, Matplotlib package helps in visualising the data and making some sense out of it. As we
have already discussed before, with the help of this package, we can plot various kinds of graphs.
Let us discuss some of them here:

Scatter plots are used to plot discontinuous data; that is, the data
which does gaps
not have anywhich
continuity in flow is termed as discontinuous.
Scatter Plot There exist in data introduce discontinuity.
A 2D scatter plot can display information maximum upto 4
parameters.

In this scatter plot, 2 axes (X and Y) are two different parameters. The colour of circles and the size
both represent 2 different parameters. Thus, just through one coordinate on the graph, one can
visualise 4 different parameters all at once.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
It is one of the most commonly used graphical methods. From
students to scientists, everyone uses bar charts in some way or the
Bar Chart other. It is a very easy to draw yet informative graphical
representation. Various versions of bar chart exist like single bar chart,
double bar chart, etc.

This is an example of a double bar chart. The 2 axes depict two different parameters while bars of
different colours work with different entities ( in this case it is women and men). Bar chart also works
on discontinuous data and is made at uniform intervals.

Histograms are the accurate representation of a continuous data.

Histogram When it comes to plotting the variation in just one entity of a period
of time, histograms come into the picture. It represents the frequency
of the variable at different points of time with the help of the bins.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
In the given example, the histogram is showing the variation in frequency of the entity plotted with
the help of XY plane. Here, at the left, the frequency of the element has been plotted and it is a
frequency map for the same. The colours show the transition from low to high and vice versa. Whereas
on the right, a continuous dataset has been plotted which might not be talking about the frequency
of occurrence of the element.

When the data is split according to its percentile throughout the

Box Plots range, box plots come in haman. Box plots also known as box and
whiskers plot conveniently display the distribution of data throughout
the range with the help of 4 quartiles.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Here as we can see, the plot contains a box and two lines at its left and right are termed as whiskers.
The plot has 5 different parts to it:
Quartile 1: From 0 percentile to 25th percentile – Here data lying between 0 and 25th percentile is
plotted. Now, if the data is close to each other, lets say 0 to 25th percentile data has been covered in
just 20-30 marks range, then the whisker would be smaller as the range is smaller. But if the range is
large that is 0-30 marks range, then the whisker would also get elongated as the range is longer.
Quartile 2: From 25th Percentile to 50th percentile – 50th percentile is termed as the mean of the

whole
distribution and since the data falling in the range of 25th percentile to 75th percentile has minimum
deviation from the mean, it is plotted inside the box. Quartile 3: From 50th percentile to 75th

percentile – This range is again plotted in the box as its


deviation from the mean is less. Quartile 2 & 3 (from 25th percentile to 75th percentile) together
constitute the Inter Quartile Range (IQR). Also, depending upon the range of distribution, just like
whiskers, the length of box also varies if the data is less spread or more.
Quartile 4: From 75th percentile to 100th percentile – It is the whiskers plot for top 25 percentile data.

Outliers: The advantage of box plots is that they clearly show the outliers in a data distribution. Points
which do not lie in the range are plotted outside the graph as dots or circles and are termed as outliers
as they do not belong to the range of data. Since being out of range is not an error, that is why they
are still plotted on the graph for visualisation.
Let us now move ahead and experience data visualisation using Jupyter notebook. Matplotlib library
will help us in plotting all sorts of graphs while Numpy and Pandas will help us in analysing the data.
Data Sciences: Classification Model
In this section, we would be looking at one of the classification models used in Data Sciences. But
before we look into the technicalities of the code, let us play a game.
Personality Prediction
Step 1: Here is a map. Take a good look at it. In this map you can see the arrows determine a quality.
The qualities mentioned are:
1. Positive X-axis – People focussed: You focus more on people and try to deliver the best
experience to them.
2. Negative X-axis – Task focussed: You focus more on the task which is to be accomplished and
try to do your best to achieve that.
3. Positive Y-axis – Passive: You focus more on listening to people and understanding everything
that they say without interruption.
4. Negative Y-axis – Active: You actively participate in the discussions and make sure that you
make your point in-front of the crowd.
Think for a minute and understand which of these qualities you have in you. Now, take a chit and write
your name on it. Place this chit at a point in this map which best describes you. It can be placed
anywhere on the graph. Be honest about yourself and put it on the graph. Step 2: Now that you have

all put up your chits on the graph, it’s time to take a quick quiz. Go to this
link and finish the quiz on it individually: https://tinyurl.com/discanimal
On this link, you will find a personality prediction quiz. Take this quiz individually and try to answer all
the questions honestly. Do not take anyone’s help in it and do not discuss about it with anyone. Once
the quiz is finished, remember the animal which has been predicted for you. Write it somewhere and
do not show it to anyone. Keep it as your little secret.
Once everyone has gone through the quiz, go back to the board remove your chit, and draw the
symbol which corresponds to your animal in place of your chit. Here are the symbols:


Lion Otter Golden Retriever Beaver

 ☺ 
Place these symbols at the locations where you had put up your names. Ask 4 students not to do so
and tell them to keep their animals a secret. Let their name chits be on the graph so that we can
predict their animals with the help of this map.
Now, we will try to use the nearest neighbour algorithm here and try to predict what can be the
possible animal(s) for these 4 unknowns. Now look that these 4 chits one by one. Which animal is
occurring the most in their vicinity? Do you think that if the m lion symbol is occurring the most near
their chit, then there is a good probability that their animal would also be a lion? Now let us try to
guess the animal for all 4 of them according to their nearest neighbours respectively. After guessing
the animals, ask these 4 students if the guess is right or not.
K-Nearest Neighbour: Explained

The k-nearest neighbours (KNN) algorithm is a simple, easy-to-implement supervised machine learning
algorithm that can be used to solve both classification and regression problems. The KNN algorithm
assumes that similar things exist in close proximity. In other words, similar things are near to each
other as the saying goes “Birds of a feather flock together”. Some features of KNN are:

• The KNN prediction model relies on the surrounding points or neighbours to determine its
class or group
• Utilises the properties of the majority of the nearest points to decide how to classify unknown
points
• Based on the concept that similar data points should be close to each other
The personality prediction activity was a brief introduction to KNN. As you recall, in that activity, we
tried to predict the animal for 4 students according to the animals which were the nearest to their
points. This is how in a lay-man’s language KNN works. Here, K is a variable which tells us about the
number of neighbours which are taken into account during prediction. It can be any integer value
starting from 1.
Let us look at another example to demystify this algorithm. Let us assume that we need to predict the
sweetness of a fruit according to the data which we have for the same type of fruit. So here we have
three maps to predict the same:
Here, X is the value which is to be predicted. The green dots depict sweet values and the blue ones
denote not sweet.
Let us try it out by ourselves first. Look at the map closely and decide whether X should be sweet or
not sweet?
Now, let us look at each graph one by one:

Here, we can see that K is taken as 1 which means that we are taking only 1 nearest

1 neighbour into consideration. The nearest value to X is a blue one hence 1-nearest
neighbour algorithm predicts that the fruit is not sweet.
2
3
In the 2nd graph, the value of K is 2. Taking 2 nearest nodes to X into consideration, we
see that one
machine is sweet
to make while the other
any predictions basedoneon is
thenot sweet.neighbour
nearest This makes it hence
and difficult
thefor the
machine is not able to give any prediction.

In the 3rd graph, the value of K becomes 3. Here, 3 nearest nodes to X are chosen out
of which 2 are green and 1 is blue. On the basis of this, the model is able to predict that
the fruit is sweet.

On the basis of this example, let us understand KNN better:

KNN tries to predict an unknown value on the basis of the known values. The model simply calculates
the distance between all the known points with the unknown point (by distance we mean to say the
different between two values) and takes up K number of points whose distance is minimum. And
according to it, the predictions are made.
Let us understand the significance of the number of neighbours:

1. As we decrease the value of K to 1, our predictions become less stable. Just think for a minute,
imagine K=1 and we have X surrounded by several greens and one blue, but the blue is the
single nearest neighbour. Reasonably, we would think X is most likely green, but because K=1,
KNN incorrectly predicts that it is blue.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
2. Inversely, as we increase the value of K, our predictions become more stable due to majority
voting / averaging, and thus, more likely to make more accurate predictions (up to a certain
point). Eventually, we begin to witness an increasing number of errors. It is at this point we
know we have pushed the value of K too far. In cases where we are taking a majority vote (e.g.
3. picking the mode in a classification problem) among labels, we usually make K an odd number
to have a tiebreaker.
Computer Vision
Introduction
In the previous chapter, you studied the concepts of Artificial Intelligence for Data Sciences. It is a
concept to unify statistics, data analysis, machine learning and their related methods in order to
understand and analyse actual phenomena with data.
As we all know, artificial intelligence is a technique that enables computers to mimic human
intelligence. As humans we can see things, analyse it and then do the required action on the basis of
what we see.
But can machines do the same? Can machines have the eyes that humans have? If you answered Yes,
then you are absolutely right. The Computer Vision domain of Artificial Intelligence, enables machines
to see through images or visual data, process and analyse them on the basis of algorithms and
methods in order to analyse actual phenomena with images.
Now before we get into the concepts of Computer Vision, let us experience this domain with the help
of the following game:

* Emoji Scavenger Hunt :


https://emojiscavengerhunt.withgoogle.com/

Go to the link and try to play the game of Emoji Scavenger Hunt. The challenge here is to find 8 items
within the time limit to pass.
Did you manage to win?

__________________________________________________________________________________
__________________________________________________________________________________
What was the strategy that you applied to win this game?

__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Was the computer able to identify all the items you brought in front of it?

__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Did the lighting of the room affect the identifying of items by the machine?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________

Applications of Computer Vision


The concept of computer vision was first introduced in the 1970s. All these new applications of
computer vision excited everyone. Having said that, the computer vision technology advanced
enough to make these applications available to everyone at ease today. However, in recent years
the world witnessed a significant leap in technology that has put computer vision on the priority
list of many industries. Let us look at some of them:

Facial Recognition*
: With the advent of smart cities and smart homes,
Computer Vision plays a vital role in making the home smarter. Security
being the most important application involves use of Computer Vision
for facial recognition. It can be either guest recognition or log
maintenance of the visitors.
It also finds its application in schools for an attendance system based on
facial recognition of students.

Face
: TheFilters*
modern-day apps like Instagram and snapchat have a lot of
features based on the usage of computer vision. The application of face
filters is one among them. Through the camera the machine or the
algorithm is able to identify the facial dynamics of the person and
applies the facial filter selected.

Google’s Search by Image*: The maximum


amount
of searching for data on Google’s search engine comes
from textual data, but at the same time it has an
interesting feature of getting search results through an
image. This uses Computer Vision as it compares
different features of the input image to the database of
images and give us the search result while at the same
time analysing various features of the image.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Computer Vision in Retail*: The retail field has been one of
the
fastest growing field and at the same time is using Computer
Vision for making the user experience more fruitful. Retailers can
use Computer Vision techniques to track customers’ movements
through stores, analyse navigational routes and detect walking
patterns. Inventory Management is another such application.
Through security camera image analysis, a Computer Vision
algorithm can generate a very accurate estimate of the items
available in the store. Also, it can analyse the use of shelf space to
identify suboptimal configurations and suggest better item
placement.
Self-Driving Cars: Computer Vision is the fundamental
technology behind developing autonomous vehicles.
Most leading car manufacturers in the world are
reaping the benefits of investing in artificial intelligence
for developing on-road versions of hands-free
technology.
This involves the process of identifying the objects,
getting navigational routes and also at the same time
environment monitoring.

Medical Imaging*: For the last decades, computer-


supported medical imaging application has been a
trustworthy help for physicians. It doesn’t only
create and analyse images, but also becomes an
assistant and helps doctors with their interpretation.
The application is used to read and convert 2D scan
images into interactive 3D models that enable
detailed
medical professionals to gain a
understanding of a patient’s health condition.

Google
All you Translate
need to doApp*:
to read signs in a foreign language is to point
your phone’s camera at the words and let the Google Translate app
tell you what it means in your preferred language almost instantly.
By using optical character recognition to see the image and
augmented reality to overlay an accurate translation, this is a
convenient tool that uses Computer Vision.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Computer Vision: Getting Started
Computer Vision is a domain of Artificial Intelligence, that deals with the images. It involves the
concepts of image processing and machine learning models to build a Computer Vision based
application.

Computer Vision Tasks


The various applications of Computer Vision are based on a certain number of tasks which are
performed to get certain information from the input image which can be directly used for prediction or
forms the base for further analysis. The tasks used in a computer vision application are :

For Single For Multiple


Objects Objects

Object
Classification
Detection

Classification + Instance
Localisation Segementation

Classification
Image Classification problem is the task of assigning an input image one label from a fixed set of
categories. This is one of the core problems in CV that, despite its simplicity, has a large variety of
practical applications.
Classification + Localisation
This is the task which involves both processes of identifying what object is present in the image and
at the same time identifying at what location that object is present in that image. It is used only for
single objects.
Object Detection
Object detection is the process of finding instances of real-world objects such as faces, bicycles, and
buildings in images or videos. Object detection algorithms typically use extracted features and
learning algorithms to recognize instances of an object category. It is commonly used in applications
such as image retrieval and automated vehicle parking systems.
Instance Segmentation
Instance Segmentation is the process of detecting instances of the objects, giving them a category and
then giving each pixel a label on the basis of that. A segmentation algorithm takes an image as input
and outputs a collection of regions (or segments).
Basics of Images
We all see a lot of images around us and use them daily either through our mobile phones or computer
system. But do we ask some basic questions to ourselves while we use them on such a regular basis.

Don’t know the answer yet? Don’t worry, in this section we will study about the basics of an image:
Basics of Pixels
The word “pixel” means a picture element. Every photograph, in digital form, is made up of pixels.
They are the smallest unit of information that make up a picture. Usually round or square, they are
typically arranged in a 2-dimensional grid.
In the image below, one portion has been magnified many times over so that you can see its individual
composition in pixels. As you can see, the pixels approximate the actual image. The more pixels you
have, the more closely the image resembles the original.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Resolution
The number of pixels in an image is sometimes called the resolution. When the term is used to
describe pixel count, one convention is to express resolution as the width by the height, for example a
monitor resolution of 1280×1024. This means there are 1280 pixels from one side to the other, and
1024 from top to bottom.
Another convention is to express the number of pixels as a single number, like a 5 mega pixel camera
(a megapixel is a million pixels). This means the pixels along the width multiplied by the pixels along
the height of the image taken by the camera equals 5 million pixels. In the case of our 1280×1024
monitors, it could also be expressed as 1280 x 1024 = 1,310,720, or 1.31 megapixels.
Pixel value
Each of the pixels that represents an image stored inside a computer has a pixel value which describes
how bright that pixel is, and/or what colour it should be. The most common pixel format is the byte
image, where this number is stored as an 8-bit integer giving a range of possible values from 0 to 255.
Typically, zero is to be taken as no colour or black and 255 is taken to be full colour or white.
Why do we have a value of 255 ? In the computer systems, computer data is in the form of ones and
zeros, which we call the binary system. Each bit in a computer system can have either a zero or a one.
Since each pixel uses 1 byte of an image, which is equivalent to 8 bits of data. Since each bit can have
two possible values which tells us that the 8 bit can have 255 possibilities of values which starts from
0 and ends at 255.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Grayscale Images
Grayscale images are images which have a range of shades of gray without apparent colour. The
darkest possible shade is black, which is the total absence of colour or zero value of pixel. The
lightest possible shade is white, which is the total presence of colour or 255 value of a pixel .
Intermediate shades of gray are represented by equal brightness levels of the three primary
colours.
A grayscale has each pixel of size 1 byte having a single plane of 2d array of pixels. The size of a
grayscale image is defined as the Height x Width of that image.
Let us look at an image to understand about grayscale images.

Here is an example of a grayscale image. as you check, the value of pixels are within the range of 0-
255.The computers store the images we see in the form of these numbers.
RGB Images
All the images that we see around are coloured images. These images are made up of three primary
colours Red, Green and Blue. All the colours that are present can be made by combining different
intensities of red, green and blue.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Let us experience!
Go to this online link https://www.w3schools.com/colors/colors_rgb.asp. On the basis of this online
tool, try and answer all the below mentioned questions.

1) What is the output colour when you put R=G=B=255 ?

___________________________________________________________________________

2) What is the output colour when you put R=G=B=0 ?

___________________________________________________________________________

3) How does the colour vary when you put either of the three as 0 and then keep on varying
the other two?

___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________

4) How does the output colour change when all the three colours are varied in same
proportion ?

___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________

5) What is the RGB value of your favourite colour from the colour palette?

___________________________________________________________________________

Were you able to answer all the questions? If yes, then you would have understood how every colour
we see around is made.
Now the question arises, how do computers store RGB images? Every RGB image is stored in the form
of three different channels called the R channel, G channel and the B channel.
Each plane separately has a number of pixels with each pixel value varying from 0 to 255. All the three
planes when combined together form a colour image. This means that in a RGB image, each pixel has
a set of three different values which together give colour to that particular pixel.
For Example,

As you can see, each colour image is stored in the form of three different channels, each having
different intensity. All three channels combine together to form a colour we see.

In the above given image, if we split the image into three different channels, namely Red (R), Green
(G) and Blue (B), the individual layers will have the following intensity of colours of the individual
pixels. These individual layers when stored in the memory looks like the image on the extreme right.
The images look in the grayscale image because each pixel has a value intensity of 0 to 255 and as
studied earlier, 0 is considered as black or no presence of colour and 255 means white or full presence
of colour. These three individual RGB values when combined together form the colour of each pixel.
Therefore, each pixel in the RGB image has three values to form the complete colour.

Task :

Go to the following link www.piskelapp.com and create your own pixel art. Try and make a GIF using
the online app for your own pixel art.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Image Features
In computer vision and image processing, a feature is a piece of information which is relevant for
solving the computational task related to a certain application. Features may be specific structures
in the image such as points, edges or objects. For example: Imagine that your security camera is
capturing an image. At the top of the image we are given six small patches of images. Our task is to
find the exact location of those image patches in the image. Take a pencil and mark the exact
location of those patches in the image.

Were you able to find the exact location of all the patches?
__________________________________________________________________________________

Which one was the most difficult to find?

__________________________________________________________________________________
__________________________________________________________________________________

Which one was the easiest to find?

__________________________________________________________________________________
__________________________________________________________________________________

Let’s Reflect:
Let us take individual patches into account at once and then check the exact location of those patches.
For Patch A and B: The patch A and B are flat surfaces in the image and are spread over a lot of area.
They can be present at any location in a given area in the image. For Patch C and D: The patches C and
D are simpler as compared to A and B. They are edges of a building and we can find an approximate
location of these patches but finding the exact location is still difficult. This is because the pattern is
the same everywhere along the edge. For Patch E and F: The patches E and F are the easiest to find in
the image. The reason being that E and F are some corners of the building. This is because at the
corners, wherever we move this patch it will look different.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Conclusion
In image processing, we can get a lot of features from the image. It can be either a blob, an edge or a
corner. These features help us to perform various tasks and then get the analysis done on the basis of
the application. Now the question that arises is which of the following are good features to be used?
As you saw in the previous activity, the features having the corners are easy to find as they can be
found only at a particular location in the image, whereas the edges which are spread over a line or an
edge look the same all along. This tells us that the corners are always good features to extract from an
image followed by the edges. Let’s look at another example to understand this. Consider the images
given below and apply the concept of good features for the following.

In the above image how would we determine the exact location of each patch?
The blue patch is a flat area and difficult to find and track. Wherever you move the blue patch it looks
the same. The black patch has an edge. Moved along the edge (parallel to edge), it looks the same.
The red patch is a corner. Wherever you move the patch, it looks different, therefore it is unique.
Hence, corners are considered to be good features in an image.

Introduction to OpenCV
Now that we have learnt about image features and its importance in image processing, we will learn
about a tool we can use to extract these features from our image for further processing. OpenCV or
Open Source Computer Vision Library is that tool which helps a computer extract these features from
the images. It is used for all kinds of images and video processing and analysis. It is capable of
processing images and videos to identify objects, faces, or even handwriting.

In this chapter we will use OpenCV for basic image processing operations on
images such as resizing, cropping and many more.
To install OpenCV library, open anaconda prompt and then write the following
command:

pip install opencv-python

Now let us take a deep dive on the various functions of OpenCV to understand the various image
processing techniques. Head to Jupyter Notebook for introduction to OpenCV given on this link:
http://bit.ly/cv_notebook

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Convolution
We have learnt that computers store images in numbers, and that pixels are arranged in a particular
manner to create the picture we can recognize. These pixels have value varying from 0 to 255 and the
value of the pixel determines the color of that pixel.
But what if we edit these numbers, will it bring a change to the image? The answer is yes. As we

change
the values of these pixels, the image changes. This process of changing pixel values is the base of image
editing. We all use a lot of image editing software like photoshop and at the same time use apps like

Instagram
and snapchat, which apply filters to the image to enhance the quality of that image.

As you can see, different filters applied to an image change the pixel values evenly throughout the
image. How does this happen? This is done with the help of the process of convolution and the
convolution operator which is commonly used to create these effects.
Before we understand how the convolution operation works, let us try and create a theory for the
convolution operator by experiencing it using an online application.
Task
Go to the link http://matlabtricks.com/post-5/3x3-convolution-kernels-with-online-demo and at the
bottom of the page click on load “Click to Load Application”

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Once the application is loaded try different filters and apply it on the image. Observe how the value
of the kernel is changing for different filters. Try these steps
1) Change all to positive values
2) Change all to negative values
3) Have a mixture of negative and positive values

Let us follow the following steps to understand how a convolution operator works. The steps to be
followed are:

Try experimenting with the following values to come up with a theory:


1) Make 4 numbers negative. Keep the rest as 0.
2) Now make one of them as positive.
3) Observe what happens.
4) Now make the second positive.

What theory do you propose for convolution on the basis of the observation?

__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
It is time to test the theory. Change the location of the four numbers and follow the above mentioned
steps. Does your theory hold true?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
If yes, change the picture and try whether the theory holds true or not. If it does not hold true, modify
your theory and keep trying until it satisfies all the conditions.
Let’s Discuss
What effect did you apply?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
How did different kernels affect the image?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Why do you think we apply these effects?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
How do you think the convolution operator works?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Convolution : Explained

many
Convolution is a simple Mathematical operation which is fundamental to
common image processing operators. Convolution provides a way of ̀`multiplying together' two arrays
of numbers, generally of different sizes, but of the same dimensionality, to produce a third array of
numbers of the same dimensionality.
An (image) convolution is simply an element-wise multiplication of image arrays and another array
called the kernel followed by sum.
As you can see here, I = Image Array K = Kernel Array I * K = Resulting array after performing the

convolution operator Note: The Kernel is passed over the whole image to get the resulting array after

convolution.
What is a Kernel?
A Kernel is a matrix, which is slid across the image and multiplied with the input such that the output is
enhanced in a certain desirable manner. Each kernel has a different value for different kind of effects
that we want to apply to an image.
In Image processing, we use the convolution operation to extract the features from the images which
can le later used for further processing especially in Convolution Neural Network (CNN), about which
we will study later in the chapter. In this process, we overlap the centre of the image with the centre of
the kernel to obtain the
convolution output. In the process of doing it, the output image becomes smaller as the overlapping
is done at the edge row and column of the image. What if we want the output image to be of exact
size of the input image, how can we achieve this?
To achieve this, we need to extend the edge values out by one in the original image while overlapping
the centres and performing the convolution. This will help us keep the input and output image of the
same size. While extending the edges, the pixel values are considered as zero.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Let’s try
In this section we will try performing the convolution operator on paper to understand how it works.
Fill the blank places of the output images by performing the convolution operation.

150 0 255 240 190 25 89 255

100 179 25 0 200 255 67 100

155 146 13 20 0 12 45 0

100 175 0 25 25 15 0 0
-1 0 -1
120 156 255 0 78 56 23 0 0 -1 0
-1 0 -1
115 113 25 90 0 80 56 155

135 190 115 116 178 0 145 165

123 255 255 0 255 255 255 0

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Write Your Output Here :

Summary
1. Convolution is a common tool used for image editing.
2. It is an element wise multiplication of an image and a kernel to get the desired output.
3. In computer vision application, it is used in Convolutional Neural Network (CNN) to extract
image features.

Convolution Neural Networks (CNN)


Introduction
In class 9, you studied about the concepts of Neural Network. You played a neural network game to
understand how a neural network works.
Let’s recall

What is a Neural Network?

__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Fill in the names of different layers of Neural Network.

Did you get the answers right? In this section, we are going to study about one such neural network
which is Convolutional Neural Network (CNN). Many of the current computer vision applications use
a powerful neural network called the convolutional neural network.

What is a Convolutional Neural Network?

A Convolutional Neural Network (CNN) is a Deep Learning algorithm which can take in an input image,
assign importance (learnable weights and biases) to various aspects/objects in the image and be able
to differentiate one from the other.

The process of deploying a CNN is as follows:

* Images shown here are the property of individual organisations and are used here for reference purpose only.
In the above diagram, we give an input image, which is then processed through a CNN and then gives
prediction on the basis of the label given in the particular dataset.

The different layers of a Convolutional Neural Network (CNN) is as follows:

A convolutional neural network consists of the following layers:

1) Convolution Layer
2) Rectified linear Unit (ReLU)
3) Pooling Layer
4) Fully Connected Layer

Convolution Layer

It is the first layer of a CNN. The objective of the Convolution Operation is to extract the high-level
features such as edges, from the input image. CNN need not be limited to only one Convolutional
Layer. Conventionally, the first Convolution Layer is responsible for capturing the Low-Level features
such as edges, colour, gradient orientation, etc. With added layers, the architecture adapts to the
High-Level features as well, giving us a network which has the wholesome understanding of images in
the dataset.

It uses convolution operation on the images. In the convolution layer, there are several kernels that
are used to produce several features. The output of this layer is called the feature map. A feature map
is also called the activation map. We can use these terms interchangeably.
There’s several uses we derive from the feature map:
• We reduce the image size so that it can be processed more efficiently.
• We only focus on the features of the image that can help us in processing the image further.
For example, you might only need to recognize someone’s eyes, nose and mouth to recognize the
person. You might not need to see the whole face.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Rectified Linear Unit Function

The next layer in the Convolution Neural Network is the Rectified Linear Unit function or the ReLU
layer. After we get the feature map, it is then passed onto the ReLU layer. This layer simply gets rid of
all the negative numbers in the feature map and lets the positive number stay as it is.

The process of passing it to the ReLU layer introduces non – linearity in the feature map. Let us see it
through a graph.

If we see the two graphs side by side, the one on the left is a linear graph. This graph when passed
through the ReLU layer, gives the one on the right. The ReLU graph starts with a horizontal straight
line and then increases linearly as it reaches a positive number.

Now the question arises, why do we pass the feature map to the ReLU layer? it is to make the colour
change more obvious and more abrupt?

* Images shown here are the property of individual organisations and are used here for reference purpose only.
As shown in the above convolved image, there is a smooth grey gradient change from black to white.
After applying the ReLu function, we can see a more abrupt change in color which makes the edges
more obvious which acts as a better feature for the further layers in a CNN as it enhances the
activation layer.

Pooling Layer

Similar to the Convolutional Layer, the Pooling layer is responsible for reducing the spatial size of the
Convolved Feature while still retaining the important features.

There are two types of pooling which can be performed on an image.

1) Max Pooling : Max Pooling returns the maximum value from the portion of the image covered
by the Kernel.
2) Average Pooling: Max Pooling returns the maximum value from the portion of the image
covered by the Kernel.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
The pooling layer is an important layer in the CNN as it performs a series of tasks which are as
follows :

1) Makes the image smaller and more manageable


2) Makes the image more resistant to small transformations, distortions and translations in the
input image.

A small difference in input image will create very similar pooled image.

Fully Connected Layer

The final layer in the CNN is the Fully Connected Layer (FCP). The objective of a fully connected layer
is to take the results of the convolution/pooling process and use them to classify the image into a label
(in a simple classification example).

The output of convolution/pooling is flattened into a single vector of values, each representing a
probability that a certain feature belongs to a label. For example, if the image is of a cat, features
representing things like whiskers or fur should have high probabilities for the label “cat”.

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Let’s Summarize:

Write the whole process of how a CNN works on the basis of the above diagram.
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________

* Images shown here are the property of individual organisations and are used here for reference purpose only.
Let’s Experience
Now let us see how this comes into practice. To see that, go to the link
http://scs.ryerson.ca/~aharley/vis/conv/flat.html

This is an online application of classifying different numbers. We need to analyse the different layers
in the application on the basis of the CNN that we have studied in the previous section.

You might also like