Unit 4

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 60

Types of Machine Learning

Machine learning is a subset of AI, which enables the machine to automatically


learn from data, improve performance from past experiences, and make
predictions. Machine learning contains a set of algorithms that work on a huge
amount of data. Data is fed to these algorithms to train them, and on the basis of
training, they build the model & perform a specific task.

These ML algorithms help to solve different business problems like Regression,


Classification, Forecasting, Clustering, and Associations, etc.

Based on the methods and way of learning, machine learning is divided into mainly
four types, which are:

1. Supervised Machine Learning


2. Unsupervised Machine Learning
3. Semi-Supervised Machine Learning
4. Reinforcement Learning
In this topic, we will provide a detailed description of the types of Machine Learning
along with their respective algorithms:

1. Supervised Machine Learning


As its name suggests, Supervised machine learning is based on supervision. It means
in the supervised learning technique, we train the machines using the "labelled"
dataset, and based on the training, the machine predicts the output. Here, the
labelled data specifies that some of the inputs are already mapped to the output.
More preciously, we can say; first, we train the machine with the input and
corresponding output, and then we ask the machine to predict the output using the
test dataset.

Let's understand supervised learning with an example. Suppose we have an input


dataset of cats and dog images. So, first, we will provide the training to the machine
to understand the images, such as the shape & size of the tail of cat and dog,
Shape of eyes, colour, height (dogs are taller, cats are smaller), etc. After
completion of training, we input the picture of a cat and ask the machine to identify
the object and predict the output. Now, the machine is well trained, so it will check
all the features of the object, such as height, shape, colour, eyes, ears, tail, etc., and
find that it's a cat. So, it will put it in the Cat category. This is the process of how the
machine identifies the objects in Supervised Learning.

The main goal of the supervised learning technique is to map the input
variable(x) with the output variable(y). Some real-world applications of supervised
learning are Risk Assessment, Fraud Detection, Spam filtering, etc.

Categories of Supervised Machine Learning


Supervised machine learning can be classified into two types of problems, which are
given below:

o Classification
o Regression

a) Classification

Classification algorithms are used to solve the classification problems in which the
output variable is categorical, such as "Yes" or No, Male or Female, Red or Blue,
etc. The classification algorithms predict the categories present in the dataset. Some
real-world examples of classification algorithms are Spam Detection, Email
filtering, etc.

Some popular classification algorithms are given below:

o Random Forest Algorithm


o Decision Tree Algorithm
o Logistic Regression Algorithm
o Support Vector Machine Algorithm

AD

b) Regression

Regression algorithms are used to solve regression problems in which there is a


linear relationship between input and output variables. These are used to predict
continuous output variables, such as market trends, weather prediction, etc.

Some popular Regression algorithms are given below:

o Simple Linear Regression Algorithm


o Multivariate Regression Algorithm
o Decision Tree Algorithm
o Lasso Regression

Advantages and Disadvantages of Supervised


Learning
Advantages:
o Since supervised learning work with the labelled dataset so we can have an
exact idea about the classes of objects.
o These algorithms are helpful in predicting the output on the basis of prior
experience.

Disadvantages:

o These algorithms are not able to solve complex tasks.


o It may predict the wrong output if the test data is different from the training
data.
o It requires lots of computational time to train the algorithm.

Applications of Supervised Learning


Some common applications of Supervised Learning are given below:

o Image Segmentation:
Supervised Learning algorithms are used in image segmentation. In this
process, image classification is performed on different image data with pre-
defined labels.
o Medical Diagnosis:
Supervised algorithms are also used in the medical field for diagnosis
purposes. It is done by using medical images and past labelled data with
labels for disease conditions. With such a process, the machine can identify a
disease for the new patients.
o Fraud Detection - Supervised Learning classification algorithms are used for
identifying fraud transactions, fraud customers, etc. It is done by using historic
data to identify the patterns that can lead to possible fraud.
o Spam detection - In spam detection & filtering, classification algorithms are
used. These algorithms classify an email as spam or not spam. The spam
emails are sent to the spam folder.
o Speech Recognition - Supervised learning algorithms are also used in speech
recognition. The algorithm is trained with voice data, and various
identifications can be done using the same, such as voice-activated
passwords, voice commands, etc.

2. Unsupervised Machine Learning


Unsupervised learning is different from the Supervised learning technique; as its
name suggests, there is no need for supervision. It means, in unsupervised machine
learning, the machine is trained using the unlabeled dataset, and the machine
predicts the output without any supervision.

In unsupervised learning, the models are trained with the data that is neither
classified nor labelled, and the model acts on that data without any supervision.

AD

The main aim of the unsupervised learning algorithm is to group or categories


the unsorted dataset according to the similarities, patterns, and
differences. Machines are instructed to find the hidden patterns from the input
dataset.

Let's take an example to understand it more preciously; suppose there is a basket of


fruit images, and we input it into the machine learning model. The images are totally
unknown to the model, and the task of the machine is to find the patterns and
categories of the objects.

So, now the machine will discover its patterns and differences, such as colour
difference, shape difference, and predict the output when it is tested with the test
dataset.

Categories of Unsupervised Machine Learning


Unsupervised Learning can be further classified into two types, which are given
below:

o Clustering
o Association

1) Clustering

The clustering technique is used when we want to find the inherent groups from the
data. It is a way to group the objects into a cluster such that the objects with the
most similarities remain in one group and have fewer or no similarities with the
objects of other groups. An example of the clustering algorithm is grouping the
customers by their purchasing behaviour.

AD

Some of the popular clustering algorithms are given below:

o K-Means Clustering algorithm


o Mean-shift algorithm
o DBSCAN Algorithm
o Principal Component Analysis
o Independent Component Analysis

2) Association

Association rule learning is an unsupervised learning technique, which finds


interesting relations among variables within a large dataset. The main aim of this
learning algorithm is to find the dependency of one data item on another data item
and map those variables accordingly so that it can generate maximum profit. This
algorithm is mainly applied in Market Basket analysis, Web usage mining,
continuous production, etc.

Some popular algorithms of Association rule learning are Apriori Algorithm, Eclat,
FP-growth algorithm.

Advantages and Disadvantages of Unsupervised


Learning Algorithm
Advantages:

o These algorithms can be used for complicated tasks compared to the


supervised ones because these algorithms work on the unlabeled dataset.
o Unsupervised algorithms are preferable for various tasks as getting the
unlabeled dataset is easier as compared to the labelled dataset.

Disadvantages:

o The output of an unsupervised algorithm can be less accurate as the dataset is


not labelled, and algorithms are not trained with the exact output in prior.
o Working with Unsupervised learning is more difficult as it works with the
unlabelled dataset that does not map with the output.

Applications of Unsupervised Learning

o Network Analysis: Unsupervised learning is used for identifying plagiarism


and copyright in document network analysis of text data for scholarly articles.
o Recommendation Systems: Recommendation systems widely use
unsupervised learning techniques for building recommendation applications
for different web applications and e-commerce websites.
o Anomaly Detection: Anomaly detection is a popular application of
unsupervised learning, which can identify unusual data points within the
dataset. It is used to discover fraudulent transactions.
o Singular Value Decomposition: Singular Value Decomposition or SVD is
used to extract particular information from the database. For example,
extracting information of each user located at a particular location.

3. Semi-Supervised Learning
Semi-Supervised learning is a type of Machine Learning algorithm that lies
between Supervised and Unsupervised machine learning. It represents the
intermediate ground between Supervised (With Labelled training data) and
Unsupervised learning (with no labelled training data) algorithms and uses the
combination of labelled and unlabeled datasets during the training period.

Although Semi-supervised learning is the middle ground between supervised and


unsupervised learning and operates on the data that consists of a few labels, it
mostly consists of unlabeled data. As labels are costly, but for corporate purposes,
they may have few labels. It is completely different from supervised and
unsupervised learning as they are based on the presence & absence of labels.

To overcome the drawbacks of supervised learning and unsupervised learning


algorithms, the concept of Semi-supervised learning is introduced. The main aim
of semi-supervised learning is to effectively use all the available data, rather than
only labelled data like in supervised learning. Initially, similar data is clustered along
with an unsupervised learning algorithm, and further, it helps to label the unlabeled
data into labelled data. It is because labelled data is a comparatively more expensive
acquisition than unlabeled data.

We can imagine these algorithms with an example. Supervised learning is where a


student is under the supervision of an instructor at home and college. Further, if that
student is self-analysing the same concept without any help from the instructor, it
comes under unsupervised learning. Under semi-supervised learning, the student has
to revise himself after analyzing the same concept under the guidance of an
instructor at college.

Advantages and disadvantages of Semi-supervised


Learning
Advantages:

o It is simple and easy to understand the algorithm.


o It is highly efficient.
o It is used to solve drawbacks of Supervised and Unsupervised Learning
algorithms.

Disadvantages:

o Iterations results may not be stable.


o We cannot apply these algorithms to network-level data.
o Accuracy is low.

AD

4. Reinforcement Learning
Reinforcement learning works on a feedback-based process, in which an AI
agent (A software component) automatically explore its surrounding by hitting
& trail, taking action, learning from experiences, and improving its
performance. Agent gets rewarded for each good action and get punished for each
bad action; hence the goal of reinforcement learning agent is to maximize the
rewards.

In reinforcement learning, there is no labelled data like supervised learning, and


agents learn from their experiences only.

The reinforcement learning process is similar to a human being; for example, a child
learns various things by experiences in his day-to-day life. An example of
reinforcement learning is to play a game, where the Game is the environment, moves
of an agent at each step define states, and the goal of the agent is to get a high
score. Agent receives feedback in terms of punishment and rewards.

Due to its way of working, reinforcement learning is employed in different fields such
as Game theory, Operation Research, Information theory, multi-agent systems.

A reinforcement learning problem can be formalized using Markov Decision


Process(MDP). In MDP, the agent constantly interacts with the environment and
performs actions; at each action, the environment responds and generates a new
state.

Categories of Reinforcement Learning


Reinforcement learning is categorized mainly into two types of methods/algorithms:

o Positive Reinforcement Learning: Positive reinforcement learning specifies


increasing the tendency that the required behaviour would occur again by
adding something. It enhances the strength of the behaviour of the agent and
positively impacts it.
o Negative Reinforcement Learning: Negative reinforcement learning works
exactly opposite to the positive RL. It increases the tendency that the specific
behaviour would occur again by avoiding the negative condition.

Real-world Use cases of Reinforcement Learning

o Video Games:
RL algorithms are much popular in gaming applications. It is used to gain
super-human performance. Some popular games that use RL algorithms
are AlphaGO and AlphaGO Zero.
o Resource Management:
The "Resource Management with Deep Reinforcement Learning" paper
showed that how to use RL in computer to automatically learn and schedule
resources to wait for different jobs in order to minimize average job
slowdown.
o Robotics:
RL is widely being used in Robotics applications. Robots are used in the
industrial and manufacturing area, and these robots are made more powerful
with reinforcement learning. There are different industries that have their
vision of building intelligent robots using AI and Machine learning technology.
o Text Mining
Text-mining, one of the great applications of NLP, is now being implemented
with the help of Reinforcement Learning by Salesforce company.

Advantages and Disadvantages of Reinforcement


Learning
Advantages

o It helps in solving complex real-world problems which are difficult to be


solved by general techniques.
o The learning model of RL is similar to the learning of human beings; hence
most accurate results can be found.
o Helps in achieving long term results.

Disadvantage

o RL algorithms are not preferred for simple problems.


o RL algorithms require huge data and computations.
o Too much reinforcement learning can lead to an overload of states which can
weaken the results.

The curse of dimensionality limits reinforcement learning for real physical systems.

Supervised Machine Learning


Supervised learning is the types of machine learning in which machines are trained
using well "labelled" training data, and on basis of that data, machines predict the
output. The labelled data means some input data is already tagged with the correct
output.

In supervised learning, the training data provided to the machines work as the
supervisor that teaches the machines to predict the output correctly. It applies the
same concept as a student learns in the supervision of the teacher.

Supervised learning is a process of providing input data as well as correct output


data to the machine learning model. The aim of a supervised learning algorithm is
to find a mapping function to map the input variable(x) with the output
variable(y).

In the real-world, supervised learning can be used for Risk Assessment, Image
classification, Fraud Detection, spam filtering, etc.

How Supervised Learning Works?


In supervised learning, models are trained using labelled dataset, where the model
learns about each type of data. Once the training process is completed, the model is
tested on the basis of test data (a subset of the training set), and then it predicts the
output.
The working of Supervised learning can be easily understood by the below example

and diagram:

Suppose we have a dataset of different types of shapes which includes square,


rectangle, triangle, and Polygon. Now the first step is that we need to train the model
for each shape.

o If the given shape has four sides, and all the sides are equal, then it will be labelled as
a Square.
o If the given shape has three sides, then it will be labelled as a triangle.
o If the given shape has six equal sides then it will be labelled as hexagon.

Now, after training, we test our model using the test set, and the task of the model is
to identify the shape.

The machine is already trained on all types of shapes, and when it finds a new shape,
it classifies the shape on the bases of a number of sides, and predicts the output.

Steps Involved in Supervised Learning:


o First Determine the type of training dataset
o Collect/Gather the labelled training data.
o Split the training dataset into training dataset, test dataset, and validation dataset.
o Determine the input features of the training dataset, which should have enough
knowledge so that the model can accurately predict the output.
o Determine the suitable algorithm for the model, such as support vector machine,
decision tree, etc.
o Execute the algorithm on the training dataset. Sometimes we need validation sets as
the control parameters, which are the subset of training datasets.
o Evaluate the accuracy of the model by providing the test set. If the model predicts the
correct output, which means our model is accurate.

AD

Types of supervised Machine learning


Algorithms:
Supervised learning can be further divided into two types of problems:

1. Regression

Regression algorithms are used if there is a relationship between the input variable
and the output variable. It is used for the prediction of continuous variables, such as
Weather forecasting, Market Trends, etc. Below are some popular Regression
algorithms which come under supervised learning:

o Linear Regression
o Regression Trees
o Non-Linear Regression
o Bayesian Linear Regression
o Polynomial Regression

2. Classification

Classification algorithms are used when the output variable is categorical, which
means there are two classes such as Yes-No, Male-Female, True-false, etc.

Spam Filtering,
o Random Forest
o Decision Trees
o Logistic Regression
o Support vector Machines

Note: We will discuss these algorithms in detail in later chapters.


AD

Advantages of Supervised learning:


o With the help of supervised learning, the model can predict the output on the basis
of prior experiences.
o In supervised learning, we can have an exact idea about the classes of objects.
o Supervised learning model helps us to solve various real-world problems such
as fraud detection, spam filtering, etc.

Disadvantages of supervised learning:


o Supervised learning models are not suitable for handling the complex tasks.
o Supervised learning cannot predict the correct output if the test data is different from
the training dataset.
o Training required lots of computation times.
o In supervised learning, we need enough knowledge about the classes of object.

Decision Tree in Machine Learning


A decision tree in machine learning is a versatile, interpretable algorithm used for
predictive modelling. It structures decisions based on input data, making it suitable
for both classification and regression tasks. This article delves into the
components, terminologies, construction, and advantages of decision trees,
exploring their applications and learning algorithms.
Decision Tree in Machine Learning
A decision tree is a type of supervised learning algorithm that is commonly used in
machine learning to model and predict outcomes based on input data. It is a tree-
like structure where each internal node tests on attribute, each branch corresponds
to attribute value and each leaf node represents the final decision or prediction. The
decision tree algorithm falls under the category of supervised learning. They can be
used to solve both regression and classification problems.
Decision Tree Terminologies
There are specialized terms associated with decision trees that denote various
components and facets of the tree structure and decision-making procedure. :
 Root Node: A decision tree’s root node, which represents the original
choice or feature from which the tree branches, is the highest node.
 Internal Nodes (Decision Nodes): Nodes in the tree whose choices are
determined by the values of particular attributes. There are branches on
these nodes that go to other nodes.
 Leaf Nodes (Terminal Nodes): The branches’ termini, when choices or
forecasts are decided upon. There are no more branches on leaf nodes.
 Branches (Edges): Links between nodes that show how decisions are
made in response to particular circumstances.
 Splitting: The process of dividing a node into two or more sub-nodes
based on a decision criterion. It involves selecting a feature and a
threshold to create subsets of data.
 Parent Node: A node that is split into child nodes. The original node
from which a split originates.
 Child Node: Nodes created as a result of a split from a parent node.
 Decision Criterion: The rule or condition used to determine how the
data should be split at a decision node. It involves comparing feature
values against a threshold.
 Pruning: The process of removing branches or nodes from a decision
tree to improve its generalisation and prevent overfitting.
Understanding these terminologies is crucial for interpreting and working with
decision trees in machine learning applications.
How Decision Tree is formed?
The process of forming a decision tree involves recursively partitioning the data
based on the values of different attributes. The algorithm selects the best attribute
to split the data at each internal node, based on certain criteria such as information
gain or Gini impurity. This splitting process continues until a stopping criterion is
met, such as reaching a maximum depth or having a minimum number of instances
in a leaf node.
Why Decision Tree?
Decision trees are widely used in machine learning for a number of reasons:
 Decision trees are so versatile in simulating intricate decision-making
processes, because of their interpretability and versatility.
 Their portrayal of complex choice scenarios that take into account a
variety of causes and outcomes is made possible by their hierarchical
structure.
 They provide comprehensible insights into the decision logic, decision
trees are especially helpful for tasks involving categorisation and
regression.
 They are proficient with both numerical and categorical data, and they
can easily adapt to a variety of datasets thanks to their autonomous
feature selection capability.
 Decision trees also provide simple visualization, which helps to
comprehend and elucidate the underlying decision processes in a model.
Decision Tree Approach
Decision tree uses the tree representation to solve the problem in which each leaf
node corresponds to a class label and attributes are represented on the internal node
of the tree. We can represent any boolean function on discrete attributes using the
decision tree.

Below are some assumptions that we made while using the decision tree:
At the beginning, we consider the whole training set as the root.
 Feature values are preferred to be categorical. If the values are
continuous then they are discretized prior to building the model.
 On the basis of attribute values, records are distributed recursively.
 We use statistical methods for ordering attributes as root or the internal
node.

As you can see from the above image the Decision Tree works on the Sum of
Product form which is also known as Disjunctive Normal Form. In the above
image, we are predicting the use of computer in the daily life of people. In the
Decision Tree, the major challenge is the identification of the attribute for the root
node at each level. This process is known as attribute selection. We have two
popular attribute selection measures:
1. Information Gain
2. Gini Index
1. Information Gain:
When we use a node in a decision tree to partition the training instances into
smaller subsets the entropy changes. Information gain is a measure of this change
in entropy.
 Suppose S is a set of instances,
 A is an attribute
 Sv is the subset of S
 v represents an individual value that the attribute A can take and Values
(A) is the set of all possible values of A, then

Entropy: is the measure of uncertainty of a random variable, it characterizes the


impurity of an arbitrary collection of examples. The higher the entropy more the
information content.
Suppose S is a set of instances, A is an attribute, Sv is the subset of S with A = v,
and Values (A) is the set of all possible values of A, then

Example:
For the set X = {a,a,a,b,b,b,b,b}
Total instances: 8
Instances of b: 5
Instances of a: 3

Building Decision Tree using Information Gain The essentials:


 Start with all training instances associated with the root node
 Use info gain to choose which attribute to label each node with
 Note: No root-to-leaf path should contain the same discrete attribute
twice
 Recursively construct each subtree on the subset of training instances
that would be classified down that path in the tree.
 If all positive or all negative training instances remain, the label that
node “yes” or “no” accordingly
 If no attributes remain, label with a majority vote of training instances
left at that node
 If no instances remain, label with a majority vote of the parent’s training
instances.
Example: Now, let us draw a Decision Tree for the following data using
Information gain. Training set: 3 features and 2 classes
X Y Z C

1 1 1 I

1 1 0 I

0 0 1 II

1 0 0 II

Here, we have 3 features and 2 output classes. To build a decision tree using
Information gain. We will take each of the features and calculate the information
for each feature.
From the above images, we can see that the information gain is maximum when we
make a split on feature Y. So, for the root node best-suited feature is feature Y.
Now we can see that while splitting the dataset by feature Y, the child contains a
pure subset of the target variable. So we don’t need to further split the dataset. The
final tree for the above dataset would look like this:

2. Gini Index
Gini Index is a metric to measure how often a randomly chosen element
would be incorrectly identified.
 It means an attribute with a lower Gini index should be preferred.
 Sklearn supports “Gini” criteria for Gini Index and by default, it takes
“gini” value.
 The Formula for the calculation of the Gini Index is given below.
The Formula for Gini Index is given by :

Gini Impurity

The Gini Index is a measure of the inequality or impurity of a distribution,


commonly used in decision trees and other machine learning algorithms. It ranges
from 0 to 0.5, where 0 indicates a pure set (all instances belong to the same class),
and 0.5 indicates a maximally impure set (instances are evenly distributed across
classes).
Some additional features and characteristics of the Gini Index are:
 It is calculated by summing the squared probabilities of each outcome in
a distribution and subtracting the result from 1.
 A lower Gini Index indicates a more homogeneous or pure distribution,
while a higher Gini Index indicates a more heterogeneous or impure
distribution.
 In decision trees, the Gini Index is used to evaluate the quality of a split
by measuring the difference between the impurity of the parent node and
the weighted impurity of the child nodes.
 Compared to other impurity measures like entropy, the Gini Index is
faster to compute and more sensitive to changes in class probabilities.
 One disadvantage of the Gini Index is that it tends to favour splits that
create equally sized child nodes, even if they are not optimal for
classification accuracy.
 In practice, the choice between using the Gini Index or other impurity
measures depends on the specific problem and dataset, and often requires
experimentation and tuning.
Example of a Decision Tree Algorithm
Forecasting Activities Using Weather Information
 Root node: Whole dataset
 Attribute : “Outlook” (sunny, cloudy, rainy).
 Subsets: Overcast, Rainy, and Sunny.
 Recursive Splitting: Divide the sunny subset even more according to
humidity, for example.
 Leaf Nodes: Activities include “swimming,” “hiking,” and “staying
inside.”
Advantages of Decision Tree
 Easy to understand and interpret, making them accessible to non-experts.
 Handle both numerical and categorical data without requiring extensive
preprocessing.
 Provides insights into feature importance for decision-making.
 Handle missing values and outliers without significant impact.
 Applicable to both classification and regression tasks.
Disadvantages of Decision Tree
 Disadvantages include the potential for overfitting
 Sensitivity to small changes in data, limited generalization if training
data is not representative
 Potential bias in the presence of imbalanced data.
Conclusion
Decision trees, a key tool in machine learning, model and predict outcomes based
on input data through a tree-like structure. They offer interpretability, versatility,
and simple visualization, making them valuable for both categorization and
regression tasks. While decision trees have advantages like ease of understanding,
they may face challenges such as overfitting. Understanding their terminologies
and formation process is essential for effective application in diverse scenarios.

Support Vector Machine Algorithm


Support Vector Machine or SVM is one of the most popular Supervised Learning
algorithms, which is used for Classification as well as Regression problems. However,
primarily, it is used for Classification problems in Machine Learning.

The goal of the SVM algorithm is to create the best line or decision boundary that
can segregate n-dimensional space into classes so that we can easily put the new
data point in the correct category in the future. This best decision boundary is called
a hyperplane.

SVM chooses the extreme points/vectors that help in creating the hyperplane. These
extreme cases are called as support vectors, and hence algorithm is termed as
Support Vector Machine. Consider the below diagram in which there are two
different categories that are classified using a decision boundary or hyperplane:

Example: SVM can be understood with the example that we have used in the KNN
classifier. Suppose we see a strange cat that also has some features of dogs, so if we
want a model that can accurately identify whether it is a cat or dog, so such a model
can be created by using the SVM algorithm. We will first train our model with lots of
images of cats and dogs so that it can learn about different features of cats and
dogs, and then we test it with this strange creature. So as support vector creates a
decision boundary between these two data (cat and dog) and choose extreme cases
(support vectors), it will see the extreme case of cat and dog. On the basis of the
support vectors, it will classify it as a cat. Consider the below diagram:
SVM algorithm can be used for Face detection, image classification, text
categorization, etc.

Types of SVM
SVM can be of two types:

o Linear SVM: Linear SVM is used for linearly separable data, which means if a
dataset can be classified into two classes by using a single straight line, then
such data is termed as linearly separable data, and classifier is used called as
Linear SVM classifier.
o Non-linear SVM: Non-Linear SVM is used for non-linearly separated data,
which means if a dataset cannot be classified by using a straight line, then
such data is termed as non-linear data and classifier used is called as Non-
linear SVM classifier.

Hyperplane and Support Vectors in the SVM


algorithm:
Hyperplane: There can be multiple lines/decision boundaries to segregate the
classes in n-dimensional space, but we need to find out the best decision boundary
that helps to classify the data points. This best boundary is known as the hyperplane
of SVM.
The dimensions of the hyperplane depend on the features present in the dataset,
which means if there are 2 features (as shown in image), then hyperplane will be a
straight line. And if there are 3 features, then hyperplane will be a 2-dimension plane.

We always create a hyperplane that has a maximum margin, which means the
maximum distance between the data points.

Support Vectors:

The data points or vectors that are the closest to the hyperplane and which affect the
position of the hyperplane are termed as Support Vector. Since these vectors support
the hyperplane, hence called a Support vector.

How does SVM works?


Linear SVM:

The working of the SVM algorithm can be understood by using an example. Suppose
we have a dataset that has two tags (green and blue), and the dataset has two
features x1 and x2. We want a classifier that can classify the pair(x1, x2) of
coordinates in either green or blue. Consider the below image:

So as it is 2-d space so by just using a straight line, we can easily separate these two
classes. But there can be multiple lines that can separate these classes. Consider the
below image:
Hence, the SVM algorithm helps to find the best line or decision boundary; this best
boundary or region is called as a hyperplane. SVM algorithm finds the closest point
of the lines from both the classes. These points are called support vectors. The
distance between the vectors and the hyperplane is called as margin. And the goal
of SVM is to maximize this margin. The hyperplane with maximum margin is called
the optimal hyperplane.
Non-Linear SVM:

If data is linearly arranged, then we can separate it by using a straight line, but for
non-linear data, we cannot draw a single straight line. Consider the below image:

AD
So to separate these data points, we need to add one more dimension. For linear
data, we have used two dimensions x and y, so for non-linear data, we will add a
third dimension z. It can be calculated as:

z=x2 +y2

By adding the third dimension, the sample space will become as below image:

So now, SVM will divide the datasets into classes in the following way. Consider the
below image:
Since we are in 3-d Space, hence it is looking like a plane parallel to the x-axis. If we
convert it in 2d space with z=1, then it will become as:

Hence we get a circumference of radius 1 in case of non-linear data.


AD

Python Implementation of Support Vector Machine

Now we will implement the SVM algorithm using Python. Here we will use the same
dataset user_data, which we have used in Logistic regression and KNN classification.

o Data Pre-processing step

AD

Till the Data pre-processing step, the code will remain the same. Below is the code:

1. #Data Pre-processing Step


2. # importing libraries
3. import numpy as nm
4. import matplotlib.pyplot as mtp
5. import pandas as pd
6.
7. #importing datasets
8. data_set= pd.read_csv('user_data.csv')
9.
10. #Extracting Independent and dependent Variable
11. x= data_set.iloc[:, [2,3]].values
12. y= data_set.iloc[:, 4].values
13.
14. # Splitting the dataset into training and test set.
15. from sklearn.model_selection import train_test_split
16. x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 0.25, random_stat
e=0)
17. #feature Scaling
18. from sklearn.preprocessing import StandardScaler
19. st_x= StandardScaler()
20. x_train= st_x.fit_transform(x_train)
21. x_test= st_x.transform(x_test)

After executing the above code, we will pre-process the data. The code will give the
dataset as:
Fitting the SVM classifier to the training set:

Now the training set will be fitted to the SVM classifier. To create the SVM classifier,
we will import SVC class from Sklearn.svm library. Below is the code for it:

1. from sklearn.svm import SVC # "Support vector classifier"


2. classifier = SVC(kernel='linear', random_state=0)
3. classifier.fit(x_train, y_train)

In the above code, we have used kernel='linear', as here we are creating SVM for
linearly separable data. However, we can change it for non-linear data. And then we
fitted the classifier to the training dataset(x_train, y_train)

Output:

Out[8]:
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
kernel='linear', max_iter=-1, probability=False, random_state=0,
shrinking=True, tol=0.001, verbose=False)

The model performance can be altered by changing the value of C(Regularization


factor), gamma, and kernel.

o Predicting the test set result:


Now, we will predict the output for test set. For this, we will create a new
vector y_pred. Below is the code for it:

1. #Predicting the test set result


2. y_pred= classifier.predict(x_test)

After getting the y_pred vector, we can compare the result of y_pred and y_test to
check the difference between the actual value and predicted value.

Output: Below is the output for the prediction of the test set:

o Creating the confusion matrix:


Now we will see the performance of the SVM classifier that how many
incorrect predictions are there as compared to the Logistic regression
classifier. To create the confusion matrix, we need to import
the confusion_matrix function of the sklearn library. After importing the
function, we will call it using a new variable cm. The function takes two
parameters, mainly y_true( the actual values) and y_pred (the targeted value
return by the classifier). Below is the code for it:

1. #Creating the Confusion matrix


2. from sklearn.metrics import confusion_matrix
3. cm= confusion_matrix(y_test, y_pred)

Output:

As we can see in the above output image, there are 66+24= 90 correct predictions
and 8+2= 10 correct predictions. Therefore we can say that our SVM model
improved as compared to the Logistic regression model.

o Visualizing the training set result:


Now we will visualize the training set result, below is the code for it:

1. from matplotlib.colors import ListedColormap


2. x_set, y_set = x_train, y_train
3. x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:, 0].
max() + 1, step =0.01),
4. nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step = 0.01))
5. mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(), x2.ravel()]).T).reshape
(x1.shape),
6. alpha = 0.75, cmap = ListedColormap(('red', 'green')))
7. mtp.xlim(x1.min(), x1.max())
8. mtp.ylim(x2.min(), x2.max())
9. for i, j in enumerate(nm.unique(y_set)):
10. mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1],
11. c = ListedColormap(('red', 'green'))(i), label = j)
12. mtp.title('SVM classifier (Training set)')
13. mtp.xlabel('Age')
14. mtp.ylabel('Estimated Salary')
15. mtp.legend()
16. mtp.show()

Output:

By executing the above code, we will get the output as:

As we can see, the above output is appearing similar to the Logistic regression
output. In the output, we got the straight line as hyperplane because we have used a
linear kernel in the classifier. And we have also discussed above that for the 2d
space, the hyperplane in SVM is a straight line.

o Visualizing the test set result:

1. #Visulaizing the test set result


2. from matplotlib.colors import ListedColormap
3. x_set, y_set = x_test, y_test
4. x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:, 0].
max() + 1, step =0.01),
5. nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step = 0.01))
6. mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(), x2.ravel()]).T).reshape
(x1.shape),
7. alpha = 0.75, cmap = ListedColormap(('red','green' )))
8. mtp.xlim(x1.min(), x1.max())
9. mtp.ylim(x2.min(), x2.max())
10. for i, j in enumerate(nm.unique(y_set)):
11. mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1],
12. c = ListedColormap(('red', 'green'))(i), label = j)
13. mtp.title('SVM classifier (Test set)')
14. mtp.xlabel('Age')
15. mtp.ylabel('Estimated Salary')
16. mtp.legend()
17. mtp.show()

Output:

By executing the above code, we will get the output as:

As we can see in the above output image, the SVM classifier has divided the users
into two regions (Purchased or Not purchased). Users who purchased the SUV are in
the red region with the red scatter points. And users who did not purchase the SUV
are in the green region with green scatter points. The hyperplane has divided the two
classes into Purchased and not purchased variable.

Unsupervised Machine Learning


In the previous topic, we learned supervised machine learning in which models are
trained using labeled data under the supervision of training data. But there may be
many cases in which we do not have labeled data and need to find the hidden
patterns from the given dataset. So, to solve such types of cases in machine learning,
we need unsupervised learning techniques.

What is Unsupervised Learning?


As the name suggests, unsupervised learning is a machine learning technique in
which models are not supervised using training dataset. Instead, models itself find
the hidden patterns and insights from the given data. It can be compared to learning
which takes place in the human brain while learning new things. It can be defined as:

Unsupervised learning is a type of machine learning in which models are trained using unlabeled
dataset and are allowed to act on that data without any supervision.

Unsupervised learning cannot be directly applied to a regression or classification


problem because unlike supervised learning, we have the input data but no
corresponding output data. The goal of unsupervised learning is to find the
underlying structure of dataset, group that data according to similarities, and
represent that dataset in a compressed format.

Example: Suppose the unsupervised learning algorithm is given an input dataset


containing images of different types of cats and dogs. The algorithm is never trained
upon the given dataset, which means it does not have any idea about the features of
the dataset. The task of the unsupervised learning algorithm is to identify the image
features on their own. Unsupervised learning algorithm will perform this task by
clustering the image dataset into the groups according to similarities between
images.
Why use Unsupervised Learning?
Below are some main reasons which describe the importance of Unsupervised
Learning:

o Unsupervised learning is helpful for finding useful insights from the data.
o Unsupervised learning is much similar as a human learns to think by their own
experiences, which makes it closer to the real AI.
o Unsupervised learning works on unlabeled and uncategorized data which make
unsupervised learning more important.
o In real-world, we do not always have input data with the corresponding output so to
solve such cases, we need unsupervised learning.

Working of Unsupervised Learning


Working of unsupervised learning can be understood by the below diagram:

Here, we have taken an unlabeled input data, which means it is not categorized and
corresponding outputs are also not given. Now, this unlabeled input data is fed to
the machine learning model in order to train it. Firstly, it will interpret the raw data to
find the hidden patterns from the data and then will apply suitable algorithms such
as k-means clustering, Decision tree, etc.

Once it applies the suitable algorithm, the algorithm divides the data objects into
groups according to the similarities and difference between the objects.

AD
Types of Unsupervised Learning Algorithm:
The unsupervised learning algorithm can be further categorized into two types of
problems:

o Clustering: Clustering is a method of grouping the objects into clusters such that
objects with most similarities remains into a group and has less or no similarities with
the objects of another group. Cluster analysis finds the commonalities between the
data objects and categorizes them as per the presence and absence of those
commonalities.
o Association: An association rule is an unsupervised learning method which is used
for finding the relationships between variables in the large database. It determines
the set of items that occurs together in the dataset. Association rule makes marketing
strategy more effective. Such as people who buy X item (suppose a bread) are also
tend to purchase Y (Butter/Jam) item. A typical example of Association rule is Market
Basket Analysis.

AD
Note: We will learn these algorithms in later chapters.

Unsupervised Learning algorithms:


Below is the list of some popular unsupervised learning algorithms:

o K-means clustering
o KNN (k-nearest neighbors)
o Hierarchal clustering
o Anomaly detection
o Neural Networks
o Principle Component Analysis
o Independent Component Analysis
o Apriori algorithm
o Singular value decomposition

Advantages of Unsupervised Learning


o Unsupervised learning is used for more complex tasks as compared to supervised
learning because, in unsupervised learning, we don't have labeled input data.
o Unsupervised learning is preferable as it is easy to get unlabeled data in comparison
to labeled data.

AD

Disadvantages of Unsupervised Learning


o Unsupervised learning is intrinsically more difficult than supervised learning as it does
not have corresponding output.
o The result of the unsupervised learning algorithm might be less accurate as input
data is not labeled, and algorithms do not know the exact output in advance.

o Artificial Neural Network Tutorial

o
o Artificial Neural Network Tutorial provides basic and advanced concepts of
ANNs. Our Artificial Neural Network tutorial is developed for beginners as well
as professions.
o The term "Artificial neural network" refers to a biologically inspired sub-field
of artificial intelligence modeled after the brain. An Artificial neural network is
usually a computational network based on biological neural networks that
construct the structure of the human brain. Similar to a human brain has
neurons interconnected to each other, artificial neural networks also have
neurons that are linked to each other in various layers of the networks. These
neurons are known as nodes.
o Artificial neural network tutorial covers all the aspects related to the artificial
neural network. In this tutorial, we will discuss ANNs, Adaptive resonance
theory, Kohonen self-organizing map, Building blocks, unsupervised learning,
Genetic algorithm, etc.
o What is Artificial Neural Network?
o The term "Artificial Neural Network" is derived from Biological neural
networks that develop the structure of a human brain. Similar to the human
brain that has neurons interconnected to one another, artificial neural
networks also have neurons that are interconnected to one another in various
layers of the networks. These neurons are known as nodes.

o
o The given figure illustrates the typical diagram of Biological Neural
Network.
o The typical Artificial Neural Network looks something like the given
figure.
o
o Dendrites from Biological Neural Network represent inputs in Artificial Neural
Networks, cell nucleus represents Nodes, synapse represents Weights, and
Axon represents Output.
o Relationship between Biological neural network and artificial neural network:

Biological Neural Network Artificial Neural Ne

Dendrites Inputs

Cell nucleus Nodes

Synapse Weights

Axon Output

o An Artificial Neural Network in the field of Artificial intelligence where it


attempts to mimic the network of neurons makes up a human brain so that
computers will have an option to understand things and make decisions in a
human-like manner. The artificial neural network is designed by programming
computers to behave simply like interconnected brain cells.
o There are around 1000 billion neurons in the human brain. Each neuron has an
association point somewhere in the range of 1,000 and 100,000. In the human
brain, data is stored in such a manner as to be distributed, and we can extract
more than one piece of this data when necessary from our memory parallelly.
We can say that the human brain is made up of incredibly amazing parallel
processors.
o We can understand the artificial neural network with an example, consider an
example of a digital logic gate that takes an input and gives an output. "OR"
gate, which takes two inputs. If one or both the inputs are "On," then we get
"On" in output. If both the inputs are "Off," then we get "Off" in output. Here
the output depends upon input. Our brain does not perform the same task.
The outputs to inputs relationship keep changing because of the neurons in
our brain, which are "learning."
o The architecture of an artificial neural
network:
o To understand the concept of the architecture of an artificial neural network,
we have to understand what a neural network consists of. In order to define a
neural network that consists of a large number of artificial neurons, which are
termed units arranged in a sequence of layers. Lets us look at various types of
layers available in an artificial neural network.
o Artificial Neural Network primarily consists of three layers:
o

o Input Layer:
o As the name suggests, it accepts inputs in several different formats provided
by the programmer.
o Hidden Layer:
o The hidden layer presents in-between input and output layers. It performs all
the calculations to find hidden features and patterns.
O AD

o Output Layer:
o The input goes through a series of transformations using the hidden layer,
which finally results in output that is conveyed using this layer.
o The artificial neural network takes input and computes the weighted sum of
the inputs and includes a bias. This computation is represented in the form of
a transfer function.

o
o It determines weighted total is passed as an input to an activation function to
produce the output. Activation functions choose whether a node should fire
or not. Only those who are fired make it to the output layer. There are
distinctive activation functions available that can be applied upon the sort of
task we are performing.
o Advantages of Artificial Neural Network
(ANN)
o Parallel processing capability:
O AD

o Artificial neural networks have a numerical value that can perform more than
one task simultaneously.
o Storing data on the entire network:
o Data that is used in traditional programming is stored on the whole network,
not on a database. The disappearance of a couple of pieces of data in one
place doesn't prevent the network from working.
o Capability to work with incomplete knowledge:
o After ANN training, the information may produce output even with
inadequate data. The loss of performance here relies upon the significance of
missing data.
o Having a memory distribution:
o For ANN is to be able to adapt, it is important to determine the examples and
to encourage the network according to the desired output by demonstrating
these examples to the network. The succession of the network is directly
proportional to the chosen instances, and if the event can't appear to the
network in all its aspects, it can produce false output.
o Having fault tolerance:
o Extortion of one or more cells of ANN does not prohibit it from generating
output, and this feature makes the network fault-tolerance.
O AD

o Disadvantages of Artificial Neural


Network:
o Assurance of proper network structure:
o There is no particular guideline for determining the structure of artificial
neural networks. The appropriate network structure is accomplished through
experience, trial, and error.
o Unrecognized behavior of the network:
o It is the most significant issue of ANN. When ANN produces a testing solution,
it does not provide insight concerning why and how. It decreases trust in the
network.
o Hardware dependence:
o Artificial neural networks need processors with parallel processing power, as
per their structure. Therefore, the realization of the equipment is dependent.
o Difficulty of showing the issue to the network:
o ANNs can work with numerical data. Problems must be converted into
numerical values before being introduced to ANN. The presentation
mechanism to be resolved here will directly impact the performance of the
network. It relies on the user's abilities.
o The duration of the network is unknown:
o The network is reduced to a specific value of the error, and this value does not
give us optimum results.
o Science artificial neural networks that have steeped into the world in the mid-
20th century are exponentially developing. In the present time, we have
investigated the pros of artificial neural networks and the issues encountered in
the course of their utilization. It should not be overlooked that the cons of ANN
networks, which are a flourishing science branch, are eliminated individually, and
their pros are increasing day by day. It means that artificial neural networks will
turn into an irreplaceable part of our lives progressively important.
o How do artificial neural networks
work?
o Artificial Neural Network can be best represented as a weighted directed
graph, where the artificial neurons form the nodes. The association between
the neurons outputs and neuron inputs can be viewed as the directed edges
with weights. The Artificial Neural Network receives the input signal from the
external source in the form of a pattern and image in the form of a vector.
These inputs are then mathematically assigned by the notations x(n) for every
n number of inputs.
o
o Afterward, each of the input is multiplied by its corresponding weights ( these
weights are the details utilized by the artificial neural networks to solve a
specific problem ). In general terms, these weights normally represent the
strength of the interconnection between neurons inside the artificial neural
network. All the weighted inputs are summarized inside the computing unit.
o If the weighted sum is equal to zero, then bias is added to make the output
non-zero or something else to scale up to the system's response. Bias has the
same input, and weight equals to 1. Here the total of weighted inputs can be
in the range of 0 to positive infinity. Here, to keep the response in the limits of
the desired value, a certain maximum value is benchmarked, and the total of
weighted inputs is passed through the activation function.
o The activation function refers to the set of transfer functions used to achieve
the desired output. There is a different kind of the activation function, but
primarily either linear or non-linear sets of functions. Some of the commonly
used sets of activation functions are the Binary, linear, and Tan hyperbolic
sigmoidal activation functions. Let us take a look at each of them in details:
o Binary:
o In binary activation function, the output is either a one or a 0. Here, to
accomplish this, there is a threshold value set up. If the net weighted input of
neurons is more than 1, then the final output of the activation function is
returned as one or else the output is returned as 0.
O AD

o Sigmoidal Hyperbolic:
o The Sigmoidal Hyperbola function is generally seen as an "S" shaped curve.
Here the tan hyperbolic function is used to approximate output from the
actual net input. The function is defined as:
o F(x) = (1/1 + exp(-????x))
o Where ???? is considered the Steepness parameter.
o Types of Artificial Neural Network:
o There are various types of Artificial Neural Networks (ANN) depending upon
the human brain neuron and network functions, an artificial neural network
similarly performs tasks. The majority of the artificial neural networks will have
some similarities with a more complex biological partner and are very effective
at their expected tasks. For example, segmentation or classification.
o Feedback ANN:
o In this type of ANN, the output returns into the network to accomplish the
best-evolved results internally. As per the University of Massachusetts,
Lowell Centre for Atmospheric Research. The feedback networks feed
information back into itself and are well suited to solve optimization issues.
The Internal system error corrections utilize feedback ANNs.
o Feed-Forward ANN:
o A feed-forward network is a basic neural network comprising of an input layer,
an output layer, and at least one layer of a neuron. Through assessment of its
output by reviewing its input, the intensity of the network can be noticed
based on group behavior of the associated neurons, and the output is
decided. The primary advantage of this network is that it figures out how to
evaluate and recognize input patterns.
o Prerequisite
o No specific expertise is needed as a prerequisite before starting this tutorial.
o Audience
o Our Artificial Neural Network Tutorial is developed for beginners as well as
professionals, to help them understand the basic concept of ANNs.
o Problems
o We assure you that you will not find any problem in this Artificial Neural
Network tutorial. But if there is any problem or mistake, please post the
problem in the contact form so that we can further improve it.

Introduction to Natural Language Processing


Last Updated : 18 Apr, 2024



The essence of Natural Language Processing lies in making computers understand
the natural language. That’s not an easy task though. Computers can understand
the structured form of data like spreadsheets and tables in the database, but human
languages, texts, and voices form an unstructured category of data, and it becomes
difficult for the computer to understand it, and there is the need for Natural
Language Processing.
Natural Language Processing (NLP) is a critical area of artificial intelligence (AI)
that focuses on enabling computers to understand, interpret, and generate human
language in a way that is both meaningful and useful. NLP draws from various
fields including computer science, linguistics, and machine learning to bridge the
gap between human communication and computer understanding.

At its core, NLP aims to equip machines with the ability to comprehend and
generate natural language text or speech, similar to how humans do. This involves
a range of tasks, from simple tasks like language translation and sentiment analysis
to more complex tasks like language generation and understanding context.

Key components of NLP include:

1. **Tokenization**: Breaking down text into smaller units such as words or


phrases (tokens) for analysis.

2. **Part-of-Speech (POS) Tagging**: Assigning grammatical categories (like


noun, verb, adjective) to words in a sentence.

3. **Named Entity Recognition (NER)**: Identifying and classifying named


entities such as names of people, organizations, or locations in text.

4. **Syntax and Parsing**: Analyzing sentence structure to understand


relationships between words.
5. **Semantics**: Extracting meaning from text beyond just syntax, understanding
the intended meaning or context.

6. **Sentiment Analysis**: Determining the sentiment or emotional tone of text,


whether it's positive, negative, or neutral.

7. **Language Translation**: Translating text from one language to another while


preserving the meaning.

8. **Language Generation**: Creating coherent and contextually relevant text,


such as in chatbots or content summarization.

NLP algorithms and models are often based on machine learning techniques like
deep learning, which involve training large neural networks on vast amounts of
textual data. These models learn to recognize patterns and make predictions about
language, enabling tasks like language translation, text summarization, question
answering, and more.

Applications of NLP are widespread across industries such as healthcare (for


clinical documentation), customer service (chatbots), finance (for sentiment
analysis of market news), and entertainment (language-based gaming experiences).
As NLP continues to advance, it plays a crucial role in making human-computer
interaction more natural and intuitive.

Robotics and Artificial Intelligence


Robotics is a separate entity in Artificial Intelligence that helps study the creation of
intelligent robots or machines. Robotics combines electrical engineering, mechanical
engineering and computer science & engineering as they have mechanical
construction, electrical component and programmed with programming language.
Although, Robotics and Artificial Intelligence both have different objectives and
applications, but most people treat robotics as a subset of Artificial Intelligence (AI).
Robot machines look very similar to humans, and also, they can perform like humans,
if enabled with AI.

In earlier days, robotic applications were very limited, but now they have become
smarter and more efficient by combining with Artificial Intelligence. AI has played a
crucial role in the industrial sector by replacing humans in terms of productivity and
quality. In this article, 'Robotics and Artificial Intelligence, we will discuss Robots &
Artificial Intelligence and their various applications, advantages, differences, etc. Let's
start with the definition of Artificial Intelligence (AI) and Robots.

What is Artificial Intelligence?


Artificial Intelligence is defined as the branch of Computer Science & Engineering,
which deals with creating intelligent machines that perform like humans. Artificial
Intelligence helps to enable machines to sense, comprehend, act and learn human
like activities. There are mainly 4 types of Artificial Intelligence: reactive machines,
limited memory, theory of mind, and self-awareness.

What is a robot?
A robot is a machine that looks like a human, and is capable of performing out
of box actions and replicating certain human movements automatically by
means of commands given to it using programming. Examples: Drug
Compounding Robot, Automotive Industry Robots, Order Picking Robots, Industrial
Floor Scrubbers and Sage Automation Gantry Robots, etc.

Components of Robot
Several components construct a robot, these components are as follows:

o Actuators: Actuators are the devices that are responsible for moving and
controlling a system or machine. It helps to achieve physical movements by
converting energy like electrical, hydraulic and air, etc. Actuators can create
linear as well as rotary motion.
o Power Supply: It is an electrical device that supplies electrical power to an
electrical load. The primary function of the power supply is to convert
electrical current to power the load.
o Electric Motors: These are the devices that convert electrical energy into
mechanical energy and are required for the rotational motion of the machines.
o Pneumatic Air Muscles: Air Muscles are soft pneumatic devices that are
ideally best fitted for robotics. They can contract and extend and operate by
pressurized air filling a pneumatic bladder. Whenever air is introduced, it can
contract up to 40%.
o Muscles wire: These are made up of nickel-titanium alloy called Nitinol and
are very thin in shape. It can also extend and contract when a specific amount
of heat and electric current is supplied into it. Also, it can be formed and bent
into different shapes when it is in its martensitic form. They can contract by
5% when electrical current passes through them.
o Piezo Motors and Ultrasonic Motors: Piezoelectric motors or Piezo motors
are the electrical devices that receive an electric signal and apply a directional
force to an opposing ceramic plate. It helps a robot to move in the desired
direction. These are the best suited electrical motors for industrial robots.
o Sensor: They provide the ability like see, hear, touch and movement like
humans. Sensors are the devices or machines which help to detect the events or
changes in the environment and send data to the computer processor. These
devices are usually equipped with other electronic devices. Similar to human
organs, the electrical sensor also plays a crucial role in Artificial Intelligence &
robotics. AI algorithms control robots by sensing the environment, and it
provides real-time information to computer processors.

Applications of Robotics
Robotics have different application areas. Some of the important applications
domains of robotics are as follows:

o Robotics in defence sectors: The defence sector is undoubtedly the one of


the main parts of any country. Each country wants their defence system to be
strong. Robots help to approach inaccessible and dangerous zone during war.
DRDO has developed a robot named Daksh to destroy life-threatening
objects safely. They help soldiers to remain safe and deployed by the military
in combat scenarios. Besides combat support, robots are also deployed
in anti-submarine operations, fire support, battle damage management,
strike missions, and laying machines.
o Robotics in Medical sectors: Robots also help in various medical fields such
as laparoscopy, neurosurgery, orthopaedic surgery, disinfecting rooms,
dispensing medication, and various other medical domains.
o Robotics in Industrial Sector: Robots are used in various industrial
manufacturing industries such as cutting, welding, assembly, disassembly, pick
and place for printed circuit boards, packaging & labelling, palletizing,
product inspection & testing, colour coating, drilling, polishing and handling
the materials.
Moreover, Robotics technology increases productivity and profitability and
reduces human efforts, resulting from lower physical strain and injury. The
industrial robot has some important advantages, which are as follows:
o Accuracy
o Flexibility
o Reduced labour charge
o Low noise operation
o Fewer production damages
o Increased productivity rate.
o Robotics in Entertainment: Over the last decade, use of robots is
continuously getting increased in entertainment areas. Robots are being
employed in entertainment sector, such as movies, animation, games and
cartoons. Robots are very helpful where repetitive actions are required. A
camera-wielding robot helps shoot a movie scene as many times as needed
without getting tired and frustrated. A big-name Disney has launched
hundreds of robots for the film industry.
o Robots in the mining industry: Robotics is very helpful for various mining
applications such as robotic dozing, excavation and haulage, robotic mapping
& surveying, robotic drilling and explosive handling, etc. A mining robot can
solely navigate flooded passages and use cameras and other sensors to detect
valuable minerals. Further, robots also help in excavation to detect gases and
other materials and keep humans safe from harm and injuries. The robot rock
climbers are used for space exploration, and underwater drones are used for
ocean exploration.

AD

AI technology used in Robotics

Computer Vision
Robots can also see, and this is possible by one of the popular Artificial Intelligence
technologies named Computer vision. Computer Vision plays a crucial role in all
industries like health, entertainment, medical, military, mining, etc.
Computer Vision is an important domain of Artificial Intelligence that helps in
extracting meaningful information from images, videos and visual inputs and take
action accordingly.

Natural Language Processing


NLP (Natural Languages Processing) can be used to give voice commands to AI
robots. It creates a strong human-robot interaction. NLP is a specific area of Artificial
Intelligence that enables the communication between humans and robots. Through
the NLP technique, the robot can understand and reproduce human language. Some
robots are equipped with NLP so that we can't differentiate between humans and
robots.

Similarly, in the health care sector, robots powered by Natural Language Processing
may help physicians to observe the decease details and automatically fill in EHR.
Besides recognizing human language, it can learn common uses, such as learn the
accent, and predict how humans speak.

Edge Computing
Edge computing in robots is defined as a service provider of robot integration,
testing, design and simulation. Edge computing in robotics provides better data
management, lower connectivity cost, better security practices, more reliable and
uninterrupted connection.

Complex Event Process


Complex event processing (CEP) is a concept that helps us to understand the
processing of multiple events in real time. An event is described as a Change of State,
and one or more events combine to define a Complex event. The complex event
process is most widely used term in various industries such as healthcare, finance,
security, marketing, etc. It is primarily used in credit card fraud detection and also in
stock marketing field.

For example, the deployment of an airbag in a car is a complex event based on the
data from multiple sensors in real-time. This idea is used in Robotics, for example,
Event-Processing in Autonomous Robot Programming.

Transfer Learning and AI


This is the technique used to solve a problem with the help of another problem that
is already solved. In Transfer learning technique, knowledge gained from solving one
problem can be implement to solve related problem. We can understand it with an
example such as the model used for identifying a circle shape can also be used to
identify a square shape.

Transfer learning reuses the pre-trained model for a related problem, and only the
last layer of the model is trained, which is relatively less time consuming and cheaper.
In robotics, transfer learning can be used to train one machine with the help of other
machines.

Reinforcement Learning
Reinforcement learning is a feedback-based learning method in machine learning
that enables an AI agent to learn and explore the environment, perform actions and
learn automatically from experience or feedback for each action. Further, it is also
having feature of autonomously learn to behave optimally through hit-and-trail
action while interacting with the environment. It is primarily used to develop the
sequence of decisions and achieve the goals in uncertain and potentially complex
environment. In robotics, robots explore the environment and learn about it through
hit and trial. For each action, he gets rewarded (positive or negative). Reinforcement
learning provides Robotics with a framework to design and simulate sophisticated
and hard-to-engineer behaviours.

Affective computing
Affective computing is a field of study that deals with developing systems that can
identify, interpret, process, and simulate human emotions. Affective computing aims
to endow robots with emotional intelligence to hope that robots can be endowed
with human-like capabilities of observation, interpretation, and emotion expression.

AD

Mixed Reality
Mixed Reality is also an emerging domain. It is mainly used in the field of
programming by demonstration (PbD). PbD creates a prototyping mechanism for
algorithms using a combination of physical and virtual objects.

What are Artificially Intelligent Robots?


Artificial intelligent robots connect AI with robotics. AI robots are controlled by AI
programs and use different AI technologies, such as Machine learning, computer
vision, RL learning, etc. Usually, most robots are not AI robots, these robots are
programmed to perform repetitive series of movements, and they don't need any AI
to perform their task. However, these robots are limited in functionality.
AI algorithms are necessary when you want to allow the robot to perform more
complex tasks.

A warehousing robot might use a path-finding algorithm to navigate around the


warehouse. A drone might use autonomous navigation to return home when it is
about to run out of battery. A self-driving car might use a combination of AI
algorithms to detect and avoid potential hazards on the road. All these are the
examples of artificially intelligent robots.

What are the advantages of integrating Artificial


Intelligence into robotics?

o The major advantages of artificially intelligent robots are social care. They can
guide people, especially come to aid for older people, with chatbot like social
skills and advanced processors.
o Robotics also helps in Agricultural industry with the help of developing AI
based robots. These robots reduce the farmer's workload.
o In Military industry, Military bots can spy through speech and vision detectors,
along with saving lives by replacing infantry
o Robotics also employed in volcanoes, deep oceans, extremely cold places, or
even in space where normally humans can't survive.
o Robotics is also used in medical and healthcare industry as it can also perform
complex surgeries that have a higher risk of a mistake by humans, but with a
pre-set of instructions and added Intelligence. AI integrated robotics could
reduce the number of casualties greatly.

AD

Difference in Robot System and AI


Programs
Here is the difference between Artificial Intelligence and Robots:

AD

1. AI Programs
Usually, we use to operate them in computer-simulated worlds.

Generally, input is given in the form of symbols and rules.


To operate this, we need general-purpose/Special-purpose computers.

2. Robots
Generally, we use robots to operate in the real physical world.

Inputs are given in the form of the analogue signal or in the form of the speech
waveform.

Also, to operate this, special hardware with sensors and effectors are needed.

What is an Expert System?


An expert system is a computer program that is designed to solve complex problems
and to provide decision-making ability like a human expert. It performs this by
extracting knowledge from its knowledge base using the reasoning and inference
rules according to the user queries.

The expert system is a part of AI, and the first ES was developed in the year 1970,
which was the first successful approach of artificial intelligence. It solves the most
complex issue as an expert by extracting the knowledge stored in its knowledge
base. The system helps in decision making for compsex problems using both facts
and heuristics like a human expert. It is called so because it contains the expert
knowledge of a specific domain and can solve any complex problem of that
particular domain. These systems are designed for a specific domain, such
as medicine, science, etc.

The performance of an expert system is based on the expert's knowledge stored in


its knowledge base. The more knowledge stored in the KB, the more that system
improves its performance. One of the common examples of an ES is a suggestion of
spelling errors while typing in the Google search box.

Below is the block diagram that represents the working of an expert system:
Note: It is important to remember that an expert system is not used to replace the
human experts; instead, it is used to assist the human in making a complex
decision. These systems do not have human capabilities of thinking and work on
the basis of the knowledge base of the particular domain.

Below are some popular examples of the Expert System:

o DENDRAL: It was an artificial intelligence project that was made as a chemical


analysis expert system. It was used in organic chemistry to detect unknown organic
molecules with the help of their mass spectra and knowledge base of chemistry.
o MYCIN: It was one of the earliest backward chaining expert systems that was
designed to find the bacteria causing infections like bacteraemia and meningitis. It
was also used for the recommendation of antibiotics and the diagnosis of blood
clotting diseases.
o PXDES: It is an expert system that is used to determine the type and level of lung
cancer. To determine the disease, it takes a picture from the upper body, which looks
like the shadow. This shadow identifies the type and degree of harm.
o CaDeT: The CaDet expert system is a diagnostic support system that can detect
cancer at early stages.

Characteristics of Expert System

o High Performance: The expert system provides high performance for solving any
type of complex problem of a specific domain with high efficiency and accuracy.
o Understandable: It responds in a way that can be easily understandable by the user.
It can take input in human language and provides the output in the same way.
o Reliable: It is much reliable for generating an efficient and accurate output.
o Highly responsive: ES provides the result for any complex query within a very short
period of time.

AD

Components of Expert System


An expert system mainly consists of three components:

o User Interface
o Inference Engine
o Knowledge Base

1. User Interface

With the help of a user interface, the expert system interacts with the user, takes
queries as an input in a readable format, and passes it to the inference engine. After
getting the response from the inference engine, it displays the output to the user. In
other words, it is an interface that helps a non-expert user to communicate with
the expert system to find a solution.

2. Inference Engine(Rules of Engine)


o The inference engine is known as the brain of the expert system as it is the main
processing unit of the system. It applies inference rules to the knowledge base to
derive a conclusion or deduce new information. It helps in deriving an error-free
solution of queries asked by the user.
o With the help of an inference engine, the system extracts the knowledge from the
knowledge base.
o There are two types of inference engine:
o Deterministic Inference engine: The conclusions drawn from this type of inference
engine are assumed to be true. It is based on facts and rules.
o Probabilistic Inference engine: This type of inference engine contains uncertainty in
conclusions, and based on the probability.

Inference engine uses the below modes to derive the solutions:

o Forward Chaining: It starts from the known facts and rules, and applies the inference
rules to add their conclusion to the known facts.
o Backward Chaining: It is a backward reasoning method that starts from the goal and
works backward to prove the known facts.

3. Knowledge Base

o The knowledgebase is a type of storage that stores knowledge acquired from the
different experts of the particular domain. It is considered as big storage of
knowledge. The more the knowledge base, the more precise will be the Expert
System.
o It is similar to a database that contains information and rules of a particular domain
or subject.
o One can also view the knowledge base as collections of objects and their attributes.
Such as a Lion is an object and its attributes are it is a mammal, it is not a domestic
animal, etc.

Components of Knowledge Base

o Factual Knowledge: The knowledge which is based on facts and accepted by


knowledge engineers comes under factual knowledge.
o Heuristic Knowledge: This knowledge is based on practice, the ability to guess,
evaluation, and experiences.
Knowledge Representation: It is used to formalize the knowledge stored in the
knowledge base using the If-else rules.

Knowledge Acquisitions: It is the process of extracting, organizing, and structuring


the domain knowledge, specifying the rules to acquire the knowledge from various
experts, and store that knowledge into the knowledge base.

Development of Expert System

Here, we will explain the working of an expert system by taking an example of


MYCIN ES. Below are some steps to build an MYCIN:

o Firstly, ES should be fed with expert knowledge. In the case of MYCIN, human experts
specialized in the medical field of bacterial infection, provide information about the
causes, symptoms, and other knowledge in that domain.
o The KB of the MYCIN is updated successfully. In order to test it, the doctor provides a
new problem to it. The problem is to identify the presence of the bacteria by
inputting the details of a patient, including the symptoms, current condition, and
medical history.
o The ES will need a questionnaire to be filled by the patient to know the general
information about the patient, such as gender, age, etc.
o Now the system has collected all the information, so it will find the solution for the
problem by applying if-then rules using the inference engine and using the facts
stored within the KB.
o In the end, it will provide a response to the patient by using the user interface.

Participants in the development of Expert System

There are three primary participants in the building of Expert System:

1. Expert: The success of an ES much depends on the knowledge provided by human


experts. These experts are those persons who are specialized in that specific domain.
2. Knowledge Engineer: Knowledge engineer is the person who gathers the
knowledge from the domain experts and then codifies that knowledge to the system
according to the formalism.
3. End-User: This is a particular person or a group of people who may not be experts,
and working on the expert system needs the solution or advice for his queries, which
are complex.
Why Expert System?

Before using any technology, we must have an idea about why to use that
technology and hence the same for the ES. Although we have human experts in
every field, then what is the need to develop a computer-based system. So below are
the points that are describing the need of the ES:

1. No memory Limitations: It can store as much data as required and can memorize it
at the time of its application. But for human experts, there are some limitations to
memorize all things at every time.
2. High Efficiency: If the knowledge base is updated with the correct knowledge, then
it provides a highly efficient output, which may not be possible for a human.
3. Expertise in a domain: There are lots of human experts in each domain, and they all
have different skills, different experiences, and different skills, so it is not easy to get a
final output for the query. But if we put the knowledge gained from human experts
into the expert system, then it provides an efficient output by mixing all the facts and
knowledge
4. Not affected by emotions: These systems are not affected by human emotions such
as fatigue, anger, depression, anxiety, etc.. Hence the performance remains constant.
5. High security: These systems provide high security to resolve any query.
6. Considers all the facts: To respond to any query, it checks and considers all the
available facts and provides the result accordingly. But it is possible that a human
expert may not consider some facts due to any reason.
7. Regular updates improve the performance: If there is an issue in the result
provided by the expert systems, we can improve the performance of the system by
updating the knowledge base.

Capabilities of the Expert System


Below are some capabilities of an Expert System:

AD

o Advising: It is capable of advising the human being for the query of any domain
from the particular ES.
o Provide decision-making capabilities: It provides the capability of decision making
in any domain, such as for making any financial decision, decisions in medical science,
etc.
o Demonstrate a device: It is capable of demonstrating any new products such as its
features, specifications, how to use that product, etc.
o Problem-solving: It has problem-solving capabilities.
o Explaining a problem: It is also capable of providing a detailed description of an
input problem.
o Interpreting the input: It is capable of interpreting the input given by the user.
o Predicting results: It can be used for the prediction of a result.
o Diagnosis: An ES designed for the medical field is capable of diagnosing a disease
without using multiple components as it already contains various inbuilt medical
tools.

Advantages of Expert System

o These systems are highly reproducible.


o They can be used for risky places where the human presence is not safe.
o Error possibilities are less if the KB contains correct knowledge.
o The performance of these systems remains steady as it is not affected by emotions,
tension, or fatigue.
o They provide a very high speed to respond to a particular query.
Limitations of Expert System

o The response of the expert system may get wrong if the knowledge base contains the
wrong information.
o Like a human being, it cannot produce a creative output for different scenarios.
o Its maintenance and development costs are very high.
o Knowledge acquisition for designing is much difficult.
o For each domain, we require a specific ES, which is one of the big limitations.
o It cannot learn from itself and hence requires manual updates.

Applications of Expert System

o In designing and manufacturing domain


It can be broadly used for designing and manufacturing physical devices such as
camera lenses and automobiles.
o In the knowledge domain
These systems are primarily used for publishing the relevant knowledge to the users.
The two popular ES used for this domain is an advisor and a tax advisor.
o In the finance domain
In the finance industries, it is used to detect any type of possible fraud, suspicious
activity, and advise bankers that if they should provide loans for business or not.
o In the diagnosis and troubleshooting of devices
In medical diagnosis, the ES system is used, and it was the first area where these
systems were used.
o Planning and Scheduling
The expert systems can also be used for planning and scheduling some particular
tasks for achieving the goal of that task.

An Expert System is a type of artificial intelligence (AI) program designed to mimic and replicate the
decision-making abilities of a human expert in a specific domain or field. It is a computer-based
system that uses knowledge, reasoning, and decision-making methodologies to solve problems that
normally require human expertise.

Key components and characteristics of an Expert System include:


1. **Knowledge Base**: This is a repository of domain-specific knowledge, facts, rules, and heuristics
that the system uses to draw conclusions and make decisions. The knowledge base is typically
created by experts in the domain and represents their expertise in a structured format that the
computer can understand.

2. **Inference Engine**: The inference engine is the core component of the expert system that
processes the information stored in the knowledge base. It uses various reasoning techniques (such
as forward chaining or backward chaining) to derive new facts or make decisions based on the
provided input and the rules encoded in the knowledge base.

3. **User Interface**: Expert systems include a user interface that allows users to interact with the
system, provide input, ask questions, and receive recommendations or solutions. This interface can
be text-based, graphical, or even voice-driven depending on the application.

4. **Explanation Facility**: Many expert systems are equipped with an explanation facility that can
explain how a particular conclusion or recommendation was reached. This is important for users to
understand the reasoning behind the system's output and build trust in its recommendations.

5. **Integration with External Systems**: In some applications, expert systems may need to
integrate with external data sources or systems to gather additional information or to act upon the
decisions made by the system.

Expert systems are used in various fields such as medicine, finance, engineering, customer support,
and more, where they can assist in complex decision-making, problem-solving, diagnosis,
troubleshooting, and knowledge sharing. They excel in situations where access to human experts is
limited, where consistency and accuracy are critical, and where the domain knowledge can be
explicitly defined and codified. However, expert systems also have limitations, particularly in handling
uncertainties, adapting to new situations, and dealing with unstructured data beyond their defined
knowledge base.

You might also like