Ai Notes Unit 4 (1) (2) (1)

UNIT IV
2 MARKS
1. What is the Purpose of Machine Learning?
Machine learning can be seen as a branch of AI or Artificial Intelligence, since, the

ability to change experience into expertise or to detect patterns in complex data is a mark of
human or animal intelligence.
As a field of science, machine learning shares common concepts with other disciplines
such as statistics, information theory, game theory, and optimization.
2. What is an Agent?
An agent can be anything that perceive its environment through sensors and act upon that
environment through actuators. An Agent runs in the cycle of perceiving, thinking, and acting.
An agent can be:
o Human-Agent: A human agent has eyes, ears, and other organs which work for sensors
and hand, legs, vocal tract work for actuators.
o Robotic Agent: A robotic agent can have cameras, infrared range finder, NLP for
sensors and various motors for actuators.
o Software Agent: Software agent can have keystrokes, file contents as sensory input and
act on those inputs and display output on the screen.
3. What is the Intelligent Agents?
An intelligent agent is an autonomous entity which act upon an environment using

sensors and actuators for achieving goals. An intelligent agent may learn from the environment
to achieve their goals. A thermostat is an example of an intelligent agent.
Following are the main four rules for an AI agent:
o Rule 1: An AI agent must have the ability to perceive the environment.

o Rule 2: The observation must be used to make decisions.
o Rule 3: Decision should result in an action.
o Rule 4: The action taken by an AI agent must be a rational action.
4. What is the Rational Agent?

A rational agent is an agent which has clear preference, models uncertainty, and acts in a
way to maximize its performance measure with all possible actions.
A rational agent is said to perform the right things. AI is about creating rational agents to
use for game theory and decision theory for various real-world scenarios.
5. What are the types of Association Rule Learning?

There are the following types of Association rule learning which are as follows −
Apriori Algorithm − This algorithm needs frequent datasets to produce association rules. It is
designed to work on databases that include transactions. This algorithm needs a breadth-first
search and hash tree to compute the item set efficiently.
Eclat Algorithm − The Eclat algorithm represents Equivalence Class Transformation. This
algorithm needs a depth-first search method to discover frequent item sets in a transaction
database. It implements quicker execution than Apriori Algorithm.
F-P Growth Algorithm − The F-P growth algorithm represents Frequent Pattern. It is the
enhanced version of the Apriori Algorithm. It describes the database in the form of a tree
structure that is referred to as a frequent pattern or tree. This frequent tree aims to extract the
most frequent patterns.
6. List out step in K-means clustering
There are the following steps used in the K-means clustering −
 It can select K initial cluster centroid c1, c2, c3… . . ck.
 It can assign each instance x in the S cluster whose centroid is nearest to x.
 For each cluster, recompute its centroid based on which elements are contained in that
cluster.
 Go to (b) until convergence is completed.
 It can separate the object (data points) into K clusters.
 It is used to cluster center (centroid) = the average of all the data points in the cluster.
 It can assign each point to the cluster whose centroid is nearest (using distance function).
7. How does Machine Learning work?
8. Why is machine learning important?

Machine learning is important because it gives enterprises a view of trends in customer
behavior and business operational patterns, as well as supports the development of new products.
Many of today's leading companies, such as Facebook, Google and Uber, make machine
learning a central part of their operations. Machine learning has become a significant
competitive differentiator for many companies.
9. What is the Need for Machine Learning?
• The need for machine learning is increasing day by day. The reason behind the need for
machine learning is that it is capable of doing tasks that are too complex for a person to
implement directly
• The importance of machine learning can be easily understood by its uses cases,
Currently, machine learning is used in self-driving cars, cyber fraud detection, face
recognition, and friend suggestion by Facebook, etc.
10. What are the different types of machine learning?
• Classical machine learning is often categorized by how an algorithm learns to become
more accurate in its predictions. There are four basic approaches:
• supervised learning,
• unsupervised learning,
• semi-supervised learning
• reinforcement learning
11. How does supervised machine learning work?
• Supervised machine learning requires the data scientist to train the algorithm with both
labeled inputs and desired outputs. Supervised learning algorithms are good for the
following tasks:
• Binary classification: Dividing data into two categories.
• Multi-class classification: Choosing between more than two types of answers.
• Regression modeling: Predicting continuous values.
• Ensembling: Combining the predictions of multiple machine learning models to produce
an accurate prediction.
12. How does unsupervised machine learning work?
• Unsupervised machine learning algorithms do not require data to be labeled. They sift
through unlabeled data to look for patterns that can be used to group data points into
subsets. Most types of deep learning, including neural networks, are unsupervised
algorithms. Unsupervised learning algorithms are good for the following tasks:
• Clustering: Splitting the dataset into groups based on similarity.
• Anomaly detection: Identifying unusual data points in a data set.
• Association mining: Identifying sets of items in a data set that frequently occur together.
• Dimensionality reduction: Reducing the number of variables in a data set.
13. How does semi-supervised learning work?
• Semi-supervised learning works by data scientists feeding a small amount of labeled
training data to an algorithm. From this, the algorithm learns the dimensions of the data
set, which it can then apply to new, unlabeled data.
• Machine translation: Teaching algorithms to translate language based on less than a full
dictionary of words.
• Fraud detection: Identifying cases of fraud when you only have a few positive
examples.
• Labelling data: Algorithms trained on small data sets can learn to apply data labels to
larger sets automatically.
14. How does reinforcement learning work?
• Reinforcement learning works by programming an algorithm with a distinct goal and a
prescribed set of rules for accomplishing that goal
• Robotics: Robots can learn to perform tasks the physical world using this technique.
• Video gameplay: Reinforcement learning has been used to teach bots to play a number
of video games.
• Resource management: Given finite resources and a defined goal, reinforcement
learning can help enterprises plan out how to allocate resources.
15. Draw the Machine learning process
16. How does Association Rule Learning work?
• Association rule learning works on the concept of If and Else Statement, such as if A then
B.
• Here the If element is called antecedent, and then statement is called as Consequent
17. Define the Apriori algorithm
• Apriori is a pretty straightforward algorithm that performs the following sequence of

calculations:
1. Calculate support for item sets of size 1.
2. Apply the minimum support threshold and prune item sets that do not meet the threshold.
3. Move on to item sets of size 2 and repeat steps one and two.
4. Continue the same process until no additional item sets satisfying the minimum threshold
can be found.
18. What are the Applications of Association Rule Learning?
• It has various applications in machine learning and data mining. Below are some popular
applications of association rule learning:
• Market Basket Analysis: It is one of the popular examples and applications of

association rule mining. This technique is commonly used by big retailers to determine
the association between items.
• Medical Diagnosis: With the help of association rules, patients can be cured easily, as it
helps in identifying the probability of illness for a particular disease.
• Protein Sequence: The association rules help in determining the synthesis of artificial
Proteins.
• It is also used for the Catalog Design and Loss-leader Analysis and many more other
applications.
19. What is Fuzzy Logic?
• Fuzzy Logic (FL) is a method of reasoning that resembles human reasoning. This
approach is similar to how humans perform decision making. And it involves all
intermediate possibilities between YES and NO.
20. Why do we use Fuzzy Logic?
• Generally, we use the fuzzy logic system for both commercial and practical purposes
such as:
• It controls machines and consumer products
• If not accurate reasoning, it at least provides acceptable reasoning

• This helps in dealing with the uncertainty in engineering
• So, now that you know about Fuzzy logic in AI and why do we actually use it, let’s move
on and understand the architecture of this logic.
21. Draw the Fuzzy Logic Systems Architecture
22. What is the Defuzzification ?
The Defuzzification process converts the fuzzy sets into a crisp value. There are different
types of techniques available, and you need to select the best-suited one with an expert system.
23. What is the Inference Engine?
It determines the degree of match between fuzzy input and the rules. According to the
input field, it will decide the rules that are to be fired. Combining the fired rules, form the control
actions.
24. What are the Applications of Fuzzy Logic?
• The Fuzzy logic is used in various fields such as automotive systems, domestic goods,
environment control, etc. Some of the common applications are:
• It is used in the aerospace field for altitude control of spacecraft and satellite.
• This controls the speed and traffic in the automotive systems.
• It is used for decision making support systems and personal evaluation in the large
company business.
• It also controls the pH, drying, chemical distillation process in the chemical industry.
25. What are the Advantages & Disadvantages of Fuzzy Logic?
• The structure of Fuzzy Logic Systems is easy and understandable
• Fuzzy logic is widely used for commercial and practical purposes
• It helps you to control machines and consumer products
• It helps you to deal with the uncertainty in engineering
Example of a Fuzzy Logic System
• This system adjusts the temperature of air conditioner by comparing the room
temperature and the target temperature value.
26. Define Set
A set is a term, which is a collection of unordered or ordered elements. Following are the
various examples of a set:
• A set of all-natural numbers

• A set of students in a class.
• A set of all cities in a state.
• A set of upper-case letters of the alphabet.
27. What are the Operations on Classical Set?
• Union Operation
• Intersection Operation
• Difference Operation
• Complement Operation
28. What are the Applications of Association Rule Learning?
• Market Basket Analysis: It is one of the popular examples and applications of

• Medical Diagnosis: With the help of association rules, patients can be cured easily, as it
• Protein Sequence: The association rules help in determining the synthesis of artificial
Proteins.
• It is also used for the Catalog Design and Loss-leader Analysis and many more other
applications.
5 MARKS
1. Explain and brief about the Machine Learning and its types?
Types of Machine Learning
Machine learning is a subset of AI, which enables the machine to automatically

learn from data, improve performance from past experiences, and make predictions.
Machine learning contains a set of algorithms that work on a huge amount of data. Data is fed to
these algorithms to train them, and on the basis of training, they build the model & perform a
specific task.
These ML algorithms help to solve different business problems like Regression, Classification,
Forecasting, Clustering, and Associations, etc.
Based on the methods and way of learning, machine learning is divided into mainly four types,
which are:
1. Supervised Machine Learning

2. Unsupervised Machine Learning
3. Semi-Supervised Machine Learning
4. Reinforcement Learning
In this topic, we will provide a detailed description of the types of Machine Learning along with
their respective algorithms:
1. Supervised Machine Learning
The main goal of the supervised learning technique is to map the input variable(x) with the
output variable(y). Some real-world applications of supervised learning are Risk Assessment,
Fraud Detection, Spam filtering, etc.
Categories of Supervised Machine Learning
Supervised machine learning can be classified into two types of problems, which are given
below:
o Classification
o Regression
a) Classification
Classification algorithms are used to solve the classification problems in which the output
variable is categorical, such as "Yes" or No, Male or Female, Red or Blue, etc. The
classification algorithms predict the categories present in the dataset. Some real-world examples
of classification algorithms are Spam Detection, Email filtering, etc.
Some popular classification algorithms are given below:
o Random Forest Algorithm

o Decision Tree Algorithm
o Logistic Regression Algorithm
o Support Vector Machine Algorithm
b) Regression
Regression algorithms are used to solve regression problems in which there is a linear
relationship between input and output variables. These are used to predict continuous output
variables, such as market trends, weather prediction, etc.
Some popular Regression algorithms are given below:
o Simple Linear Regression Algorithm

o Multivariate Regression Algorithm
o Decision Tree Algorithm
o Lasso Regression
2. Unsupervised Machine Learning
The main aim of the unsupervised learning algorithm is to group or categories the unsorted
dataset according to the similarities, patterns, and differences. Machines are instructed to
find the hidden patterns from the input dataset.
Categories of Unsupervised Machine Learning
Unsupervised Learning can be further classified into two types, which are given below:
o Clustering
o Association
1) Clustering
The clustering technique is used when we want to find the inherent groups from the data. It is a
way to group the objects into a cluster such that the objects with the most similarities remain in
one group and have fewer or no similarities with the objects of other groups. An example of the
clustering algorithm is grouping the customers by their purchasing behaviour.
Some of the popular clustering algorithms are given below:
o K-Means Clustering algorithm

o Mean-shift algorithm
o DBSCAN Algorithm
o Principal Component Analysis
o Independent Component Analysis
2) Association
Association rule learning is an unsupervised learning technique, which finds interesting relations
among variables within a large dataset. The main aim of this learning algorithm is to find the
dependency of one data item on another data item and map those variables accordingly so that it
can generate maximum profit. This algorithm is mainly applied in Market Basket analysis,
Web usage mining, continuous production, etc.
Some popular algorithms of Association rule learning are Apriori Algorithm, Eclat, FP-growth
algorithm.
3. Semi-Supervised Learning
Semi-Supervised learning is a type of Machine Learning algorithm that lies between

Supervised and Unsupervised machine learning. It represents the intermediate ground
between Supervised (With Labelled training data) and Unsupervised learning (with no labelled
training data) algorithms and uses the combination of labelled and unlabeled datasets during the
training period.
4. Reinforcement Learning
Reinforcement learning works on a feedback-based process, in which an AI agent (A

software component) automatically explore its surrounding by hitting & trail, taking
action, learning from experiences, and improving its performance. Agent gets rewarded for
each good action and get punished for each bad action; hence the goal of reinforcement learning
agent is to maximize the rewards.
Categories of Reinforcement Learning
Reinforcement learning is categorized mainly into two types of methods/algorithms:
o Positive Reinforcement Learning: Positive reinforcement learning specifies increasing

the tendency that the required behaviour would occur again by adding something. It
enhances the strength of the behaviour of the agent and positively impacts it.
o Negative Reinforcement Learning: Negative reinforcement learning works exactly
opposite to the positive RL. It increases the tendency that the specific behaviour would
occur again by avoiding the negative condition.
2. What are the Advantages and Disadvantages of Supervised Learning?
Advantages:
o Since supervised learning work with the labelled dataset so we can have an exact idea
about the classes of objects.
o These algorithms are helpful in predicting the output on the basis of prior experience.
Disadvantages:
o These algorithms are not able to solve complex tasks.

o It may predict the wrong output if the test data is different from the training data.
o It requires lots of computational time to train the algorithm.
3. What are the Applications of Supervised Learning?
Some common applications of Supervised Learning are given below:
o ImageSegmentation:
Supervised Learning algorithms are used in image segmentation. In this process, image
classification is performed on different image data with pre-defined labels.
o MedicalDiagnosis:
Supervised algorithms are also used in the medical field for diagnosis purposes. It is done
by using medical images and past labelled data with labels for disease conditions. With
such a process, the machine can identify a disease for the new patients.
o Fraud Detection - Supervised Learning classification algorithms are used for identifying
fraud transactions, fraud customers, etc. It is done by using historic data to identify the
patterns that can lead to possible fraud.
o Spam detection - In spam detection & filtering, classification algorithms are used. These
algorithms classify an email as spam or not spam. The spam emails are sent to the spam
folder.
o Speech Recognition - Supervised learning algorithms are also used in speech
recognition. The algorithm is trained with voice data, and various identifications can be
done using the same, such as voice-activated passwords, voice commands, etc.
4. What are the Advantages and Disadvantages of Unsupervised Learning Algorithm?
Advantages:
o These algorithms can be used for complicated tasks compared to the supervised ones
because these algorithms work on the unlabeled dataset.
o Unsupervised algorithms are preferable for various tasks as getting the unlabeled dataset
is easier as compared to the labelled dataset.
Disadvantages:
o The output of an unsupervised algorithm can be less accurate as the dataset is not
labelled, and algorithms are not trained with the exact output in prior.
o Working with Unsupervised learning is more difficult as it works with the unlabelled
dataset that does not map with the output.
5.What are the Applications of Unsupervised Learning?

o Network Analysis: Unsupervised learning is used for identifying plagiarism and
copyright in document network analysis of text data for scholarly articles.
o Recommendation Systems: Recommendation systems widely use unsupervised learning
techniques for building recommendation applications for different web applications and
e-commerce websites.
o Anomaly Detection: Anomaly detection is a popular application of unsupervised
learning, which can identify unusual data points within the dataset. It is used to discover
fraudulent transactions.
o Singular Value Decomposition: Singular Value Decomposition or SVD is used to
extract particular information from the database. For example, extracting information of
each user located at a particular location.
6. What are the Advantages and disadvantages of Semi-supervised Learning?
Advantages:
o It is simple and easy to understand the algorithm.
o It is highly efficient.
o It is used to solve drawbacks of Supervised and Unsupervised Learning algorithms.
Disadvantages:
o Iterations results may not be stable.

o We cannot apply these algorithms to network-level data.
o Accuracy is low.
10 MARKS
1. What is Machine Learning and explain it?
In the real world, we are surrounded by humans who can learn everything from their
experiences with their learning capability, and we have computers or machines which work on
our instructions. But can a machine also learn from experiences or past data like a human does?
So here comes the role of Machine Learning.
Machine Learning is said as a subset of artificial intelligence that is mainly concerned

with the development of algorithms which allow a computer to learn from the data and past
experiences on their own. The term machine learning was first introduced by Arthur
Samuel in 1959. We can define it in a summarized way as:
Machine learning enables a machine to automatically learn from data, improve performance
from experiences, and predict things without being explicitly programmed.
With the help of sample historical data, which is known as training data, machine
learning algorithms build a mathematical model that helps in making predictions or decisions
without being explicitly programmed. Machine learning brings computer science and statistics
together for creating predictive models. Machine learning constructs or uses the algorithms that
learn from historical data. The more we will provide the information, the higher will be the
performance.
A machine has the ability to learn if it can improve its performance by gaining more data.
How does Machine Learning work
A Machine Learning system learns from historical data, builds the prediction models, and
whenever it receives new data, predicts the output for it. The accuracy of predicted output
depends upon the amount of data, as the huge amount of data helps to build a better model which
predicts the output more accurately.
Suppose we have a complex problem, where we need to perform some predictions, so instead of
writing a code for it, we just need to feed the data to generic algorithms, and with the help of
these algorithms, machine builds the logic as per the data and predict the output. Machine
learning has changed our way of thinking about the problem. The below block diagram explains
the working of Machine Learning algorithm:
Features of Machine Learning:
o Machine learning uses data to detect various patterns in a given dataset.

o It can learn from past data and improve automatically.
o It is a data-driven technology.
o Machine learning is much similar to data mining as it also deals with the huge amount of
the data.
Need for Machine Learning
The need for machine learning is increasing day by day. The reason behind the need for
machine learning is that it is capable of doing tasks that are too complex for a person to
implement directly. As a human, we have some limitations as we cannot access the huge amount
of data manually, so for this, we need some computer systems and here comes the machine
learning to make things easy for us.
We can train machine learning algorithms by providing them the huge amount of data
and let them explore the data, construct the models, and predict the required output
automatically. The performance of the machine learning algorithm depends on the amount of
data, and it can be determined by the cost function. With the help of machine learning, we can
save both time and money.
The importance of machine learning can be easily understood by its uses cases,
Currently, machine learning is used in self-driving cars, cyber fraud detection, face
recognition, and friend suggestion by Facebook, etc. Various top companies such as Netflix
and Amazon have build machine learning models that are using a vast amount of data to analyze
the user interest and recommend product accordingly.
Following are some key points which show the importance of Machine Learning:
o Rapid increment in the production of data

o Solving complex problems, which are difficult for a human
o Decision making in various sector including finance
o Finding hidden patterns and extracting useful information from data.
Classification of Machine Learning
At a broad level, machine learning can be classified into three types:
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
1) Supervised Learning
Supervised learning is a type of machine learning method in which we provide sample labeled
data to the machine learning system in order to train it, and on that basis, it predicts the output.
The system creates a model using labeled data to understand the datasets and learn about each
data, once the training and processing are done then we test the model by providing a sample
data to check whether it is predicting the exact output or not.
The goal of supervised learning is to map input data with the output data. The supervised
learning is based on supervision, and it is the same as when a student learns things in the
supervision of the teacher. The example of supervised learning is spam filtering.
Supervised learning can be grouped further in two categories of algorithms:
o Classification
o Regression
2) Unsupervised Learning
Unsupervised learning is a learning method in which a machine learns without any supervision.
The training is provided to the machine with the set of data that has not been labeled, classified,
or categorized, and the algorithm needs to act on that data without any supervision. The goal of
unsupervised learning is to restructure the input data into new features or a group of objects with
similar patterns.
In unsupervised learning, we don't have a predetermined result. The machine tries to find useful
insights from the huge amount of data. It can be further classifieds into two categories of
algorithms:
o Clustering
o Association
3) Reinforcement Learning
Reinforcement learning is a feedback-based learning method, in which a learning agent gets a

reward for each right action and gets a penalty for each wrong action. The agent learns
automatically with these feedbacks and improves its performance. In reinforcement learning, the
agent interacts with the environment and explores it. The goal of an agent is to get the most
reward points, and hence, it improves its performance.
The robotic dog, which automatically learns the movement of his arms, is an example of
Reinforcement learning.
2. What is Association rule learning and explain it?
Association Rule Learning
Association rule learning is a type of unsupervised learning technique that checks for the
dependency of one data item on another data item and maps accordingly so that it can be more
profitable. It tries to find some interesting relations or associations among the variables of
dataset. It is based on different rules to discover the interesting relations between variables in the
database.
The association rule learning is one of the very important concepts of machine learning
, and it is employed in Market Basket analysis, Web usage mining, continuous production,
etc. Here market basket analysis is a technique used by the various big retailer to discover the
associations between items. We can understand it by taking an example of a supermarket, as in a
supermarket, all products that are purchased together are put together.
For example, if a customer buys bread, he most likely can also buy butter, eggs, or milk, so these
products are stored within a shelf or mostly nearby. Consider the below diagram:
Association rule learning can be divided into three types of algorithms:
1. Apriori
2. Eclat
3. F-P Growth Algorithm
We will understand these algorithms in later chapters.
How does Association Rule Learning work?
Association rule learning works on the concept of If and Else Statement, such as if A then B.
Here the If element is called antecedent, and then statement is called as Consequent. These
types of relationships where we can find out some association or relation between two items is
known as single cardinality. It is all about creating rules, and if the number of items increases,
then cardinality also increases accordingly. So, to measure the associations between thousands of
data items, there are several metrics. These metrics are given below:
o Support
o Confidence
o Lift
Let's understand each of them:
Support
Support is the frequency of A or how frequently an item appears in the dataset. It is defined as
the fraction of the transaction T that contains the itemset X. If there are X datasets, then for
transactions T, it can be written as:
Confidence
Confidence indicates how often the rule has been found to be true. Or how often the items X and
Y occur together in the dataset when the occurrence of X is already given. It is the ratio of the
transaction that contains X and Y to the number of records that contain X.
Lift
It is the strength of any rule, which can be defined as below formula:
It is the ratio of the observed support measure and expected support if X and Y are independent
of each other. It has three possible values:
o If Lift= 1: The probability of occurrence of antecedent and consequent is independent of

each other.
o Lift>1: It determines the degree to which the two itemsets are dependent to each other.
o Lift<1: It tells us that one item is a substitute for other items, which means one item has a
negative effect on another.
Types of Association Rule Learning
Association rule learning can be divided into three algorithms:
Apriori Algorithm
This algorithm uses frequent datasets to generate association rules. It is designed to work on the
databases that contain transactions. This algorithm uses a breadth-first search and Hash Tree to
calculate the itemset efficiently.
It is mainly used for market basket analysis and helps to understand the products that can be
bought together. It can also be used in the healthcare field to find drug reactions for patients.
Eclat Algorithm
Eclat algorithm stands for Equivalence Class Transformation. This algorithm uses a depth-first
search technique to find frequent itemsets in a transaction database. It performs faster execution
than Apriori Algorithm.
F-P Growth Algorithm
The F-P growth algorithm stands for Frequent Pattern, and it is the improved version of the
Apriori Algorithm. It represents the database in the form of a tree structure that is known as a
frequent pattern or tree. The purpose of this frequent tree is to extract the most frequent patterns.
Applications of Association Rule Learning
It has various applications in machine learning and data mining. Below are some popular
applications of association rule learning:
o Market Basket Analysis: It is one of the popular examples and applications of

o Medical Diagnosis: With the help of association rules, patients can be cured easily, as it
o Protein Sequence: The association rules help in determining the synthesis of artificial
Proteins.
o It is also used for the Catalog Design and Loss-leader Analysis and many more other
applications.
2. What is K-Means Clustering Algorithm and explain it?
K-Means Clustering is an unsupervised learning algorithm that is used to solve the clustering
problems in machine learning or data science. In this topic, we will learn what is K-means
clustering algorithm, how the algorithm works, along with the Python implementation of k-
means clustering.
What is K-Means Algorithm?
K-Means Clustering is an Unsupervised Learning algorithm
, which groups the unlabeled dataset into different clusters. Here K defines the number of pre-
defined clusters that need to be created in the process, as if K=2, there will be two clusters, and
for K=3, there will be three clusters, and so on.
It is an iterative algorithm that divides the unlabeled dataset into k different clusters in such a
way that each dataset belongs only one group that has similar properties.
It allows us to cluster the data into different groups and a convenient way to discover the
categories of groups in the unlabeled dataset on its own without the need for any training.
It is a centroid-based algorithm, where each cluster is associated with a centroid. The main aim
of this algorithm is to minimize the sum of distances between the data point and their
corresponding clusters
The algorithm takes the unlabeled dataset as input, divides the dataset into k-number of clusters,
and repeats the process until it does not find the best clusters. The value of k should be
predetermined in this algorithm.
The k-means clustering
algorithm mainly performs two tasks:
o Determines the best value for K center points or centroids by an iterative process.
o Assigns each data point to its closest k-center. Those data points which are near to the
particular k-center, create a cluster.
Hence each cluster has datapoints with some commonalities, and it is away from other clusters.
The below diagram explains the working of the K-means Clustering Algorithm:
How does the K-Means Algorithm Work?
The working of the K-Means algorithm is explained in the below steps:
Step-1: Select the number K to decide the number of clusters.
Step-2: Select random K points or centroids. (It can be other from the input dataset).
Step-3: Assign each data point to their closest centroid, which will form the predefined K
clusters.
Step-4: Calculate the variance and place a new centroid of each cluster.
Step-5: Repeat the third steps, which means reassign each datapoint to the new closest centroid
of each cluster.
Step-6: If any reassignment occurs, then go to step-4 else go to FINISH.
Step-7: The model is ready.
Let's understand the above steps by considering the visual plots:
Suppose we have two variables M1 and M2. The x-y axis scatter plot of these two variables is
given below:
o Let's take number k of clusters, i.e., K=2, to identify the dataset and to put them into
different clusters. It means here we will try to group these datasets into two different
clusters.
o We need to choose some random k points or centroid to form the cluster. These points
can be either the points from the dataset or any other point. So, here we are selecting the
below two points as k points, which are not the part of our dataset. Consider the below
image:
o Now we will assign each data point of the scatter plot to its closest K-point or centroid.
We will compute it by applying some mathematics that we have studied to calculate the
distance between two points. So, we will draw a median between both the centroids.
Consider the below image:
From the above image, it is clear that points left side of the line is near to the K1 or blue
centroid, and points to the right of the line are close to the yellow centroid. Let's color them as
blue and yellow for clear visualization.
o As we need to find the closest cluster, so we will repeat the process by choosing a new
centroid. To choose the new centroids, we will
compute the center of

gravity of these centroids, and will find new centroids as below:
o Next, we will reassign each datapoint to the new centroid. For this, we will repeat the
same process of finding a median line. The median will be like below image:
From the above image, we can see, one yellow point is on the left side of the line, and two blue
points are right to the line. So, these three points will be assigned to new centroids.
As reassignment has taken place, so we will again go to the step-4, which is finding new
centroids or K-points.
o We will repeat the process by finding the center of gravity of centroids, so the new
centroids will be as shown in the below image:
o As we got the new centroids so again will draw the median line and reassign the data
points. So, the image will be:
o We can see in the above image; there are no dissimilar data points on either side of the
line, which means our model is formed. Consider the below image:
As our model is ready, so we can now remove the assumed centroids, and the two final clusters
will be as shown in the below image:
How to choose the value of "K number of clusters" in K-means Clustering?
The performance of the K-means clustering algorithm depends upon highly efficient clusters that
it forms. But choosing the optimal number of clusters is a big task. There are some different
ways to find the optimal number of clusters, but here we are discussing the most appropriate
method to find the number of clusters or value of K. The method is given below:
Elbow Method
The Elbow method is one of the most popular ways to find the optimal number of clusters. This
method uses the concept of WCSS value. WCSS stands for Within Cluster Sum of Squares,
which defines the total variations within a cluster. The formula to calculate the value of WCSS
(for 3 clusters) is given below:
WCSS= ∑Pi in Cluster1 distance(Pi C1)2 +∑Pi in Cluster2distance(Pi C2)2+∑Pi in CLuster3 distance(Pi C3)2
In the above formula of WCSS,
∑Pi in Cluster1 distance(Pi C1)2: It is the sum of the square of the distances between each data point
and its centroid within a cluster1 and the same for the other two terms.
To measure the distance between data points and centroid, we can use any method such as
Euclidean distance or Manhattan distance.
To find the optimal value of clusters, the elbow method follows the below steps:
o It executes the K-means clustering on a given dataset for different K values (ranges from
1-10).
o For each value of K, calculates the WCSS value.
o Plots a curve between calculated WCSS values and the number of clusters K.
o The sharp point of bend or a point of the plot looks like an arm, then that point is
considered as the best value of K.
Since the graph shows the sharp bend, which looks like an elbow, hence it is known as the elbow
method. The graph for the elbow method looks like the below image:
3.Explain about Agents in Artificial Intelligence
An AI system can be defined as the study of the rational agent and its environment. The agents
sense the environment through sensors and act on their environment through actuators. An AI
agent can have mental properties such as knowledge, belief, intention, etc.
What is an Agent?
An agent can be anything that perceiveits environment through sensors and act upon that
environment through actuators. An Agent runs in the cycle of perceiving, thinking, and acting.
An agent can be:
o Human-Agent: A human agent has eyes, ears, and other organs which work for sensors
and hand, legs, vocal tract work for actuators.
o Robotic Agent: A robotic agent can have cameras, infrared range finder, NLP for
sensors and various motors for actuators.
o Software Agent: Software agent can have keystrokes, file contents as sensory input and
act on those inputs and display output on the screen.
Hence the world around us is full of agents such as thermostat, cellphone, camera, and even we
are also agents.
Before moving forward, we should first know about sensors, effectors, and actuators.Play Vid
Sensor: Sensor is a device which detects the change in the environment and sends the
information to other electronic devices. An agent observes its environment through sensors.
Actuators: Actuators are the component of machines that converts energy into motion. The
actuators are only responsible for moving and controlling a system. An actuator can be an
electric motor, gears, rails, etc.
Effectors: Effectors are the devices which affect the environment. Effectors can be legs, wheels,
arms, fingers, wings, fins, and display screen.
Intelligent Agents:
An intelligent agent is an autonomous entity which act upon an environment using sensors and
actuators for achieving goals. An intelligent agent may learn from the environment to achieve
their goals. A thermostat is an example of an intelligent agent.
Following are the main four rules for an AI agent:
o Rule 1: An AI agent must have the ability to perceive the environment.

o Rule 2: The observation must be used to make decisions.
o Rule 3: Decision should result in an action.
o Rule 4: The action taken by an AI agent must be a rational action.
Rational Agent:
A rational agent is an agent which has clear preference, models uncertainty, and acts in a way to
maximize its performance measure with all possible actions.
A rational agent is said to perform the right things. AI is about creating rational agents to use for
game theory and decision theory for various real-world scenarios.
For an AI agent, the rational action is most important because in AI reinforcement learning
algorithm, for each best possible action, agent gets the positive reward and for each wrong
action, an agent gets a negative reward.
Rationality:
The rationality of an agent is measured by its performance measure. Rationality can be judged on
the basis of following points:
o Performance measure which defines the success criterion.

o Agent prior knowledge of its environment.
o Best possible actions that an agent can perform.
o The sequence of percepts.
Structure of an AI Agent
The task of AI is to design an agent program which implements the agent function. The structure
of an intelligent agent is a combination of architecture and agent program. It can be viewed as:
1. Agent = Architecture + Agent program
Following are the main three terms involved in the structure of an AI agent:
Architecture: Architecture is machinery that an AI agent executes on.
Agent Function: Agent function is used to map a percept to an action.
1. f:P* → A
Agent program: Agent program is an implementation of agent function. An agent program

executes on the physical architecture to produce function f.
PEAS Representation
PEAS is a type of model on which an AI agent works upon. When we define an AI agent or
rational agent, then we can group its properties under PEAS representation model. It is made up
of four words:
o P: Performance measure
o E: Environment
o A: Actuators
o S: Sensors
Here performance measure is the objective for the success of an agent's behavior.
PEAS for self-driving cars:
Let's suppose a self-driving car then PEAS representation will be:
Performance: Safety, time, legal drive, comfort
Environment: Roads, other vehicles, road signs, pedestrian
Actuators: Steering, accelerator, brake, signal, horn
Sensors: Camera, GPS, speedometer, odometer, accelerometer, sonar.
4. Briefly explain about the Fuzzy Clustering
Fuzzy clustering is a type of soft method in which a data object may belong to more than one
group or cluster. Each dataset has a set of membership coefficients, which depend on the degree
of membership to be in a cluster. Fuzzy C-means algorithm is the example of this type of
clustering; it is sometimes also known as the Fuzzy k-means algorithm.
Clustering Algorithms
The Clustering algorithms can be divided based on their models that are explained above. There
are different types of clustering algorithms published, but only a few are commonly used. The
clustering algorithm is based on the kind of data that we are using. Such as, some algorithms
need to guess the number of clusters in the given dataset, whereas some are required to find the
minimum distance between the observation of the dataset.
Here we are discussing mainly popular Clustering algorithms that are widely used in machine
learning:
1. K-Means algorithm: The k-means algorithm is one of the most popular clustering
algorithms. It classifies the dataset by dividing the samples into different clusters of equal
variances. The number of clusters must be specified in this algorithm. It is fast with fewer
computations required, with the linear complexity of O(n).
2. Mean-shift algorithm: Mean-shift algorithm tries to find the dense areas in the smooth
density of data points. It is an example of a centroid-based model, that works on updating
the candidates for centroid to be the center of the points within a given region.
3. DBSCAN Algorithm: It stands for Density-Based Spatial Clustering of Applications
with Noise. It is an example of a density-based model similar to the mean-shift, but with
some remarkable advantages. In this algorithm, the areas of high density are separated by
the areas of low density. Because of this, the clusters can be found in any arbitrary shape.
4. Expectation-Maximization Clustering using GMM: This algorithm can be used as an
alternative for the k-means algorithm or for those cases where K-means can be failed. In
GMM, it is assumed that the data points are Gaussian distributed.
5. Agglomerative Hierarchical algorithm: The Agglomerative hierarchical algorithm
performs the bottom-up hierarchical clustering. In this, each data point is treated as a
single cluster at the outset and then successively merged. The cluster hierarchy can be
represented as a tree-structure.
6. Affinity Propagation: It is different from other clustering algorithms as it does not
require to specify the number of clusters. In this, each data point sends a message
between the pair of data points until convergence. It has O(N2T) time complexity, which
is the main drawback of this algorithm.
Applications of Clustering
Below are some commonly known applications of clustering technique in Machine Learning:
o In Identification of Cancer Cells: The clustering algorithms are widely used for the
identification of cancerous cells. It divides the cancerous and non-cancerous data sets into
different groups.
o In Search Engines: Search engines also work on the clustering technique. The search
result appears based on the closest object to the search query. It does it by grouping
similar data objects in one group that is far from the other dissimilar objects. The accurate
result of a query depends on the quality of the clustering algorithm used.
o Customer Segmentation: It is used in market research to segment the customers based
on their choice and preferences.
o In Biology: It is used in the biology stream to classify different species of plants and
animals using the image recognition technique.
o In Land Use: The clustering technique is used in identifying the area of similar lands use
in the GIS database. This can be very useful to find that for what purpose the particular
land should be used, that means for which purpose it is more suitable.

Ai Notes Unit 4 (1) (2) (1)

Uploaded by

Copyright:

Available Formats

Ai Notes Unit 4 (1) (2) (1)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ai Notes Unit 4 (1) (2) (1)

Uploaded by

Copyright:

Available Formats

UNIT IV

1. What is the Purpose of Machine Learning?

Machine learning can be seen as a branch of AI or Artificial Intelligence, since, the

3. What is the Intelligent Agents?

An intelligent agent is an autonomous entity which act upon an environment using

Following are the main four rules for an AI agent:

o Rule 1: An AI agent must have the ability to perceive the environment.

4. What is the Rational Agent?

5. What are the types of Association Rule Learning?

8. Why is machine learning important?

15. Draw the Machine learning process

16. How does Association Rule Learning work?

• Apriori is a pretty straightforward algorithm that performs the following sequence of

1. Calculate support for item sets of size 1.

18. What are the Applications of Association Rule Learning?

• Market Basket Analysis: It is one of the popular examples and applications of

19. What is Fuzzy Logic?

20. Why do we use Fuzzy Logic?

• It controls machines and consumer products

• If not accurate reasoning, it at least provides acceptable reasoning

21. Draw the Fuzzy Logic Systems Architecture

22. What is the Defuzzification ?

23. What is the Inference Engine?

24. What are the Applications of Fuzzy Logic?

25. What are the Advantages & Disadvantages of Fuzzy Logic?

• The structure of Fuzzy Logic Systems is easy and understandable

• Fuzzy logic is widely used for commercial and practical purposes

• It helps you to control machines and consumer products

• It helps you to deal with the uncertainty in engineering

Example of a Fuzzy Logic System

26. Define Set

• A set of all-natural numbers

• A set of all cities in a state.

• A set of upper-case letters of the alphabet.

27. What are the Operations on Classical Set?

28. What are the Applications of Association Rule Learning?

• Market Basket Analysis: It is one of the popular examples and applications of

Types of Machine Learning

Machine learning is a subset of AI, which enables the machine to automatically

1. Supervised Machine Learning

1. Supervised Machine Learning

Categories of Supervised Machine Learning

Some popular classification algorithms are given below:

o Random Forest Algorithm

Some popular Regression algorithms are given below:

o Simple Linear Regression Algorithm

2. Unsupervised Machine Learning

Categories of Unsupervised Machine Learning

Some of the popular clustering algorithms are given below:

o K-Means Clustering algorithm

Semi-Supervised learning is a type of Machine Learning algorithm that lies between

Reinforcement learning works on a feedback-based process, in which an AI agent (A

Categories of Reinforcement Learning

Reinforcement learning is categorized mainly into two types of methods/algorithms:

o Positive Reinforcement Learning: Positive reinforcement learning specifies increasing

2. What are the Advantages and Disadvantages of Supervised Learning?

o These algorithms are not able to solve complex tasks.

3. What are the Applications of Supervised Learning?

Some common applications of Supervised Learning are given below:

4. What are the Advantages and Disadvantages of Unsupervised Learning Algorithm?