Ai Notes Unit 4 (1) (2) (1)
Ai Notes Unit 4 (1) (2) (1)
Ai Notes Unit 4 (1) (2) (1)
2 MARKS
An agent can be anything that perceive its environment through sensors and act upon that
environment through actuators. An Agent runs in the cycle of perceiving, thinking, and acting.
An agent can be:
o Human-Agent: A human agent has eyes, ears, and other organs which work for sensors
and hand, legs, vocal tract work for actuators.
o Robotic Agent: A robotic agent can have cameras, infrared range finder, NLP for
sensors and various motors for actuators.
o Software Agent: Software agent can have keystrokes, file contents as sensory input and
act on those inputs and display output on the screen.
A rational agent is said to perform the right things. AI is about creating rational agents to
use for game theory and decision theory for various real-world scenarios.
• Robotics: Robots can learn to perform tasks the physical world using this technique.
• Video gameplay: Reinforcement learning has been used to teach bots to play a number
of video games.
• Resource management: Given finite resources and a defined goal, reinforcement
learning can help enterprises plan out how to allocate resources.
• Association rule learning works on the concept of If and Else Statement, such as if A then
B.
• Here the If element is called antecedent, and then statement is called as Consequent
17. Define the Apriori algorithm
2. Apply the minimum support threshold and prune item sets that do not meet the threshold.
3. Move on to item sets of size 2 and repeat steps one and two.
4. Continue the same process until no additional item sets satisfying the minimum threshold
can be found.
• It has various applications in machine learning and data mining. Below are some popular
applications of association rule learning:
• Medical Diagnosis: With the help of association rules, patients can be cured easily, as it
helps in identifying the probability of illness for a particular disease.
• Protein Sequence: The association rules help in determining the synthesis of artificial
Proteins.
• It is also used for the Catalog Design and Loss-leader Analysis and many more other
applications.
• Fuzzy Logic (FL) is a method of reasoning that resembles human reasoning. This
approach is similar to how humans perform decision making. And it involves all
intermediate possibilities between YES and NO.
• Generally, we use the fuzzy logic system for both commercial and practical purposes
such as:
• So, now that you know about Fuzzy logic in AI and why do we actually use it, let’s move
on and understand the architecture of this logic.
The Defuzzification process converts the fuzzy sets into a crisp value. There are different
types of techniques available, and you need to select the best-suited one with an expert system.
It determines the degree of match between fuzzy input and the rules. According to the
input field, it will decide the rules that are to be fired. Combining the fired rules, form the control
actions.
• The Fuzzy logic is used in various fields such as automotive systems, domestic goods,
environment control, etc. Some of the common applications are:
• It is used in the aerospace field for altitude control of spacecraft and satellite.
• This controls the speed and traffic in the automotive systems.
• It is used for decision making support systems and personal evaluation in the large
company business.
• It also controls the pH, drying, chemical distillation process in the chemical industry.
• This system adjusts the temperature of air conditioner by comparing the room
temperature and the target temperature value.
A set is a term, which is a collection of unordered or ordered elements. Following are the
various examples of a set:
• Union Operation
• Intersection Operation
• Difference Operation
• Complement Operation
• Medical Diagnosis: With the help of association rules, patients can be cured easily, as it
helps in identifying the probability of illness for a particular disease.
• Protein Sequence: The association rules help in determining the synthesis of artificial
Proteins.
• It is also used for the Catalog Design and Loss-leader Analysis and many more other
applications.
5 MARKS
1. Explain and brief about the Machine Learning and its types?
Based on the methods and way of learning, machine learning is divided into mainly four types,
which are:
The main goal of the supervised learning technique is to map the input variable(x) with the
output variable(y). Some real-world applications of supervised learning are Risk Assessment,
Fraud Detection, Spam filtering, etc.
Supervised machine learning can be classified into two types of problems, which are given
below:
o Classification
o Regression
a) Classification
Classification algorithms are used to solve the classification problems in which the output
variable is categorical, such as "Yes" or No, Male or Female, Red or Blue, etc. The
classification algorithms predict the categories present in the dataset. Some real-world examples
of classification algorithms are Spam Detection, Email filtering, etc.
b) Regression
Regression algorithms are used to solve regression problems in which there is a linear
relationship between input and output variables. These are used to predict continuous output
variables, such as market trends, weather prediction, etc.
The main aim of the unsupervised learning algorithm is to group or categories the unsorted
dataset according to the similarities, patterns, and differences. Machines are instructed to
find the hidden patterns from the input dataset.
Unsupervised Learning can be further classified into two types, which are given below:
o Clustering
o Association
1) Clustering
The clustering technique is used when we want to find the inherent groups from the data. It is a
way to group the objects into a cluster such that the objects with the most similarities remain in
one group and have fewer or no similarities with the objects of other groups. An example of the
clustering algorithm is grouping the customers by their purchasing behaviour.
2) Association
Association rule learning is an unsupervised learning technique, which finds interesting relations
among variables within a large dataset. The main aim of this learning algorithm is to find the
dependency of one data item on another data item and map those variables accordingly so that it
can generate maximum profit. This algorithm is mainly applied in Market Basket analysis,
Web usage mining, continuous production, etc.
Some popular algorithms of Association rule learning are Apriori Algorithm, Eclat, FP-growth
algorithm.
3. Semi-Supervised Learning
4. Reinforcement Learning
Advantages:
o Since supervised learning work with the labelled dataset so we can have an exact idea
about the classes of objects.
o These algorithms are helpful in predicting the output on the basis of prior experience.
Disadvantages:
o ImageSegmentation:
Supervised Learning algorithms are used in image segmentation. In this process, image
classification is performed on different image data with pre-defined labels.
o MedicalDiagnosis:
Supervised algorithms are also used in the medical field for diagnosis purposes. It is done
by using medical images and past labelled data with labels for disease conditions. With
such a process, the machine can identify a disease for the new patients.
o Fraud Detection - Supervised Learning classification algorithms are used for identifying
fraud transactions, fraud customers, etc. It is done by using historic data to identify the
patterns that can lead to possible fraud.
o Spam detection - In spam detection & filtering, classification algorithms are used. These
algorithms classify an email as spam or not spam. The spam emails are sent to the spam
folder.
o Speech Recognition - Supervised learning algorithms are also used in speech
recognition. The algorithm is trained with voice data, and various identifications can be
done using the same, such as voice-activated passwords, voice commands, etc.
Advantages:
o These algorithms can be used for complicated tasks compared to the supervised ones
because these algorithms work on the unlabeled dataset.
o Unsupervised algorithms are preferable for various tasks as getting the unlabeled dataset
is easier as compared to the labelled dataset.
Disadvantages:
o The output of an unsupervised algorithm can be less accurate as the dataset is not
labelled, and algorithms are not trained with the exact output in prior.
o Working with Unsupervised learning is more difficult as it works with the unlabelled
dataset that does not map with the output.
Advantages:
o It is simple and easy to understand the algorithm.
o It is highly efficient.
o It is used to solve drawbacks of Supervised and Unsupervised Learning algorithms.
Disadvantages:
10 MARKS
In the real world, we are surrounded by humans who can learn everything from their
experiences with their learning capability, and we have computers or machines which work on
our instructions. But can a machine also learn from experiences or past data like a human does?
So here comes the role of Machine Learning.
Machine learning enables a machine to automatically learn from data, improve performance
from experiences, and predict things without being explicitly programmed.
With the help of sample historical data, which is known as training data, machine
learning algorithms build a mathematical model that helps in making predictions or decisions
without being explicitly programmed. Machine learning brings computer science and statistics
together for creating predictive models. Machine learning constructs or uses the algorithms that
learn from historical data. The more we will provide the information, the higher will be the
performance.
A machine has the ability to learn if it can improve its performance by gaining more data.
A Machine Learning system learns from historical data, builds the prediction models, and
whenever it receives new data, predicts the output for it. The accuracy of predicted output
depends upon the amount of data, as the huge amount of data helps to build a better model which
predicts the output more accurately.
Suppose we have a complex problem, where we need to perform some predictions, so instead of
writing a code for it, we just need to feed the data to generic algorithms, and with the help of
these algorithms, machine builds the logic as per the data and predict the output. Machine
learning has changed our way of thinking about the problem. The below block diagram explains
the working of Machine Learning algorithm:
The need for machine learning is increasing day by day. The reason behind the need for
machine learning is that it is capable of doing tasks that are too complex for a person to
implement directly. As a human, we have some limitations as we cannot access the huge amount
of data manually, so for this, we need some computer systems and here comes the machine
learning to make things easy for us.
We can train machine learning algorithms by providing them the huge amount of data
and let them explore the data, construct the models, and predict the required output
automatically. The performance of the machine learning algorithm depends on the amount of
data, and it can be determined by the cost function. With the help of machine learning, we can
save both time and money.
The importance of machine learning can be easily understood by its uses cases,
Currently, machine learning is used in self-driving cars, cyber fraud detection, face
recognition, and friend suggestion by Facebook, etc. Various top companies such as Netflix
and Amazon have build machine learning models that are using a vast amount of data to analyze
the user interest and recommend product accordingly.
Following are some key points which show the importance of Machine Learning:
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
1) Supervised Learning
Supervised learning is a type of machine learning method in which we provide sample labeled
data to the machine learning system in order to train it, and on that basis, it predicts the output.
The system creates a model using labeled data to understand the datasets and learn about each
data, once the training and processing are done then we test the model by providing a sample
data to check whether it is predicting the exact output or not.
The goal of supervised learning is to map input data with the output data. The supervised
learning is based on supervision, and it is the same as when a student learns things in the
supervision of the teacher. The example of supervised learning is spam filtering.
o Classification
o Regression
2) Unsupervised Learning
Unsupervised learning is a learning method in which a machine learns without any supervision.
The training is provided to the machine with the set of data that has not been labeled, classified,
or categorized, and the algorithm needs to act on that data without any supervision. The goal of
unsupervised learning is to restructure the input data into new features or a group of objects with
similar patterns.
In unsupervised learning, we don't have a predetermined result. The machine tries to find useful
insights from the huge amount of data. It can be further classifieds into two categories of
algorithms:
o Clustering
o Association
3) Reinforcement Learning
The robotic dog, which automatically learns the movement of his arms, is an example of
Reinforcement learning.
Association rule learning is a type of unsupervised learning technique that checks for the
dependency of one data item on another data item and maps accordingly so that it can be more
profitable. It tries to find some interesting relations or associations among the variables of
dataset. It is based on different rules to discover the interesting relations between variables in the
database.
The association rule learning is one of the very important concepts of machine learning
, and it is employed in Market Basket analysis, Web usage mining, continuous production,
etc. Here market basket analysis is a technique used by the various big retailer to discover the
associations between items. We can understand it by taking an example of a supermarket, as in a
supermarket, all products that are purchased together are put together.
For example, if a customer buys bread, he most likely can also buy butter, eggs, or milk, so these
products are stored within a shelf or mostly nearby. Consider the below diagram:
Association rule learning can be divided into three types of algorithms:
1. Apriori
2. Eclat
3. F-P Growth Algorithm
Association rule learning works on the concept of If and Else Statement, such as if A then B.
Here the If element is called antecedent, and then statement is called as Consequent. These
types of relationships where we can find out some association or relation between two items is
known as single cardinality. It is all about creating rules, and if the number of items increases,
then cardinality also increases accordingly. So, to measure the associations between thousands of
data items, there are several metrics. These metrics are given below:
o Support
o Confidence
o Lift
Support
Support is the frequency of A or how frequently an item appears in the dataset. It is defined as
the fraction of the transaction T that contains the itemset X. If there are X datasets, then for
transactions T, it can be written as:
Confidence
Confidence indicates how often the rule has been found to be true. Or how often the items X and
Y occur together in the dataset when the occurrence of X is already given. It is the ratio of the
transaction that contains X and Y to the number of records that contain X.
Lift
It is the ratio of the observed support measure and expected support if X and Y are independent
of each other. It has three possible values:
Apriori Algorithm
This algorithm uses frequent datasets to generate association rules. It is designed to work on the
databases that contain transactions. This algorithm uses a breadth-first search and Hash Tree to
calculate the itemset efficiently.
It is mainly used for market basket analysis and helps to understand the products that can be
bought together. It can also be used in the healthcare field to find drug reactions for patients.
Eclat Algorithm
Eclat algorithm stands for Equivalence Class Transformation. This algorithm uses a depth-first
search technique to find frequent itemsets in a transaction database. It performs faster execution
than Apriori Algorithm.
The F-P growth algorithm stands for Frequent Pattern, and it is the improved version of the
Apriori Algorithm. It represents the database in the form of a tree structure that is known as a
frequent pattern or tree. The purpose of this frequent tree is to extract the most frequent patterns.
It has various applications in machine learning and data mining. Below are some popular
applications of association rule learning:
K-Means Clustering is an unsupervised learning algorithm that is used to solve the clustering
problems in machine learning or data science. In this topic, we will learn what is K-means
clustering algorithm, how the algorithm works, along with the Python implementation of k-
means clustering.
, which groups the unlabeled dataset into different clusters. Here K defines the number of pre-
defined clusters that need to be created in the process, as if K=2, there will be two clusters, and
for K=3, there will be three clusters, and so on.
It is an iterative algorithm that divides the unlabeled dataset into k different clusters in such a
way that each dataset belongs only one group that has similar properties.
It allows us to cluster the data into different groups and a convenient way to discover the
categories of groups in the unlabeled dataset on its own without the need for any training.
It is a centroid-based algorithm, where each cluster is associated with a centroid. The main aim
of this algorithm is to minimize the sum of distances between the data point and their
corresponding clusters
The algorithm takes the unlabeled dataset as input, divides the dataset into k-number of clusters,
and repeats the process until it does not find the best clusters. The value of k should be
predetermined in this algorithm.
o Determines the best value for K center points or centroids by an iterative process.
o Assigns each data point to its closest k-center. Those data points which are near to the
particular k-center, create a cluster.
Hence each cluster has datapoints with some commonalities, and it is away from other clusters.
The below diagram explains the working of the K-means Clustering Algorithm:
How does the K-Means Algorithm Work?
Step-2: Select random K points or centroids. (It can be other from the input dataset).
Step-3: Assign each data point to their closest centroid, which will form the predefined K
clusters.
Step-4: Calculate the variance and place a new centroid of each cluster.
Step-5: Repeat the third steps, which means reassign each datapoint to the new closest centroid
of each cluster.
Suppose we have two variables M1 and M2. The x-y axis scatter plot of these two variables is
given below:
o Let's take number k of clusters, i.e., K=2, to identify the dataset and to put them into
different clusters. It means here we will try to group these datasets into two different
clusters.
o We need to choose some random k points or centroid to form the cluster. These points
can be either the points from the dataset or any other point. So, here we are selecting the
below two points as k points, which are not the part of our dataset. Consider the below
image:
o Now we will assign each data point of the scatter plot to its closest K-point or centroid.
We will compute it by applying some mathematics that we have studied to calculate the
distance between two points. So, we will draw a median between both the centroids.
Consider the below image:
From the above image, it is clear that points left side of the line is near to the K1 or blue
centroid, and points to the right of the line are close to the yellow centroid. Let's color them as
blue and yellow for clear visualization.
o As we need to find the closest cluster, so we will repeat the process by choosing a new
centroid. To choose the new centroids, we will
From the above image, we can see, one yellow point is on the left side of the line, and two blue
points are right to the line. So, these three points will be assigned to new centroids.
As reassignment has taken place, so we will again go to the step-4, which is finding new
centroids or K-points.
o We will repeat the process by finding the center of gravity of centroids, so the new
centroids will be as shown in the below image:
o As we got the new centroids so again will draw the median line and reassign the data
points. So, the image will be:
o We can see in the above image; there are no dissimilar data points on either side of the
line, which means our model is formed. Consider the below image:
As our model is ready, so we can now remove the assumed centroids, and the two final clusters
will be as shown in the below image:
The performance of the K-means clustering algorithm depends upon highly efficient clusters that
it forms. But choosing the optimal number of clusters is a big task. There are some different
ways to find the optimal number of clusters, but here we are discussing the most appropriate
method to find the number of clusters or value of K. The method is given below:
Elbow Method
The Elbow method is one of the most popular ways to find the optimal number of clusters. This
method uses the concept of WCSS value. WCSS stands for Within Cluster Sum of Squares,
which defines the total variations within a cluster. The formula to calculate the value of WCSS
(for 3 clusters) is given below:
WCSS= ∑Pi in Cluster1 distance(Pi C1)2 +∑Pi in Cluster2distance(Pi C2)2+∑Pi in CLuster3 distance(Pi C3)2
∑Pi in Cluster1 distance(Pi C1)2: It is the sum of the square of the distances between each data point
and its centroid within a cluster1 and the same for the other two terms.
To measure the distance between data points and centroid, we can use any method such as
Euclidean distance or Manhattan distance.
To find the optimal value of clusters, the elbow method follows the below steps:
o It executes the K-means clustering on a given dataset for different K values (ranges from
1-10).
o For each value of K, calculates the WCSS value.
o Plots a curve between calculated WCSS values and the number of clusters K.
o The sharp point of bend or a point of the plot looks like an arm, then that point is
considered as the best value of K.
Since the graph shows the sharp bend, which looks like an elbow, hence it is known as the elbow
method. The graph for the elbow method looks like the below image:
3.Explain about Agents in Artificial Intelligence
An AI system can be defined as the study of the rational agent and its environment. The agents
sense the environment through sensors and act on their environment through actuators. An AI
agent can have mental properties such as knowledge, belief, intention, etc.
What is an Agent?
An agent can be anything that perceiveits environment through sensors and act upon that
environment through actuators. An Agent runs in the cycle of perceiving, thinking, and acting.
An agent can be:
o Human-Agent: A human agent has eyes, ears, and other organs which work for sensors
and hand, legs, vocal tract work for actuators.
o Robotic Agent: A robotic agent can have cameras, infrared range finder, NLP for
sensors and various motors for actuators.
o Software Agent: Software agent can have keystrokes, file contents as sensory input and
act on those inputs and display output on the screen.
Hence the world around us is full of agents such as thermostat, cellphone, camera, and even we
are also agents.
Before moving forward, we should first know about sensors, effectors, and actuators.Play Vid
Sensor: Sensor is a device which detects the change in the environment and sends the
information to other electronic devices. An agent observes its environment through sensors.
Actuators: Actuators are the component of machines that converts energy into motion. The
actuators are only responsible for moving and controlling a system. An actuator can be an
electric motor, gears, rails, etc.
Effectors: Effectors are the devices which affect the environment. Effectors can be legs, wheels,
arms, fingers, wings, fins, and display screen.
Intelligent Agents:
An intelligent agent is an autonomous entity which act upon an environment using sensors and
actuators for achieving goals. An intelligent agent may learn from the environment to achieve
their goals. A thermostat is an example of an intelligent agent.
Rational Agent:
A rational agent is an agent which has clear preference, models uncertainty, and acts in a way to
maximize its performance measure with all possible actions.
A rational agent is said to perform the right things. AI is about creating rational agents to use for
game theory and decision theory for various real-world scenarios.
For an AI agent, the rational action is most important because in AI reinforcement learning
algorithm, for each best possible action, agent gets the positive reward and for each wrong
action, an agent gets a negative reward.
Rationality:
The rationality of an agent is measured by its performance measure. Rationality can be judged on
the basis of following points:
Structure of an AI Agent
The task of AI is to design an agent program which implements the agent function. The structure
of an intelligent agent is a combination of architecture and agent program. It can be viewed as:
Following are the main three terms involved in the structure of an AI agent:
1. f:P* → A
PEAS Representation
PEAS is a type of model on which an AI agent works upon. When we define an AI agent or
rational agent, then we can group its properties under PEAS representation model. It is made up
of four words:
o P: Performance measure
o E: Environment
o A: Actuators
o S: Sensors
Here performance measure is the objective for the success of an agent's behavior.
Fuzzy clustering is a type of soft method in which a data object may belong to more than one
group or cluster. Each dataset has a set of membership coefficients, which depend on the degree
of membership to be in a cluster. Fuzzy C-means algorithm is the example of this type of
clustering; it is sometimes also known as the Fuzzy k-means algorithm.
Clustering Algorithms
The Clustering algorithms can be divided based on their models that are explained above. There
are different types of clustering algorithms published, but only a few are commonly used. The
clustering algorithm is based on the kind of data that we are using. Such as, some algorithms
need to guess the number of clusters in the given dataset, whereas some are required to find the
minimum distance between the observation of the dataset.
Here we are discussing mainly popular Clustering algorithms that are widely used in machine
learning:
1. K-Means algorithm: The k-means algorithm is one of the most popular clustering
algorithms. It classifies the dataset by dividing the samples into different clusters of equal
variances. The number of clusters must be specified in this algorithm. It is fast with fewer
computations required, with the linear complexity of O(n).
2. Mean-shift algorithm: Mean-shift algorithm tries to find the dense areas in the smooth
density of data points. It is an example of a centroid-based model, that works on updating
the candidates for centroid to be the center of the points within a given region.
3. DBSCAN Algorithm: It stands for Density-Based Spatial Clustering of Applications
with Noise. It is an example of a density-based model similar to the mean-shift, but with
some remarkable advantages. In this algorithm, the areas of high density are separated by
the areas of low density. Because of this, the clusters can be found in any arbitrary shape.
4. Expectation-Maximization Clustering using GMM: This algorithm can be used as an
alternative for the k-means algorithm or for those cases where K-means can be failed. In
GMM, it is assumed that the data points are Gaussian distributed.
5. Agglomerative Hierarchical algorithm: The Agglomerative hierarchical algorithm
performs the bottom-up hierarchical clustering. In this, each data point is treated as a
single cluster at the outset and then successively merged. The cluster hierarchy can be
represented as a tree-structure.
6. Affinity Propagation: It is different from other clustering algorithms as it does not
require to specify the number of clusters. In this, each data point sends a message
between the pair of data points until convergence. It has O(N2T) time complexity, which
is the main drawback of this algorithm.
Applications of Clustering
Below are some commonly known applications of clustering technique in Machine Learning:
o In Identification of Cancer Cells: The clustering algorithms are widely used for the
identification of cancerous cells. It divides the cancerous and non-cancerous data sets into
different groups.
o In Search Engines: Search engines also work on the clustering technique. The search
result appears based on the closest object to the search query. It does it by grouping
similar data objects in one group that is far from the other dissimilar objects. The accurate
result of a query depends on the quality of the clustering algorithm used.
o Customer Segmentation: It is used in market research to segment the customers based
on their choice and preferences.
o In Biology: It is used in the biology stream to classify different species of plants and
animals using the image recognition technique.
o In Land Use: The clustering technique is used in identifying the area of similar lands use
in the GIS database. This can be very useful to find that for what purpose the particular
land should be used, that means for which purpose it is more suitable.