AI all notes
AI all notes
AI all notes
BTCOC701
Lecture
Topic to be covered
Number
: Submitted by:
Prof. S. B. Mehta
ASST PROF:- S.B.Mehta NCER, BATU UNIVERSITY, LONERE
DEPARTMENT OF
COMPUTER SCIENCE
Nutan College Of Engineering &
& ENGINEERING
Research, Talegaon Dabhade, Pune-410507
Artificial Intelligence
Unit 1: Introduction of AI
Artificial Intelligence:
In today's world, technology is growing very fast, and we are getting in touch with different new
technologies day by day Here, one of the booming technologies of computer science is Artificial
Intelligence which is ready to create a new revolution in the world by making intelligent machines.
The Artificial Intelligence is now all around us. It is currently working with a variety of subfields,
ranging from general to specific, such as self-driving cars, playing chess, proving theorems, playing
music, Painting, etc.
One of the greatest innovators in the field of machine learning was John McCarthy, widely
recognized as the "Father of Artificial Intelligence".
In the mid 1950s, McCarthy coined the term "Artificial Intelligence" and defined it as "the science of
making intelligent machines
According to the father of Artificial Intelligence, John McCarthy, it is “The science and
engineering of making intelligent machines, especially intelligent computer programs”.
Artificial Intelligence is a way of making a computer, a computer-controlled robot, or a software
think intelligently, in the similar manner the intelligent humans think.
AI is accomplished by studying how human brain thinks, and how humans learn, decide, and work
while trying to solve a problem, and then using the outcomes of this study as a basis of developing
intelligent software and systems.
Artificial Intelligence suggest that machines can mimic humans in:
Talking
Thinking
Learning
Planning
Understanding
Artificial Intelligence is composed of two words Artificial and Intelligence, where Artificial defines
"man-made," and intelligence defines "thinking power", hence AI means "a man-made thinking
power."
"It is a branch of computer science by which we can create intelligent machines which can
behave like a human, think like humans, and able to make decisions.
With the help of AI, you can create such software or devices which can solve real-world problems
very easily and with accuracy such as health issues, marketing, traffic issues, etc.
With the help of AI, you can create your personal virtual Assistant, such as Cortana, Google
Assistant, Siri, etc.
With the help of AI, you can build such Robots which can work in an environment where survival of
humans can be at risk.
AI opens a path for other new technologies, new devices, and new Opportunities.
1. Reactive Machines
Purely reactive machines are the most basic types of Artificial Intelligence.
This type of machine reacts depending on the present moment .
Such AI systems do not store memories or past experiences for future actions.
These machines only focus on current scenarios and react on it as per possible best action.
IBM's Deep Blue system is an example of reactive machines.
Google's AlphaGo is also an example of reactive machines
Example: Chess Player Machine this type of machine greate with playing games has ability to
recognize which move would be best
2. Limited Memory
• Limited memory machines can store past experiences or some data for a short period of time.
• These machines can use stored data for a limited time period only.
• Self-driving cars are one of the best examples of Limited Memory systems. These cars can store
recent speed of nearby cars, the distance of other cars, speed limit, and other information to navigate the
road. E.g Traffic light , chat Robots
3. Theory of Mind
Theory of Mind AI should understand the human emotions, people, beliefs, and be able to interact
socially like humans.
This type of AI machines is still not developed, but researchers are making lots of efforts and
improvement for developing such AI machines.
This type of AI Machine more intelligent as it reacts according to understood people thought &
emotions, these machines can adjust with people around, build social interactions, predict how people
expect to be Treats & thus behave according of these expectations.
Example : Robot Sofia – Camera present in Sophia’s eves combined with computer algorithms allow
her to see , sustain eye contact , recognize individual & follow faces
4. Self-Awareness
Self-awareness AI is the future of Artificial Intelligence. These machines will be super intelligent, and
will have their own consciousness, sentiments, and self-awareness.
These machines will be smarter than human mind. More aware of their needs and internal sense than
human being
Self-Awareness AI does not exist in reality still and it is a hypothetical concept. Such system
understands their internal traits, states & conditions and perceives human emotion.
This type of AI will not only be to understand and evoke emotions in those it interact with but also
have emotions, needs and beliefs of it’s own.
History of Artificial Intelligence
Maturation of Artificial Intelligence (1943-1952)
o Year 1943: The first work which is now recognized as AI was done by Warren McCulloch
and Walter pits in 1943. They proposed a model of artificial neurons.
o Year 1949: Donald Hebb demonstrated an updating rule for modifying the connection
strength between neurons. His rule is now called Hebbian learning.
o Year 1950: The Alan Turing who was an English mathematician and pioneered Machine
learning in 1950. Alan Turing publishes "Computing Machinery and Intelligence" in which
he proposed a test. The test can check the machine's ability to exhibit intelligent behavior
equivalent to human intelligence, called a Turing test.
o Year 1955: An Allen Newell and Herbert A. Simon created the "first artificial intelligence
program"Which was named as "Logic Theorist". This program had proved 38 of 52
Mathematics theorems, and find new and more elegant proofs for some theorems.
o Year 1956: The word "Artificial Intelligence" first adopted by American Computer scientist
John McCarthy at the Dartmouth Conference. For the first time, AI coined as an academic
field.
At that time high-level computer languages such as FORTRAN, LISP, or COBOL were invented. And
the enthusiasm for AI was very high at that time.
o Year 1966: The researchers emphasized developing algorithms which can solve
mathematical problems. Joseph Weizenbaum created the first chatbot in 1966, which was
named as ELIZA.
o Year 1972: The first intelligent humanoid robot was built in Japan which was named as
WABOT-1.
o The duration between years 1974 to 1980 was the first AI winter duration. AI winter refers to
the time period where computer scientist dealt with a severe shortage of funding from
government for AI researches.
o During AI winters, an interest of publicity on artificial intelligence was decreased.
A boom of AI (1980-1987)
o Year 1980: After AI winter duration, AI came back with "Expert System". Expert systems
were programmed that emulate the decision-making ability of a human expert.
o In the Year 1980, the first national conference of the American Association of Artificial
Intelligence was held at Stanford University.
o The duration between the years 1987 to 1993 was the second AI Winter duration.
o Again Investors and government stopped in funding for AI research as due to high cost but
not efficient result. The expert system such as XCON was very cost effective.
o Year 1997: In the year 1997, IBM Deep Blue beats world chess champion, Gary Kasparov,
and became the first computer to beat a world chess champion.
o Year 2002: for the first time, AI entered the home in the form of Roomba, a vacuum
cleaner.
o Year 2006: AI came in the Business world till the year 2006. Companies like Facebook,
Twitter, and Netflix also started using AI.
o Year 2011: In the year 2011, IBM's Watson won jeopardy, a quiz show, where it had to solve
the complex questions as well as riddles. Watson had proved that it could understand
natural language and can solve tricky questions quickly.
o Year 2012: Google has launched an Android app feature "Google now", which was able to
provide information to the user as a prediction.
o Year 2014: In the year 2014, Chatbot "Eugene Goostman" won a competition in the
infamous "Turing test."
o Year 2018: The "Project Debater" from IBM debated on complex topics with two master
debaters and also performed extremely well.
o Google has demonstrated an AI program "Duplex" which was a virtual assistant and which
had taken hairdresser appointment on call, and lady on other side didn't notice that she was
talking with the machine.
Application of AI
Artificial Intelligence has various applications in today's society. It is becoming essential for today's time
because it can solve complex problems with an efficient way in multiple industries, such as Healthcare,
entertainment, finance, education, etc. AI is making our daily life more comfortable and fast.Following are
some sectors which have the application of Artificial Intelligence:
1. AI in Astronomy
Artificial Intelligence can be very useful to solve complex universe problems. AI technology can be
helpful for understanding the universe such as how it works, origin, etc.
2. AI in Healthcare
In the last, five to ten years, AI becoming more advantageous for the healthcare industry and going to have
a significant impact on this industry.
Healthcare Industries are applying AI to make a better and faster diagnosis than humans. AI can help
doctors with diagnoses and can inform when patients are worsening so that medical help can reach to the
patient before hospitalization.
3. AI in Gaming
AI can be used for gaming purpose. The AI machines can play strategic games like chess, where the
machine needs to think of a large number of possible places.
4. AI in Finance
AI and finance industries are the best matches for each other. The finance industry is implementing
automation, chatbot, adaptive intelligence, algorithm trading, and machine learning into financial
processes.
5. AI in Data Security
The security of data is crucial for every company and cyber-attacks are growing very rapidly in the digital
world. AI can be used to make your data more safe and secure. Some examples such as AEG bot, AI2
Platform,are used to determine software bug and cyber-attacks in a better way.
6. AI in Social Media
Social Media sites such as Facebook, Twitter, and Snapchat contain billions of user profiles, which need to
be stored and managed in a very efficient way. AI can organize and manage massive amounts of data. AI
can analyze lots of data to identify the latest trends, hashtag, and requirement of different users.
7. AI in Travel & Transport
AI is becoming highly demanding for travel industries. AI is capable of doing various travel related works
such as from making travel arrangement to suggesting the hotels, flights, and best routes to the customers.
Travel industries are using AI-powered chatbots which can make human-like interaction with customers
for better and fast response.
8. AI in Automotive Industry
Some Automotive industries are using AI to provide virtual assistant to their user for better performance.
Such as Tesla has introduced TeslaBot, an intelligent virtual assistant.
Various Industries are currently working for developing self-driven cars which can make your journey
more safe and secure.
9. AI in Robotics:
Artificial Intelligence has a remarkable role in Robotics. Usually, general robots are programmed such that
they can perform some repetitive task, but with the help of AI, we can create intelligent robots which can
perform tasks with their own experiences without pre-programmed.
Humanoid Robots are best examples for AI in robotics, recently the intelligent Humanoid robot named as
Erica and Sophia has been developed which can talk and behave like humans.
10. AI in Entertainment
We are currently using some AI based applications in our daily life with some entertainment services such
as Netflix or Amazon. With the help of ML/AI algorithms, these services show the recommendations for
programs or shows.
11. AI in Agriculture
Agriculture is an area which requires various resources, labor, money, and time for best result. Now a day's
agriculture is becoming digital, and AI is emerging in this field. Agriculture is applying AI as agriculture
robotics, solid and crop monitoring, predictive analysis. AI in agriculture can be very helpful for farmers.
12. AI in E-commerce
AI is providing a competitive edge to the e-commerce industry, and it is becoming more demanding in the
e-commerce business. AI is helping shoppers to discover associated products with recommended size,
color, or even brand.
13. AI in education:
AI can automate grading so that the tutor can have more time to teach. AI chatbot can communicate with
students as a teaching assistant.
AI in the future can be work as a personal virtual tutor for students, which will be accessible easily at any
time and any place.
AI Examples
Artificial Intelligence Samples:
Self Driving Cars
E-Payment
Google Maps
Text Autocorrect
Automated Translation
Chatbots
Social Media
Face Detection
Search Algorithms
Robots
Automated Investment
NLP - Natural Language Processing
Flying Drones
Dr. Watson
Apple Siri
Microsoft Cortana
Amazon Ale
High Accuracy with less errors: AI machines or systems are prone to less errors and high
accuracy as it takes decisions as per pre-experience or information.
High-Speed: AI systems can be of very high-speed and fast-decision making, because of
that AI systems can beat a chess champion in the Chess game.
High reliability: AI machines are highly reliable and can perform the same action multiple
times with high accuracy.
Useful for risky areas: AI machines can be helpful in situations such as defusing a bomb,
exploring the ocean floor, where to employ a human can be risky.
Digital Assistant: AI can be very useful to provide digital assistant to the users such as AI
technology is currently used by various E-commerce websites to show the products as per
customer requirement.
Useful as a public utility: AI can be very useful for public utilities such as a self-driving car
which can make our journey safer and hassle-free, facial recognition for security purpose,
Natural language processing to communicate with the human in human-language, etc.
What is an Agent?
An agent can be anything that perceive its environment through sensors and act upon that
environment through actuators. An Agent runs in the cycle of perceiving, thinking, and acting. An
agent can be: An agent is anything that can perceive its environment through sensors and acts upon
that environment through effectors.
Human-Agent: A human agent has eyes, ears, and other organs which work for sensors and
hand, legs, vocal tract work for actuators.
Robotic Agent: A robotic agent can have cameras, infrared range finder, NLP for sensors
and various motors for actuators.
Software Agent: Software agent can have keystrokes, file contents as sensory input and act
on those inputs and display output on the screen. A software agent has encoded bit strings as
its programs and actions.
Hence the world around us is full of agents such as thermostat, cellphone, camera, and even we are also
agents.
Before moving forward, we should first know about sensors, effectors, and actuators.
Sensor: Sensor is a device which detects the change in the environment and sends the
information to other electronic devices. An agent observes its environment through sensors.
Actuators: Actuators are the component of machines that converts energy into motion. The
actuators are only responsible for moving and controlling a system. An actuator can be an electric
motor, gears, rails, etc.
Effectors: Effectors are the devices which affect the environment. Effectors can be legs, wheels,
arms, fingers, wings, fins, and display screen.
Agent Terminology
Performance Measure of Agent − It is the criteria, which determines how successful an agent is.
Behavior of Agent − It is the action that agent performs after any given sequence of percepts.
Percept Sequence − It is the history of all that an agent has perceived till date.
Intelligent Agents:
An intelligent agent is an autonomous entity which acts upon an environment using sensors and
actuators for achieving goals. An intelligent agent may learn from the environment to achieve their
goals. A thermostat is an example of an intelligent agent.
Following are the main four rules for an AI agent:
Rule 1: An AI agent must have the ability to perceive the environment.
Rule 2: The observation must be used to make decisions.
Rule 3: Decision should result in an action.
Rule 4: The action taken by an AI agent must be a rational action.
Rational Agent:
Rational agent is an agent which has clear preference, models uncertainty, and acts in a way to
maximize its performance measure with all possible actions.
A rational agent is said to perform the right things. AI is about creating rational agents to use for
game theory and decision theory for various real-world scenarios.
For an AI agent, the rational action is most important because in AI reinforcement learning
algorithm, for each best possible action, agent gets the positive reward and for each wrong
action, an agent gets a negative reward.
An ideal rational agent is the one, which is capable of doing expected actions to maximize its
performance measure, on the basis of −
Its percept sequence
Its built-in knowledge base
A rational agent always performs right action, where the right action means the action that causes the
agent to be most successful in the given percept sequence. The problem the agent solves is characterized
by Performance Measure, Environment, Actuators, and Sensors (PEAS).
Agent Environment in AI
An environment is everything in the world which surrounds the agent, but it is not a part of an agent
itself. An environment can be described as a situation in which an agent is present.
The environment is where agent lives, operate and provide the agent with something to sense and act
upon it. An environment is mostly said to be non-feministic.
Type (Features) of Environment
An environment in artificial intelligence is the surrounding of the agent. The agent takes input from the
environment through sensors and delivers the output to the environment through actuators. There are
several types of environmentsStep-1: Select random K data points from the training set.
Examples:
Chess – the board is fully observable, so are the opponent’s moves.
Driving – the environment is partially observable because what’s around the corner is not known.
2. Deterministic vs Stochastic:
If When a uniqueness an agent's current state and selected action can completely determine the next state of
the environment, then such environment is called a deterministic environment.
In a deterministic, fully observable environment, agent does not need to worry about uncertainty.
The stochastic environment is random in nature which is not unique and cannot be completely determined
by the agent.
Examples:
Chess – there would be only a few possible moves for a coin at the current state and these moves can be
determined.
Self Driving Cars – the actions of a self-driving car are not unique, it varies time to time.
3. Episodic vs Sequential:
In an episodic environment, there is a series of one-shot actions, and only the current percept is required for
the action.
In Episodic task environment, each of the agents action is divided into an atomic incidents or episodes. There
is no dependency between current and previous incident. In each incident agent receives input from
environment and then performs corresponding action.
Example: Consider an example of Pick and Place robot, which is used to detect defective parts from
conveyer belt. Here, every time robot(agent) will make decision on current part i.e. there is no dependency
between current and previous decision.
However, in Sequential environment, an agent requires memory of past actions to determine the next best
actions.
In Sequential environment, previous decision can affect all future decisions. The next action of agent depends
on what action he has taken previously and what action he is supposed to take in future.
Example:
Checkers- Where previous move can affect all the following moves.
4. Single-agent vs Multi-agent
If only one agent is involved in an environment, and operating by itself then such an environment is called
single agent environment.
However, if multiple agents are operating in an environment, then such an environment is called a multi-agent
environment.
The agent design problems in the multi-agent environment are different from single agent environment
consisting of only one agent is said to be a single-agent environment.
A person left alone in a maze is an example of the single-agent system.
An environment involving more than one agent is a multi-agent environment.
The game of football is multi-agent as it involves 11 players in each team.vironment.
5. Static vs Dynamic:
If the environment can change itself while an agent is deliberating then such environment is called a dynamic
environment else it is called a static environment.
Static environments are easy to deal because an agent does not need to continue looking at the world while
deciding for an action.
However for dynamic environment, agents need to keep looking at the world at each action.
Taxi driving is an example of a dynamic environment whereas Crossword puzzles are an example of a
static environment.
An environment that keeps constantly changing itself when the agent is up with some action is said to be
dynamic.
A roller coaster ride is dynamic as it is set in motion and the environment keeps changing every instant.
An idle environment with no change in its state is called a static environment.
An empty house is static as there’s no change in the surroundings when an agent enters.
6. Discrete vs Continuous:
If in an environment there are a finite number of percepts and actions that can be performed within it, then
such an environment is called a discrete environment else it is called continuous environment.
A chess game comes under discrete environment as there is a finite number of moves that can be performed.
A self-driving car is an example of a continuous environment.
If an environment consists of a finite number of actions that can be deliberated in the environment to obtain
the output, it is said to be a discrete environment.
The game of chess is discrete as it has only a finite number of moves. The number of moves might vary with
every game, but still, it’s finite.
The environment in which the actions performed cannot be numbered i.e.. is not discrete, is said to be
continuous.
Self-driving cars are an example of continuous environments as their actions are driving, parking, etc.
which cannot be numbered
7. Known vs Unknown
Known and unknown are not actually a feature of an environment, but it is an agent's state of knowledge
to perform an action.
In a known environment, the results for all actions are known to the agent. While in unknown
environment, agent needs to learn how it works in order to perform an action.
It is quite possible that a known environment to be partially observable and an Unknown environment to
be fully observable.
8. Accessible vs Inaccessible
If an agent can obtain complete and accurate information about the state's environment, then such an
environment is called an Accessible environment else it is called inaccessible.
An empty room whose state can be defined by its temperature is an example of an accessible environment.
Information about an event on earth is an example of Inaccessible environment.
The task of AI is to design an agent program which implements the agent function. The
structure of an intelligent agent is a combination of architecture and agent program. It can be viewed
as:
Following are the main three terms involved in the structure of an AI agent:
Agent program: Agent program is an implementation of agent function. An agent program executes
on the physical architecture to produce function f.
Types of AI Agents
Agents can be grouped into five classes based on their degree of perceived intelligence and
capability. All these agents can improve their performance and generate better action over the time.
These are given below:
3. Goal-based agents
The knowledge of the current state environment is not always sufficient to decide for an
agent to what to do.
The agent needs to know its goal which describes desirable situations.
Goal-based agents expand the capabilities of the model-based agent by having the
"goal" information.
They choose an action, so that they can achieve the goal.
These agents may have to consider a long sequence of possible actions before deciding
whether the goal is achieved or not. Such considerations of different scenario are called
searching and planning, which makes an agent proactive.
4. Utility-based agents
These agents are similar to the goal-based agent but provide an extra component of
utility measurement which makes them different by providing a measure of success at a
given state.
Utility-based agent act based not only goals but also the best way to achieve the goal.
The Utility-based agent is useful when there are multiple possible alternatives, and an
agent has to choose in order to perform the best action.
The utility function maps each state to a real number to check how efficiently each
action achieves the goals.
5. Learning Agents
A learning agent in AI is the type of agent which can learn from its past experiences, or it
has learning capabilities.
It starts to act with basic knowledge and then able to act and adapt automatically through
learning.
A learning agent has mainly four conceptual components, which are:
o Learning element: It is responsible for making improvements by learning from
environment
o Critic: Learning element takes feedback from critic which describes that how well the
agent is doing with respect to a fixed performance standard.
o Performance element: It is responsible for selecting external action
o Problem generator: This component is responsible for suggesting actions that will
lead to new and informative experiences.
Hence, learning agents are able to learn, analyse performance, and look for new ways to
improve the performance.
(Subject In-charge)
Prof.S.B.Mehta)
NUTAN MAHARASHTRA VIDYA PRASARAK MANDAL’S
NUTAN COLLEGE OF ENGINEERING & RESEARCH (NCER)
Department of Computer Science & Engineering
--------------------------------------------------------------------------------------------------------------------------------------
BTCOE603(B)
Lecture
Topic to be covered
Number
⮚ Problem-Solving Agents
2
⮚ Example Problems
3
: Submitted by:
Prof. S. B. Mehta
Artificial Intelligence
Problem Solving :
• In computer science, problem-solving refers to artificial intelligence techniques, including various
techniques such as forming efficient algorithms, heuristics, and performing root cause analysis to
find desirable solutions.
• In artificial intelligence, problems can be solved by using searching algorithms, evolutionary
computations, knowledge representations, etc.
• The process of problem-solving using searching consists of the following steps.
1. Define the problem
2. Analyze the problem
3. Identification of possible solutions
4. Choosing the optimal solution
5. Implementation
In today’s fast-paced digitized world, artificial intelligence techniques are used widely to
automate systems that can use the resource and time efficiently. Some of the well-known
problems experienced in everyday life are games and puzzles.
• Travelling Salesman Problem
• Tower of Hanoi Problem
• Water-Jug Problem
• N-Queen Problem
• Chess
• Sudoku
• Crypt-arithmetic Problems
• Magic Squares
• Logical Puzzles and so on.
1. Goal Formulation: It is the first and simplest step in problem-solving. It organizes the steps/sequence
required to formulate a target/goals which require some action to achieve the goal. Today the
formulation of the goal is based on AI agents.
2. Problem Formulation: It is the most important step of problem-solving which decides what actions
should be taken to achieve the formulated goal. There are following five components involved in
problem formulation:
Components to formulate the associated problem:
• Initial State: This state requires an initial state for the problem which starts the AI agent
towards a specified goal. In this state new methods also initialize problem domain solving by a
specific class.
Search: It identifies all the best possible sequence of actions to reach the goal state from the current
state. It takes a problem as an input and returns solution as its output.
Solution: It finds the best algorithm out of various algorithms, which may be proven as the best
optimal solution.
Execution: It executes the best optimal solution from the searching algorithms to reach the goal state
from the current state.
Example Problems
Basically, there are two types of problem approaches:
Toy Problem: It is a concise and exact description of the problem which is used by the researchers to
compare the performance of algorithms.
Real-world Problem: It is real-world based problems which require solutions. Unlike a toy problem, it
does not depend on descriptions, but we can have a general formulation of the problem.
Note: The 8-puzzle problem is a type of sliding-block problem which is used for testing new search
algorithms in artificial intelligence.
8-queens problem: The aim of this problem is to place eight queens on a chessboard in an order where
no queen may attack another. A queen can attack other queens either diagonally or in same row and
column.
From the following figure, we can understand the problem as well as its correct solution.
o Search: Searchingis a step by step procedure to solve a search-problem in a given search space. A
search problem can have three main factors:
a. Search Space: Search space represents a set of possible solutions, which a system may have.
b. Start State: It is a state from where agent begins the search.
c. Goal test: It is a function which observe the current state and returns whether the goal state is
achieved or not.
o Search tree: A tree representation of search problem is called Search tree. The root of the search tree
is the root node which is corresponding to the initial state.
o Actions: It gives the description of all the available actions to the agent.
o Transition model: A description of what each action do, can be represented as a transition model.
o Path Cost: It is a function which assigns a numeric cost to each path.
Based on the search problems we can classify the search algorithms into uninformed (Blind search) search
and informed search (Heuristic search) algorithms.
Depth-first Search
● Depth-first search is a recursive algorithm for traversing a tree or graph data structure.
● It is called the depth-first search because it starts from the root node and follows each path to its
greatest depth node before moving to the next path.
Note: Backtracking is an algorithm technique for finding all possible solutions using recursion.
Advantage:
● DFS requires very less memory as it only needs to store a stack of the nodes on the path from root
node to the current node.
● It takes less time to reach to the goal node than BFS algorithm (if it traverses in the right path).
● There is the possibility that many states keep re-occurring, and there is no guarantee of finding the
solution.
● DFS algorithm goes for deep down searching and sometime it may go to the infinite loop.
Example:
In the below search tree, we have shown the flow of depth-first search, and it will follow the order as:
It will start searching from root node S, and traverse A, then B, then D and E, after traversing E, it will
backtrack the tree as E has no other successor and still goal node is not found. After backtracking it will
traverse node C and then G, and here it will terminate as it found goal node.
Completeness: DFS search algorithm is complete within finite state space as it will expand every
node within a limited search tree.
Time Complexity: Time complexity of DFS will be equivalent to the node traversed by the
algorithm. It is given by:
Where, m= maximum depth of any node and this can be much larger than d (Shallowest
solution depth)
Space Complexity: DFS algorithm needs to store only single path from the root node, hence space
Breadth-first Search:
● Breadth-first search is the most common search strategy for traversing a tree or graph. This algorithm
searches breadthwise in a tree or graph, so it is called breadth-first search.
● BFS algorithm starts searching from the root node of the tree and expands all successor node at the
current level before moving to nodes of next level.
● It starts from the root node, explores the neighboring nodes first and moves towards the next level
neighbors. It generates one tree at a time until the solution is found. It can be implemented using
FIFO queue data structure. This method provides shortest path to the solution.
Advantages:
● If there are more than one solutions for a given problem, then BFS will provide the minimal solution
which requires the least number of steps.
Disadvantages:
● It requires lots of memory since each level of the tree must be saved into memory to expand the next
level.
● BFS needs lots of time if the solution is far away from the root node.
Example:
In the below tree structure, we have shown the traversing of the tree using BFS algorithm from the root node
S to goal node K. BFS search algorithm traverse in layers, so it will follow the path which is shown by the
dotted arrow, and the traversed path will be:
1. S---> A--->B---->C--->D---->G--->H--->E---->F---->I---->K
Space Complexity: Space complexity of BFS algorithm is given by the Memory size of frontier which is
O(bd).
Completeness: BFS is complete, which means if the shallowest goal node is at some finite depth, then BFS
will find a solution.
Optimality: BFS is optimal if path cost is a non-decreasing function of the depth of the node.
BFS DFS
Full form BFS stands for Breadth First DFS stands for Depth First Search.
Search.
Data Queue data structure is used for Stack data structure is used for the
Structure the BFS traversal. BFS traversal.
Backtracking BFS does not use the DFS uses backtracking to traverse all
backtracking concept. the unvisited nodes.
Number of BFS finds the shortest path In DFS, a greater number of edges
edges having a minimum number of are required to traverse from the
edges to traverse from the source source vertex to the destination
to the destination vertex. vertex.
Optimality BFS traversal is optimal for those DFS traversal is optimal for those
vertices which are to be searched graphs in which solutions are away
closer to the source vertex. from the source vertex.
Suitability It is not suitable for the decision It is suitable for the decision tree.
for decision tree because it requires exploring Based on the decision, it explores all
tree all the neighbouring nodes first. the paths. When the goal is found, it
stops its traversal.
Heuristics function:
Heuristic is a function which is used in Informed Search, and it finds the most promising path.
It takes the current state of the agent as its input and produces the estimation of how close
agent is from the goal.
The heuristic method, however, might not always give the best solution, but it guaranteed to find
a good solution in reasonable time.
Heuristic function estimates how close a state is to the goal.
It is represented by h(n), and it calculates the cost of an optimal path between the pair of states. The
value of the heuristic function is always positive.
h(n) <= h*(n)
Here h(n) is heuristic cost, and h*(n) is the estimated cost. Hence heuristic cost should be less than or equal
to the estimated cost.
In the informed search we will discuss two main algorithms which are given below:
Greedy best-first search algorithm always selects the path which appears best at that moment.
It is the combination of depth-first search and breadth-first search algorithms.
It uses the heuristic function and search. Best-first search allows us to take the advantages of both
algorithms.
With the help of best-first search, at each step, we can choose the most promising node. In the best
first search algorithm, we expand the node which is closest to the goal node and the closest cost is
estimated by heuristic function
Advantages:
o Best first search can switch between BFS and DFS by gaining the advantages of both the algorithms.
o This algorithm is more efficient than BFS and DFS algorithms.
Disadvantages:
Example:
Consider the below search problem, and we will traverse it using greedy best-first search. At each
iteration, each node is expanded using evaluation function f(n)=h(n) , which is given in the below
table.
In this search example, we are using two lists which are OPEN and CLOSED Lists. Following are the
iteration for traversing the above example.
Time Complexity: The worst case time complexity of Greedy best first search is O(bm).
Space Complexity: The worst case space complexity of Greedy best first search is O(b m). Where, m is the
maximum depth of the search space.
Complete: Greedy best-first search is also incomplete, even if the given state space is finite.
A* Search Algorithm:
At each point in the search space, only those node is expanded which have the lowest value of f(n), and the
algorithm terminates when the goal node is found.
A* Search Algorithm
Advantages:
Disadvantages:
o It does not always produce the shortest path as it mostly based on heuristics and approximation.
o A* search algorithm has some complexity issues.
o The main drawback of A* is memory requirement as it keeps all generated nodes in the memory, so
it is not practical for various large-scale problems.
Example:
In this example, we will traverse the given graph using the A* algorithm. The heuristic value of all states is
given in the below table so we will calculate the f(n) of each state using the formula f(n)= g(n) + h(n), where
g(n) is the cost to reach any node from start state.
Iteration3: {(S--> A-->C--->G, 6), (S--> A-->C--->D, 11), (S--> A-->B, 7), (S-->G, 10)}
Iteration 4 :will give the final result, as S--->A--->C--->G it provides the optimal path with cost 6.
Points to remember:
Admissible: the first condition requires for optimality is that h(n) should be an admissible heuristic
for A* tree search. An admissible heuristic is optimistic in nature.
Consistency: Second required condition is consistency for only A* graph-search.
If the heuristic function is admissible, then A* tree search will always find the least cost path.
Time Complexity: The time complexity of A* search algorithm depends on heuristic function, and
the number of nodes expanded is exponential to the depth of solution d. So the time complexity is
O(b^d), where b is the branching factor.
Space Complexity: The space complexity of A* search algorithm is O(b^d)
Utilizing It uses knowledge during the It does not require using any knowledge during
Knowledge process of searching. the process of searching.
Speed Finding the solution is quicker. Finding the solution is much slower
comparatively.
Cost Incurred The expenses are much lower. The expenses are comparatively higher.
Suggestion/ The AI gets suggestions The AI does not get any suggestions regarding
Direction regarding how and where to what solution to find and where to find it.
find a solution to any problem. Whatever knowledge it gets is out of the
information provided.
Examples A few examples include Graph A few examples include Breadth-First Search or
Search and Greedy Search. BFS and Depth-First Search or DFS.
(Subject In-charge)
Prof.S.B.Mehta)
BTCOE603(B)
Lecture
Topic to be covered
Number
⮚ Constraint Type
2
⮚ Constraint Propagation
3
⮚ Backtracking search
4
⮚ Local search
5
: Submitted by:
Prof. S. B. Mehta
Artificial Intelligence
Assignment
An assignment of values to a variable can be done in three ways:
Each state in a CSP is defined by an assignment of values to some of the variables, {Xi=vi, Xj=vj, …};
Consistent or Legal Assignment: An assignment which does not violate any constraint or rule is
called Consistent or legal assignment.
Complete Assignment: An assignment where every variable is assigned with a value, and the
solution to the CSP remains consistent. Such assignment is known as Complete assignment.
Partial Assignment: An assignment which assigns values to some of the variables only. Such type
of assignments are called Partial assignments.
There are following two types of domains which are used by the variables :
Discrete Domain: It is an infinite domain which can have one state for multiple variables. For
example, a start state can be allocated infinite times for each variable. E.g. Map-coloring problems,
scheduling with time limits, the 8-queens problem.
Finite Domain: It is a finite domain which can have continuous states describing one domain for one
specific variable. It is also called a continuous domain.
e.g. The set of integers or strings. With infinite domains, to describe constraints, a constraint language
must be used instead of enumerating all allowed combinations of values.
With respect to the variables, basically there are following types of constraints:
Unary Constraints: It is the simplest type of constraints that restricts the value of a single variable.
Binary Constraints: It is the constraint type which relates two variables. A value x2 will contain a
value which lies between x1 and x3. A binary constraint relates two variables. (e.g. SA≠NSW.) A
binary CSP is one with only binary constraints, can be represented as a constraint graph.
Global Constraints: It is the constraint type which involves an arbitrary number of variables. (Need
not involve all the variable in a problem.) One of the most common global constraint is Alldiff,
which says that all of the variables involved in the constraint must have different values
Linear Constraints: These type of constraints are commonly used in linear programming where
each variable containing an integer value exists in linear form only.
Non-linear Constraints: These type of constraints are used in non-linear programming where each
variable (an integer value) exists in a non-linear form.
To formulate a CSP:
define the variables to be the regions X = {WA, NT, Q, NSW, V, SA, T}.
The domain of each variable is the set Di = {red, green, blue}.
The constraints is C = {SA≠WA, SAW≠NT, SA≠Q, SA≠NSW, SA≠V, WA≠NT, NT≠Q, Q≠NSW,
NSW≠V}. ( SA≠WA is a shortcut for <(SA,WA),SA≠WA>. )
Constraint graph: The nodes of the graph correspond to variables of the problem, and a link connects to
any two variables that participate in a constraint.
Constraint propagation is a special type of inference which helps in reducing the legal number of values for
the variables. The idea behind constraint propagation is local consistency.
In local consistency, variables are treated as nodes, and each binary constraint is treated as an arc in
the given problem.
If we treat each variable as a node in a graph and each binary constraint as an arc, then the process of
enforcing local consistency in each part of the graph causes inconsistent values to be eliminated
throughout the graph.
There are following local consistencies which are discussed below:
Node Consistency: A single variable is said to be node consistent if all the values in the variable’s
domain satisfy the unary constraints on the variables. A single variable (a node in the CSP network)
is node-consistent if all the values in the variable’s domain satisfy the variable’s unary constraint.
Arc Consistency: A variable is arc consistent if every value in its domain satisfies the binary
constraints of the variables.
A variable in a CSP is arc-consistent if every value in its domain satisfies the variable’s binary
constraints.
Xi is arc-consistent with respect to another variable Xj if for every value in the current domain Di
there is some value in the domain Dj that satisfies the binary constraint on the arc (Xi, Xj).
Arc consistency tightens down the domains (unary constraint) using the arcs (binary constraints).
Path Consistency: When the evaluation of a set of two variable with respect to a third variable can
be extended over another variable, satisfying all the binary constraints. It is similar to arc
consistency.
k-consistency: This type of consistency is used to define the notion of stronger forms of
propagation. Here, we examine the k-consistency of the variables.
CSP Problems
Constraint satisfaction includes those problems which contains some constraints while solving the
problem. CSP includes the following problems:
Graph Coloring: The problem where the constraint is that no adjacent sides can have the same
color.
Sudoku Playing: The gameplay where the constraint is that no number from 0-9 can be repeated in
the same row or column.
N-queen problem: In n-queen problem, the constraint is that no queen should be placed either
diagonally, in the same row or column.
Crossword: In crossword problem, the constraint is that there should be the correct formation of the
words, and it should be meaningful.
Example,
Here,
Green is the start point, blue is the intermediate point, red are points with no feasible solution, dark green is
end solution.
Here, when the algorithm propagates to an end to check if it is a solution or not, if it is then returns the
solution otherwise backtracks to the point one step behind it to find track to the next point to find solution
Algorithm
1. Step 1 − if current_position is goal, return success
2. Step 2 − else,
3. Step 3 − if current_position is an end point, return failed.
4. Step 4 − else, if current_position is not end point, explore and repeat above steps.
Let’s use this backtracking problem to find the solution to N-Queen Problem.
In N-Queen problem, we are given an NxN chessboard and we have to place n queens on the board in such
a way that no two queens attack each other. A queen will attack another queen if it is placed in horizontal,
vertical or diagonal points in its way. Here, we will do 4-Queen problem.
Here, the solution is –
Here, the binary output for n queen problem with 1’s as queens to the positions are placed.
{0 , 1 , 0 , 0}
{0 , 0 , 0 , 1}
Algorithm
Step 1 − Start from 1st position in the array.
Step 2 − Place queens in the board and check. Do,
Step 2.1 − After placing the queen, mark the position as a part of the solution and then recursively check if
this will lead to a solution.
Step 2.2 − Now, if placing the queen doesn’t lead to a solution and trackback and go to step (a) and place
queens to other rows.
Step 2.3 − If placing queen returns a lead to solution return TRUE.
Step 3 − If all queens are placed return TRUE.
Step 4 − If all rows are tried and no solution is found, return FALSE.
Elevation: It is defined by the value of the objective function or heuristic cost function.
The local search algorithm explores the above landscape by finding the following two points:
Global Minimum: If the elevation corresponds to the cost, then the task is to find the lowest valley,
which is known as Global Minimum.
Global Maxima: If the elevation corresponds to an objective function, then it finds the highest peak
which is called as Global Maxima. It is the highest point in the valley.
Simulated Annealing
Note: Local search algorithms do not burden to remember all the nodes in the memory; it operates on
complete state-formulatio.
Global Maximum: It is the highest point on the hill, which is the goal state.
Local Maximum: It is the peak higher than all other peaks but lower than the global maximum.
Flat local maximum: It is the flat area over the hill where it has no uphill or downhill. It is a
saturated point of the hill.
Note: Both simple, as well as steepest-ascent hill climbing search, fails when there is no closer node.
2. If the CURRENT node=GOAL node, return GOAL and terminate the search.
Stochastic hill climbing does not focus on all the nodes. It selects one node at random and decides whether it
should be expanded or search for a better one.
Random-restart algorithm is based on try and try strategy. It iteratively searches the node and selects the
best one at each step until the goal is not found. The success depends most commonly on the shape of the
hill. If there are few plateaus, local maxima, and ridges, it becomes easy to reach the destination.
Local Maxima: It is that peak of the mountain which is highest than all its neighboring states but
lower than the global maxima. It is not the goal peak because there is another peak higher than it.
Ridges: It is a challenging problem where the person finds two or more local maxima of the same
height commonly. It becomes difficult for the person to navigate the right point and stuck to that
point itself.
Steps:
2. Then choose the no. of iterations. A ML model is then built and the predictive performance
(otherwise called objective function) is calculated.
3. A small percentage of features are randomly included/excluded from the model. This is just to
‘perturb’ the features. Then the predictive performance is calculated once again for this new set of
features.
1. Repeatedly annealing with a schedule is very slow, especially if the cost function is expensive to
compute
2. For problems where the energy landscape is smooth, or there are few local minima, SA is overkill ---
simpler, faster methods (e.g., gradient descent) will work better. But usually don't know what the
energy landscape is.
3. Heuristic methods, which are problem-specific or take advantage of extra information about the
system, will often be better than general methods. But SA is often comparable to heuristics.
4. The method cannot tell whether it has found an optimal solution. Some other method (e.g. branch
and bound) is required to do this.
This search can suffer from a lack of diversity among the k states.
Note: A variant of Local Beam Search is Stochastic Beam Search which selects k successors at random
rather than choosing the best k successors.
(Subject In-charge)
Prof.S.B.Mehta)
BTCOE603(B)
Lecture
Topic to be covered
Number
⮚ Games
2
⮚ Game tree:
4
⮚ Mini-Max Algorithm
5
⮚ Alpha-beta pruning
6
: Submitted by:
Prof. S. B. Mehta
Artificial Intelligence
Adversarial Search:
• Adversarial search is a search, where we examine the problem which arises when we try to plan
ahead of the world and other agents are planning against us.
• AI Adversarial search: Adversarial search is a game-playing technique where the agents are
surrounded by a competitive environment.
• A conflicting goal is given to the agents (multiagent). These agents compete with one another and try
to defeat one another in order to win the game.Such conflicting goals give rise to the adversarial
search. Here, game-playing means discussing those games where human intelligence and logic factor
is used, excluding other factors such as luck factor. Tic-tac-toe, chess, checkers, etc., are such type of
games where no luck factor works, only mind works.
• Mathematically, this search is based on the concept of ‘Game Theory.’ According to game theory, a
game is played between two players. To complete the game, one has to win the game and the other
loses automatically.
Games
We have studied the search strategies which are only associated with a single agent that aims to find
the solution which often expressed in the form of a sequence of actions.
But, there might be some situations where more than one agent is searching for the solution in the
same search space, and this situation usually occurs in game playing.
So, Searches in which two or more players with conflicting goals are trying to explore the same
search space for the solution, are called adversarial searches, often known as Games.
Games are modeled as a Search problem and heuristic evaluation function, and these are the two
main factors which help to model and solve games in AI.
o Perfect information: A game with the perfect information is that in which agents can look into the
complete board. Agents have all the information about the game, and they can see each other moves
also. Examples are Chess, Checkers, Go, etc.
o Imperfect information: If in a game agents do not have all information about the game and not
aware with what's going on, such type of games are called the game with imperfect information, such
as tic-tac-toe, Battleship, blind, Bridge, etc.
o Deterministic games: Deterministic games are those games which follow a strict pattern and set of
rules for the games, and there is no randomness associated with them. Examples are chess, Checkers,
Go, tic-tac-toe, etc.
o Non-deterministic games: Non-deterministic are that game which have various unpredictable
events and has a factor of chance or luck. This factor of chance or luck is introduced by either dice or
cards. These are random, and each action response is not fixed. Such games are also called as
stochastic games.
Example: Backgammon, Monopoly, Poker, etc.
What to do.
How to decide the move
Needs to think about his opponent as well
The opponent also thinks what to do
Each of the players is trying to find out the response of his opponent to their actions. This requires
embedded thinking or backward reasoning to solve the game problems in AI.
A game can be defined as a type of search in AI which can be formalized of the following elements:
The following figure is showing part of the game-tree for tic-tac-toe game. Following are some key
points of the game:
PLAYER (s): There are two players, MAX and MIN. MAX begins the game by picking one best
move and place X in the empty square box.
RESULT (s, a): The moves made by MIN and MAX will decide the outcome of the game.
TERMINAL-TEST(s): When all the empty boxes will be filled, it will be the terminating state of
the game.
UTILITY: At the end, we will get to know who wins: MAX or MIN, and accordingly, the price will
be given to them.
Example Explanation:
From the initial state, MAX has 9 possible moves as he starts first. MAX place x and MIN place o,
and both player plays alternatively until we reach a leaf node where one player has three in a row or
all squares are filled.
Both players will compute each node, minimax, the minimax value which is the best achievable
utility against an optimal adversary.
Suppose both the players are well aware of the tic-tac-toe and playing the best play. Each player is
doing his best to prevent another one from winning. MIN is acting against Max in the game.
So in the game tree, we have a layer of Max, a layer of MIN, and each layer is called as Ply. Max
place x, then MIN puts o to prevent Max from winning, and this game continues until the terminal
node.
In this either MIN wins, MAX wins, or it's a draw. This game-tree is the whole search space of
possibilities that MIN and MAX are playing tic-tac-toe and taking turns alternately.
It aims to find the optimal strategy for MAX to win the game.
It follows the approach of Depth-first search.
In the game tree, optimal leaf node could appear at any depth of the tree.
Propagate the minimax values up to the tree until the terminal node discovered.
In a given game tree, the optimal strategy can be determined from the minimax value of each node, which
can be written as MINIMAX(n). MAX prefer to move to a state of maximum value and MIN prefer to move
to a state of minimum value then:
There is always a need to choose those algorithms which provide the best optimal solution in a limited time. So, we
use the following techniques which could fulfill our requirements:
Pruning: A technique which allows ignoring the unwanted portions of a search tree which make no difference
in its final result.
Heuristic Evaluation Function: It allows to approximate the cost value at each level of the search tree,
before reaching the goal node
o In this example, there are two players one is called Maximizer and other is called Minimizer.
o Maximizer will try to get the Maximum possible score, and Minimizer will try to get the minimum
possible score.
o This algorithm applies DFS, so in this game-tree, we have to go all the way through the leaves to
reach the terminal nodes.
o At the terminal node, the terminal values are given so we will compare those value and backtrack the
tree until the initial state occurs.
Following are the main steps involved in solving the two-player game tree:
Step-1: In the first step, the algorithm generates the entire game-tree and apply the utility function to get the
utility values for the terminal states. In the below tree diagram, let's take A is the initial state of the tree.
Suppose maximizer takes first turn which has worst-case initial value =- infinity, and minimizer will take
next turn which has worst-case initial value = +infinity.
Step 3: In the next step, it's a turn for minimizer, so it will compare all nodes value with +∞, and will find
the 3rd layer node values.
That was the complete workflow of the minimax two player game.
o Complete- Min-Max algorithm is Complete. It will definitely find a solution (if exist), in the finite
search tree.
o Time complexity- As it performs DFS for the game-tree, so the time complexity of Min-Max
algorithm is O(bm), where b is branching factor of the game-tree, and m is the maximum depth of the
tree.
o Space Complexity- Space complexity of Mini-max algorithm is also similar to DFS which is O(bm).
The main drawback of the minimax algorithm is that it gets really slow for complex games such as Chess,
go, etc. This type of games has a huge branching factor, and the player has lots of choices to decide. This
limitation of the minimax algorithm can be improved from alpha-beta pruning which we have discussed in
the next topic.
Alpha-Beta Pruning
o Alpha-beta pruning is a modified version of the minimax algorithm. It is an optimization technique
for the minimax algorithm.
α>=β
Key points about alpha-beta pruning:
Step 1: At the first step the, Max player will start first move from node A where α= -∞ and β= +∞, these
value of alpha and beta passed down to node B where again α= -∞ and β= +∞, and Node B passes the same
value to its child D.
Step 3: Now algorithm backtrack to node B, where the value of β will change as this is a turn of Min, Now
β= +∞, will compare with the available subsequent nodes value, i.e. min (∞, 3) = 3, hence at node B now α=
-∞, and β= 3.
In the next step, algorithm traverse the next successor of Node B which is node E, and the values of α= -∞,
and β= 3 will also be passed.
Step 5: At next step, algorithm again backtrack the tree, from node B to node A. At node A, the value of
alpha will be changed the maximum available value is 3 as max (-∞, 3)= 3, and β= +∞, these two values
now passes to right successor of A which is Node C.
At node C, α=3 and β= +∞, and the same values will be passed on to node F.
Step 6: At node F, again the value of α will be compared with left child which is 0, and max(3,0)= 3, and
then compared with right child which is 1, and max(3,1)= 3 still α remains 3, but the node value of F will
become 1.
Step 8: C now returns the value of 1 to A here the best value for A is max (3, 1) = 3. Following is the final
game tree which is the showing the nodes which are computed and nodes which has never computed. Hence
the optimal value for the maximizer is 3 for this example.
o Worst ordering: In some cases, alpha-beta pruning algorithm does not prune any of the leaves of
the tree, and works exactly as minimax algorithm. In this case, it also consumes more time because
of alpha-beta factors, such a move of pruning is called worst ordering. In this case, the best move
occurs on the right side of the tree. The time complexity for such an order is O(bm).
o Ideal ordering: The ideal ordering for alpha-beta pruning occurs when lots of pruning happens in
the tree, and best moves occur at the left side of the tree. We apply DFS hence it first search left of
the tree and go deep twice as minimax algorithm in the same amount of time. Complexity in ideal
ordering is O(bm/2).
o Order the nodes in the tree such that the best nodes are checked first.
o Use domain knowledge while finding the best move. Ex: for Chess, try order: captures first, then
threats, then forward moves, backward moves.
o We can bookkeep the states, as there is a possibility that states may repeat.
(Subject In-charge)
Prof.S.B.Mehta)
BTCOC701
Lecture
Topic to be covered
Number
:Submitted by:
Prof. S. B. Mehta
Artificial Intelligence
Knowledge-Based Agent:
o An intelligent agent needs knowledge about the real world for taking decisions and reasoning to
act efficiently.
o Knowledge-based agents are those agents who have the capability of maintaining an internal
state of knowledge, reason over that knowledge, update their knowledge after observations
and take actions. These agents can represent the world with some formal representation and
act intelligently.
o Knowledge-base and
o Inference system.
A knowledge-based agent can be viewed at different levels which are given below:
1. Knowledge level
Knowledge level is the first level of knowledge-based agent, and in this level, we need to specify what
the agent knows, and what the agent goals are. With these specifications, we can fix its behavior. For
example, suppose an automated taxi agent needs to go from a station A to station B, and he knows the
way from A to B, so this comes at the knowledge level.
2. Logical level:
At this level, we understand that how the knowledge representation of knowledge is stored. At this
level, sentences are encoded into different logics. At the logical level, an encoding of knowledge into
logical sentences occurs. At the logical level we can expect to the automated taxi agent to reach to the
destination B.
3. Implementation level:
This is the physical representation of logic and knowledge. At the implementation level agent perform
actions as per logical and knowledge level. At this level, an automated taxi agent actually implement his
knowledge and logic so that he can reach to the destination.
o Knowledge representation and reasoning (KR, KRR) is the part of Artificial intelligence which
concerned with AI agents thinking and how thinking contributes to intelligent behavior of agents.
What to Represent:
Following are the kind of knowledge which needs to be represented in AI systems:
o Object: All the facts about objects in our world domain. E.g., Guitars contains strings, trumpets
are brass instruments.
o Events: Events are the actions which occur in our world.
o Performance: It describe behavior which involves knowledge about how to do things.
o Meta-knowledge: It is knowledge about what we know.
o Facts: Facts are the truths about the real world and what we represent.
o Knowledge-Base: The central component of the knowledge-based agents is the knowledge base.
It is represented as KB. The Knowledgebase is a group of the Sentences (Here, sentences are used
as a technical term and not identical with the English language).
Knowledge: Knowledge is awareness or familiarity gained by experiences of facts, data, and situations.
Following are the types of knowledge in artificial intelligence:
1. Declarative Knowledge:
2. Procedural Knowledge
3. Meta-knowledge:
4. Heuristic knowledge:
5. Structural knowledge:
There are mainly four ways of knowledge representation which are given as follows:
1. Logical Representation
2. Semantic Network Representation
3. Frame Representation
4. Production Rules
1. Logical Representation
Logical representation is a language with some concrete rules which deals with propositions and has no
ambiguity in representation. Logical representation means drawing a conclusion based on various
conditions. This representation lays down some important communication rules. It consists of precisely
Syntax:
o Syntaxes are the rules which decide how we can construct legal sentences in the logic.
o It determines which symbol we can use in knowledge representation.
o How to write those symbols.
Semantics:
o Semantics are the rules by which we can interpret the sentence in the logic.
o Semantic also involves assigning a meaning to each sentence.
a. Propositional Logics
b. Predicate logics
1. Logical representations have some restrictions and are challenging to work with.
2. Logical representation technique may not be very natural, and inference may not be so efficient.
Semantic networks are alternative of predicate logic for knowledge representation. In Semantic networks,
we can represent our knowledge in the form of graphical networks. This network consists of nodes
representing objects and arcs which describe the relationship between those objects. Semantic networks
can categorize the object in different forms and can also link those objects. Semantic networks are easy
to understand and can be easily extended.
Statements:
a. Jerry is a cat.
b. Jerry is a mammal
c. Jerry is owned by Priya.
d. Jerry is brown colored.
e. All Mammals are animal.
In the above diagram, we have represented the different type of knowledge in the form of nodes and arcs.
Each object is connected with another object by some relation.
1. Semantic networks take more computational time at runtime as we need to traverse the complete
network tree to answer some questions. It might be possible in the worst case scenario that after
traversing the entire tree, we find that the solution does not exist in this network.
2. Semantic networks try to model human-like memory (Which has 1015 neurons and links) to store
the information, but in practice, it is not possible to build such a vast semantic network.
3. These types of representations are inadequate as they do not have any equivalent quantifier, e.g.,
for all, for some, none, etc.
4. Semantic networks do not have any standard definition for the link names.
5. These networks are not intelligent and depend on the creator of the system.
3. Frame Representation
A frame is a record like structure which consists of a collection of attributes and its values to describe an
entity in the world. Frames are the AI data structure which divides knowledge into substructures by
representing stereotypes situations. It consists of a collection of slots and slot values. These slots may be
of any type and sizes. Slots have names and values which are called facets.
Facets: The various aspects of a slot is known as Facets. Facets are features of frames which enable us
to put constraints on the frames. Example: IF-NEEDED facts are called when data of any particular slot
is needed. A frame may consist of any number of slots, and a slot may include any number of facets and
facets may have any number of values. A frame is also known as slot-filter knowledge representation in
artificial intelligence.
Frames are derived from semantic networks and later evolved into our modern-day classes and objects. A
single frame is not much useful. Frames system consist of a collection of frames which are connected. In
the frame, knowledge about an object or event can be stored together in the knowledge base. The frame
is a type of technology which is widely used in various applications including Natural language processing
and machine visions.
Example: 1
Slots Filters
Year 1996
Page 1152
Let's suppose we are taking an entity, Peter. Peter is an engineer as a profession, and his age is 25, he lives
in city London, and the country is England. So following is the frame representation for this:
Slots Filter
Name Peter
Profession Doctor
Age 25
Weight 78
1. The frame knowledge representation makes the programming easier by grouping the related data.
2. The frame representation is comparably flexible and used by many applications in AI.
3. It is very easy to add slots for new attribute and relations.
4. It is easy to include default data and to search for missing values.
5. Frame representation is easy to understand and visualize.
4. Production Rules
Production rules system consist of (condition, action) pairs which mean, "If condition then action". It has
mainly three parts:
The working memory contains the description of the current state of problems-solving and rule can write
knowledge to the working memory. This knowledge match and may fire other rules.
If there is a new situation (state) generates, then multiple production rules will be fired together, this is
called conflict set. In this situation, the agent needs to select a rule from these sets, and it is called a
conflict resolution.
Example:
o IF (at bus stop AND bus arrives) THEN action (get into the bus)
o IF (on the bus AND paid AND empty seat) THEN action (sit down).
o IF (on bus AND unpaid) THEN action (pay charges).
o IF (bus arrives at destination) THEN action (get down from the bus).
1. Production rule system does not exhibit any learning capabilities, as it does not store the result of
the problem for the future uses.
2. During the execution of the program, many rules may be active hence rule-based production
systems are inefficient.
Example:
d) 5 is a prime number.
In propositional logic, we use symbolic variables to represent the logic, and we can use any
symbol for a representing a proposition, such A, B, C, P, Q, R, etc.
The propositions and connectives are the basic elements of the propositional logic.
A proposition formula which is always true is called tautology, and it is also called a valid
sentence.
A proposition formula which has both true and false values is called
Statements which are questions, commands, or opinions are not propositions such as "Where is
Rohini", "How are you", "What is your name", are not propositions.
a. Atomic Propositions
b. Compound propositions
o Atomic Proposition:
Atomic propositions are the simple propositions. It consists of a single proposition symbol.
These are the sentences which must be either true or false.
Example:
o Compound proposition:
Compound propositions are constructed by combining simpler or atomic propositions, using
parenthesis and logical connectives.
Example:
Logical Connectives:
Logical connectives are used to connect two simpler propositions or representing a sentence logically.
We can create compound propositions with the help of logical connectives. There are mainly five
connectives, which are given as follows:
1. Negation: A sentence such as ¬ P is called negation of P. A literal can be either Positive literal
or negative literal.
Precedence of connectives:
Just like arithmetic operators, there is a precedence order for propositional connectors or logical
operators. This order should be followed while evaluating a propositional problem. Following is the list
of the precedence order for operators:4
Precedence Operators
Logical equivalence:
Logical equivalence is one of the features of propositional logic. Two propositions are said to be
logically equivalent if and only if the columns in the truth table are identical to each other.
Let's take two propositions A and B, so for logical equivalence, we can write it as A⇔B. In below truth
table we can see that column for ¬A∨ B and A→B, are identical hence A is Equivalent to B
Properties of Operators:
o Commutativity:
o P∧ Q= Q ∧ P, or
o P ∨ Q = Q ∨ P.
o Associativity:
o (P ∧ Q) ∧ R= P ∧ (Q ∧ R),
o (P ∨ Q) ∨ R= P ∨ (Q ∨ R)
o Identity element:
o P ∧ True = P,
o P ∨ True= True.
o Distributive:
o P∧ (Q ∨ R) = (P ∧ Q) ∨ (P ∧ R).
o P ∨ (Q ∧ R) = (P ∨ Q) ∧ (P ∨ R).
o DE Morgan's Law:
o ¬ (P ∧ Q) = (¬P) ∨ (¬Q)
o ¬ (P ∨ Q) = (¬ P) ∧ (¬Q).
o Double-negation elimination:
o ¬ (¬P) = P.
o We cannot represent relations like ALL, some, or none with propositional logic. Example:
To represent the above statements, PL logic is not sufficient, so we required some more powerful logic,
such as first-order logic.
First-Order logic:
o First-order logic is another way of knowledge representation in artificial intelligence. It is an
extension to propositional logic.
o FOL is sufficiently expressive to represent the natural language statements in a concise way.
o First-order logic is also known as Predicate logic or First-order predicate logic. First-order
logic is a powerful language that develops information about the objects in a more easy way and
can also express the relationship between those objects.
o First-order logic (like natural language) does not only assume that the world contains facts like
propositional logic but also assumes the following things in the world:
o Objects: A, B, people, numbers, colors, wars, theories, squares, pits, wumpus, ......
o Relations: It can be unary relation such as: red, round, is adjacent, or n-any relation
such as: the sister of, brother of, has color, comes between
o Function: Father of, best friend, third inning of, end of, ......
o As a natural language, first-order logic also has two main parts:
a. Syntax
Variables x, y, z, a, b,....
Connectives ∧, ∨, ¬, ⇒, ⇔
Equality ==
Quantifier ∀, ∃
Atomic sentences:
o Atomic sentences are the most basic sentences of first-order logic. These sentences are formed
from a predicate symbol followed by a parenthesis with a sequence of terms.
o We can represent atomic sentences as Predicate (term1, term2, ......, term n).
Complex Sentences:
Universal Quantifier:
Universal quantifier is a symbol of logical representation, which specifies that the statement within
its range is true for everything or every instance of a particular thing.
The Universal quantifier is represented by a symbol ∀, which resembles an inverted A.
Note: In universal quantifier we use implication "→".
If x is a variable, then ∀x is read as:
o For all x
o For each x
o For every x.
Example:
All man drink coffee.
Let a variable x which refers to a cat so all x can be represented in UOD as below:
Existential Quantifier:
Existential quantifiers are the type of quantifiers, which express that the statement within its scope is
true for at least one instance of something.
It is denoted by the logical operator ∃, which resembles as inverted E. When it is used with a predicate
variable then it is called as an existential quantifier.
Note: In Existential quantifier we always use AND or Conjunction symbol (∧).
If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:
o There exists a 'x.'
o For some 'x.'
o For at least one 'x.'
Example:
Some boys are intelligent.
Points to remember:
The main connective for universal quantifier ∀ is implication →.
The main connective for existential quantifier ∃ is and ∧.
Properties of Quantifiers:
In universal quantifier, ∀x∀y is similar to ∀y∀x.
In Existential quantifier, ∃x∃y is similar to ∃y∃x.
∃x∀y is not similar to ∀y∃x.
In this topic, we will understand the Knowledge engineering process in an electronic circuit domain,
which is already familiar. This approach is mainly suitable for creating special-purpose knowledge
base.
At the first level or highest level, we will examine the functionality of the circuit:
At the second level, we will examine the circuit structure details such as:
For gate input, we will use the function In(1, X1) for denoting the first input terminal of the gate, and for
output terminal we will use Out (1, X1).
The function Arity(c, i, j) is used to denote that circuit c has i input, j output.
The connectivity between gates can be represented by predicate Connect(Out(1, X1), In(1, X1)).
We use a unary predicate On (t), which is true if the signal at a terminal is on.
o If two terminals are connected then they have the same input signal, it can be represented as:
∀ t1, t2 Terminal (t1) ∧ Terminal (t2) ∧ Connect (t1, t2) → Signal (t1) = Signal (2).
o Signal at every terminal will have either value 0 or 1, it will be represented as:
For the given circuit C1, we can encode the problem instance in atomic sentences as below:
1. Since in the circuit there are two XOR, two AND, and one OR gate so atomic sentences for the
For XOR gate: Type(x1)= XOR, Type(X2) = XOR
2. For AND gate: Type(A1) = AND, Type(A2)= AND
3. For OR gate: Type (O1) = OR.
What should be the combination of input which would generate the first output of circuit C1, as 0 and a
second output to be 1?
1. ∃ i1, i2, i3 Signal (In(1, C1))=i1 ∧ Signal (In(2, C1))=i2 ∧ Signal (In(3, C1))= i3
2. ∧ Signal (Out(1, C1)) =0 ∧ Signal (Out(2, C1))=1
Substitution:
Substitution is a fundamental operation performed on terms and formulas. It occurs in all inference
systems in first-order logic. The substitution is complex in the presence of quantifiers in FOL. If we
write F[a/x], so it refers to substitute a constant "a" in place of variable "x".
Note: First-order logic is capable of expressing facts about some or all objects in the universe
Equality:
First-Order logic does not only use predicate and terms for making atomic sentences but also uses another
way, which is equality in FOL. For this, we can use equality symbols which specify that the two terms
refer to the same object.
As in the above example, the object referred by the Brother (John) is similar to the object referred
by Smith. The equality symbol can also be used with negation to represent that two terms are not the
same objects.
1. Universal Generalization:
o Universal generalization is a valid inference rule which states that if premise P(c) is true for any
arbitrary element c in the universe of discourse, then we can have a conclusion as ∀ x P(x).
Example: Let's represent, P(c): "A byte contains 8 bits", so for ∀ x P(x) "All bytes contain 8 bits.", it
will also be true.
2. Universal Instantiation:
o Universal instantiation is also called as universal elimination or UI is a valid inference rule. It can
be applied multiple times to add new sentences.
o The new KB is logically equivalent to the previous KB.
o As per UI, we can infer any sentence obtained by substituting a ground term for the variable.
o The UI rule state that we can infer any sentence P(c) by substituting a ground term c (a constant
within domain x) from ∀ x P(x) for any object in the universe of discourse.
Example:1.
Example: 2.
"All kings who are greedy are Evil." So let our knowledge base contains this detail as in the form of FOL:
So from this information, we can infer any of the following statements using Universal Instantiation:
3. Existential Instantiation:
o Existential instantiation is also called as Existential Elimination, which is a valid inference rule in
first-order logic.
o It can be applied only once to replace the existential sentence.
o The new KB is not logically equivalent to old KB, but it will be satisfiable if old KB was
satisfiable.
o This rule states that one can infer P(c) from the formula given in the form of ∃x P(x) for a new
constant symbol c.
o The restriction with this rule is that c used in the rule must be a new term for which P(c ) is true.
Example:
So we can infer: Crown(K) ∧ OnHead( K, John), as long as K does not appear in the knowledge base.
4. Existential introduction
(Subject In-charge)
Prof.S.B.Mehta)
Probabilistic reasoning:
Uncertainty:
Till now, we have learned knowledge representation using first-order logic and propositional logic with
certainty, which means we were sure about the predicates. With this knowledge representation, we might
write A→B, which means if A is true then B is true, but consider a situation where we are not sure about
whether A is true or not then we cannot express this statement, this situation is called uncertainty.
So to represent uncertain knowledge, where we are not sure about the predicates, we need uncertain
reasoning or probabilistic reasoning.
Causes of uncertainty:
Following are some leading causes of uncertainty to occur in the real world.
Probabilistic reasoning:
Probabilistic reasoning is a way of knowledge representation where we apply the concept of probability
to indicate the uncertainty in knowledge. In probabilistic reasoning, we combine probability theory with
logic to handle the uncertainty.
We use probability in probabilistic reasoning because it provides a way to handle the uncertainty that is
the result of someone's laziness and ignorance.
In the real world, there are lots of scenarios, where the certainty of something is not confirmed, such as
"It will rain today," "behavior of someone for some situations," "A match between two teams or two
In probabilistic reasoning, there are two ways to solve problems with uncertain knowledge:
o Bayes' rule
o Bayesian Statistics
As probabilistic reasoning uses probability and related terms, so before understanding probabilistic
reasoning, let's understand some common terms:
Probability: Probability can be defined as a chance that an uncertain event will occur. It is the
numerical measure of the likelihood that an event will occur. The value of probability always remains
between 0 and 1 that represent ideal uncertainties.
1. ≤ P(A) ≤ 1, where P(A) is the probability of an event A.
We can find the probability of an uncertain event by using the below formula.
Sample space: The collection of all possible events is called sample space.
Random variables: Random variables are used to represent the events and objects in the real world.
Prior probability: The prior probability of an event is probability computed before observing new
information.
Posterior Probability: The probability that is calculated after all evidence or information has taken into
account. It is a combination of prior probability and new information.
Conditional probability is a probability of occurring an event when another event has already happened.
Let's suppose, we want to calculate the event A when event B has already occurred, "the probability of A
under the conditions of B", it can be written as:
If the probability of A is given and we need to find the probability of B, then it will be given as:
It can be explained by using the below Venn diagram, where B is occurred event, so sample space will be
reduced to set B, and now we can only calculate event A when event B is already occurred by dividing
the probability of P(A⋀B) by P( B ).
Example:
In a class, there are 70% of the students who like English and 40% of the students who likes English and
mathematics, and then what is the percent of students those who like English also like mathematics?
Solution:
Bayes' theorem:
Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which determines the
probability of an event with uncertain knowledge.
In probability theory, it relates the conditional probability and marginal probabilities of two random
events.
Bayes' theorem was named after the British mathematician Thomas Bayes. The Bayesian inference is
an application of Bayes' theorem, which is fundamental to Bayesian statistics.
ayes' theorem allows updating the probability prediction of an event by observing new information of the
real world.
Example: If cancer corresponds to one's age then by using Bayes' theorem, we can determine the
probability of cancer more accurately with the help of age.
Bayes' theorem can be derived using product rule and conditional probability of event A with known
event B:
The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic of most
modern AI systems for probabilistic inference.
It shows the simple relationship between joint and conditional probabilities. Here,
P(A|B) is known as posterior, which we need to calculate, and it will be read as Probability of
hypothesis A when we have occurred an evidence B.
P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we calculate the
probability of evidence.
P(A) is called the prior probability, probability of hypothesis before considering the evidence
In the equation (a), in general, we can write P (B) = P(A)*P(B|Ai), hence the Bayes' rule can be written
as:
Where A1, A2, A3,........, An is a set of mutually exclusive and exhaustive events.
Example-2:
Question: From a standard deck of playing cards, a single card is drawn. The probability that the
card is king is 4/52, then calculate posterior probability P(King|Face), which means the drawn face
card is a king card.
Solution:
o It is used to calculate the next step of the robot when the already executed step is given.
o Bayes' theorem is helpful in weather forecasting.
o It can solve the Monty Hall problem.
It is also called a Bayes network, belief network, decision network, or Bayesian model.
Bayesian networks are probabilistic, because these networks are built from a probability distribution,
and also use probability theory for prediction and anomaly detection.
Real world applications are probabilistic in nature, and to represent the relationship between multiple
events, we need a Bayesian network. It can also be used in various tasks including prediction, anomaly
detection, diagnostics, automated insight, reasoning, time series prediction, and decision making
under uncertainty.
Bayesian Network can be used for building models from data and experts opinions, and it consists of two
parts:
The generalized form of Bayesian network that represents and solve decision problems under uncertain
knowledge is known as an Influence diagram.
A Bayesian network graph is made up of nodes and Arcs (directed links), where:
o Each node corresponds to the random variables, and a variable can be continuous or discrete.
o Arc or directed arrows represent the causal relationship or conditional probabilities between
random variables. These directed links or arrows connect the pair of nodes in the graph.
These links represent that one node directly influence the other node, and if there is no directed
link that means that nodes are independent with each other
o In the above diagram, A, B, C, and D are random variables represented by the nodes
of the network graph.
o If we are considering node B, which is connected with node A by a directed arrow,
then node A is called the parent of Node B.
o Node C is independent of node A.
o Causal Component
o Actual numbers
Each node in the Bayesian network has condition probability distribution P(Xi |Parent(Xi) ), which
determines the effect of the parent on that node.
Bayesian network is based on Joint probability distribution and conditional probability. So let's first
understand the joint probability distribution:
If we have variables x1, x2, x3,....., xn, then the probabilities of a different combination of x1, x2, x3.. xn,
are known as Joint probability distribution.
P[x1, x2, x3,....., xn], it can be written as the following way in terms of the joint probability distribution.
In general for each variable Xi, we can write the equation as:
Example: Harry installed a new burglar alarm at his home to detect burglary. The alarm reliably responds
at detecting a burglary but also responds for minor earthquakes. Harry has two neighbors David and
Sophia, who have taken a responsibility to inform Harry at work when they hear the alarm. David always
calls Harry when he hears the alarm, but sometimes he got confused with the phone ringing and calls at
that time too. On the other hand, Sophia likes to listen to high music, so sometimes she misses to hear the
alarm. Here we would like to compute the probability of Burglary Alarm.
Problem:
Calculate the probability that alarm has sounded, but there is neither a burglary, nor an earthquake
occurred, and David and Sophia both called the Harry.
Solution:
o Burglary (B)
o Earthquake(E)
o Alarm(A)
o David Calls(D)
o Sophia calls(S)
We can write the events of problem statement in the form of probability: P[D, S, A, B, E], can rewrite the
above probability statement using joint probability distribution:
P(E= False)= 0.999, Which is the probability that an earthquake not occurred.
The Conditional probability of Sophia that she calls is depending on its Parent Node "Alarm."
From the formula of joint distribution, we can write the problem statement in the form of probability
distribution:
= 0.00068045.
Hence, a Bayesian network can answer any query about the domain by using Joint distribution.
There are two ways to understand the semantics of the Bayesian network, which is given below:
The 'Fuzzy' word means the things that are not clear or are vague. Sometimes, we cannot decide in real
life that the given problem or statement is either true or false. At that time, this concept provides many
values between the true and false and gives the flexibility to find the best solution to that problem.
Fuzzy logic contains the multiple logical values and these values are the truth values of a variable or
problem between 0 and 1. This concept was introduced by Lofti Zadeh in 1965 based on the Fuzzy Set
Theory. This concept provides the possibilities which are not given by computers, but similar to the range
of possibilities generated by humans.
In the Boolean system, only two possibilities (0 and 1) exist, where 1 denotes the absolute truth value and
0 denotes the absolute false value. But in the fuzzy system, there are multiple possibilities present between
the 0 and 1, which are partially false and partially true.
The Fuzzy logic can be implemented in systems such as micro-controllers, workstation-based or large
network-based systems for achieving the definite output. It can also be implemented in both hardware or
software.
Characteristics of Fuzzy Logic
1. This concept is flexible and we can easily understand and implement it.
2. It is used for helping the minimization of the logics created by the human.
3. It is the best method for finding the solution of those problems which are suitable for approximate
or uncertain reasoning.
4. It always offers two values, which denote the two possible solutions for a problem and statement.
5. It allows users to build or create the functions which are non-linear of arbitrary complexity.
6. In fuzzy logic, everything is a matter of degree.
7. In the Fuzzy logic, any system which is logical can be easily fuzzified.
8. It is based on natural language processing.
1. Rule Base
2. Fuzzification
3. Inference Engine
4. Defuzzification
1. Rule Base
Rule Base is a component used for storing the set of rules and the If-Then conditions given by the experts
are used for controlling the decision-making systems. There are so many updates that come in the Fuzzy
theory recently, which offers effective methods for designing and tuning of fuzzy controllers. These
updates or developments decreases the number of fuzzy set of rules.
2. Fuzzification
Fuzzification is a module or component for transforming the system inputs, i.e., it converts the crisp
number into fuzzy steps. The crisp numbers are those inputs which are measured by the sensors and then
fuzzification passed them into the control systems for further processing. This component divides the
input signals into following five states in any Fuzzy Logic system:
3. Inference Engine
This component is a main component in any Fuzzy Logic system (FLS), because all the information is
processed in the Inference Engine. It allows users to find the matching degree between the current fuzzy
input and the rules. After the matching degree, this system determines which rule is to be added according
to the given input field. When all rules are fired, then they are combined for developing the control actions.
4. Defuzzification
Defuzzification is a module or component, which takes the fuzzy set inputs generated by the Inference
Engine, and then transforms them into a crisp value. It is the last step in the process of a fuzzy logic
system. The crisp value is a type of value which is acceptable by the user. Various techniques are present
to do this, but the user has to select the best one for reducing the errors.
Set
A set is a term, which is a collection of unordered or ordered elements. Following are the various examples
of a set:
Types of Set:
There are following various categories of set:
1. Finite
2. Empty
3. Infinite
4. Proper
5. Universal
6. Subset
Following are the different application areas where the Fuzzy Logic concept is widely used:
1. The run time of fuzzy logic systems is slow and takes a long time to produce outputs.
2. Users can understand it easily if they are simple.
3. The possibilities produced by the fuzzy logic system are not always accurate.
4. Many researchers give various ways for solving a given statement using this technique which leads
to ambiguity.
5. Fuzzy logics are not suitable for those problems that require high accuracy.
6. The systems of a Fuzzy logic need a lot of testing for verification and validation.
Classical Set
It is a type of set which collects the distinct objects in a group. The sets with the crisp boundaries are
classical sets. In any set, each single entity is called an element or member of that set.
Any set can be easily denoted in the following two different ways:
1. Roaster Form: This is also called as a tabular form. In this form, the set is represented in the following
way:
The elements in the set are enclosed within the brackets and separated by the commas.
Following are the two examples which describes the set in Roaster or Tabular form:
Example 1:
Example 2:
Set of Prime Numbers less than 50: X={2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47}.
A = {x:p(x)}
The set {2, 4, 6, 8, 10, 12, 14, 16, 18} is written as:
B = {x:2 ≤ x < 20 and (x%2) = 0}
Operations on Classical Set
Following are the various operations which are performed on the classical sets:
1. Union Operation
2. Intersection Operation
3. Difference Operation
4. Complement Operation
1. Union:
This operation is denoted by (A U B). A U B is the set of those elements which exist in two
different sets A and B. This operation combines all the elements from both the sets and make a
new set. It is also called a Logical OR operation.
A ∪ B = { x | x ∈ A OR x ∈ B }.
Example:
Set A = {10, 11, 12, 13}, Set B = {11, 12, 13, 14, 15}, then A ∪ B = {10, 11, 12, 13, 14, 15}
2. Intersection
This operation is denoted by (A ∩ B). A ∩ B is the set of those elements which are common in
both set A and B. It is also called a Logical OR operation.
A ∩ B = { x | x ∈ A AND x ∈ B }.
Example:
Set A = {10, 11, 12, 13}, Set B = {11, 12, 14} then A ∩ B = {11, 12}
This operation is denoted by (A - B). A-B is the set of only those elements which exist only in set
A but not in set B.
A - B = { x | x ∈ A AND x ∉ B }.
A′ = {x|x ∉ A}.
Fuzzy Set
The set theory of classical is the subset of Fuzzy set theory. Fuzzy logic is based on this theory,
which is a generalisation of the classical theory of set (i.e., crisp set) introduced by Zadeh in 1965.
A fuzzy set is a collection of values which exist between 0 and 1. Fuzzy sets are denoted or
represented by the tilde (~) character. The sets of Fuzzy theory were introduced in 1965 by Lofti
A. Zadeh and Dieter Klaua. In the fuzzy set, the partial membership also exists. This theory
released as an extension of classical set theory.
1. Fuzzy set is a set having degrees of membership between 1 and 0. Fuzzy sets are
represented with tilde character(~). For example, Number of cars following traffic signals at a
particular time out of all cars present will have membership value between [0,1].
2. Partial membership exists when member of one fuzzy set can also be a part of other fuzzy
sets in the same universe.
3. The degree of membership or truth is not same as probability, fuzzy truth represents
membership in vaguely defined sets
This theory is denoted mathematically asA fuzzy set (Ã) is a pair of U and M, where U is the
Universe of discourse and M is the membership function which takes on values in the interval [
0, 1 ]. The universe of discourse (U) is also denoted by Ω or X.
Given à and B are the two fuzzy sets, and X be the universe of discourse with the following respective
member functions:
Example:
then,
For X1
μA∪B(X1) = 0.6
For X2
μA∪B(X2) = 0.8
μA∪B(X3) = 1
For X4
μA∪B(X4) = 0.9
Example:
then,
For X1
μA∩B(X1) = 0.3
For X2
μA∩B(X2) = 0.2
μA∩B(X3) = 0.4
For X4
μA∩B(X4) = 0.1
μĀ(x) = 1-μA(x),
Example:
then,
For X1
μĀ(X1) = 1-μA(X1)
μĀ(X1) = 1 - 0.3
μĀ(X1) = 0.7
For X2
μĀ(X2) = 1-μA(X2)
μĀ(X2) = 1 - 0.8
μĀ(X2) = 0.2
For X3
μĀ(X3) = 1-μA(X3)
μĀ(X3) = 0.5
For X4
μĀ(X4) = 1-μA(X4)
μĀ(X4) = 1 - 0.1
μĀ(X4) = 0.9
1. This theory is a class of those sets 1. This theory is a class of those sets having
having sharp boundaries. un-sharp boundaries.
2. This set theory is defined by exact 2. This set theory is defined by ambiguous
boundaries only 0 and 1. boundaries.
4. This theory is widely used in the 4. It is mainly used for fuzzy controllers.
design of digital systems.
This system can work with any type of inputs whether it is imprecise, distorted or noisy input
information.
The construction of Fuzzy Logic Systems is easy and understandable.
Fuzzy logic comes with mathematical concepts of set theory and the reasoning of that is quite
simple.
It provides a very efficient solution to complex problems in all fields of life as it resembles human
reasoning and decision-making.
The algorithms can be described with little data, so little memory is required.
Many researchers proposed different ways to solve a given problem through fuzzy logic which
leads to ambiguity. There is no systematic approach to solve a given problem through fuzzy logic.
Proof of its characteristics is difficult or impossible in most cases because every time we do not
get a mathematical description of our approach.
As fuzzy logic works on precise as well as imprecise data so most of the time accuracy is
compromised.
It is used in the aerospace field for altitude control of spacecraft and satellites.
It has been used in the automotive system for speed control, traffic control.
It is used for decision-making support systems and personal evaluation in the large company
business.
It has application in the chemical industry for controlling the pH, drying, chemical distillation
process.
Fuzzy logic is used in Natural language processing and various intensive applications in Artificial
Intelligence.
Fuzzy logic is extensively used in modern control systems such as expert systems.
Fuzzy Logic is used with Neural Networks as it mimics how a person would make decisions, only much
faster. It is done by Aggregation of data and changing it into more meaningful data by forming partial
truths as Fuzzy
Automotive
Business
Defense
Finance
Industrial Sector
Manufacturing
Marine
Medical
Securities
Transportation
In Pattern Recognition and Classification, fuzzy logic is used in the following areas −
Psychology
Planning
Artificial intelligence is an important technology in the future. Whether it is intelligent robots,
self-driving cars, or smart cities, they will all use different aspects of artificial intelligence!!! But
Planning is very important to make any such AI project.
Even Planning is an important part of Artificial Intelligence which deals with the tasks and
domains of a particular problem. Planning is considered the logical side of acting.
Everything we humans do is with a definite goal in mind, and all our actions are oriented towards
achieving our goal. Similarly, Planning is also done for Artificial Intelligence.
That is why Planning is considered the logical side of acting. In other words, Planning is about deciding
the tasks to be performed by the artificial intelligence system and the system's functioning under domain-
independent conditions.
What is a Plan?
We require domain description, task specification, and goal description for any planning system. A plan
is considered a sequence of actions, and each action has its preconditions that must be satisfied before it
can act and some effects that can be positive or negative.
So, we have Forward State Space Planning (FSSP) and Backward State Space Planning (BSSP) at the
basic level.
Execution of the plan is about choosing a sequence of tasks with a high probability of accomplishing a
specific task.
The start position and target position are shown in the following diagram.
o Choose the best rule to apply the next rule based on the best available guess.
o Apply the chosen rule to calculate the new problem condition.
o Find out when a solution has been found.
o Detect dead ends so they can be discarded and direct system effort in more useful directions.
o Find out when a near-perfect solution is found.
1. Start by pushing the original target onto the stack. Repeat this until the pile is empty. If the stack
top is a mixed target, push its unsatisfied sub-targets onto the stack.
2. If the stack top is a single unsatisfied target, replace it with action and push the action precondition
to the stack to satisfy the condition.
Non-linear Planning
This Planning is used to set a goal stack and is included in the search space of all possible sub-goal
orderings. It handles the goal interactions by the interleaving method.
It takes a larger search space since all possible goal orderings are considered.
Algorithm
We start at the goal state and we try fulfilling the preconditions required to achieve the initial state. These
preconditions in turn have their own set of preconditions, which are required to be satisfied first. We keep
solving these “goals” and “sub-goals” until we finally arrive at the Initial State. We make use of a stack
to hold these goals that need to be fulfilled as well the actions that we need to perform for the same.
Apart from the “Initial State” and the “Goal State”, we maintain a “World State” configuration as well.
Goal Stack uses this world state to work its way from Goal State to Initial State. World State on the other
hand starts off as the Initial State and ends up being transformed into the Goal state.
At the end of this algorithm we are left with an empty stack and a set of actions which helps us navigate
from the Initial State to the World State.
Step-1: Push the original goal on the stack. Repeat until the stack is empty.
Step-2: If the stack top is a compound goal, push its unsatisfied subgoals on the stack.
Step-3: If the stack top is a single unsatisfied goal. It replaces it with an action that makes it
satisfied and push the action’s precondition on the stack.
Step-4: If stack top is an action, pop it from the stack, execute it and change the knowledge base by the
action’s effects.
Predicates can be thought of as a statement which helps us convey the information about a configuration
in Blocks World.
Given below are the list of predicates as well as their intended meaning
1. ON(A,B) : Block A is on B
2. ONTABLE(A) : A is on table
3. CLEAR(A) : Nothing is on top of A
4. HOLDING(A) : Arm is holding A.
5. ARMEMPTY : Arm is holding nothing
Using these predicates, we represent the Initial State and the Goal State in our example like this:
Initial State
Thus a configuration can be thought of as a list of predicates describing the current scenario.
All the four operations have certain preconditions which need to be satisfied to perform the same. These
preconditions are represented in the form of predicates.
Hierarchical Planning:
Hierarchical Planning is a method used in Artificial Intelligence (AI) that breaks down complex problems
into smaller, more manageable sub-problems. This approach allows AI systems to efficiently plan and
execute actions to achieve their goals.
At its core, Hierarchical Planning involves two main levels: the high-level planner and the low-level
planner. The high-level planner is responsible for setting the overall goal and breaking it down into smaller
sub-goals. It then passes these sub-goals to the low-level planner, which generates a sequence of actions
to achieve each sub-goal.
This process continues until all sub-goals have been accomplished, and the overall goal has been achieved.
The high-level planner may also monitor the progress of the low-level planner and make adjustments as
necessary.
For complicated projects with several subtasks, hierarchical planning is especially helpful since it enables
the AI system to divide the issue into more manageable chunks. It is also helpful in scenarios when the
activities required to accomplish a goal may alter depending on the setting or circumstances.
Overall, Hierarchical Planning is an essential technique in the field of AI, as it allows AI systems to
efficiently plan and execute actions to achieve their goals.
For example, in autonomous robots, hierarchical planning can be used to plan high-level goals, such as
navigating to a specific location, and decompose them into lower-level tasks, such as obstacle avoidance,
path planning, and motion control.
Hierarchical planning in artificial intelligence (AI) typically involves several key components, including:
High-level goals: The overall objectives or tasks that the AI system aims to achieve.
Task decomposition: Breaking down high-level goals into lower-level tasks or subgoals.
Planning hierarchy: The organization of tasks or subgoals into a hierarchical structure, such as
a tree or a directed acyclic graph (DAG).
Plan generation at different levels: Reasoning and planning at different levels of the hierarchy,
with plans generated for achieving subgoals or actions.
Plan synthesis: Combining the plans for achieving subgoals or actions into a cohesive plan for
execution.
Plan execution: Carrying out the actions or subgoals in the plan in the correct order.
Plan adaptation: Revising plans at different levels of abstraction to accommodate changes in the
environment or goals.
Scalability: Hierarchical planning allows for reasoning and planning at different levels of
abstraction, enabling efficient handling of complex tasks and environments.
Flexibility: Hierarchical planning provides the flexibility to adapt plans to changes in the
environment or goals, making them more robust and adaptable.
Abstraction and reuse: The use of a hierarchy of tasks or subgoals allows for the abstraction and
reuse of plans, making planning more efficient and reducing the need for redundant planning.
Hierarchical planning in artificial intelligence (AI) involves the use of various techniques to effectively
decompose tasks, abstract them at different levels, allocate tasks to appropriate agents or resources, and
integrate execution plans. Here's a brief overview of these techniques:
Robotics: Hierarchical planning is commonly used in robotics to plan and execute complex tasks
involving multiple levels of abstraction. For example, in autonomous robots, hierarchical planning
can be used to plan high-level goals, such as navigating to a specific location, and decompose
them into lower-level tasks, such as obstacle avoidance, path planning, and motion control.
Hierarchical planning can also be used in robot task planning, where high-level goals, such as
(Subject In-charge)
Prof.S.B.Mehta)
What is NLP?
NLP stands for Natural Language Processing, which is a part of Computer Science, Human language, and
Artificial Intelligence. It is the technology that is used by machines to understand, analyse, manipulate,
and interpret human's languages. It helps developers to organize knowledge for performing tasks such as
translation, automatic summarization, Named Entity Recognition (NER), speech recognition, relationship
extraction, and topic segmentation.
Processing of Natural Language is required when you want an intelligent system like robot to perform as
per your instructions, when you want to hear decision from a dialogue based clinical expert system, etc.
The field of NLP involves making computers to perform useful tasks with the natural languages humans
use. The input and output of an NLP system can be
Speech
Written Text
Natural Language Understanding (NLU) helps the machine to understand and analyse human language
by extracting the metadata from content such as concepts, entities, keywords, emotion, relations, and
semantic roles.
NLU mainly used in Business applications to understand the customer's problem in both spoken and
written language.
First, the computer must comprehend the meaning of each word. It tries to figure out whether the word is
a noun or a verb, whether it’s in the past or present tense, and so on. This is called Part-of-Speech tagging
(POS).
A lexicon (a vocabulary) and a set of grammatical rules are also built into NLP systems. The most difficult
part of NLP is understanding.
The machine should be able to grasp what you said by the conclusion of the process. There are several
challenges in accomplishing this when considering problems such as words having several meanings
(polysemy) or different words having similar meanings (synonymy), but developers encode rules into their
NLU systems and train them to learn to apply the rules correctly.
Natural Language Generation (NLG) acts as a translator that converts the computerized data into natural
language representation. It mainly involves Text planning, Sentence planning, and Text Realization.
NLG is much simpler to accomplish. NLG converts a computer’s machine-readable language into text and can also
convert that text into audible speech using text-to-speech technology.
First, the NLP system identifies what data should be converted to text. If you asked the computer a question
about the weather, it most likely did an online search to find your answer, and from there it decides that the
temperature, wind, and humidity are the factors that should be read aloud to you.
Then, it organizes the structure of how it’s going to say it. This is similar to NLU except backward. NLG
system can construct full sentences using a lexicon and a set of grammar rules.
Finally, text-to-speech takes over. The text-to-speech engine uses a prosody model to evaluate the text and
identify breaks, duration, and pitch. The engine then combines all the recorded phonemes into one cohesive
string of speech using a speech database.
NLU NLG
NLU is the process of reading and interpreting NLG is the process of writing or generating
language. language.
Advantages of NLP
o NLP helps users to ask questions about any subject and get a direct response within seconds.
o NLP offers exact answers to the question means it does not offer unnecessary and unwanted information.
o NLP helps computers to communicate with humans in their languages.
o It is very time efficient.
o Most of the companies use NLP to improve the efficiency of documentation processes, accuracy of
documentation, and identify the information from large databases.
Disadvantages of NLP
A list of disadvantages of NLP is given below:
ML engineer: Designing and deployment of various machine learning models including NLP.
Phases of NLP
The first phase of NLP is the Lexical Analysis. This phase scans the source code as a stream of characters
and converts it into meaningful lexemes. It divides the whole text into paragraphs, sentences, and words.
It involves identifying and analyzing the structure of words. Lexicon of a language means the collection
of words and phrases in a language. Lexical analysis is dividing the whole chunk of txt into paragraphs,
sentences, and words.
Syntactic Analysis is used to check grammar, word arrangements, and shows the relationship among the
words. It involves analysis of words in the sentence for grammar and arranging words in a manner that
shows the relationship among the words. The sentence such as “The school goes to boy” is rejected by
English syntactic analyzer.
In the real world, Agra goes to the Poonam, does not make any sense, so this sentence is rejected by the
Syntactic analyzer.
3. Semantic Analysis
Semantic analysis is concerned with the meaning representation. It mainly focuses on the literal meaning
of words, phrases, and sentences. It draws the exact meaning or the dictionary meaning from the text. The
text is checked for meaningfulness. It is done by mapping syntactic structures and objects in the task
domain. The semantic analyzer disregards sentence such as “hot ice-cream”.
Discourse Integration depends upon the sentences that proceeds it and also invokes the meaning of the
sentences that follow it. he meaning of any sentence depends upon the meaning of the sentence just before
it. In addition, it also brings about the meaning of immediately succeeding sentence.
5. Pragmatic Analysis
Pragmatic is the fifth and last phase of NLP. It helps you to discover the intended effect by applying a set
of rules that characterize cooperative dialogues. During this, what was said is re-interpreted on what it
actually meant. It involves deriving those aspects of language which require real world knowledge.
Applications of NLP
There are the following applications of NLP -
1. Question Answering
Question Answering focuses on building systems that automatically answer the questions asked by
humans in a natural language.
2. Spam Detection
Sentiment Analysis is also known as opinion mining. It is used on the web to analyse the attitude,
behaviour, and emotional state of the sender. This application is implemented through a combination of
NLP (Natural Language Processing) and statistics by assigning the values to the text (positive, negative,
or natural), identify the mood of the context (happy, sad, angry, etc.)
4. Machine Translation
Machine translation is used to translate text or speech from one natural language to another natural
language.
5. Spelling correction
Microsoft Corporation provides word processor software like MS-word, PowerPoint for the spelling
correction.
6. Speech Recognition
Speech recognition is used for converting spoken words into text. It is used in applications, such as mobile,
home automation, video recovery, dictating to Microsoft Word, voice biometrics, voice user interface,
and so on.
Implementing the Chatbot is one of the important applications of NLP. It is used by many companies to
provide the customer's chat services.
8. Information extraction
Information extraction is one of the most important applications of NLP. It is used for extracting structured
information from unstructured or semi-structured machine-readable documents.
It converts a large set of text into more formal representations such as first-order logic structures that are
easier for the computer programs to manipulate notations of the natural language processing.
Form of Learning
1. Supervised Learning:
In supervised learning, the algorithm is trained on a labeled dataset, which means that it learns
from input-output pairs. It tries to map inputs to correct outputs.
Common applications include classification and regression tasks, such as image recognition and
predicting house prices.
2. Unsupervised Learning:
Unsupervised learning involves training an algorithm on unlabeled data, and the goal is to find
patterns or structures within the data.
Clustering and dimensionality reduction are typical unsupervised learning tasks, as seen in
customer segmentation and feature extraction.
3. Semi-Supervised Learning:
Semi-supervised learning combines elements of both supervised and unsupervised learning. It uses
a small amount of labeled data and a large amount of unlabeled data for training.
It's useful when labeling data is expensive or time-consuming.
4. Reinforcement Learning:
Reinforcement learning involves an agent that learns to interact with an environment to maximize
a reward. It learns by trial and error, receiving feedback in the form of rewards or penalties.
Applications include game playing, robotics, and autonomous systems.
5. Self-Supervised Learning:
Self-supervised learning is a type of unsupervised learning where the algorithm creates its own
labels or tasks from the data.
It's often used for pre-training deep neural networks, where the model learns to predict parts of the
input data from other parts.
6. Transfer Learning:
Transfer learning leverages knowledge gained from one task to improve performance on another
related task.
Inductive Learning
Inductive Learning Algorithm (ILA) is an iterative and inductive machine learning algorithm that is used
for generating a set of classification rules, which produces rules of the form “IF-THEN”, for a set of
examples, producing rules at each iteration and appending to the set of rules.
There are basically two methods for knowledge extraction firstly from domain experts and then with
machine learning. For a very large amount of data, the domain experts are not very useful and reliable. So
we move towards the machine learning approach for this work. To use machine learning One method is to
replicate the expert’s logic in the form of algorithms but this work is very tedious, time taking, and
expensive. So we move towards the inductive algorithms which generate the strategy for performing a task
and need not instruct separately at each step.
An technique of machine learning called inductive learning trains a model to generate predictions based on
examples or observations. During inductive learning, the model picks up knowledge from particular
examples or instances and generalizes it such that it can predict outcomes for brand-new data.
When using inductive learning, a rule or method is not explicitly programmed into the model. Instead, the
model is trained to spot trends and connections in the input data and then utilize this knowledge to predict
outcomes from fresh data. Making a model that can precisely anticipate the result of subsequent instances
is the aim of inductive learning.
In supervised learning situations, where the model is trained using labeled data, inductive learning is
frequently utilized. A series of samples with the proper output labels are used to train the model. The model
then creates a mapping between the input data and the output data using this training data. The output for
fresh instances may be predicted using the model after it has been trained.
Inductive learning is used by a number of well-known machine learning algorithms, such as decision trees,
k-nearest neighbors, and neural networks. Because it enables the development of models that can accurately
anticipate new data, even when the underlying patterns and relationships are complicated and poorly
understood, inductive learning is an essential method for machine learning.
Advantages
Disadvantages
May overfit to particular data − Inductive learning models that have overfit to specific training data, or that
have learned the noise in the data rather than the underlying patterns, may perform badly on fresh data.
computationally costly possible − The employment of inductive learning models in real-time applications
may be constrained by their computationally costly nature, especially for complex datasets.
Limited interpretability − Inductive learning models may be difficult to understand, making it difficult to
understand how they arrive at their predictions, in applications where the decision-making process must be
transparent and explicable.
Inductive learning models are only as good as the data they are trained on, therefore if the data is inaccurate
or inadequate, the model may not perform effectively.
o Decision Trees usually mimic human thinking ability while making a decision, so it is easy to
understand.
o The logic behind the decision tree can be easily understood because it shows a tree-like structure.
For the next node, the algorithm again compares the attribute value with the other sub-nodes and move
further. It continues the process until it reaches the leaf node of the tree. The complete process can be
better understood using the below algorithm:
o Step-1: Begin the tree with the root node, says S, which contains the complete dataset.
o Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).
o Step-3: Divide the S into subsets that contains possible values for the best attributes.
o Step-4: Generate the decision tree node, which contains the best attribute.
o Step-5: Recursively make new decision trees using the subsets of the dataset created in step -3.
Continue this process until a stage is reached where you cannot further classify the nodes and
called the final node as a leaf node.
Example: Suppose there is a candidate who has a job offer and wants to decide whether he should accept
the offer or Not. So, to solve this problem, the decision tree starts with the root node (Salary attribute by
ASM). The root node splits further into the next decision node (distance from the office) and one leaf
node based on the corresponding labels. The next decision node further gets split into one decision node
(Cab facility) and one leaf node. Finally, the decision node splits into two leaf nodes (Accepted offers and
Declined offer). Consider the below diagram:
Explanation-based learning can learn from just one training event. Rather than taking multiple
examples, explanation-based learning emphasises learning a single, specific model. Take, for
example, the Ludoo game. Explanation-based learning can learn from just one training event.
Explanation-based learning (EBL) is a form of learning in artificial intelligence (AI) that involves
understanding the underlying domain or problem space and generating explanations for specific instances.
Unlike other machine learning approaches that rely on numerous examples, EBL focuses on a deep
understanding of individual examples, generating explanations based on underlying principles, and
generalizing those explanations to cover other related instances.
Here’s a more detailed overview of EBL:
1. Analyzing the Problem: EBL starts by analyzing a particular instance of a problem, applying domain
knowledge to understand why a specific solution works.
2. Generating an Explanation: Based on the analysis, EBL generates an explanation of why the solution is
correct. This explanation leverages domain knowledge, logical reasoning, and underlying principles to
articulate why the solution makes sense.
3. Generalizing the Explanation: The explanation is then generalized to create a rule or a pattern that can
be applied to other similar instances. This generalization allows the system to learn from a single example
and apply the learning to many related cases.
4. Storing the Generalized Rule: The generalized rule is stored and used to solve similar problems in the
future. By having this rule, the system can quickly solve related problems without having to analyze each
one individually.
5. Integration with Other Learning Techniques: EBL can be integrated with other learning techniques,
such as inductive learning and analogical reasoning, to form more comprehensive and robust learning
models.
The advantage of EBL is that it can provide a deep understanding of specific problems, allowing for intelligent
generalization to related instances. However, it requires significant domain knowledge and logical reasoning
capabilities, making it a more complex approach than many other machine learning techniques.
Neural Network
The term "Artificial Neural Network" is derived from Biological neural networks that develop the
structure of a human brain. Similar to the human brain that has neurons interconnected to one another,
artificial neural networks also have neurons that are interconnected to one another in various layers of the
networks. These neurons are known as nodes.
To understand the concept of the architecture of an artificial neural network, we have to understand what
a neural network consists of. In order to define a neural network that consists of a large number of artificial
neurons, which are termed units arranged in a sequence of layers. Lets us look at various types of layers
available in an artificial neural network.
Input Layer:
As the name suggests, it accepts inputs in several different formats provided by the programmer.
Hidden Layer:
The hidden layer presents in-between input and output layers. It performs all the calculations to find
hidden features and patterns.
Output Layer:
The input goes through a series of transformations using the hidden layer, which finally results in output
The artificial neural network takes input and computes the weighted sum of the inputs and includes a bias.
This computation is represented in the form of a transfer function.
It determines weighted total is passed as an input to an activation function to produce the output.
Activation functions choose whether a node should fire or not. Only those who are fired make it to the
output layer. There are distinctive activation functions available that can be applied upon the sort of task
we are performing.
Afterward, each of the input is multiplied by its corresponding weights ( these weights are the details
utilized by the artificial neural networks to solve a specific problem ). In general terms, these weights
normally represent the strength of the interconnection between neurons inside the artificial neural
network. All the weighted inputs are summarized inside the computing unit.
If the weighted sum is equal to zero, then bias is added to make the output non-zero or something else to
scale up to the system's response. Bias has the same input, and weight equals to 1. Here the total of
The activation function refers to the set of transfer functions used to achieve the desired output. There is
a different kind of the activation function, but primarily either linear or non-linear sets of functions. Some
of the commonly used sets of activation functions are the Binary, linear, and Tan hyperbolic sigmoidal
activation functions. Let us take a look at each of them in details:
There are various types of Artificial Neural Networks (ANN) depending upon the human brain neuron
and network functions, an artificial neural network similarly performs tasks. The majority of the artificial
neural networks will have some similarities with a more complex biological partner and are very effective
at their expected tasks. For example, segmentation or classification.
Feedback ANN:
In this type of ANN, the output returns into the network to accomplish the best-evolved results internally.
As per the University of Massachusetts, Lowell Centre for Atmospheric Research. The feedback networks
feed information back into itself and are well suited to solve optimization issues. The Internal system error
corrections utilize feedback ANNs.
Feed-Forward ANN:
A feed-forward network is a basic neural network comprising of an input layer, an output layer, and at
least one layer of a neuron. Through assessment of its output by reviewing its input, the intensity of the
network can be noticed based on group behavior of the associated neurons, and the output is decided. The
primary advantage of this network is that it figures out how to evaluate and recognize input patterns.
Genetic learning
Genetic Algorithms(GAs) are adaptive heuristic search algorithms that belong to the larger part of
evolutionary algorithms. Genetic algorithms are based on the ideas of natural selection and genetics.
These are intelligent exploitation of random search provided with historical data to direct the search into
the region of better performance in solution space. They are commonly used to generate high-quality
solutions for optimization problems and search problems.
Genetic algorithms simulate the process of natural selection which means those species who can adapt to
changes in their environment are able to survive and reproduce and go to next generation. In simple words,
they simulate “survival of the fittest” among individual of consecutive generation for solving a problem.
Each generation consist of a population of individuals and each individual represents a point in search
space and possible solution
Genetic Algorithms are being widely used in different real-world applications, for example, Designing
electronic circuits, code-breaking, image processing, and artificial creativity.
o Population: Population is the subset of all possible or probable solutions, which can solve the
given problem.
o Chromosomes: A chromosome is one of the solutions in the population for the given problem,
and the collection of gene generate a chromosome.
o Gene: A chromosome is divided into a different gene, or it is an element of the chromosome.
o Allele: Allele is the value provided to the gene within a particular chromosome.
o Fitness Function: The fitness function is used to determine the individual's fitness level in the
population. It means the ability of an individual to compete with other individuals. In every
iteration, individuals are evaluated based on their fitness function.
o Genetic Operators: In a genetic algorithm, the best individual mate to regenerate offspring better
than parents. Here genetic operators play a role in changing the genetic composition of the next
generation.
o Selection
It basically involves five phases to solve the complex optimization problems, which are given as below:
o Initialization
o Fitness Assignment
o Selection
o Reproduction
o Termination
1. Initialization
The process of a genetic algorithm starts by generating the set of individuals, which is called population.
Here each individual is the solution for the given problem. An individual contains or is characterized by
a set of parameters called Genes. Genes are combined into a string and generate chromosomes, which is
the solution to the problem. One of the most popular techniques for initialization is the use of random
binary strings.
3. Selection
The selection phase involves the selection of individuals for the reproduction of offspring. All the selected
individuals are then arranged in a pair of two to increase reproduction. Then these individuals transfer
their genes to the next generation.
4. Reproduction
After the selection process, the creation of a child occurs in the reproduction step. In this step, the genetic
algorithm uses two variation operators that are applied to the parent population. The two operators
involved in the reproduction phase are given below:
o Crossover: The crossover plays a most significant role in the reproduction phase of the genetic algorithm.
In this process, a crossover point is selected at random within the genes. Then the crossover operator swaps
genetic information of two parents from the current generation to produce a new individual representing
the offspring.
The genes of parents are exchanged among themselves until the crossover point is met. These newly
generated offspring are added to the population. This process is also called or crossover. Types of
crossover styles available:
o One point crossover
o Two-point crossover
o Livery crossover
Mutation
The mutation operator inserts random genes in the offspring (new child) to maintain the diversity in the
population. It can be done by flipping some bits in the chromosomes.
Mutation helps in solving the issue of premature convergence and enhances diversification. The below
image shows the mutation process:
Types of mutation styles available,
o Flip bit mutation
o Gaussian mutation
o Exchange/Swap mutation
5. Termination
After the reproduction phase, a stopping criterion is applied as a base for termination. The algorithm
terminates after the threshold fitness solution is reached. It will identify the final solution as the best
solution in the population.
It solves the most complex issue as an expert by extracting the knowledge stored in its knowledge base. The system
helps in decision making for compsex problems using both facts and heuristics like a human expert. It is called
so because it contains the expert knowledge of a specific domain and can solve any complex problem of that
particular domain. These systems are designed for a specific domain, such as medicine, science,
o High Performance: The expert system provides high performance for solving any type of
complex problem of a specific domain with high efficiency and accuracy.
o Understandable: It responds in a way that can be easily understandable by the user. It can take
input in human language and provides the output in the same way.
o Reliable: It is much reliable for generating an efficient and accurate output.
o Highly responsive: ES provides the result for any complex query within a very short period of
time.
o User Interface
o Inference Engine
o Knowledge Base
o Factual Knowledge: The knowledge which is based on facts and accepted by knowledge
engineers comes under factual knowledge.
o Heuristic Knowledge: This knowledge is based on practice, the ability to guess, evaluation, and
experiences.
Knowledge Representation: It is used to formalize the knowledge stored in the knowledge base using
the If-else rules.
Knowledge Acquisitions: It is the process of extracting, organizing, and structuring the domain
knowledge, specifying the rules to acquire the knowledge from various experts, and store that knowledge
into the knowledge base.
Knowledge representation
It is the method used to organize and formalize the knowledge in the knowledge base. It is in the form of
IF-THEN-ELSE rules.
Knowledge Acquisition
The success of any expert system majorly depends on the quality, completeness, and accuracy of the
information stored in the knowledge base.
The knowledge base is formed by readings from various experts, scholars, and the Knowledge Engineers.
The knowledge engineer is a person with the qualities of empathy, quick learning, and case analyzing
skills.
He acquires information from subject expert by recording, interviewing, and observing him at work, etc.
He then categorizes and organizes the information in a meaningful way, in the form of IF-THEN-ELSE
rules, to be used by interference machine. The knowledge engineer also monitors the development of the
ES.
Advantages of Expert System
o These systems are highly reproducible.
o They can be used for risky places where the human presence is not safe.
o Error possibilities are less if the KB contains correct knowledge.
o The performance of these systems remains steady as it is not affected by emotions, tension, or
fatigue.
o They provide a very high speed to respond to a particular query.
Limitations of Expert System
o The response of the expert system may get wrong if the knowledge base contains the wrong
information.
Not only the above components but also some ES Shells provide facilities for database connectivity
through interpreter, web integration, and natural language processing (NLP) features.
The knowledge base can be connected with an external database (like MySQL) since the knowledge base
is not optimal in storing extensive data. The knowledge base cannot directly access the database and these
access features are mediated through an interpreter.
Some ES Shells have inbuilt knowledge base editors which facilitate the Knowledge Engineer to easily
update and check the knowledge base. Knowledge Engineer collects the expertise knowledge in a specific
domain and models in populating the knowledge base.
Inference engine which is the most important part of an expert system access the knowledge base and
solves the problem by either backward chaining or forward chaining of facts and rules in the knowledge
base. In ES Shells, the inference engine is also a built-in component that is usually programmed in Prolog.
Most ES shells are composed of another component called ‘Explanation System’ which provides the user
with reasons and explanations to provide a certain answer, by considering the ‘case specification data’
available.
In an expert system shell, the design of the user interface and other software components are programmed
by the software engineer. Therefore, an expert system is a collaborative design of 03 major parties: expert,
knowledge engineer and software engineer. (Depending on the size these parties may vary from
individuals to large teams)
The advantage of expert system shells was that they were somewhat easier for nonprogrammers to use.
The advantage of Prolog environments was that they were not focused only on if-then rules; Prolog
environments provided a much better realization of a complete first-order logic environment.
(Subject In-charge)
Prof.S.B.Mehta)