Unit 4
Unit 4
Unit 4
o An intelligent agent needs knowledge about the real world for taking decisions and reasoning to act
efficiently.
o Knowledge-based agents are those agents who have the capability of maintaining an internal state of
knowledge, reason over that knowledge, update their knowledge after observations and take actions.
These agents can represent the world with some formal representation and act intelligently.
o Knowledge-based agents are composed of two main parts:
o Knowledge-base and
o Inference system.
The above diagram is representing a generalized architecture for a knowledge-based agent. The knowledge-based
agent (KBA) take input from the environment by perceiving the environment. The input is taken by the inference
engine of the agent and which also communicate with KB to decide as per the knowledge store in KB. The learning
element of KBA regularly updates the KB by learning new knowledge.
Knowledge base: Knowledge-base is a central component of a knowledge-based agent, it is also known as KB. It is a
collection of sentences (here 'sentence' is a technical term and it is not identical to sentence in English). These
sentences are expressed in a language which is called a knowledge representation language. The Knowledge-base of
KBA stores fact about the world.
Why use a knowledge base?
Knowledge-base is required for updating knowledge for an agent to learn with experiences and take action as per the
knowledge.
Inference system
Inference means deriving new sentences from old. Inference system allows us to add a new sentence to the
knowledge base. A sentence is a proposition about the world. Inference system applies logical rules to the KB to
deduce new information.
Inference system generates new facts so that an agent can update the KB. An inference system works mainly in two
rules which are given as:
o Forward chaining
o Backward chaining
Following are three operations which are performed by KBA in order to show the intelligent behavior:
1. TELL: This operation tells the knowledge base what it perceives from the environment.
2. ASK: This operation asks the knowledge base what action it should perform.
3. Perform: It performs the selected action.
function KB-AGENT(percept):
persistent: KB, a knowledge base
t, a counter, initially 0, indicating time
TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t))
Action = ASK(KB, MAKE-ACTION-QUERY(t))
TELL(KB, MAKE-ACTION-SENTENCE(action, t))
t=t+1
return action
The knowledge-based agent takes percept as input and returns an action as output. The agent maintains the
knowledge base, KB, and it initially has some background knowledge of the real world. It also has a counter to
indicate the time for the whole process, and this counter is initialized with zero.
Each time when the function is called, it performs its three operations:
The MAKE-ACTION-QUERY generates a sentence to ask which action should be done at the current time.
MAKE-ACTION-SENTENCE generates a sentence which asserts that the chosen action was executed.
A knowledge-based agent can be viewed at different levels which are given below:
1. Knowledge level
Knowledge level is the first level of knowledge-based agent, and in this level, we need to specify what the agent
knows, and what the agent goals are. With these specifications, we can fix its behavior. For example, suppose an
automated taxi agent needs to go from a station A to station B, and he knows the way from A to B, so this comes at
the knowledge level.
2. Logical level:
are encoded into different logics. At the logical level, an encoding of knowledge into logical sentences occurs. At the
logical level we can expect to the automated taxi agent to reach to the destination B.
3. Implementation level:
This is the physical representation of logic and knowledge. At the implementation level agent perform actions as per
logical and knowledge level. At this level, an automated taxi agent actually implement his knowledge and logic so that
he can reach to the destination.
1. 1. Declarative approach: We can create a knowledge-based agent by initializing with an empty knowledge
base and telling the agent all the sentences with which we want to start with. This approach is called
Declarative approach.
2. 2. Procedural approach: In the procedural approach, we directly encode desired behavior as a program
code. Which means we just need to write a program that already encodes the desired behavior or agent.
However, in the real world, a successful agent can be built by combining both declarative and procedural approaches,
and declarative knowledge can often be compiled into more efficient procedural code.
2. Propositional logic
Propositional logic (PL) is the simplest form of logic where all the statements are made by propositions. A proposition
is a declarative statement which is either true or false. It is a technique of knowledge representation in logical and
mathematical form.
Example:
a) It is Sunday.
b) The Sun rises from West (False proposition)
c) 3+3= 7(False proposition)
d) 5 is a prime number.
The syntax of propositional logic defines the allowable sentences for the knowledge representation. There are two
types of Propositions:
a. Atomic Propositions
b. Compound propositions
o Atomic Proposition: Atomic propositions are the simple propositions. It consists of a single proposition
symbol. These are the sentences which must be either true or false.
Example:
Example:
Logical connectives are used to connect two simpler propositions or representing a sentence logically. We can create
compound propositions with the help of logical connectives. There are mainly five connectives, which are given as
follows:
1. Negation: A sentence such as ¬ P is called negation of P. A literal can be either Positive literal or negative
literal.
2. Conjunction: A sentence which has ∧ connective such as, P ∧ Q is called a conjunction.
Example: Rohan is intelligent and hardworking. It can be written as,
P= Rohan is intelligent,
Q= Rohan is hardworking. → P∧ Q.
3. Disjunction: A sentence which has ∨ connective, such as P ∨ Q. is called disjunction, where P and Q are the
propositions.
Example: "Ritika is a doctor or Engineer",
Here P= Ritika is Doctor. Q= Ritika is Doctor, so we can write it as P ∨ Q.
4. Implication: A sentence such as P → Q, is called an implication. Implications are also known as if-then rules.
It can be represented as
If it is raining, then the street is wet.
Let P= It is raining, and Q= Street is wet, so it is represented as P → Q
5. Biconditional: A sentence such as P⇔ Q is a Biconditional sentence, example If I am breathing, then I
am alive
P= I am breathing, Q= I am alive, it can be represented as P ⇔ Q.
Truth Table:
In propositional logic, we need to know the truth values of propositions in all possible scenarios. We can combine all
the possible combination with logical connectives, and the representation of these combinations in a tabular format is
called Truth table. Following are the truth table for all logical connectives:
Truth table with three propositions:
We can build a proposition composing three propositions P, Q, and R. This truth table is made-up of 8n Tuples as we
have taken three proposition symbols.
Precedence of connectives:
Just like arithmetic operators, there is a precedence order for propositional connectors or logical operators. This order
should be followed while evaluating a propositional problem. Following is the list of the precedence order for
operators:
Precedence Operators
Note: For better understanding use parenthesis to make sure of the correct interpretations. Such as ¬R∨ Q, It
can be interpreted as (¬R) ∨ Q
Logical equivalence:
Logical equivalence is one of the features of propositional logic. Two propositions are said to be logically equivalent if
and only if the columns in the truth table are identical to each other.
Let's take two propositions A and B, so for logical equivalence, we can write it as A⇔B. In below truth table we can
see that column for ¬A∨ B and A→B, are identical hence A is Equivalent to B
Properties of Operators:
o Commutativity:
o P∧ Q= Q ∧ P, or
o P ∨ Q = Q ∨ P.
o Associativity:
o (P ∧ Q) ∧ R= P ∧ (Q ∧ R),
o (P ∨ Q) ∨ R= P ∨ (Q ∨ R)
o Identity element:
o P ∧ True = P,
o P ∨ True= True.
o Distributive:
o P∧ (Q ∨ R) = (P ∧ Q) ∨ (P ∧ R).
o P ∨ (Q ∧ R) = (P ∨ Q) ∧ (P ∨ R).
o DE Morgan's Law:
o ¬ (P ∧ Q) = (¬P) ∨ (¬Q)
o ¬ (P ∨ Q) = (¬ P) ∧ (¬Q).
o Double-negation elimination:
o ¬ (¬P) = P.
Theorem proving: Applying rules of inference directly to the sentences in our knowledge base to
construct a proof of the desired sentence without consulting models.
Inference rules are patterns of sound inference that can be used to find proofs. The resolution rule yields
a complete inference algorithm for knowledge bases that are expressed in conjunctive normal
form. Forward chaining and backward chaining are very natural reasoning algorithms for knowledge
bases in Horn form.
Logical equivalence:
Two sentences α and β are logically equivalent if they are true in the same set of models. (write as α ≡ β).
Also: α ≡ β if and only if α 𝖼 β and β 𝖼 α.
Validity: A sentence is valid if it is true in all models.
Valid sentences are also known as tautologies—they are necessarily true. Every valid sentence is
logically equivalent to True.
The deduction theorem: For any sentence αand β, α 𝖼 β if and only if the sentence (α ⇒ β) is valid.
Satisfiability: A sentence is satisfiable if it is true in, or satisfied by, some model. Satisfiability can be
checked by enumerating the possible models until one is found that satisfies the sentence.
The SAT problem: The problem of determining the satisfiability of sentences in propositional logic.
Validity and satisfiability are connected:
α is valid iff ¬α is unsatisfiable;
α is satisfiable iff ¬α is not valid;
α 𝖼 β if and only if the sentence (α∧¬β) is unsatisfiable.
Proving β from α by checking the unsatisfiability of (α∧¬β) corresponds to proof by refutation / proof
by contradiction.
Whenever any sentences of the form α⇒β and α are given, then the sentence β can be inferred.
·And-Elimination:
·All of logical equivalence (in Figure 7.11) can be used as inference rules.
e.g. The equivalence for biconditional elimination yields 2 inference rules:
We can apply any of the search algorithms in Chapter 3 to find a sequence of steps that constitutes a
proof. We just need to define a proof problem as follows:
·INITIAL STATE: the initial knowledge base;
·ACTION: the set of actions consists of all the inference rules applied to all the sentences that match the
top half of the inference rule.
·RESULT: the result of an action is to add the sentence in the bottom half of the inference rule.
·GOAL: the goal is a state that contains the sentence we are trying to prove.
In many practical cases, finding a proof can be more efficient than enumerating models, because the
proof can ignore irrelevant propositions, no matter how many of them they are.
Monotonicity: A property of logical system, says that the set of entailed sentences can only increased as
information is added to the knowledge base.
For any sentences α and β,
If KB 𝖼 αthen KB ∧β 𝖼 α.
Monotonicity means that inference rules can be applied whenever suitable premises are found in the
knowledge base, what else in the knowledge base cannot invalidate any conclusion already inferred.
Proof by resolution
Resolution: An inference rule that yields a complete inference algorithm when coupled with any complete
search algorithm.
Clause: A disjunction of literals. (e.g. A∨ B). A single literal can be viewed as a unit clause (a disjunction of
one literal ).
Unit resolution inference rule: Takes a clause and a literal and produces a new clause.
where each l is a literal, li and m are complementary literals (one is the negation of the other).
Notice: The resulting clause should contain only one copy of each literal. The removal of multiple copies of literal is
called factoring.
e.g. resolve(A∨ B) with (A∨ ¬B), obtain(A∨ A) and reduce it to just A.
A resolution algorithm
e.g.
KB = (B1,1⟺(P1,2∨P2,1))∧¬B1,1
α = ¬P1,2
Notice: Any clause in which two complementary literals appear can be discarded, because it is always equivalent
to True.
e.g. B1,1∨¬B1,1∨P1,2 = True∨P1,2 = True.
PL-RESOLUTION is complete.
Definite clause: A disjunction of literals of which exactly one is positive. (e.g. ¬ L1,1∨¬Breeze∨B1,1)
Every definite clause can be written as an implication, whose premise is a conjunction of positive literals and whose
conclusion is a single positive literal.
Horn clause: A disjunction of literals of which at most one is positive. (All definite clauses are Horn clauses.)
In Horn form, the premise is called the body and the conclusion is called the head.
A sentence consisting of a single positive literal is called a fact, it too can be written in implication form.
Horn clause are closed under resolution: if you resolve 2 horn clauses, you get back a horn clause.
Inference with horn clauses can be done through the forward-chaining and backward-chaining algorithms.
Deciding entailment with Horn clauses can be done in time that is linear in the size of the knowledge base.
Goal clause: A clause with no positive literals.
Fixed point: The algorithm reaches a fixed point where no new inferences are possible.
Data-driven reasoning: Reasoning in which the focus of attention starts with the known data. It can be used within
an agent to derive conclusions from incoming percept, often without a specific query in mind. (forward chaining is an
example)
The set of possible models, given a fixed propositional vocabulary, is finite, so entailment can be checked by
enumerating models. Efficient model-checking inference algorithms for propositional logic include backtracking and
local search methods and can often solve large problems quickly.
2 families of algorithms for the SAT problem based on model checking:
a. based on backtracking
b. based on local hill-climbing search
DPLL embodies 3 improvements over the scheme of TT-ENTAILS?: Early termination, pure symbol heuristic, unit
clause heuristic.
Tricks that enable SAT solvers to scale up to large problems: Component analysis, variable and value
ordering, intelligent backtracking, random restarts, clever indexing.
The notation CNFk(m, n) denotes a k-CNF sentence with m clauses and n symbols. (with n variables and k literals per
clause).
Given a source of random sentences, where the clauses are chosen uniformly, independently and without
replacement from among all clauses with k different literals, which are positive or negative at random.
Hardness: problems right at the threshold > overconstrained problems > underconstrained problems
Satifiability threshold conjecture: A theory says that for every k≥3, there is a threshold ratio rk, such that as n goes
to infinity, the probability that CNFk(n, rn) is satisfiable becomes 1 for all values or r below the threshold, and 0 for all
values above. (remains unproven)
Frame problem: some information lost because the effect axioms fails to state what remains unchanged as the
result of an action.
Solution: add frame axioms explicity asserting all the propositions that remain the same.
Representation frame problem: The proliferation of frame axioms is inefficient, the set of frame axioms will be
O(mn) in a world with m different actions and n fluents.
Solution: because the world exhibits locaility (for humans each action typically changes no more than some number
k of those fluents.) Define the transition model with a set of axioms of size O(mk) rather than size O(mn).
Inferential frame problem: The problem of projecting forward the results of a t step lan of action in time O(kt) rather
than O(nt).
Solution: change one’s focus from writing axioms about actions to writing axioms about fluents.
For each fluent F, we will have an axiom that defines the truth value of Ft+1 in terms of fluents at time t and the action
that may have occurred at time t.
The truth value of Ft+1 can be set in one of 2 ways:
Either a. The action at time t cause F to be true at t+1
Or b. F was already true at time t and the action at time t does not cause it to be false.
An axiom of this form is called a successor-state axiom and has this schema:
Qualification problem: specifying all unusual exceptions that could cause the action to fail.
2. A hybrid agent
Hybrid agent: combines the ability to deduce various aspect of the state of the world with condition-action rules, and
with problem-solving algorithms.
The agent maintains and update KB as a current plan.
The initial KB contains the atemporal axioms. (don’t depend on t)
At each time step, the new percept sentence is added along with all the axioms that depend on t (such as the
successor-state axioms).
Then the agent use logical inference by ASKING questions of the KB (to work out which squares are safe and which
have yet to be visited).
The main body of the agent program constructs a plan based on a decreasing priority of goals:
1. If there is a glitter, construct a plan to grab the gold, follow a route back to the initial location and climb out of the
cave;
2. Otherwise if there is no current plan, plan a route (with A* search) to the closest safe square unvisited yet, making
sure the route goes through only safe squares;
3. If there are no safe squares to explore, if still has an arrow, try to make a safe square by shooting at one of the
possible wumpus locations.
4. If this fails, look for a square to explore that is not provably unsafe.
5. If there is no such square, the mission is impossible, then retreat to the initial location and climb out of the cave.
Weakness: The computational expense goes up as time goes by.
We use a logical sentence involving the proposition symbols associated with the current time step and the temporal
symbols.
Logical state estimation involves maintaining a logical sentence that describes the set of possible states consistent
with the observation history. Each update step requires inference using the transition model of the environment,
which is built from successor-state axioms that specify how each fluent changes.
State estimation: The process of updating the belief state as new percepts arrive.
Exact state estimation may require logical formulas whose size is exponential in the number of symbols.
One common scheme for approximate state estimation: to represent belief state as conjunctions of literals (1-CNF
formulas).
The agent simply tries to prove Xt and ¬Xt for each symbol Xt, given the belief state at t-1.
The conjunction of provable literals becomes the new belief state, and the previous belief state is discarded.
(This scheme may lose some information as time goes along.)
The set of possible states represented by the 1-CNF belief state includes all states that are in fact possible given the
full percept history. The 1-CNF belief state acts as a simple outer envelope, or conservative approximation.
Precondition axioms: stating that an action occurrence requires the preconditions to be satisfied, added to avoid
generating plans with illegal actions.
Action exclusion axioms: added to avoid the creation of plans with multiple simultaneous actions that interfere with
each other.
Propositional logic does not scale to environments of unbounded size because it lacks the expressive power to deal
concisely with time, space and universal patterns of relationshipgs among objects.
6. First-order logic
In the topic of Propositional logic, we have seen that how to represent statements using propositional logic. But
unfortunately, in propositional logic, we can only represent the facts, which are either true or false. PL is not sufficient
to represent the complex sentences or natural language statements. The propositional logic has very limited
expressive power. Consider the following sentence, which we cannot represent using PL logic.
To represent the above statements, PL logic is not sufficient, so we required some more powerful logic, such as first-
order logic.
First-Order logic:
o First-order logic is another way of knowledge representation in artificial intelligence. It is an extension to
propositional logic.
o FOL is sufficiently expressive to represent the natural language statements in a concise way.
o First-order logic is also known as Predicate logic or First-order predicate logic. First-order logic is a
powerful language that develops information about the objects in a more easy way and can also express the
relationship between those objects.
o First-order logic (like natural language) does not only assume that the world contains facts like propositional
logic but also assumes the following things in the world:
o Objects: A, B, people, numbers, colors, wars, theories, squares, pits, wumpus, ......
o Relations: It can be unary relation such as: red, round, is adjacent, or n-any relation such as: the
sister of, brother of, has color, comes between
o Function: Father of, best friend, third inning of, end of, ......
o As a natural language, first-order logic also has two main parts:
a. Syntax
b. Semantics
The syntax of FOL determines which collection of symbols is a logical expression in first-order logic. The basic
syntactic elements of first-order logic are symbols. We write statements in short-hand notation in FOL.
Variables x, y, z, a, b,....
Connectives ∧, ∨, ¬, ⇒, ⇔
Equality ==
Quantifier ∀, ∃
Atomic sentences:
o Atomic sentences are the most basic sentences of first-order logic. These sentences are formed from a
predicate symbol followed by a parenthesis with a sequence of terms.
o We can represent atomic sentences as Predicate (term1, term2, ....... , term n).
Complex Sentences:
o Complex sentences are made by combining atomic sentences using connectives.
Universal Quantifier:
Universal quantifier is a symbol of logical representation, which specifies that the statement within its range is true for
everything or every instance of a particular thing.
o For all x
o For each x
o For every x.
Example:
Let a variable x which refers to a cat so all x can be represented in UOD as below:
∀x man(x) → drink (x, coffee).
It will be read as: There are all x where x is a man who drink coffee.
Existential Quantifier:
Existential quantifiers are the type of quantifiers, which express that the statement within its scope is true for at least
one instance of something.
It is denoted by the logical operator ∃, which resembles as inverted E. When it is used with a predicate variable then it
is called as an existential quantifier.
If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:
Example:
It will be read as: There are some x where x is a boy who is intelligent.
Points to remember:
o The main connective for universal quantifier ∀ is implication →.
o The main connective for existential quantifier ∃ is and ∧.
Properties of Quantifiers:
o In universal quantifier, ∀x∀y is similar to ∀y∀x.
o In Existential quantifier, ∃x∃y is similar to ∃y∃x.
o ∃x∀y is not similar to ∀y∃x.
The quantifiers interact with variables which appear in a suitable way. There are two types of variables in First-order
logic which are given below:
Free Variable: A variable is said to be a free variable in a formula if it occurs outside the scope of the quantifier.
Bound Variable: A variable is said to be a bound variable in a formula if it occurs within the scope of the quantifier.
What is knowledge-engineering?
The process of constructing a knowledge-base in first-order logic is called as knowledge- engineering. In knowledge-
engineering, someone who investigates a particular domain, learns important concept of that domain, and generates
a formal representation of the objects, is known as knowledge engineer.
In this topic, we will understand the Knowledge engineering process in an electronic circuit domain, which is already
familiar. This approach is mainly suitable for creating special-purpose knowledge base.
Following are some main steps of the knowledge-engineering process. Using these steps, we will develop a
knowledge base which will allow us to reason about digital circuit (One-bit full adder) which is given below
1. Identify the task:
The first step of the process is to identify the task, and for the digital circuit, there are various reasoning tasks
At the first level or highest level, we will examine the functionality of the circuit:
At the second level, we will examine the circuit structure details such as:
In the second step, we will assemble the relevant knowledge which is required for digital circuits. So for digital circuits,
we have the following required knowledge:
3. Decide on vocabulary:
The next step of the process is to select functions, predicate, and constants to represent the circuits, terminals, signals,
and gates. Firstly we will distinguish the gates from each other and from other objects. Each gate is represented as an
object which is named by a constant, such as, Gate(X1). The functionality of each gate is determined by its type,
which is taken as constants such as AND, OR, XOR, or NOT. Circuits will be identified by a predicate: Circuit (C1).
For gate input, we will use the function In(1, X1) for denoting the first input terminal of the gate, and for output
terminal we will use Out (1, X1).
The function Arity(c, i, j) is used to denote that circuit c has i input, j output.
The connectivity between gates can be represented by predicate Connect(Out(1, X1), In(1, X1)).
We use a unary predicate On (t), which is true if the signal at a terminal is on.
To encode the general knowledge about the logic circuit, we need some following rules:
o If two terminals are connected then they have the same input signal, it can be represented as:
∀ t1, t2 Terminal (t1) ∧ Terminal (t2) ∧ Connect (t1, t2) → Signal (t1) = Signal (2).
o Signal at every terminal will have either value 0 or 1, it will be represented as:
∀ g Gate(g) ∧ Type(g) = XOR → Signal (Out(1, g)) = 1 ⇔ Signal (In(1, g)) ≠ Signal (In(2, g)).
o Output of NOT gate is invert of its input:
Now we encode problem of circuit C1, firstly we categorize the circuit and its gate components. This step is easy if
ontology about the problem is already thought. This step involves the writing simple atomics sentences of instances
of concepts, which is known as ontology.
For the given circuit C1, we can encode the problem instance in atomic sentences as below:
Since in the circuit there are two XOR, two AND, and one OR gate so atomic sentences for these gates will be:
In this step, we will find all the possible set of values of all the terminal for the adder circuit. The first query will be:
What should be the combination of input which would generate the first output of circuit C1, as 0 and a second
output to be 1?
∃ i1, i2, i3 Signal (In(1, C1))=i1 ∧ Signal (In(2, C1))=i2 ∧ Signal (In(3, C1))= i3
∧ Signal (Out(1, C1)) =0 ∧ Signal (Out(2, C1))=1
Now we will debug the knowledge base, and this is the last step of the complete process. In this step, we will try to
debug the issues of knowledge base.
Inference in First-Order Logic is used to deduce new facts or sentences from existing sentences. Before understanding
the FOL inference rule, let's understand some basic terminologies used in FOL.
Substitution:
Substitution is a fundamental operation performed on terms and formulas. It occurs in all inference systems in first-
order logic. The substitution is complex in the presence of quantifiers in FOL. If we write F[a/x], so it refers to
substitute a constant "a" in place of variable "x".
Note: First-order logic is capable of expressing facts about some or all objects in the universe.
Equality:
First-Order logic does not only use predicate and terms for making atomic sentences but also uses another way,
which is equality in FOL. For this, we can use equality symbols which specify that the two terms refer to the same
object.
As in the above example, the object referred by the Brother (John) is similar to the object referred by Smith. The
equality symbol can also be used with negation to represent that two terms are not the same objects.
As propositional logic we also have inference rules in first-order logic, so following are some basic inference rules in
FOL:
o Universal Generalization
o Universal Instantiation
o Existential Instantiation
o Existential introduction
1. Universal Generalization:
o Universal generalization is a valid inference rule which states that if premise P(c) is true for any arbitrary
element c in the universe of discourse, then we can have a conclusion as ∀ x P(x).
Example: Let's represent, P(c): "A byte contains 8 bits", so for ∀ x P(x) "All bytes contain 8 bits.", it will also be true.
2. Universal Instantiation:
o Universal instantiation is also called as universal elimination or UI is a valid inference rule. It can be applied
multiple times to add new sentences.
o The new KB is logically equivalent to the previous KB.
o As per UI, we can infer any sentence obtained by substituting a ground term for the variable.
o The UI rule state that we can infer any sentence P(c) by substituting a ground term c (a constant within
domain x) from ∀ x P(x) for any object in the universe of discourse.
So from this information, we can infer any of the following statements using Universal Instantiation:
3. Existential Instantiation:
Existential instantiation is also called as Existential Elimination, which is a valid inference rule in first-order
logic.
Example:
So we can infer: Crown(K) ∧ OnHead( K, John), as long as K does not appear in the knowledge base.
4. Existential introduction
o An existential introduction is also known as an existential generalization, which is a valid inference rule in
first-order logic.
o This rule states that if there is some element c in the universe of discourse which has a property P, then we
can infer that there exists something in the universe which has the property P.
For the inference process in FOL, we have a single inference rule which is called Generalized Modus Ponens. It is lifted
version of Modus ponens.
Generalized Modus Ponens can be summarized as, " P implies Q and P is asserted to be true, therefore Q must be
True."
According to Modus Ponens, for atomic sentences pi, pi', q. Where there is a substitution θ such that SUBST (θ, pi',)
= SUBST(θ, pi), it can be represented as:
Example:
We will use this rule for Kings are evil, so we will find some x such that x is king, and x is greedy so we can
infer that x is evil.
In artificial intelligence, forward and backward chaining is one of the important topics, but before understanding
forward and backward chaining lets first understand that from where these two terms came.
Inference engine:
The inference engine is the component of the intelligent system in artificial intelligence, which applies logical rules to
the knowledge base to infer new information from known facts. The first inference engine was part of the expert
system. Inference engine commonly proceeds in two modes, which are:
A. Forward chaining
B. Backward chaining
Horn clause and definite clause are the forms of sentences, which enables knowledge base to use a more restricted
and efficient inference algorithm. Logical inference algorithms use forward and backward chaining approaches, which
require KB in the form of the first-order definite clause.
Definite clause: A clause which is a disjunction of literals with exactly one positive literal is known as a definite
clause or strict horn clause.
Horn clause: A clause which is a disjunction of literals with at most one positive literal is known as horn clause.
Hence all the definite clauses are horn clauses.
It is equivalent to p ∧ q → k.
A. Forward Chaining
Forward chaining is also known as a forward deduction or forward reasoning method when using an inference engine.
Forward chaining is a form of reasoning which start with atomic sentences in the knowledge base and applies
inference rules (Modus Ponens) in the forward direction to extract more data until a goal is reached.
The Forward-chaining algorithm starts from known facts, triggers all rules whose premises are satisfied, and add their
conclusion to the known facts. This process repeats until the problem is solved.
Properties of Forward-Chaining:
o It is a down-up approach, as it moves from bottom to top.
o It is a process of making a conclusion based on known facts or data, by starting from the initial state and
reaches the goal state.
o Forward-chaining approach is also called as data-driven as we reach to the goal using available data.
o Forward -chaining approach is commonly used in the expert system, such as CLIPS, business, and production
rule systems.
Consider the following famous example which we will use in both approaches:
Example:
"As per the law, it is a crime for an American to sell weapons to hostile nations. Country A, an enemy of
America, has some missiles, and all the missiles were sold to it by Robert, who is an American citizen."
To solve the above problem, first, we will convert all the above facts into first-order definite clauses, and then we will
use a forward-chaining algorithm to reach the goal.
Step-1:
In the first step we will start with the known facts and will choose the sentences which do not have implications, such
as: American(Robert), Enemy(A, America), Owns(A, T1), and Missile(T1). All these facts will be represented as
below.
Step-2:
At the second step, we will see those facts which infer from available facts and with satisfied premises.
Rule-(1) does not satisfy premises, so it will not be added in the first iteration.
Rule-(4) satisfy with the substitution {p/T1}, so Sells (Robert, T1, A) is added, which infers from the conjunction of
Rule (2) and (3).
Rule-(6) is satisfied with the substitution(p/A), so Hostile(A) is added and which infers from Rule-(7).
Step-3:
At step-3, as we can check Rule-(1) is satisfied with the substitution {p/Robert, q/T1, r/A}, so we can add
Criminal(Robert) which infers all the available facts. And hence we reached our goal statement.
B. Backward Chaining:
Backward-chaining is also known as a backward deduction or backward reasoning method when using an inference
engine. A backward chaining algorithm is a form of reasoning, which starts with the goal and works backward,
chaining through rules to find known facts that support the goal.
Example:
In backward-chaining, we will use the same above example, and will rewrite all the rules.
Backward-Chaining proof:
In Backward chaining, we will start with our goal predicate, which is Criminal(Robert), and then infer further rules.
Step-1:
At the first step, we will take the goal fact. And from the goal fact, we will infer other facts, and at last, we will prove
those facts true. So our goal fact is "Robert is Criminal," so following is the predicate of it.
Step-2:
At the second step, we will infer other facts form goal fact which satisfies the rules. So as we can see in Rule-1, the
goal predicate Criminal (Robert) is present with substitution {Robert/P}. So we will add all the conjunctive facts below
the first level and will replace p with Robert.
Step-4:
At step-4, we can infer facts Missile(T1) and Owns(A, T1) form Sells(Robert, T1, r) which satisfies the Rule- 4, with the
substitution of A in place of r. So these two statements are proved here.
Step-5:
At step-5, we can infer the fact Enemy(A, America) from Hostile(A) which satisfies Rule- 6. And hence all the
statements are proved true using backward chaining.
Following is the difference between the forward chaining and backward chaining:
o Forward chaining as the name suggests, start from the known facts and move forward by applying inference
rules to extract more data, and it continues until it reaches to the goal, whereas backward chaining starts
from the goal, move backward by using inference rules to determine the facts that satisfy the goal.
o Forward chaining is called a data-driven inference technique, whereas backward chaining is called a goal-
driven inference technique.
o Forward chaining is known as the down-up approach, whereas backward chaining is known as a top-
down approach.
o Forward chaining uses breadth-first search strategy, whereas backward chaining uses depth-first
search strategy.
o Forward and backward chaining both applies Modus ponens inference rule.
o Forward chaining can be used for tasks such as planning, design process monitoring, diagnosis, and
classification, whereas backward chaining can be used for classification and diagnosis tasks.
o Forward chaining can be like an exhaustive search, whereas backward chaining tries to avoid the unnecessary
path of reasoning.
o In forward-chaining there can be various ASK questions from the knowledge base, whereas in backward
chaining there can be fewer ASK questions.
o Forward chaining is slow as it checks for all the rules, whereas backward chaining is fast as it checks few
required rules only.
1. Forward chaining starts from known Backward chaining starts from the goal
facts and applies inference rule to and works backward through inference
extract more data unit it reaches to rules to find the required facts that
the goal. support the goal.
5. Forward chaining tests for all the Backward chaining only tests for few
available rules required rules.
9. Forward chaining is aimed for any Backward chaining is only aimed for the
conclusion. required data.
11. Resolution
Resolution in FOL
Resolution
Resolution is a theorem proving technique that proceeds by building refutation proofs, i.e., proofs by contradictions.
It was invented by a Mathematician John Alan Robinson in the year 1965.
Resolution is used, if there are various statements are given, and we need to prove a conclusion of those statements.
Unification is a key concept in proofs by resolutions. Resolution is a single inference rule which can efficiently operate
on the conjunctive normal form or clausal form.
Clause: Disjunction of literals (an atomic sentence) is called a clause. It is also known as a unit clause.
Conjunctive Normal Form: A sentence represented as a conjunction of clauses is said to be conjunctive normal
form or CNF.
The resolution rule for first-order logic is simply a lifted version of the propositional rule. Resolution can resolve two
clauses if they contain complementary literals, which are assumed to be standardized apart so that they share no
variables.
This rule is also called the binary resolution rule because it only resolves exactly two literals.
Example:
Where two complimentary literals are: Loves (f(x), x) and ¬ Loves (a, b)
These literals can be unified with unifier θ= [a/f(x), and b/x] , and it will generate a resolvent clause:
To better understand all the above steps, we will take an example in which we will apply resolution.
Example:
a. John likes all kind of food.
b. Apple and vegetable are food
c. Anything anyone eats and not killed is food.
d. Anil eats peanuts and still alive
e. Harry eats everything that Anil eats.
Prove by resolution that:
f. John likes peanuts.
In the first step we will convert all the given statements into its first order logic.
In First order logic resolution, it is required to convert the FOL into CNF as CNF form makes easier for resolution
proofs.
Note: Statements "food(Apple) Λ food(vegetables)" and "eats (Anil, Peanuts) Λ alive(Anil)" can be written in two separate statements.
In this statement, we will apply negation to the conclusion statements, which will be written as ¬likes(John, Peanuts)
Now in this step, we will solve the problem by resolution tree using substitution. For the above problem, it will be
given as follows:
Hence the negation of the conclusion has been proved as a complete contradiction with the given set of statements.