5.3-Inductive Logic Programming
5.3-Inductive Logic Programming
B ∪ T |-
E
mother(penelope,victoria). parent(penelope,victoria).
mother(penelope,arthur). parent(penelope,arthur).
father(christopher,victoria). parent(christopher,victoria).
father(christopher,arthur). parent(christopher,arthur).
parent(X,Y) :- father(X,Y).
parent(X,Y) :- mother(X,Y).
Logical reasoning: induction
B ∪ E |-
T
mother(penelope,victoria). parent(penelope,victoria).
mother(penelope,arthur). parent(penelope,arthur).
father(christopher,victoria). parent(christopher,victoria).
father(christopher,arthur). parent(christopher,arthur).
parent(X,Y) :- father(X,Y).
parent(X,Y) :- mother(X,Y).
Inductive Logic Programming
The goal of ILP is to find hypotheses expressed as logic programming clauses, from a set of
positive and negative examples and in the presence of a domain theory where the latter also
are expressed in the same formalism.
Given:
• a set of positive E+ and negative E- training examples expressed in an ´observation language´ LE.
• a Domain Theory T
• a hypothesis language LH that delimits the clauses that are allowed in the hypotheses space H,
• a relation covers(e, H, B) which determines the classification of an example e with respect to H
considering B.
• Convenient handling of structured data and multiple Train Car Car Length Shape Axe Roof …
s
relations as needed for e.g. descriptions of chemical t1 c11
c11 short rectangle 2 none …
molecules. t1 c12
c12 long rectangle 3 none …
t1 c13
c13 short rectangle 2 peaked …
t1 c14
• Some ILP systems can also invent new predicates and t2 c21
c14 long rectangle 2 none …
In search algorithms, the notions of generalization and specialization are incorporated using inductive and deductive
inference rules:
• A deductive inference rule r that maps a conjunction of clauses G onto a conjunction of clauses S such that G ⊧ S is
called a specialization rule.
- the typical ILP rule is θ-subsumption
- the search space is a lattice under θ-subsumption
- there exists a least upper bound (lub) greatest lower bound (glb) and for every pair of clauses
• An inductive inference rule r that maps a conjunction of clauses S onto a conjunction of clauses G such that G ⊧ S is
called a generalization rule.
- the typical ILP rules are inverse resolution and inverse entailment
- Bottom-up approaches establishes the least general generalization (lgg) of the positive examples.
The algorithm:
• keeps track of a queue of candidate hypotheses QH
• repeatedly
1. deletes a hypothesis H from the queue
2. expands H using some generalization or specialization rules
3. the resulting hypotheses are then added to the queue of hypotheses QH
4. QH is pruned to discard unpromising hypotheses from further consideration
• continues until the stop-criterion is satisfied.
ILP Algorithm – Elements
• Initialize the hypotheses (HQ) (e.g. HQ = {true} in Top Down or HQ= Examples in Bottom UP)
• Delete influences the search strategy. One can realize a depth-first (Delete = LIFO), a breadth-first
(Delete = FIFO) or a best-first algorithm.
• Prune determines which candidate hypotheses are to be deleted from the queue. Delete and prune
together implement the search strategy. (e.g. for hill-climbing: only keep the best hypothesis)
• The Stop-criterion states the conditions under which the algorithm stops , when the current set of
hypothesis is already good enough (e.g. when it contains a correct hypothesis).
Pruning the search space
Generalization and specialization form the basis for pruning the search space; this is
because:
• When B ∪ H ⊭ e, where e ∈ E+, B is the background theory, H is the hypothesis, then the
specializations H’ of H will not imply the evidence and can be pruned.
• When B ∪ H ∪ {e} ⊧ □, where e ∈ E-, B is the background theory, H is the hypothesis, then
all generalizations H’ of H will also be inconsistent with B ∪ E and can be pruned.
Some further ILP terminology
An ILP algorithm has the goal to form correct hypotheses in the sense that they satisfying the
following requirements:
• "Sufficiency" - the hypotheses explains all positive examples
• "Necessity" - all the positive examples are NOT explainable without the hypotheses.
• "Weak consistency" the hypotheses do not contradict any element of the Domain theory D .
• "Strong consistency" the hypotheses are not consistent with the negative examples.
Generalization through Inverse Resolution
Resolution
Inverse resolution
Hypothesis: father(X)
Domain knowledge: male(adam). male(kain). male(abdullah). male(muhammad). male(moses).
parent(adam,kain). parent(eve,kain). parent(abdullah,muhammad)
Example: father(adam,kain)
H:-p, q, r
Θ-subsumption in predicate logic
G subsumes S if and only if there is a substitution θ such that Gθ = S
e.g. p(X,Y,X) subsumes p(a,Y,a) or p(f(X),Y) subsumes p(f(a),Y)
P(X,Y,Z)
P(a,b,c)
Example
Positive Examples:
Background knowledge:
Hypotheses: bird(X).
Bottom up search typically based on inverse resolution
bird(X):- true.
Inverse resolution
bird(X):- lays-eggs(X), flies(X) bird(X):- lays-eggs(X), has-talons(X) bird(X):- flies(X), has-talons(X)
…..
bird(penguin).
bird(sparrow).
bird(eagle).
Top down search typically based on theta subsumption
bird(X):- true.
theta subsumption
bird(X):- lays-eggs(X), flies(X) bird(X):- lays-eggs(X), has-talons(X) bird(X):- flies(X). has-talons(X)
…..
bird(penguin).
bird(sparrow). {}
bird(eagle).
false.
A sample of ILP systems
FOIL (Quinlan and Cameron-Jones, 1993) top down hill climbing search
• Where p⊕⊕ is the number of possible bindings that make the clause cover
positive examples, p is the number of positive examples covered and n is
the number of negative examples covered.
Re-inforcement Learning