Inductive Logic Programming: Anoop & Hector
Inductive Logic Programming: Anoop & Hector
(for Dummies)
More problems:
• Extending the key notations
• Efficiency concerns
Initially
Binary Classification
Now
Classification, Regression, Clustering,
Association Analysis
EDAM Reading Group © 2004 5
ILP?
• India Literacy Project?
• International Language Program?
• Individualized Learning Program?
• Instruction Level Parallelism?
• International Lithosphere Program?
E U B T (induce)
parent(mary,vinni). mother(mary,vinni).
parent(X,Y) :- mother(X,Y).
parent(mary,andre). mother(mary,andre). parent(X,Y) :- father(X,Y).
parent(carrey,vinni). father(carrey,vinni).
parent(carrey,andre).
father(carry,andre).
big_spender(C1,Age1,Income1,TotSpent1) ←
married_to(C1,C2) ∧
customer(C2,Age2,Income2,TotSpent2,BigSpender2) ∧
Income2 ≥10000
– Where p is the number of possible bindings that make the clause
cover positive examples, p is the number of positive examples
€ covered and n is the number of negative examples covered.
– Background knowledge (B) is limited to ground facts.
First seed - + -
selected
+ + ++
- Second seed
+ -
+ + -- -+ + selected
First rule
- -
learned
EDAM Reading Group © 2004 23
From Last Time
• Why ILP is not just Decision Trees.
– Language is First-Order Logic
• Natural representation for multi-relational settings
• Thus, a natural representation for full databases
– Not restricted to the classification task.
– So then, what is ILP?
1− g
• ∑ ( p0 i − pi )
i=1..n
1− g
• ∑ ( p0 i − pi )
i=1..n
478
1378 • ⎡ 906 − 334
2
+ 472 − 144
2
⎤ = 0.00154
⎢⎣( 1378 478) ( 1378 )
478 ⎥⎦
1− 4781378
€
EDAM Reading Group © 2004 32
Scaling in Data Mining
• Scaling to large datasets
– Increasing number of training examples
• Scaling to size of the examples
– Increasing number of ground facts in
background knowledge