KR 2020
17th International Conference on Principles of
Knowledge Representation and Reasoning
18th INTERNATIONAL WORKSHOP ON
NON-MONOTONIC REASONING
NMR 2020
Workshop Notes
Maria Vanina Martínez
Universidad de Buenos Aires and CONICET, Argentina
Ivan Varzinczak
CRIL, Univ. Artois & CNRS, France
2
Preface
NMR is the premier forum for results in the area of non-monotonic reasoning. Its aim
is to bring together active researchers in this broad field within knowledge representation and reasoning (KR), including belief revision, uncertain reasoning, reasoning
about actions, planning, logic programming, preferences, argumentation, causality,
and many other related topics including systems and applications.
NMR has a long history — it started in 1984, and has been held every two years
since then. The present edition is the 18th workshop in the series and it aims at
fostering connections between the different subareas of non-monotonic reasoning and
providing a forum for emerging topics.
This volume contains the papers accepted for presentation at NMR 2020, the 18th
International Workshop on Non-Monotonic Reasoning, held virtually on September
12-14, 2020, and collocated with the 17th International Conference on Principles of
Knowledge Representation and Reasoning (KR 2020). There were 26 submissions,
each of which have been reviewed by two program-committee members. The committee has decided to accept all 26 papers. The program also includes two invited talks
by Francesca Tony (Imperial College, London) and Andreas Herzig (IRIT CNRS,
Toulouse). The latter was part of a joint session with the workshop on Description
Logics (DL 2020).
12 September 2020
Buenos Aires and Lens
Maria Vanina Martínez
Ivan Varzinczak
3
4
Contents
Counting with Bounded Treewidth: Meta Algorithm and Runtime Guarantees
J.K. Fichte and M. Hecher . . . . . . . . . . . . . . . . . . . . . . . .
9
Paraconsistent Logics for Knowledge Representation and Reasoning: advances and perspectives
W. Carnielli and R. Testa . . . . . . . . . . . . . . . . . . . . . . . .
19
Towards Interactive Conflict Resolution in ASP Programs
A. Thevapalan and G. Kern-Isberner . . . . . . . . . . . . . . . . . .
29
Towards Conditional Inference under Disjunctive Rationality
R. Booth and I. Varzinczak . . . . . . . . . . . . . . . . . . . . . . .
37
Treewidth-Aware Complexity in ASP: Not all Positive Cycles are
Equally Hard
J. Fandinno and M. Hecher . . . . . . . . . . . . . . . . . . . . . . .
48
Towards Lightweight Completion Formulas for Lazy Grounding
in Answer Set Programming
B. Bogaerts, S. Marynissen, and A. Weinzierl . . . . . . . . . . . . .
58
Splitting a Logic Program Efficiently
R. Ben-Eliyahu-Zohary . . . . . . . . . . . . . . . . . . . . . . . . . .
67
Interpreting Conditionals in Argumentative Environments
J. Heyninck, G. Kern-Isberner, M. Thimm, and K. Skiba . . . . . . .
73
Inductive Reasoning with Difference-making Conditionals
M. Sezgin, G. Kern-Isberner, and H. Rott . . . . . . . . . . . . . . .
83
Stability in Abstract Argumentation
J.-G. Mailly and J. Rossit . . . . . . . . . . . . . . . . . . . . . . . .
93
Weak Admissibility is PSPACE-complete
W. Dvořák, M. Ulbricht, and S. Woltran . . . . . . . . . . . . . . . . 100
Cautious Monotonicity in Case-Based Reasoning with Abstract
Argumentation
G. Paulino-Passos and F. Toni . . . . . . . . . . . . . . . . . . . . . 110
A Preference-Based Approach for Representing Defaults in FirstOrder Logic
J. Delgrande and C. Rantsoudis . . . . . . . . . . . . . . . . . . . . . 120
Probabilistic Belief Fusion at Maximum Entropy by First-Order
Embedding
M. Wilhelm and G. Kern-Isberner . . . . . . . . . . . . . . . . . . . . 130
5
Stratified disjunctive logic programs and the infinite-valued semantics
P. Rondogiannis and I. Symeonidou . . . . . . . . . . . . . . . . . . . 140
Information Revision: The Joint Revision of Belief and Trust
A. Yasser and H. Ismail . . . . . . . . . . . . . . . . . . . . . . . . . 150
Algebraic Foundations for Non-Monotonic Practical Reasoning
N. Ehab and H. Ismail . . . . . . . . . . . . . . . . . . . . . . . . . . 160
BKLM - An expressive logic for defeasible reasoning
G. Casini, T. Meyer, and G. Paterson-Jones . . . . . . . . . . . . . . 170
Towards Efficient Reasoning with Intensional Concepts
J. Heyninck, R. Gonçalves, M. Knorr, and J. Leite . . . . . . . . . . 179
Obfuscating Knowledge in Modular Answer Set Programming
R. Gonçalves, T. Janhunen, M. Knorr, J. Leite, and S. Woltran . . . 189
A framework for a modular multi-concept lexicographic closure
semantics
L. Giordano and D. Theseider Dupre’ . . . . . . . . . . . . . . . . . 198
An Approximate Model Counter for Answer Set Programming
F. Everardo, M. Hecher, and A. Shukla . . . . . . . . . . . . . . . . . 208
A Survey on Multiple Revision
F. Resina and R. Wassermann
. . . . . . . . . . . . . . . . . . . . . 217
A Principle-based Approach to Bipolar Argumentation
L. Yu and L. van der Torre . . . . . . . . . . . . . . . . . . . . . . . 227
Discursive Input/Output Logic: Deontic Modals, Norms, and Semantic Unification
A. Farjami . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Kratzer Style Deontic Logics in Formal Argumentation
H. Dong, B. Liao, and L. van der Torre . . . . . . . . . . . . . . . . 246
6
Program Committee
Ofer Arieli
Academic College of Tel-Aviv, Israel
Christoph Beierle
FernUniversitaet Hagen, Germany
Alexander Bochman
Holon Institute of Technology, Israel
Richard Booth
Cardiff University, United Kingdom
Arina Britz
Stellenbosch University, South Africa
Giovanni Casini
Université du Luxembourg
James Delgrande
Simon Fraser University, Canada
Juergen Dix
Clausthal University of Technology, Germany
Wolfgang Faber
Alpen-Adria-Universität Klagenfurt, Germany
Jorge Fandinno
Potsdam University, Germany
Bettina Fazzinga
Advanced Analytics on Complex Data - ICAR CNR, Italy
Eduardo Fermé
Universidade da Madeira, Portugal
Martin Gebser
University of Potsdam, Germany
Laura Giordano
Universite of Piemonte Orientale, Italy
Lluis Godo Lacasa
IIIA - CSIC, Spain
Andreas Herzig
IRIT-CNRS, France
Aaron Hunter
British Columbia Institute of Technology, Canada
Anthony Hunter
University College London, United Kingdom
Katsumi Inoue
National Institute of Informatics, Japan
Tomi Janhunen
Aalto University, Finland
Souhila Kaci
Université Montpellier 2, France
Antonis Kakas
University of Cyprus
Gabriele Kern-Isberner Technische Universitaet Dortmund, Germany
Sébastien Konieczny
CRIL-CNRS, France
Thomas Lukasiewicz
University of Oxford, United Kingdom
Marco Maratea
DIBRIS, University of Genova, Italy
Thomas Meyer
University of Cape Town, South Africa
Nir Oren
University of Aberdeen, United Kingdom
Odile Papini
Aix-Marseille Université, France
Xavier Parent
Université du Luxembourg
Ramon Pino Perez
Universidad de Los Andes, Venezuela
Laurent Perrussel
Université de Toulouse, France
Ricardo O. Rodriguez
Universidad de Buenos Aires, Argentina
Ken Satoh
National Institute of Informatics and Sokendai, Japan
Gerardo Simari
Universidad Nacional del Sur and CONICET, Argentina
Guillermo R. Simari
Universidad del Sur in Bahia Blanca, Argentina
Christian Straßer
Ruhr-Universitaet Bochum, Germany
Matthias Thimm
Universität Koblenz-Landau, Germany
Leon van der Torre
Université du Luxembourg
Renata Wassermann
Universidade de São Paulo, Brazil
Emil Weydert
Université du Luxembourg
Stefan Woltran
Vienna University of Technology, Austria
7
Additional Reviewers
Flavio Everardo University of Potsdam, Germany
Pedro Cabalar
Corunna University, Spain
Igor Câmara
Universidade de São Paulo, Brazil
8
Counting with Bounded Treewidth: Meta Algorithm and Runtime Guarantees∗
Johannes K. Fichte1 , Markus Hecher2
1
Faculty of Computer Science, TU Dresden, 01062 Dresden, Germany
2
Institute of Logic and Computation, TU Wien, Favoritenstraße 9-11, 1040 Wien, Austria
johannes.fichte@tu-dresden.de, hecher@dbai.tuwien.ac.at
Abstract
The computational complexity of counting has been studied since the late 70s (Durand, Hermann, and Kolaitis 2005;
Hemaspaandra and Vollmer 1995; Valiant 1979). Unsurprisingly, counting is at least as hard as solving the corresponding
decision problem, because one can trivially solve the decision
problem by counting and checking whether the count differs
from zero (Hemaspaandra and Vollmer 1995).
While it suffices to count the number of solutions, many applications employ combinatorial solvers in practice by encoding the application into ASP, SAT, QBF, or ILP (Gaggl et al.
2015; Dueñas-Osorio et al. 2017). There, we often need auxiliary constructions (variables) in the encodings that are not
necessarily in a functional dependency. If we are interested
in the solutions with respect to certain variables, the standard
concept is projection, which is extensively used in the area
of databases (Abiteboul, Hull, and Vianu 1995) as well as
in declarative problem specifications (Dueñas-Osorio et al.
2017; Gebser, Kaufmann, and Schaub 2009). Projected Solution Counting (PSC) then asks for the number of solutions
after restricting each solution to parts of interest (projection
set). In other words, multiple solutions that are identical with
respect to the projection set, count as single projected solution. Recently, there is growing interest in PSC, as witnessed
by a variety of results in areas such as logic (Aziz 2015;
Aziz et al. 2015; Capelli and Mengel 2019; Fichte et al. 2018;
Lagniez and Marquis 2019; Sharma et al. 2019), reliability
estimation (Dueñas-Osorio et al. 2017), answer set programming (Fichte and Hecher 2019), and argumentation (Fichte,
Hecher, and Meier 2019). Interestingly, the projected solution counting is often harder than counting problems, in
contrast to decision problems, where projecting the solution
to a projection set obviously does not change the complexity
of the decision problem.
To deal with the high computational complexity and designing solving algorithms assuming that the input instance
has a certain structure, ideas from parameterized algorithmics proved valuable (Cygan et al. 2015). In particular,
treewidth (Bodlaender and Kloks 1996) was successfully
applied to solution counting for a range of problems (Curticapean 2018; Fichte et al. 2017; Fioretto et al. 2018;
Kangas, Koivisto, and Salonen 2019; Pichler, Rümmele, and
Woltran 2010; Samer and Szeider 2010). Some recent results
also address projected solution counting when parameterized
by treewidth (Capelli and Mengel 2019; Fichte et al. 2018;
In this paper, we present a meta result to construct algorithms
for solution counting in various formalisms in knowledge representation and reasoning (KRR). Our meta algorithm employs small treewidth of the input instance, which yields
polynomial-time solvability in the input size for instances
of bounded treewidth when considering various decision problems in graph theory, reasoning, and logic. For many results,
there are explicit dynamic programming algorithms or results
based on the well-known Courcelle’s theorem that allow to
decide a problem in time linear in the input size and some
function in the treewidth. We follow this line of research,
however, consider a much more elaborate question: counting
and projected solution counting (PSC). PSC is a natural generalization to counting all solutions where we consider multiple
indistinguishable solutions as one single solution.
Our meta result allows to extend already existing given dynamic programming (DP) algorithms by introducing only a
single-exponential blowup in the runtime on top of the existing DP algorithm. The technique is widely applicable for
problems in KRR. Exemplarily, we present an application to
projected solution counting on QBFs, which often also serves
as a canonical counting problem for the polynomial hierarchy.
Finally, we present a list of problems on which our result is
applicable and where the single-exponential blowup caused
by the approach cannot be avoided under ETH (exponential
time hypothesis). This completes the picture of recently obtained results in argumentation, answer set programming, and
epistemic logic programming.
Introduction
Counting solutions is a well-known task in mathematics,
computer science, and other areas (Domshlak and Hoffmann 2007; Gomes, Sabharwal, and Selman 2009; Sang,
Beame, and Kautz 2005). For instance, in mathematical
combinatorics one characterizes the number of solutions to
combinatorial problems by means of mathematical expressions, e.g., generating functions (Doubilet, Rota, and Stanley
1972). Another example are applications to machine learning
and probabilistic inference (Chavira and Darwiche 2008).
∗
This work has been supported by the Austrian Science Fund
(FWF), Grants P32830 and Y698 and the Vienna Science and Technology Fund, Grant WWTF ICT19-065. Markus Hecher is also
affiliated with the University of Potsdam, Germany.
9
Fichte, Hecher, and Meier 2019).
A definability based approach for problems that can be
encoded into monadic second-order logic have also been considered, e.g., (Arnborg, Lagergren, and Seese 1991). Still,
a generic approach to facilitate the development of algorithms for counting problems of bounded treewidth is missing. We address this research question and present a meta
algorithm for solving PSC by utilizing small treewidth of the
Gaifman graph (Gaifman 1982). It works for various graph
problems, for problems in logic and reasoning, including
problems located higher on the polynomial hierarchy such
as QBFs. Our meta algorithm allows for extending existing dynamic programming (DP) algorithms by causing only
a single-exponential blowup in the treewidth on top of the
existing DP. In fact, if we consider all solutions as distinguishable by taking an unrestricted projection set, the considered
projection counting question simplifies to simple counting.
Hence, our results immediately apply to simple counting.
based on knowledge compilation (Charwat and Woltran 2019;
Capelli and Mengel 2019).
Preliminaries
Basics and Computational Complexity. We assume familiarity with standard notions in computational complexity
and use counting complexity classes as defined by Durand,
Hermann, and Kolaitis (2005). For parameterized complexity,
we refer to standard texts (Cygan et al. 2015). Let n ∈ N be
a natural number (including zero), then [n] := {1, . . . , n}.
Further, for all ℓ ∈ N, we define tower : N × N → N
by tower(1, n) = 2n and tower(ℓ + 1, n) = 2tower(ℓ,n) .
Given a family of finite sets X1 , X2 , . . ., Xn , the generalized combinatorial inclusion-exclusion principle (Graham, Grötschel, and Lovász 1995) states that the number
S
of elements in the union over all subsets is nj=1 Xj =
P
T
|I|−1
| i∈I Xi |. For a set X, let 2X be the
I⊆{1,...,n},I6=∅ (−1)
power set of X consisting of all subsets Y with ∅ ⊆ Y ⊆ X.
Let ~s be a sequence of elements of X. When we address
the i-th element of the sequence ~s for a given positive integer i, we write ~s(i) . Similar, for a set U of sequences we let
U(i) := {~s(i) | ~s ∈ U }.
Contributions. We give the following contributions.
1. We establish a novel meta approach to solve PSC for various problems. We simply assume that the input is given in
terms of a finite structure, for which a dynamic programming algorithm (DP) for computing the solutions to the
considered problem exists, and build a generic algorithm
on top of the DP that solves PSC.
Quantified Boolean Formulas (QBFs). We assume familiarity with notations and problems for quantified Boolean
formulas (QBF), their evaluation and satisfiability (Biere et al.
2009). Literals are variables or their negations. For a Boolean
formula F , we denote by var(F ) the set of variables of F . A
term is a conjunction of literals and a clause is a disjunction
of literals. F is in conjunctive normal form (CNF) if F is a
conjunction of clauses. We identify F by its set of sets of
literals. From now on assume that a Boolean formula F is in
CNF, and each set in F has at most three literals. Let ℓ ≥ 0 be
integer. A quantified Boolean formula Q (Biere et al. 2009)
of quantifier depth ℓ is of the form Q1 V1 .Q2 V2 . · · · Qℓ Vℓ .F
where quantifier Qi ∈ {∀, ∃} for 1 ≤ i ≤ ℓ and Qj 6= Qj+1
for 1 ≤ j ≤ ℓ − 1. Further, sets Vi are disjoint, non-empty
sets of Boolean variables and F is a Boolean formula such
S
that ℓi=1 Vi = var(F ). We let mat(Q) := F be the matrix
of Q. An assignment is a mapping ι : X → {0, 1} defined
for a set X of variables. Sometimes we compactly denote
assignments by {x | x ∈ X, ι(x) = 1}, i.e., the sets of variables that are set to true. Given a QBF Q and an assignment ι,
then Q[ι] is a QBF that is obtained from Q, where every occurrence of any x ∈ X in mat(Q) is replaced by ι(x), and
variables that do not occur in the result are removed from preceding quantifiers accordingly. QBF Q evaluates to true (or
is valid) if ℓ = 0 and the Boolean formula mat(Q) evaluates
to true, denoted |= mat(Q). Otherwise (ℓ 6= 0), we distinguish according to Q1 . If Q1 = ∃, then Q evaluates to true if
and only if there exists an assignment ι : V1 → {0, 1} such
that Q[ι] evaluates to true. If Q1 = ∀, then Q[ι] evaluates to
true if for any assignment ι : V1 → {0, 1}, Q[ι] evaluates to
true. Deciding validity of a given QBF is PSPACE-complete
and (believed) harder than SAT (Stockmeyer and Meyer
1973).
2. Since not every DP algorithm can be used to also solve
PSC, we provide sufficient conditions under which a DP
algorithm can be used in our framework for PSC.
3. For various PSC problems, we list complexity upper
bounds that can be obtained from our framework, which
completes the recently established lower bounds (Fichte,
Hecher, and Pfandler 2020) for treewidth when assuming ETH (exponential time hypothesis). As running
example, we illustrate the applicability of our framework on PSC for quantified Boolean formulas (QBFs),
which spans the canonical counting problems #Σℓ QSAT
and #Πℓ QSAT (Durand, Hermann, and Kolaitis 2005) on
the polynomial counting hierarchy.
Related Work. Gebser, Kaufmann, and Schaub (2009)
considered projected solution enumeration for conflict-driven
solvers based on clause learning. Aziz (2015) introduced
techniques to modify modern solvers for logic programming in order to count projected solutions. Recently,
Fichte et al. (2018) gave DP algorithms for PSC in SAT and
showed lower bounds under ETH. This algorithm was then
extended to related formalisms (Fichte and Hecher 2019;
Fichte, Hecher, and Meier 2019). Our algorithm also traverses a tree decomposition multiple times and runs in linear
time, while being single-exponential in the maximum number of records computed by the DP algorithm. However, we
generalize the results by (i) providing a general framework to
solve PSC (ii) generalizing the PSC algorithm such that it can
take a DP algorithm as input to solve various problems, and
(iii) establishing necessary conditions for DP algorithms to be
employed in our framework. For implementations of decision
and counting problems on QBFs, one could adapt existing
DP algorithms (Chen 2004) or use alternative approaches
Example 1. Let F :={c1 , c2 , c3 , c4 }, where c1 = ¬a ∨ b,
c2 = c ∨ ¬e, c3 = ¬b ∨ ¬d ∨ e, c4 = b ∨ d ∨ e, and
10
Q := ∃c, d, e.∀a.∃b.F . Take assignment ι : {c, d, e} →
{0, 1}, where ι := {c, e}. Then, formula Q[ι] evaluates to
true, because for any assignment ι′ : {a} → {0, 1} there
is ι′′ : {b} → {0, 1} with ι′′ := {b} such that ((F [ι])[ι′ ])[ι′′ ]
evaluates to true. Similarly, for ζ : {c, d, e} → {0, 1},
where ζ := ∅, formula Q[ζ] evaluates to true. In total, there
are only four assignments over domain {c, d, e} witnessing
validity of Q, namely ι, ζ, and assignments {c} and {c, d, e}.
structures. This then allows us to use these problems for
projected solution counting. Formally, a problem (specification) P = hσ, ξ, sol i consists of disjoint vocabularies σ and ξ
and a function sol : algs(σ)×algs(ξ) → {0, 1}. We consider
a σ-structure I as instance, a ξ-structure S as solution, and
sol as the solution checker. The solution checker sol then
returns 1 if and only if structure S is a solution of instance I.
From a problem specification, we define a (meta) problem #PS OLS(P) for projected solution counting as follows.
We define the counting vocabulary ξsc , consisting of only
one symbol sc of arity 1 that we use for the solution count,
˙
Then, formally, we let #PS OLS(P) :=
i.e., ξsc = {sc}.
hσ ∪ ξ, ξsc , psolsi. Instances are (σ ∪ ξ)-structures, solutions
are ξsc -structures, and psols is the solution checker. Since
projected solution counting requires to specify a projection
over solutions (ξ-structures) to P, we also give a projection Iξ as input which is defined by P := Iξ . Then, the
number s of projected solutions is obtained by projecting
each solution of input instance Iσ to projection P, i.e., s =
|{S ′ ⊓ P | S ′ ∈ algs(ξ), sol (Iσ , S ′ ) = 1}|. Now we can
simply define the solution checker as psols(I, S) := 1 if
and only if S is the ξsc -structure containing relation sc with
just s, i.e., S = h{s}, (sc)i.
For our running example with QBFs, we can now instantiate the definitions from above to specify the problem
QSAT := hσQBF , ξT , sol i. Recall from above that the instances are σQBF -structures Q and solutions are ξT -structures S. Naturally, from the definitions of QSAT we set
sol as follows: sol (I, S) := 1 if and only if the QBF Q
corresponding to instance I = Q evaluates to true under the
assignment corresponding to S. If we restrict our σQBF -structure Q such that the corresponding QBFs of the instances are
of quantifier depth ℓ, where the first quantifier starts with ∃,
we call the resulting problem Σℓ QSAT. Naturally, we define
#Σℓ QSAT := #PS OLS(Σℓ QSAT).
Finite Structures and Projected Solution Counting. A
vocabulary σ is a set of relation symbols, where each such
symbol Ṙ ∈ σ is of arity ar(Ṙ) ≥ 0. Let D be a finite set
of elements, referred to by domain. By relation, we mean a
set R ⊆ Dar(Ṙ) . A finite σ-structure I = hD, (R)Ṙ∈σ i
consists of a domain D and a set of relations for every
symbol Ṙ in σ. We denote by algs(σ) the set of finite
σ-structures and refer by kσk to the size of σ. Given finite σ-structures I = hD, Ri and I ′ = hD′ , R′ i and a
vocabulary ξ ⊆ σ. In order to access relation R for symbol Ṙ ∈ σ, we let RṘ := R. Then, the structure Iξ restricted to ξ is the structure that consists of relation symbols
from I that occur in ξ, i.e., Iξ := hD, (R)Ṙ∈ξ i. Further, we
define the intersection I ⊓ I ′ of both σ-structures as the structure that consists of an intersection over each relation, i.e.,
I ⊓ I ′ := hD ∩ D′ , (RṘ ∩ R′Ṙ )Ṙ∈σ i.
Assume some integer ℓ ≥ 1. For QBFs, we define
˙ , FORALL
˙
,
the primal vocabulary by σQBF := {DEPTH
˙
˙
˙
˙
EXISTS , NEG , POS , IN C LAUSE}, containing binary relation
symbols only. Given QBF Q = Q1 V1 .Q2 V2 . · · · Qℓ Vℓ .F .
Then, the σQBF -structure Q of Q is given by Q :=
hvar(F ), (R)Ṙ∈σQBF i, where DEPTH = {ℓ}, FORALL =
{(v, i) | Qi =∀, v ∈ Vi }, EXISTS = {(v, i) | Qi =∃, v ∈ Vi },
NEG = {(v, c) | c ∈ F, ¬v ∈ c}, POS = {(v, c) | c ∈ F, v ∈
c}, and IN C LAUSE = {(u, v) | c ∈ F, {u, v} ⊆ var(c)}.
Note that in the definition above we place constant symbols for integers 0 ≤ i ≤ ℓ, which we need below when
decomposing the input. Instead of the constants that occur
in tuples, we can also use multiple relation symbols of the
˙ ℓ , FORALL
˙
˙ i . Since this results in a
form DEPTH
i , and EXISTS
vacuous overhead, we treat them as given above.
We say that Q is the corresponding QBF of Q and we
sometimes use Q instead of Q for brevity. Let ι be an assignment for var(F ). Then, we define the solution vocabulary ξT := {Ṫ} of arity 1 and the ξT -structure is given
by hι, (ι)i.
Example 3. Consider QBF Q, σQBF -structure Q =
hD, (R)Ṙ∈σQBF i from Example 1. The problem #Σℓ QSAT =
hσQBF ∪ ξT , ξsc , psolsi additionally assumes a projection
as part of the instances. This projection is given as
part of the (σQBF ∪ ξT )-structure. Hence, consider projection T = {d, e}, our instance of #Σℓ QSAT is given
by I = hD, RṘ∈(σQBF ∪ξT ) i. Consequently, projection P =
IξT = hD, ({d, e})i. Recall the four assignments ∅, {c}
{c, e}, and {c, d, e} from Example 1 under which Q evaluates to 1. When we project these assignments to {d, e}, we
are left with only three assignments ∅, {e}, and {d, e}. As
a result, h{3}, ({3})i is the only solution to instance I of
problem #Σℓ QSAT.
Example 2. Consider QBF Q from Example 1. Then,
we construct the σQBF -structure from Q as Q=
hmat(Q) ∪ var(mat(Q)), (R)Ṙ∈σQBF i, where DEPTH={3},
EXISTS ={(c, 1), (d, 1), (e, 1), (b, 3)},
FORALL ={(a, 2)},
NEG ={(a, c1 ), (c, c2 ), (b, c3 ), (d, c3 )}, POS ={(b, c1 ), (c, c2 ),
(e, c3 ), (b, c4 ), (d, c4 ), (e, c4 )}. Observe that Q is the corresponding QBF of Q. Further, assignment ι of Example 1 is
represented using ξT -structure h{c, e}, ({c, e})i.
Proposition 1 (Hemaspaandra and Vollmer, 1995). The problem #Σℓ QSAT is #·Σℓ P -complete.
Tree Decompositions (TDs) of Finite Structures. For a
tree T and a node t of T , we let children(t) be the sequence
of all child nodes of t in arbitrary but fixed order. Let I =
hD, Ri be a σ-structure. A tree decomposition (TD) of I
is a pair T = (T, χ) where T = (N, A) is a tree rooted
at root(T ) and χ a mapping that assigns to each node t ∈ N
Similar in algorithms and specifications that use logic for
verification (Gurevich 1995) as well as in descriptive complexity, we define problems in a very general way using finite
11
a set χ(t) ⊆ D, called bag, such that the following conditions
S
hold: (i) D = t∈N χ(t) and for each Ṙ ∈ σ, we have
R ⊆ χ(t)ar(Ṙ) for some t ∈ N ; and (ii) for each r, s, and
t such that s lies on the path from r to t, we have χ(r) ∩
χ(t) ⊆ χ(s). This definition of a TD of I is the same as
a TD of the Gaifman graph (Gaifman 1982) of I. Then,
width(T ) := maxt∈N |χ(t)| − 1. The treewidth tw (G) of
G is the minimum width(T ) over all TDs T of G. We denote
S
the bags χ≤t below t by χ≤t := t′ of T [t] χ(t′ ), where T [t] is
the sub-tree of T rooted at t.
For a node t ∈ N , we say that type(t) is leaf if
children(t) = hi; join if children(t) = ht′ , t′′ i where
χ(t) = χ(t′ ) = χ(t′′ ) 6= ∅; int (“introduce”) if
children(t) = ht′ i, χ(t′ ) ⊆ χ(t) and |χ(t)| = |χ(t′ )| + 1;
rem (“removal”) if children(t) = ht′ i, χ(t′ ) ⊇ χ(t) and
|χ(t′ )| = |χ(t)| + 1. We use nice TDs, where for every node
t ∈ N , type(t) ∈ {leaf, join, int, rem} and bags of the root
node and leaf nodes are empty, which can be obtained in
linear time without increasing the width (Kloks 1994).
Listing 1: Algorithm DPA (I, T ): Dynamic programming on
TTD T , cf., (Fichte et al. 2017).
In: Problem instance I, TTD T = (T, χ, ι) of I such that n is
the root of T and children(t) = ht1 , . . . , tℓ i.
Out: A-TTD (T, χ, o) with A-table mapping o.
1 o ← empty mapping
2 for iterate t in post-order(T,n) do
3
o(t) ← At (It , ι(t), ho(t1 ), . . . , o(tℓ )i)
4 return (T, χ, o)
TD (T, χ) multiple times, where ι is a table mapping from
an earlier traversal. Therefore, ι might be empty at the
beginning of the first traversal.
2. Run algorithm DPA (see Listing 1). It takes a TTD T =
(T, χ, ι) and traverses T in post-order. At each node t
of T it computes a new A-table o(t) by executing the
algorithm A. The algorithm A has a “local view” on the
computation and can access only t, atoms in the bag χ(t),
bag-structure It , and child A-table o(t′ ) for child nodes t′ .
3. Output the A-tabled tree decomposition (T, χ, o).
Dynamic Programming on TDs of Finite
Structures
4. Print the result by interpreting o(n) for root n = root(T ).
Usually when giving a dynamic programming algorithm,
one only describes algorithm A. Hence, we focus on this
algorithm in the following and call A table algorithm.
Algorithms that utilize treewidth to solve problems typically
proceed by dynamic programming along the TD (in postorder) where at each node of the tree information is gathered (Bodlaender and Kloks 1996) in a table by a table algorithm A. More generally, a table is a set of records, where a
record ~u is a sequence of fixed length. The actual length, content, and meaning of the records depend on the algorithm A.
Since we later traverse the tree decomposition repeatedly
running different algorithms, we explicitly state A-record if
records of this type are syntactically used for algorithm A
and similar A-table for tables. In order to access tables computed at certain nodes after a traversal as well as to provide
better readability, we attribute tree decompositions with an
additional mapping to store tables. Formally, a tabled tree
decomposition (TTD) of graph G is a pair T = (T, χ, τ )
where (T, χ) is a tree decomposition of G and τ is a mapping
which maps nodes t of T to tables. When a TTD has been
computed using algorithm A, after traversing, we call the
decomposition the A-TTD of the input instance.
Let T = (T, χ, ·) be a TTD of a σ-structure I = hD, Ri for a
problem P = hσ, ξ, sol i and t in T . Then, we define the bagrelations Rt := (R ∩ χ(t)ar(Ṙ) )Ṙ∈σ . The bag-structure It
is given by It := hχ(t), Rt i. This allows to define the bagS
domain below t by D≤t := t′ in T [t] χ(t′ ), bag-relations bear(Ṙ)
)Ṙ∈σ , and bag-structure below t
low t by R≤t := (R∩D≤t
by I≤t := hD≤t , R≤t i.
Table Algorithm for Σℓ QSAT
Next, we briefly present table algorithm QALG that allows us to solve problem Σℓ QSAT. To this end, consider
a QBF Q = ∃V1 . · · · Qℓ Vℓ .F its σQBF -structure Q and a
tabled tree decomposition T = (T, χ, ι) of Q. Then, algorithm DPQALG solves Σℓ QSAT, where algorithm QALG
stores in table o(t) (nested) records of the form hI, Ai. The
first position of such a record consists of an assignment I restricted to V1 ∩ χ(t). The second position consists of a nested
set A of sequences that are of the same form as records
in o(t). Intuitively, I is an assignment restricted to variables
in V1 . For a nested sequence hI ′ , A′ i in A, assignment I ′ is
restricted to variables in V2 ∩ χ(t) and so on. The innermost
sequence hI ∗ , ∅i stores assignments restricted to Vℓ ∩ χ(t).
In other words, the first position ~u(1) of any ~u ∈ o(t) characterizes a bag-relation T for symbol Ṫ ∈ ξT .
Before we discuss algorithm QALG in more details,
we introduce some auxiliary notation. In order to evaluate quantifiers we let checkForall(Q, hτ1 i) return true
if and only if either Q1 = ∃, or for every ~u ∈ τ1 ,
we have that QALGt (Q, h{~u}i) outputs something different from ∅, i.e., we need for each record of τ1 a
“succeeding record” for the parent node of t. Analogously, we let checkForall(Q, hτ1 , τ2 i) be true if and only
if either Q1 = ∃, or for every ~u ∈ τ1 and ~v ∈
τ2 , we have QALGt (Q, h{~u}, τ2 i) 6= ∅ as well as
QALGt (Q, hτ1 , {~v }i) 6= ∅. Intuitively, this reports whether
records are missing in order to satisfy QBFs having outermost universal quantifier.
Listing 2 presents table algorithm QALG, which works
as follows. The algorithm computes the nested records recursively, which are of the same depth as the quantifier
Observation 1. Given a finite structure I = hD, ·, Ri
over σ and a TD T = (T, χ). Then, for n = root(T ),
D≤n = D, R≤n = R, and I≤n = I.
Let σ be a vocabulary, P = hσ, ξ, sol i a problem, and A
a table algorithm for solving P on instances over σ. Then,
dynamic programming (DP) on tree decompositions of finite
structures performs the following steps for σ-structure I:
1. Compute a TTD (T, χ, ι) of I. Later, we traverse a
12
Listing 2: Table algorithm QALGt (Qt , hτ1 , . . .i), influenced
by previous work (Chen 2004).
In: Node t, bag-structure Qt = hDt , Rt i, and sequence
hτ1 , . . .i of QALG-tables of children of t.
Out: QALG-table τt .
1 Q1 V1 · · · Qℓ Vℓ .F ← corresponding QBF of Qt
2 if ℓ = 0 then τt ← ∅
3 else if type(t) = leaf then
4
τt ← {h∅, QALGt (Q2 V2 · · · Qℓ Vℓ .F, hi)i}
5 else if type(t) = int and a ∈ χt is introduced then
6
τt ← {hJ, A′ i | hI, Ai ∈ τ1 , J ∈ {I}∪{Ia+ | a ∈ V1 },
|= mat(Qt [J]), A′ =QALGt (Qt [J], hAi), A′ 6=∅,
checkForall(Qt [J], hAi)}
7 else if type(t) = rem and a 6∈ χt is removed then
8
τt ← {hIa− , QALGt (Qt [I], hAi)}i | hI, Ai ∈ τ1 }
9 else if type(t) = join then
10
τt ← {hI, A′ i | hI, A1 i ∈ τ1 , hI, A2 i ∈ τ2 ,
A′ =QALGt (Qt [I], hA1 , A2 i), A′ 6=∅,
checkForall(Qt [I], hA1 , A2 i)}
11 return τt
T : ∅ t14
i hI12.i ,A12.i i
1 h∅,
{h∅, {{b}}i}i
2 h∅,
{h∅, {∅, {b}}i}i
τ12
{b} t4 t12 {b}
τ11
t11 hI
i hI4.i ,A4.i i
i
t3
11.i ,A11.i i
{a,
b}
{b,
d}
1 h∅, {h∅, {∅, {b}}i,
h∅,
{h∅, {{b}}i}i 1
h∅, {{b}}i}i
h∅,
{h∅, {∅, {b}}i}i 2
t2 {a} t10 {b, d, e}
τ4
h{d}, {h∅, {∅}i}i
3
τ3
h{d}, {h∅, {∅, {b}}i}i 4
∅ t1 t9 {d, e}
i hI3.i ,A3.i i
hI10.i , A10.i i
i
1 h∅, {h∅, {∅, {b}}i,
t8 {e}
h∅,
{h∅, {{b}}i}i
1
h{a}, {{b}}i}i
h{e}, {h∅, {∅, {b}}i}i 2
t7
τ9
τ7
{c, e}
h{d}, {h∅, {∅}i}i
3
hI9.i , A9.i i
i hI7.i , A7.i i
h{d, e},{h∅, {∅, {b}}i}i 4
h∅,
{h∅, {∅}i}i 1 h∅,
{h∅, {∅}i}i {c} t6
τ10
h{e}, {h∅, {∅}i}i 2 h{c}, {h∅, {∅}i}i
hI5.i ,A5.i i
i
h{d}, {h∅, {∅}i}i 3 h{c, e},{h∅, {∅}i}i
τ5
∅ t5
1 h∅, {h∅, {∅}i}i
h{d, e},{h∅, {∅}i}i 4
i hI13.i ,A13.i i
τ13
1 h∅,
{h∅, {{b}}i}i
2 h∅,
{h∅, {∅, {b}}i}i
{b} t13
Figure 1: Selected tables of τ obtained by DPQALG on TTD T .
ing of the empty assignment for each depth d > 0, and a
set of assignments that contain an empty set, recursively
constructed with decreasing depth in Line 4. Node t2 is of
type int and introduces variable a. Line 6 makes sure that
results in τ2 contains only record h∅, {h∅, {∅}i, h{a}, {∅}i}i,
thereby guessing on the assignment of a for the second
(universal) quantifier of Q. Node t3 is of type int and
introduces b. Then, bag-relations Rt3 at node t3 contain IN C LAUSE={(a, b)}, POS={(b, c1 )}, NEG={(a, c1 )},
i.e., we need to ensure clause c1 = ¬a ∨ b is satisfied
in t3 . This is done in Line 6, as well as making sure that
we keep for universally quantified variable a all its assignments. Node t4 is of type rem. Here, we restrict the records
(see Line 8) such that they contain only variables occurring
in bag χ(t4 ) = {b}. Basic conditions of a TD ensure that
once a variable is removed, it does not occur in any bag
at an ancestor node, i.e., we encountered all clauses for a.
Nodes t5 , t6 , t7 , and t8 are symmetric to nodes t1 , t2 , t3 ,
and t4 . We proceed similar for nodes t9 − t12 . At node t13
we join tables τ4 and τ12 according to Line 10, where we
only match agreeing assignments such that no assignment
involving universally quantified variable a is lost. At the root
node t14 , it is then ensured that we have those records only
that lead to witnessing assignments. Since τ14 is not empty,
formula Q is valid.
We can reconstruct witnessing assignment {c, e} by combining parts I of the yellow highlighted records, as shown in
Figure 1.
Se+ :=S ∪ {e}, Se− :=S \ {e} and R∼
outputs R where relahṘ,N i
tion RṘ is replaced by the relation N .
depth ℓ. For leaf nodes, i.e., nodes t with type(t) = leaf,
we construct a record of the form h∅, {h∅, {. . .}i}i, which
is a nested record of depth ℓ, cf., Line 4. Note the recursive call to QALG and the base (termination) case in Line 2.
Intuitively, whenever a variable a is introduced (int), we decide whether we assign a to true and only keep records, cf.,
Line 6, where all clauses of the matrix of the corresponding
QBF of Qt are satisfied. Further, we need to guarantee that
the universal quantifiers are still satisfiable, which is ensure
by checkForall. When removing (rem) a variable a, we remove a from our records accordingly, cf., Line 8. If the node
is of type join, we combine two records in two different child
tables and ensure the satisfiability of the universal quantifier by means of checkForall. Intuitively, we are enforced
to agree on assignments I, and on the ones in A, which is
established in Line 10.
Example 4. Recall QBF Q from Example 1. Observe that
by the construction of Q, we have (u, v) ∈ IN C LAUSE for
every two variables u, v of a given clause. Consequently,
it is guaranteed (Kloks 1994) that in any TD of Q, we
have for each clause at least one bag containing all its
variables. In the light of this observation, Figure 1 depicts a TD T = (T, χ) of Q, assuming that clauses are
implicitly contained in those bags, which contain all of its
variables. The figure illustrates a snippet of tables of the
TTD (T, χ, τ ), which we obtain when running DPQALG on
instance Q and TTD T according to Listing 2. Note that
for the ease of presentation, we write X instead of hX, ∅i.
Further, for brevity we write τj instead of τ (tj ) and identify
records by their node and identifier i in the figure. For example, record ~u9.2 = hI9.2 , A9.2 i ∈ τ9 refers to the second
record of table τ9 for node t9 ; similarly we write for table τ3 ,
e.g., A3.1.2.1 to address {a}.
In the following, we briefly discuss selected records of
tables in τ . Node t1 is of type leaf. Therefore, table τ1
equals table τ5 which both have only one record, consist-
Lemma 1 (Chen, 2004). Given a QBF Q of quantifier
depth ℓ and a TTD T = (T, χ, ·) of Q of width k with g nodes.
Then, algorithm DPQALG runs in time O(tower(ℓ, k + 5) · g).
A recent result establishes that one cannot significantly improve the running time of the algorithm above assuming that
the exponential time hypothesis (ETH). ETH (Impagliazzo,
Paturi, and Zane 2001) states that there is some real s > 0
such that satisfiability of a given 3-CNF formula F cannot
be decided in time 2s·|F | · kF kO(1) .
Proposition 2 (Fichte, Hecher, and Pfandler, 2020). Un-
13
1. Create TTD T of Iσ
Store results
in table τt
done?
no
Apply A on (Iσ )t
Store results
in table πt
Visit next node t
of T in post-order
done?
yes
2.I. DPA for P
Purge non-solutions in τ
ν
Listing 3: Table algorithm APRJ(νt , It , hπ1 , . . .i) for projected solution counting, c.f., (Fichte et al. 2018).
In: Purged table mapping νt , bag-projection Pt = (It )ξ ,
sequence hπ1 , . . .i of APRJ-tables of children of t.
Out: APRJ-table πt of records hρ, ci, where ρ ⊆ νt , c ∈ N.
1 πt ← hρ, ipsc(t, ρ, hπ1 , . . .i)i ρ ∈ sub-bucketsPt (νt )
2 return πt
Apply APRJ to νt
no
Visit next node t
of T in post-order
yes
2.II. DPAPRJ for #PSols(P)
3. Output projected count
pre-order and remove all records from the tables that cannot
be extended to a solution (“Purge non-solution records”). Intuitively, these records are those that are not involved in τ (n)
when recursively following A-origins(n, τ (n)) the root n
back to the leaf nodes of T . In other words, we keep only
records ~u of table τ (t), if ~u is involved in τ (n), i.e., if ~u
participates in constructing a solution to our problem P. We
call the table mapping ν purged table mapping and let the
resulting TTD be Tpurged = (T, χ, ν).
Step 2.II forms the main part of PSCA . By DPAPRJ , we
traverse Tpurged to count solutions with respect to the projection and obtain Tproj = (T, χ, π). From the table π(n) at
the root n of T , we can directly read the projected solution
count of I. In the following, we only describe table algorithm APRJ as the traversal in DPAPRJ is the same as before.
Records are of the form hρ, c, i ∈ π(t), where ρ ⊆ ν(t)
is an A-table, and c is a non-negative integer. For a set ρ
of records, the intersection projected solution count ipsc is,
vaguely speaking, the number “projected” solutions to Iσ ,
all records in ρ have in common. That is the cardinality of
the intersection of solutions to I≤t restricted to the given
projection P for the set ρ of involved records. In the end, we
are interested in the projected solution count (psc) of ρ, i.e.,
the cardinality of the union of solutions to I≤t restricted to P
for records ρ.
In the remainder, we additionally let ν be the purged
table mapping, π be the APRJ-table mapping as used
above, and ρ ⊆ ν(t). The relation ≡P ⊆ ρ × ρ considers equivalent records with respect to the projection P by
≡P :={(~u, ~v ) | ~u, ~v ∈ ρ, hD, ~u(1) i ⊓ P = hD, ~v(1) i ⊓ P}.
Let bucketsP (ρ) be the set of equivalence classes induced
by ≡P on set ρ of records, i.e., buckets
S P (ρ) := (ρ/ ≡P ) =
{[~u]P | ~u ∈ ρ}, where [~u]P = ~v∈ρ {~v | ~v ≡P ~u}, and
sub-bucketsP (ρ) :={S | ∅ ( S ⊆ B, B ∈ bucketsP (ρ)}.
Figure 2: Algorithm PSCA consists of DPA and DPAPRJ .
der ETH, problems Σℓ QSAT for given QBF Q of quantifier
depth ℓ cannot be solved in time tower(ℓ, o(k)) · 2o(kQk) ,
using treewidth k of structure Q.
Projected Solution Counting
In this section, we present a generic dynamic programming algorithm (PSCA ) and table algorithm (APRJ) that
allows for solving projected solution counting, namely, problem #PS OLS(P). Therefore, we let P = hσ, ξ, ·i be a problem, which we extend to projected counting, I = hD, ·i a
σ ∪ ξ-structure, P :=Iξ the considered projection, and A
a table algorithm that solves P by dynamic programming.
Since we reuse the output of algorithm A (tabled tree decomposition) to solve the actual projected solution counting
problem, we let T = (T, χ, τ ) be an A-TTD of instance Iσ
for problem P. By convention, we take t as a node of T .
Generic Algorithm PSCA
Next, we define a new meta algorithm (PSCA ) for a given
table algorithm A. Core ideas that lead to algorithm PSCA are
based on an algorithm for Boolean satisfiability (Fichte et al.
2018), which we lift to many much more general problems
in KRR that can be defined using finite structures. Before
we discuss our approach, we provide a notion to reconstruct
solutions from a tabled tree decomposition that has been computed using a table algorithm A. This requires to determine
for a given record its predecessor records in the corresponding
child tables. Let therefore children(t) = ht1 , . . . , tℓ i. Given
a sequence ~s = hs1 , . . . , sℓ i, we let h{~s}i :=h{s1 }, . . . , {sℓ }i.
For a given A-record ~u, we define the originating A-records
of ~u in node t by A-origins(t, ~u) :={~s | ~s ∈ τ (t1 ) × · · · ×
τ (tℓ ), ~u ∈ At (It , S
h{~s}i)}. We extend this to A-table ρ by
A-origins(t, ρ) := ~u∈ρ A-origins(t, ~u). These origins allow us to collect records that are extendable to solutions of P
and combine them accordingly. Given any descendant node t′
of t, we call every record ~v ∈ ι(t′ ) that appears in some Arecords, when recursively following every A-origins(t, ρ)
back to the leaf nodes, involved in ρ.
Example 6. Consider again QBF Q, projection P,
TTD (T, χ, τ ), and tables τ10 , τ11 from Example 4 and Figure 1. During purging, records u10.3
~ and u11.3
~ are removed, as highlighted in gray. This results in tables ν10
and ν11 . Then, the set ν10 / ≡P of equivalence classes
~ }, {u10.2
~ }, {u10.4
~ }}, whereas
is bucketsP (ν10 ) = {{u10.1
ν11 / ≡P = {{u11.1
~ , u11.2
~ }, {u11.4
~ }}.
Example 5. Recall QBF Q, TTD (T, χ, τ ), and taConbles τ10 , τ11 from Example 4 and Figure 1.
Observe that
sider record ~u11.1 , ~u11.2 ∈ τ11 .
A-origins(t11 , ~u11.1 ) = {~u11.1 }, which is in contrast to
A-origins(t11 , ~u11.2 ) = {~u10.2 , ~u10.4 }.
Later, we require to access already computed projected
counts in tables of children of a given node t. Therefore,
we define the stored ipsc of a table ρ ⊆ ν(t) in table π(t)
P
by s-ipsc(π(t), ρ) := hρ,ci∈π(t) c. We extend this to a sequence ~s = hπ(t1 ), . . . , π(tℓ )i of tables of length ℓ and
a set O = {hρ1 , . . . , ρℓ i, hρ′1 , . . . , ρ′ℓ i, . . .} of sequences
Q
of ℓ tables by s-ipsc(s, O) = i∈{1,...,ℓ} s-ipsc(s(i) , O(i) ).
In other words, we select the i-th position of the sequence
Figure 2 illustrates an overview of the steps of PSCA . First,
we compute a TD T = (T, χ) of Iσ . Then, we traverse
the TD a first time by running DPA (Step 2.I), which outputs
a TTD Tcons = (T, χ, τ ). Afterwards, we traverse Tcons in
14
i
1
2
3
c4.i i π4
i hρ4.i ,
1 h{h∅, A4.i i},1i
t
3
Example 7. Recall QBF Q, TTD T =(T, χ, τ), and tables τ1 ,
. . ., τ14 from Example 4 and Figure 1. Recall that for some
nodes t, there are records among different QALG-tables that
are removed (highlighted gray in Figure 1) during purging,
i.e., not contained in purged table mapping ν. By purging
we avoid to correct stored counters (backtracking) whenever
a record has no “succeeding” record in the parent table.
Next, recall Example 3 and consider Q, projection P, and
the resulting instance I of #Σℓ QSAT. We discuss selected
tables obtained by DPAPRJ (I, (T, χ, ν)). Figure 3 depicts
selected tables of π1 , . . . , π14 obtained after running DPAPRJ
for projected counting. We assume that record i in table πt
corresponds to v~t.i = hρt.i , ct.i i where ρt.i ⊆ ν(t).
Since type(t1 ) = leaf, we have π1 = h{~u1.1 }, 1i.
Intuitively, at t1 the record ~u1.1 belongs to one
bucket.
Similarly for nodes t2 , t3 ,t4 , and t5 .
t6 introduces c, which results in table π6 =
Node
h{u~6.1 }, 1i, h{u~6.2 }, 1i, h{u~6.1 , u~6.2 }, 1i , where u~6.1 =
h∅, A6.1 i and u~6.2 = h{c}, A6.2 i with u~6.1 , u~6.2 ∈ τ6 .
Consequently, c6.1 = ipsc(t6 , {u~6.1 }) = psc(t6 , {u~6.1 }) =
s-ipsc(hπ5 i, {u~5.1 }) = 1; analogously for u~6.2 . Further,
c6.3 = ipsc(t6 , {u~6.1 , u~6.2 }) = | psc(t6 , {u~6.1 , u~6.2 }) −
ipsc(t6 , {u~6.1 }) − ipsc(t6 , {u~6.2 })| = |1 − 1 − 1| = 1.
Similarly for table π7 as given, but ν7 has two buckets, as
well as for π8 , π9 of Figure 3.
Next, we discuss how to compute table π11 ,
For record v11.1
~
we comgiven table π10 .
=
ipsc(t11 , {u11.1
~ }
=
pute the count c11.1
psc(t11 , {u11.1
~ }= s-ipsc(hπ10 i, {u10.1
~ })=1. Analogously,
for record v11.2
~ , where c11.2 = ipsc(t11 , {u11.2
~ })=1.
~ , we compute c11.3
=
In order to obtain v11.3
ipsc(t11 , {u11.1
~ , u11.2
~ }) = | psc(t11 , {u11.1
~ , u11.2
~ }) −
ipsc(t11 , {u11.1
~ }) − ipsc(t11 , {u11.2
~ })| = |2 − 1 − 1| = 0.
We continue for tables π12 and π13 . In the end, the projected
solution count of Q is given in the root node t14 and corresponds to s-ipsc(hπ14 i, {u14.1
~ }) = c13.1 + c13.2 − c13.3 = 3.
c14.i i
i hρ14.i ,
1 h{h∅, {h∅, {∅}i}i},3i
π14
{b} t13
c12.i i
i hρ12.i ,
1 h{h∅, A12.1 i},1i
{b} t4 t12 {b}
π12
2 h{h∅, A12.2 i},2i
h{h∅,
A
i,
12.1
3
{a, b} t11 {b, d}
0i
h∅, A12.2 i},
hρ13.i ,
c13.i i
π13
h{h∅, A13.1 i},1i
h{h∅, A13.2 i},2i
h{h∅, A13.1 i, 0i
h∅, A13.2 i},
T : ∅ t14
c3.i i
i hρ3.i ,
π3 t {a} t
hρ11.i ,
c11.i i i
2
10 {b, d, e}
1 h{h∅, A3.i i},1i
h{h∅,
A
i},
1i
1
11.1
c1.i i π
i hρ1.i ,
1
2
{d, e} t9 h{h∅, A11.2 i}, 1i
∅ t1
1 h{h∅, A1.i i},1i
h{h∅, A11.1 i,
3
0i
h∅,
A
i},
hρ
,
c
i
11.2
9.i
i 9.i
t8 {e}
4
π
h{h{d},
A
i},1i
11
11.4
1 h{h∅, A9.1 i},
1i
π9
2 h{h{e}, A9.2 i}, 1i
t7 {c, e} i hρ10.i ,
c10.i i
3 h{h{d, e}, A9.4 i},1i
1 h{h∅, A10.1 i},
1i
hρ
,
c
i
7.i
t6 {c} 2 h{h{e}, A10.2 i}, 1i
i 7.i
1i
3
1 h{h∅, A7.1 i},
π10h{h{d, e}, A10.4 i},1i
π7
2 h{h{c}, A7.2 i}, 1i
t5 ∅
hρ8.i ,
c8.i i i
3 h{h∅, A7.1 i,
1i
h{c}, A7.2 i},
1
π8 h{h∅, A8.1 i}, 1i
4 h{h{c, e}, A i},1i
2
h{h{e}, A8.2 i},1i
7.3
Figure 3: Tables of π obtained by DPPROJ on TTD T and purged
table mapping ν of τ .
together with sets of the i-th positions from the set of sequences.
Intuitively, when we are at a node t in algorithm DPAPRJ
we have already computed π(t′ ) of Tproj for every node t′
below t. Then, the projected solution count of ρ ⊆ ν(t) is
obtained by applying the inclusion-exclusion principle to the
stored projected solution count of origins.
Definition 1. For table ρ and node t, the projected solution
count psc is
X
psc(t, ρ, hπ(t1 ), . . .i := −1(|O|−1) ·s-ipsc(hπ(t1 ), . . .i, O).
∅(O⊆A-origins(t,ρ)
Vaguely speaking, psc determines the A-origins of table ρ,
iterates over all subsets of these origins and looks up the
stored counts (s-ipsc) in the APRJ-tables over the children ti
of node t.
Finally, we provide a definition to compute ipsc, which
can be computed at a node t for given table ρ ⊆ ν(t) by
computing the psc for children ti of t using stored ipsc values
from tables π(ti ), subtracting and adding ipsc values for
subsets ∅ ( ϕ ( ρ accordingly.
Runtime Analysis
Next, we present asymptotic upper bounds on the runtime of
our Algorithm DPAPRJ . To this end, we assume γ(n) to be
the number of operations that are required to multiply two nbit integers, which can be achieved in time O(n · log n ·
log log n) (Knuth 1998). If unit costs for multiplication of
numbers are assumed, then γ(n) = 1.
Definition 2. For table ρ and node t, the intersection pro:=1 if type(t) = leaf and
jected solution count ipsc(t, ρ,
Ps)
ipsc(t, ρ, s) := psc(t, ρ, s)+ ∅(ϕ(ρ (−1)|ϕ| ·ipsc(t, ϕ, s)
where s = hπ(t1 ), . . .i, otherwise.
Theorem 1 (⋆1 ). Given instance I of problem P and
a TTD Tpurged = (T, χ, ν) of I of width k with g
nodes. Then, DPAPRJ runs in time O(24m · g · γ(kIk))
where m := max(|{ν(t) | t ∈ N }|).
In other words, if a node is of type leaf the ipsc is one,
since bags of leaf nodes are empty. Otherwise, we compute
count of given table ρ ⊆ ν(t) with respect to P, by exploiting
the inclusion-exclusion principle on A-origins of ρ such that
we count every projected solution only once. Then we have
to subtract and add ipsc values (“all-overlapping” counts) for
strict subsets ϕ of table ρ.
Listing 3 presents the table algorithm APRJ, which
stores π(t) consisting of every sub-bucket of the given table ν(t) together with its ipsc. In the end, the solution
to #PS OLS(P) is given by s-ipsc(hπ(n)i, ν(n)).
Corollary 1 (⋆). Given an instance Q of #Σℓ QSAT of
treewidth k. Then, PSCQALG runs in time O(tower(ℓ + 1, k +
7) · γ(kQk) · g), where ℓ is the quantifier depth of QBF of Q.
From recent results stated in Proposition 2, we can conclude
that one cannot significantly improve the runtime assuming
that the exponential time hypothesis (ETH) holds.
1
Proofs of statements marked with “⋆” will be made available in
an author self-archived copy.
15
Definition 4 (Compatibility). Let children(t) = ht1 , . . . , tℓ i,
û = hŜ, . . .i be an A-solution up to t and v̂ = hŜ ′ , . . .i be
an A-solution up to ti . Then, û is compatible with v̂ (and
vice-versa) if v̂(1) |χ≤ (ti ) = û(1) |χ≤ (ti ) .
Corollary 2. Under ETH, the problem #Σℓ QSAT cannot
be solved in time tower(ℓ + 1, o(k)) · 2o(kQk) for instance Q
of treewidth k and quantifier depth ℓ.
Formalization of suitable Table Algorithms
For a table algorithm that correctly models the solutions to
any instance of P, we require both soundness, indicating that
computed records are not wrong, and completeness, which
ensures that we do not miss records.
In order to use our algorithm for variable problems P, we
need to characterize the suitable table algorithms for P. To
this end, we formalize the content of a table at a node t.
Therefore, we define a record up to node t as follows: A
record û up to t is of the form û = hû1 , . . . , ûq i such that
for each i with 1 ≤ i ≤ q, either ûi only contains elements
of the sub-tree rooted at t, i.e., ûi ⊆ χ≤ (t), or ûi is a set of
records up to t. A set of records up to t is referred to by a
table ρ̂ up to t.
Definition 5 (Soundness). Algorithm A is referred to as
sound, if for any TTD T ′ of any instance I ′ of P, any node
t′ of T ′ with children(t′ ) = ht1 , . . . , tℓ i, any A-record solution u at t′ , and any A-record solutions vi at ti for 1 ≤ i ≤ ℓ,
we have: If hv1 , . . . , vℓ i ∈ A-origins(t′ , u), then u is also
an A-record solution at node t′ and u is compatible with vi .
Definition 6 (Completeness). Algorithm A is referred to as
complete, if for any TTD T ′ of any instance I ′ of P, any
node t′ of T ′ with children(t′ ) = ht1 , . . . , tℓ i, ℓ ≥ 1, any
A-record solution u at node t′ , and any corresponding Asolution û up to t′ (of u), we have: For every 1 ≤ i ≤ ℓ, there
exists s = hv1 , . . . , vℓ i where vi is an A-record solution at ti
such that s ∈ A-origins(t, u), and vˆi is a corresponding Asolution up to ti (of vi ) that is compatible with û.
Formalizing Tables. Correctness of a table algorithm A
for a problem P is typically established using a set C of conditions (Fichte et al. 2017; Jakl, Pichler, and Woltran 2009;
Pichler, Rümmele, and Woltran 2010) that hold for every
table that is computed using algorithm A. Let therefore ρ̂
be a table up to t, and C be a set of conditions, which depend only on sub-tree T [t] of T rooted at t. Then, û ∈ ρ̂
is referred to as A-solution up to t consistent with C if it
ensures C. However, we need to restrict this set such that it
allows us to characterize the solutions to P. Therefore, we
need the definition of sufficient conditions, which, vaguely
speaking, make sure that parts of the record, while potentially
containing auxiliary data, correspond to the (relations of)
solutions.
These definitions finally allow us to define table algorithms
to capture the solutions to instances of problem P. This
ensures, besides soundness and completeness, that checking
conditions in C can be done in polynomial time.
Definition 7 (Correctness). Algorithm A is referred to as
correct for problem P, if A is both sound and complete. Further, for any TD T ′ = (T, χ′ ) of any instance I ′ of P over σ,
and any node t′ ∈ N ′ , any resulting A-TTD (·, ·, o), we
have: (i) We can verify for every A-record u at node t′ of
table o(t′ ) in time |u|O(1) whether record u is an A-record
solution at t′ , by using only A-record solutions in o(·) for
children of t′ . In other words, for every corresponding Asolution û up to t′ of record u, the conditions C hold. (ii)
If t′ = root(T ′ ), or type(t′ ) = leaf, then |ν(t′ )| ≤ 1 for
purged table mapping ν of o.
Definition 3. A set C of conditions is called sufficient for A, if the set of solutions of any instance I ′
of problem P, and any TD T ′ of I ′ are characterized as follows: The set {û(1) | û is an A-solution
up to root n′ of T ′ consistent with C} corresponds to set
{R | S = h·, ·, Ri, S is a solution to instance I ′ }.
However, these table algorithms A do not store records up
to a node. Instead, such algorithms store “local” records that
only mention contents restricted to the bag of a node. To this
end, we need the following definitions. Given a table ρ̂ =
{v̂1 , . . . , v̂s } up to t and a set P ⊆ χ(t). Then the table ρ̂
restricted to P is given by ρ̂|P :={v̂1 |P , . . . , v̂s |P }, where
for v̂i ∈ ρ̂, if v̂i ⊆ χ≤ (t), then v̂i |P :=v̂i ∩ P . Otherwise,
if v̂i = hû1 , . . . , ûq i, then v̂i |P :=hû1 |P , . . . , ûq |P i. This
allows us to formalize the table ρ at node t, which is given
by ρ :=ρ̂|χ(t) .
Remark: Condition (2) hardly restricts correctness, since
bags of the leaves and root are empty by definition, as we use
nice TDs. In fact, reasonable table algorithms that compute
solutions to problems P are correct, because the form of
the table data structure, the correctness condition, and the
monotonicity criteria via compatibility notion are very weak
notions on top of the existing algorithm. In particular, these
conditions still allow for solving (Bliem, Pichler, and Woltran
2013) monadic second order logic using TDs.
Characterization of Correctness. In the following, we
let C be such a set of sufficient conditions for A and û be
an A-solution up to t consistent with C. Then, û|χ(t) at t is
referred to as A-record solution at node t consistent with C.
We say û is a corresponding A-solution up to t of û|χ(t) .
Intuitively, to characterize correctness, we need some kind
of monotonicity among bag relations over ξ, i.e., it is not
allowed that some table records defer or change decisions at
some descendant node about domain elements in relations
that are part of solutions. Therefore, we rely on the following
notion of compatibility.
Proposition 3. Algorithm QALG is correct.
Proof (Idea). Correctness of QALG can be established by
adapting the original proof (Chen 2004) and establishing
conditions C similar to invariants for other formalisms (Samer
and Szeider 2010).
Results for our Meta Algorithm PSCA . Finally, we state
that new table algorithm APRJ is indeed correct assuming a
correct table algorithm A is given.
16
Origin
Problem P
Graphs
Graphs
Graphs
Graphs
Logic
Logic
Logic Programs
Epistemic LPs
Epistemic LPs
Reasoning
Argumentation
V ERTEX C OVER
D OMINATING S ET
I NDEPENDENT S ET
3-C OLORABILITY
SAT
Σℓ−1 QSAT, Πℓ−1 QSAT, ℓ ≥ 2
ASP
C ANDIDATE W ORLD V IEWS
W ORLD V IEWS
A BDUCTION , C IRCUMSCRIPTION
C REDpreferred ,C REDsemi-st ,C REDstage
Runtime tower(i, Θ(k)) · kIkO(1) of #PS OLS(P)
i=2
i=3
i=4
i=5
N,H
N,H
N,H
N,H
N,△[1],▽[1]
i=ℓ
N,△[3],▽[2]
N,△[1],▽[1,2]
N,▽[2,5]
N,▽[2,5]
N,▽[2]
N,△[4],▽[2,4]
Table 1: Runtime upper (N, △) and lower (H, ▽) bounds (ETH) of #PS OLS(P) for selected problems P. I refers to an instance of #PS OLS(P),
k to the treewidth of I. (△, ▽) indicates previously known bounds. For problem definitions, we refer to the problem compendium
of (Fichte, Hecher, and Pfandler 2020). Due to space reasons, we abbreviate references above as follows: [1]: (Fichte et al. 2018), [2]:
(Fichte, Hecher, and Pfandler 2020), [3]: (Chen 2004), [4]: (Fichte, Hecher, and Meier 2019), [5]: (Hecher, Morak, and Woltran 2020)
Theorem 2 (⋆). Given a correct algorithm A for problem P
and an instance I of P. Then, Algorithm PSCA is correct
and s-ipsc(hπ(n)i, ν(n)) outputs for TD-root n of the instance, consisting of I and any projection P, the solution to
problem #PS OLS(P).
established lower bounds for projected solution counting
problems (under ETH) by also providing the corresponding
upper bounds that are achieved with our framework. While
we did not elaborate in detail, our work still allows different
graph representations such as the incidence graph.
The presented research opens up a variety of new questions.
We believe that an implementation for projected counting can
be quite interesting. Another interesting direction for future
work is to extend the existing counting framework to projected solution enumeration with linear delay. We believe
that projected counting or enumeration can be a promising
extension for well-known graph problems, yielding new insights and wider applications as it was already the case for
abstract argumentation (Fichte, Hecher, and Meier 2019).
Corollary 3. Algorithm PSCQALG is correct and outputs for
any given instance of #Σℓ QSAT its projected solution count.
As a side result we immediately obtain a meta algorithm
for counting. This includes problems P, where the corresponding DP algorithm A might compute more involved
records such that a trivial extension of A to facilitate counters
might count duplicate solutions.
Corollary 4. Given an instance hD, (RṘ∈σ )i of problem P = hσ, ξ, ·i and correct table algorithm A for P. If
we set R :=Dar(Ṙ) for each Ṙ ∈ ξ and run algorithm PSCA
on instance hD, (RṘ∈σ∪ξ )i, the value s-ipsc(hπ(n)i, ν(n))
for TD-root n is the number of solutions to P.
References
Abiteboul, S.; Hull, R.; and Vianu, V. 1995. Foundations of
Databases: The Logical Level. Addison-Wesley, 1st edition.
Arnborg, S.; Lagergren, J.; and Seese, D. 1991. Easy problems for tree-decomposable graphs. J. Algorithms 12(2):308–
340.
Aziz, R. A.; Chu, G.; Muise, C.; and Stuckey, P. 2015.
#(∃)SAT: Projected Model Counting. In SAT’15, 121–137.
Springer.
Aziz, R. A. 2015. Answer Set Programming: Founded
Bounds and Model Counting. Ph.D. Dissertation, University
of Melbourne.
Biere, A.; Heule, M.; van Maaren, H.; and Walsh, T., eds.
2009. Handbook of Satisfiability, volume 185 of Frontiers in
Artificial Intelligence and Applications. IOS Press.
Bliem, B.; Pichler, R.; and Woltran, S. 2013. Declarative
dynamic programming as an alternative realization of courcelle’s theorem. In IPEC’13, volume 8246 of LNCS, 28–40.
Springer.
Bodlaender, H. L., and Kloks, T. 1996. Efficient and constructive algorithms for the pathwidth and treewidth of graphs. J.
Algorithms 21(2):358–402.
Table 1 gives a brief overview of selected problems P and
their respective runtime upper bounds (N) obtained via DPA ,
as well as lower bounds (H, ▽) under the exponential time
hypothesis (ETH).
Conclusion and Future Work
We introduced a novel framework and meta algorithm for
counting projected solutions in a variety of domains. We use
finite structures for specifying the problem and the graph representations on which we run dynamic programming. With
this general tool at hand to describe problems, we employ
dynamic programming (DP) on tree decompositions of the
Gaifman graph (primal graph) of the input given in the finite
structure. Conveniently, we can reuse already established DP
algorithms and impose very weak conditions for their usage
for projected counting. Interestingly, a very general technique that describes implementations of counting techniques
(without projection) in a similar fashion, namely relational
algebra, is also useful and competitive in practice (Fichte et
al. 2020). Further, we completed the picture of previously
17
Capelli, F., and Mengel, S. 2019. Tractable QBF by knowledge compilation. In STACS’19, volume 126 of LIPIcs, 18:1–
18:16. Dagstuhl.
Charwat, G., and Woltran, S. 2019. Expansion-based QBF
solving on tree decompositions. Fundam. Inform. 167(12):59–92.
Chavira, M., and Darwiche, A. 2008. On probabilistic
inference by weighted model counting. Artificial Intelligence
172(6–7):772—799.
Chen, H. 2004. Quantified constraint satisfaction and
bounded treewidth. In ECAI’04, 161–170. IOS Press.
Curticapean, R. 2018. Counting problems in parameterized
complexity. In IPEC’18, volume 115 of LIPIcs, 1:1–1:18.
Dagstuhl. 978-3-95977-084-2.
Cygan, M.; Fomin, F. V.; Kowalik, Ł.; Lokshtanov, D.;
Dániel Marx, M. P.; Pilipczuk, M.; and Saurabh, S. 2015.
Parameterized Algorithms. Springer.
Domshlak, C., and Hoffmann, J. 2007. Probabilistic planning
via heuristic forward search and weighted model counting. J.
Artif. Intell. Res. 30.
Doubilet, P.; Rota, G.-C.; and Stanley, R. 1972. On the
foundations of combinatorial theory. vi. the idea of generating
function. In Berkeley Symposium on Mathematical Statistics
and Probability, 2: 267–318.
Dueñas-Osorio, L.; Meel, K. S.; Paredes, R.; and Vardi,
M. Y. 2017. Counting-based reliability estimation for powertransmission grids. In AAAI’17, 4488–4494. AAAI Press.
Durand, A.; Hermann, M.; and Kolaitis, P. G. 2005. Subtractive reductions and complete problems for counting complexity classes. Theoretical Computer Science 340(3):496–513.
Fichte, J. K., and Hecher, M. 2019. Treewidth and counting
projected answer sets. In LPNMR’19, volume 11481 of
LNCS, 105–119. Springer.
Fichte, J. K.; Hecher, M.; Morak, M.; and Woltran, S. 2017.
Answer set solving with bounded treewidth revisited. In
LPNMR’17, volume 10377 of LNCS, 132–145. Springer.
Fichte, J. K.; Hecher, M.; Morak, M.; and Woltran, S. 2018.
Exploiting treewidth for projected model counting and its
limits. In SAT’18. Springer.
Fichte, J. K.; Hecher, M.; Thier, P.; and Woltran, S. 2020.
Exploiting database management systems and treewidth for
counting. In PADL’20, volume 12007 of LNCS, 151–167.
Springer.
Fichte, J. K.; Hecher, M.; and Meier, A. 2019. Counting complexity for reasoning in abstract argumentation. In AAAI’19,
2827–2834. AAAI Press.
Fichte, J. K.; Hecher, M.; and Pfandler, A. 2020. Lower
bounds for QBFs of bounded treewidth. In LICS’20, 410–424.
ACM.
Fioretto, F.; Pontelli, E.; Yeoh, W.; and Dechter, R. 2018. Accelerating exact and approximate inference for (distributed)
discrete optimization with GPUs. Constraints 23(1):1–43.
Gaggl, S. A.; Manthey, N.; Ronca, A.; Wallner, J. P.; and
Woltran, S. 2015. Improved answer-set programming encodings for abstract argumentation. TPLP 15(4-5):434–448.
Gaifman, H. 1982. On local and nonlocal properties. In
Proceedings of the Herbrand Symposium, volume 107 of
Studies in Logic and the Foundations of Mathematics, 105–
135. Elsevier.
Gebser, M.; Kaufmann, B.; and Schaub, T. 2009. Solution enumeration for projected boolean search problems. In
CPAIOR’09, volume 5547 of LNCS, 71–86. Springer.
Gomes, C. P.; Sabharwal, A.; and Selman, B. 2009. Chapter
20: Model counting. In Handbook of Satisfiability, volume
185 of Frontiers in Artificial Intelligence and Applications.
IOS Press. 633–654.
Graham, R. L.; Grötschel, M.; and Lovász, L. 1995. Handbook of combinatorics, volume I. Elsevier.
Gurevich, Y. 1995. Evolving algebras 1993: Lipari guide. In
Specification and Validation Methods. OUP. 9–36.
Hecher, M.; Morak, M.; and Woltran, S. 2020. Structural
decompositions of epistemic logic programs. In AAAI’20,
2830–2837. AAAI Press.
Hemaspaandra, L. A., and Vollmer, H. 1995. The satanic
notations: Counting classes beyond #P and other definitional
adventures. SIGACT News 26(1):2–13.
Impagliazzo, R.; Paturi, R.; and Zane, F. 2001. Which problems have strongly exponential complexity? J. of Computer
and System Sciences 63(4):512–530.
Jakl, M.; Pichler, R.; and Woltran, S. 2009. Answer-set programming with bounded treewidth. In IJCAI’09, volume 2,
816–822.
Kangas, K.; Koivisto, M.; and Salonen, S. 2019. A faster
tree-decomposition based algorithm for counting linear extensions. In IPEC’18, volume 115 of LIPIcs, 5:1–5:13.
Dagstuhl.
Kloks, T. 1994. Treewidth. Computations and Approximations, volume 842 of LNCS. Springer.
Knuth, D. E. 1998. How fast can we multiply? In The
Art of Computer Programming, volume 2 of Seminumerical
Algorithms. Addison-Wesley, 3 edition. chapter 4.3.3, 294–
318.
Lagniez, J.-M., and Marquis, P. 2019. A recursive algorithm
for projected model counting. In AAAI’19, 1536–1543. AAAI
Press.
Pichler, R.; Rümmele, S.; and Woltran, S. 2010. Counting and enumeration problems with bounded treewidth. In
LPAR’10, volume 6355 of LNCS, 387–404. Springer.
Samer, M., and Szeider, S. 2010. Algorithms for propositional model counting. J. Discrete Algorithms 8(1):50—64.
Sang, T.; Beame, P.; and Kautz, H. 2005. Performing
bayesian inference by weighted model counting. In AAAI’05.
AAAI Press.
Sharma, S.; Roy, S.; Soos, M.; and Meel, K. S. 2019. Ganak:
A scalable probabilistic exact model counter. In IJCAI’19,
1169–1176. IJCAI.
Stockmeyer, L. J., and Meyer, A. R. 1973. Word problems
requiring exponential time. In STOC’73, 1–9. ACM.
Valiant, L. 1979. The complexity of enumeration and reliability problems. SIAM J. Comput. 8(3):410–421.
18
Paraconsistent Logics for Knowledge Representation and Reasoning:
advances and perspectives
Walter Carnielli1,2,3,4 , Rafael Testa1 ,
1
Centre for Logic, Epistemology and the History of Science
2
Institute for Philosophy and Human Sciences
University of Campinas (UNICAMP)
3
Advanced Institute for Artificial Intelligence (AI2)
4
Modal Institute
walterac@unicamp.br, rafaeltesta@gmail.com
Abstract
Indeed, the so called Bar-Hillel-Carnap paradox has already suggested half century ago the collapse between the
notions of contradiction and semantic information: the less
probable a statement is, the more informative it is, and so
contradictions carry the maximum amount of information
(Carnap and Bar-Hillel 1952). However, and in the light
of standard logic, contradictions are “too informative to be
true” as a famous quote by the latter has it.
To face the task of reasoning under contradictions, a field
where human agents excel, is a difficult philosophical problem for standard logic, which is forced to equate triviality
and contradiction, and to regard all contradictions as equivalent. However, skipping all technicalities in favor of a
clear intuition (technical details can be found in (Mendonça
2018)), the Bar-Hillel-Carnap observation is not paradoxical
for LFIs.
This paper briefly outlines some advancements in paraconsistent logics for modeling knowledge representation and reasoning. Emphasis is given on the so-called Logics of Formal
Inconsistency (LFIs), a class of paraconsistent logics that formally internalize the very concept(s) of consistency and inconsistency. A couple of specialized systems based on the
LFIs will be reviewed, including belief revision and probabilistic reasoning. Potential applications of those systems in
the AI area of KRR are tackled by illustrating some examples that emphasizes the importance of a fine-tuned treatment
of consistency in modeling reputation systems, preferences,
argumentation, and evidence.
1
Introduction
Non-classical logics find several applications in artificial intelligence, including multi-agent systems, reasoning with
vagueness, uncertainty, and contradictions, among others,
mostly akin with the area of knowledge representation and
reasoning (Thomason 2020). Regarding this latter, there
is a plethora of aims and applications in view when representing a knowledge of an agent, including fields beyond
AI like software engineering, databases and robotics. Several logics have been studied for the latter purposes, including non-monotonic, epistemic, temporal, many-valued and
fuzzy logics. This paper highlights the use of paraconsistent
logics in some inconsistency-tolerant frameworks, introducing the family of Logics of Formal Inconsistency (LFIs) (advanced in the literature to be presented) for representing reasoning that makes use of the very notion of consistency and
inconsistency, suitably formalized within the systems.
2
2.1
2.2
The beginings of Paraconsistent Logics
(modern era)
The idea of a non-Aristotelian logic was advanced in a lecture in 1919 by Nicolai A. Vasiliev, where he proposed a
kind of reasoning free from the laws of excluded middle and
contradiction – called Imaginary Logic as an analogy with
Lobachevsky’s imaginary geometry. Such a logic would be
valid, as the former has it, only for reasoning in “imaginary
worlds” (Vasiliev 1912).
A more concrete example of a system for reasoning
with contradictions can be found in the Discussive Logic
(Jaśkowski 1948), advanced as a formal answer to the puzzling situation posed by J. Łukasiewicz: which logic applies in the situation where one has to defend some judgment A, also considering not-A for the sake of the argument?
Jaśkowski’s strategy is to avoid the combination of conflicting information by blocking the rule of adjunction. The idea
is making room for A and ¬A without entailing A ∧ ¬A,
since the classic explosion actually still holds in the form of
A ∧ ¬A 6⊢ B. In terms of reasoning, it has a straightforward
meaning: each agent must still be consistent! Jaśkowski’s
intuitions contributed to the proposal of the society semantics and to general case, the possible-translations semantics. A discussion on some conceptual points involving society semantics and their role on collective intelli-
Reasoning under contradiction
The informative power of contradictions
Contradictory information is not only frequent, and more so
as systems increase in complexity, but can have a positive
role in human thought, in some cases not being totally undesirable. Finding contradictions in juridical testimonies, in
statements from suspects of a crime or in suspects of tax
fraud, for instance, can be an effective strategy – contradictions can be very informative in those cases (Carnielli and
Coniglio 2016).
19
Adaptive Logics Human reasoning can be better understood as endowed with many dynamic consequence relations. Adaptive reasoning recognizes the so-called abnormalities to develop formal strategies to deal with them:
for instance, an abnormality might be an inconsistency
(inconsistency-adaptive logics), or it might be an inductive
inference, and a strategy might be excluding a line of a proof
(by marking it), or to change an inference rule. (Batens
2001).
gence can be found in (Carnielli and Lima-Marques 2017;
Testa 2020).
Another precursor, with a multi-valued approach, is the
Logic of Nonsense (Halldén 1949) that, despite its name,
captured a meaningful form of reasoning – aiming in studying logical paradoxes by means of 3-valued logical matrices
(closely related to the Nonsense Logic introduced in 1938 by
A. Bochvar). An analogous approach is made by F. Asenjo,
who introduced a 3-valued logic as a formal framework for
studying antinomies by means of 3-valued Kleene’s truthtables for negation and conjunction, where the third truthvalue is distinguished (Asenjo 1966). The same logic has
been studied by G. Priest, from the perspective of matrix
logics, in the form of the so-called Logic of Paradox (LP)
(Priest 1979).
With respect to a constructive approach to intuitionistic
negation, D. Nelson proposed an extension of positive intuitionistic logic with a strong negation – a connective designed to capture the notion of “constructible falsity”. By
eliminating the explosion, Nelson obtained a (first-order)
paraconsistent logic (Nelson 1959).
Focusing on the status of contradictions in mathematical
reasoning, N. da Costa advanced a hierarchy of paraconsistent systems Cn (for n ≥ 1) tolerant to contradictions,
where the consistency of a formula A (in his terminology,
the ‘well-behavior’ of A) is defined in C1 by the formula
A◦ = ¬(A ∧ ¬A). Let A1 =def A◦ and An+1 =def (An )◦ .
Then, in C n , the following holds: (i) the well behavior
is denoted by A(n) =def A1 ∧ · · · ∧ An ; (ii) A, ¬A 6⊢ B
in general, but A(n) , A, ¬A ⊢ B always holds; and (iii)
A(n) , B (n) ⊢ (A#B)(n) and A(n) ⊢ (¬A)(n) .
By concentrating on the non-triviality of the systems
rather than on the absence of contradictions, da Costa defined a logic to be paraconsistent with respect to ¬ if it can
serve as a basis for ¬-contradictory yet non-trivial theories
(da Costa 1974):
Dialetheism A dialetheia is a sentence A, such that both
it and its negation ¬A are true. Assuming that falsity is the
truth of negation, a dialetheia then is a sentence which is
both true and false. Dialetheism, accordingly, is the metaphysical view that there are dialetheia, i.e., that there are
true contradictions. As such, dialetheism opposes the Law
of Non-Contradiction in the forma of ¬(A ∧ ¬A) (Priest
1987). A system admitting ‘both’ as a truth-value, for instance, is the aforementioned Logic of Paradox.
Inconsistent (or rather Contradictory) Formal Systems
The main idea is that there are situations in which contradictions can, at least temporarily, be admissible if their “behavior can be somehow controlled”, as da Costa has it (op. cit.).
Contemporaneously, (Carnielli and Marcos 2002) extended
and further generalized such notions, giving rise to the so
called Logics of Formal Inconsistency, to be presented in
the next section.
3
3.1
Contradiction, consistency, inconsistency, and
triviality
LFIs are a family of paraconsistent logics designed to express the notion(s) of consistency and inconsitency (sometimes defining one another, sometimes taken as primitive,
depending on the strength of the axioms) within the object
language by employing a connective “◦” (or “•”), in which
◦α means that “α is consistent” (and •α means that “α is inconsistent”), further expanding and generalizing da Costa’s
hierarchy of C systems. Accordingly, the principle of explosion is not valid in general, although this law is not abolished
but restricted to the so-called “consistent sentences”, a feature captured by the following law, which is referred to as
the “principle of Gentle Explosion” (PGE):
Definition 1. ∃Γ∃α∃β(Γ ⊢ α and Γ ⊢ ¬α and Γ 6⊢ β)
2.3
Logics of Formal Inconsistency- LFIs
Motivations: main approaches
Preservationism Similar to the way discussive logic has
it, there is a clear distinction between an inconsistent data
set, like {A, ¬A} (which is considered tractable), with a
contradiction in the form A ∧ ¬A (intractable). Thus, given
an inconsistent collection of sentences (in an already defined
logic L, usually classical logic), one should not try to reason
about that collection as a whole, but rather focus on internally consistent subsets of premises. (Schotch, Brown, and
Jennings 2009).
α, ¬α, ◦α ⊢ β, for every β, but α, ¬α 6⊢ β for some β (1)
In formal terms, we have the following (Carnielli and
Coniglio 2016):
Definition 2 (A formal definition of LFI). Let L be a
Tarskian logic with a negation ¬. The logic L is a LFI if
there is a non-empty set (p) of formulas in some language
L of L which depends only on the propositional variable p,
satisfying the following:
Relevant Logics Relevant logics are mainly concerned
with a meaningful connection between the premises and the
conclusion of an argument, thus not accepting for example
inferences like B ⊢ A → B. This strategy induces a paraconsistent character in the resulting deductions, since A and
¬A, as premisses, do not necessarily have a meaningful connection with an arbitrary conclusion B (Anderson, Belnap,
and Dunn 1992).
a.
b.
c.
d.
20
∃α∃β(¬α, α 6⊢ β)
∃α∃β( (α), α 6⊢ β)
∃α∃β( (α), ¬α 6⊢ β)
∀α∀β( (α), α, ¬α ⊢ β)
For any formula α, the set (α) is intended to express, in
a specific sense, the consistency of α relative to the logic
L. When this set is a singleton, it is denoted by ◦α the sole
element of (α), thus defining a consistency operator.
The connective “◦”, as mentioned, is not necessarily a
primitive one. Indeed, LFI is an umbrella definition that covers many paraconsistent logics of the literature.
Remark 3 (Some notable LFIs). Following definition 2, it
can be easily proved that some well-known logics in the literature are LFIs, including the aforementioned Jaśkowski’s
Discussive logic, Halldén’s nonsense logic and, as expected, da Costa’s C-systems (Carnielli and Coniglio 2016;
Carnielli, Coniglio, and Marcos 2007; Carnielli and Marcos 2002).
It is worth observing that each one of the aforementioned
logics has their own motivations and particularities - being Remark 3 to be understood as a logic-mathematical reminder that those logics share some common results and
properties.
3.2
mbC can be characterized in terms of valuations over
{0, 1} (also called bivaluations), but cannot be semantically
characterized by finite matrices (cf. (Carnielli, Coniglio, and
Marcos 2007)). Surprisingly, however, mbC can be characterized by 5-valued non-deterministic matrices, as shown in
(Avron 2005) (details also in Example 6.3.3 of (Carnielli and
Coniglio 2016)).
Definition 5 (Valuations for mbC). A function v : L →
0, 1 is a valuation for mbC if it satisfies the following
clauses:
(Biv1) v(α ∧ β) = 1 ⇐⇒ v(α) = 1 and v(β) = 1
(Biv2) v(α ∨ β) = 1 ⇐⇒ v(α) = 1 or v(β) = 1
(Biv3) v(α → β) = 1 ⇐⇒ v(α) = 0 or v(β) = 1
(Biv4) v(¬α) = 0 =⇒ v(α) = 1
(Biv5) v(◦α) = 1 =⇒ v(α) = 0 or v(¬α) = 0.
The semantic consequence relation associated to valuations for mbC is defined as expected: X |=mbC α iff, for
every mbC-valuation v, if v(β) = 1 for every β ∈ X then
v(α) = 1.
Definition 6 (Extensions of mbC (Carnielli and Marcos
2002; Carnielli, Coniglio, and Marcos 2007; Carnielli and
Coniglio 2016)). Consider the following axioms:
(ciw) ◦α ∨ (α ∧ ¬α)
(ci) ¬◦α → (α ∧ ¬α)
(cl) ¬(α ∧ ¬α) → ◦α
(cf) ¬¬α → α
(ce) α → ¬¬α
Some interesting extensions of mbC are the following:
mbCciw = mbC+(ciw)
mbCci = mbC+(ci)
bC = mbC+(cf)
Ci = mbC+(ci)+(cf) = mbCci+(cf)
mbCcl = mbC+(cl)
Cil = mbC+(ci)+(cf)+(cl) = mbCci+(cf)+(cl) = mbCcl+
(cf) + (ci) = Ci+(cl)
The semantic characterization by bivaluations for all these
extensions of mbC can be easily obtained from the one for
mbC (see (Carnielli, Coniglio, and Marcos 2007; Carnielli
and Coniglio 2016)). For instance, mbCciw is characterized by mbC-valuations such that v(◦α) = 1 if and only if
v(α) = 0 or v(¬α) = 0 (if and only if v(α) 6= v(¬α)).
Notation 7 (derived bottom particle and strong negation).
⊥=def α ∧ ¬α ∧ ◦α and ∼ α =def α →⊥ (for any α).
It is then clear that the LFIs are at the same time subsystems and extensions of CPL. They can be seen as classical
logic extended by two connectives: a paraconsistent negation and a consistency connective (or an inconsistency one,
dual to it). In formal terms, consider CPL defined over the
language L0 generated by the connectives ∧, ∨, →, ¬, where
¬ represents the classical negation instead of the paraconsistent one. If Y ⊆ L0 then ◦(Y ) = {◦α : α ∈ Y }. Then, the
following result can be obtained:
Observation 8 (Derivability Adjustment Theorem (Carnielli
and Marcos 2002)). Let X ∪ {α} be a set of formulas in L0 .
Then X ⊢CP L α if and only if ◦(Y ), X ⊢mbc α for some
Y ⊆ L0 .
A family of LFIs
It should be clear that the notions of consistency and noncontradiction are not coincident in the LFIs, and that the
same holds for the notions of inconsistency and contradiction. There is, however, a fully-fledged hierarchy of
LFIs where consistency is gradually connected to noncontradiction.
Starting from positive classical logic plus tertium non
datur (α ∨ ¬α), mbC is one of the basic logics intended
to comply with definition 2 in a minimal way: an axiom
schema called (bc1) is added solely to capture the aforementioned principle of gentle explosion.
Definition 4 (mbC(Carnielli and Marcos 2002)). The logic
mbC is defined over the language L ( generated by the connectives ∧, ∨, →, ¬, ◦) by means of a Hilbert system as follows:
Axioms:
(A1) α → (β → α)
(A2) (α → β) → ((α → (β → δ)) → (α → δ))
(A3) α → (β → (α ∧ β))
(A4) (α ∧ β) → α
(A5) (α ∧ β) → β
(A6) α → (α ∨ β)
(A7) β → (α ∨ β)
(A8) (α → δ) → ((β → δ) → ((α ∨ β) → δ))
(A9) α ∨ (α → β)
(A10) α ∨ ¬α
(bc1) ◦α → (α → (¬α → β))
Inference Rule:
(Modus Ponens (MP)) α, α → β ⊢ β
(A1)-(10) plus (MP) coincides with Baten’s paraconsistent logic CLuN – it is worth mentioning that a nonmonotonic characterization of the Ci-hierarchy (presented in section 6) can be found in (Batens 2009). Furthermore, (A1)(A9) plus (MP) defines positive classical propositional logic
CPL+ .
21
4
Paraconsistent Belief Change
The fact is that there are in the literature several systems
that could be understood as endowing a certain paraconsistent character, each one based on distinct strategies and motivations (see for instance (Fermé and Wassermann 2017) for
an Iterated Belief Change perspective). An approach of Belief Change from the perspective of inconsistent formal systems was conceptually suggested by (da Costa and Bueno
1998). Departing from the technical advances of mbC and
its extensions, (Testa, Coniglio, and Ribeiro 2017) goes further in this direction, defining external and semi-revisions
for belief sets, as well as consolidation (operations that were
originally presented for belief bases (Hansson 1993)(1997).
By considering consistency as an epistemic attitude, and allowing temporary contradictions, the informational power of
the operations are maximized (as it argued by (Testa 2015)).
It is worth mentioning that, as proposed by Priest and
Tanaka (op. cit.), paraconsistent revision could be understood as a plain expansion. As it is explained by (Testa et al.
2018), to equate paraconsistent revision with expansion it is
necessary to assume that consistency is necessarily equivalent to non-triviality in a paraconsistent setting and, furthermore, that all paraconsistent logics do not endow a bottom
particle (primitive or defined). As this paper intends to highlight, neither assumption is true.
Belief Change in a wide sense has been subject of philosophical reflection since antiquity, including discussions about
the mechanisms by which scientific theories develop and
proposing rationality criteria for revisions of probability assignments (Fermé and Hansson 2018). Contemporaneously,
there is a strong tendency towards confluence of the research
traditions on the subject from philosophy and from computer
research (Hansson 1999).
The most influential paradigm in this area of study is the
AGM model (Alchourrón, Gärdenfors, and Makinson 1985),
in which epistemic states are represented as theories – considered simply as sets of sentences closed under logical consequence. Three types of epistemic changes (or operations)
are considered in this model: expansion, the incorporation of
a sentence into a given theory; contraction, the retraction of
a sentence from a given theory; and revision, the incorporation of a sentence into a given consistent theory by ensuring
the consistency of the resulting one.
Notably, given the possibility of reasoning with contradictions (as paraconsistent logics have it), as well as the aforementioned scrutiny on the very concept of “consistency”, the
definition of revision can be refined. Indeed, there are some
investigations in the literature alongside this direction:
Based on the four-valued relevant logic of first-degree entailment, (Restall and Slaney 1995) defines an AGM-like
contraction without satisfying the recovery postulate. Revision is obtained from contraction by the Levi identity (to
be introduced).
Also based on the first-degree entailment, (Tamminga
2001) advances a system that put forth a distinction between
information and belief. Techniques of expansion, contraction an revisions are applied to information (which can be
contradictory), while other kind of operations are advanced
for extracting beliefs from those information. The demanding for consistency (i.e. non-contradictoriness) is applied
only for those beliefs.
(Mares 2002) proposes a model in which an agent’s belief
state is represented by a pair of sets – one of these is the
belief set, and the other consists of the sentences that the
agent rejects. A belief state is coherent if and only if the
intersection of these two sets is empty, i.e. if and only if
there is no statement that the agent both accepts and rejects.
In this model, belief revision preserves coherence but does
not necessarily preserve consistency.
Also departing from a distinction between consistency
and coherence, (Chopra and Parikh 1999) advances a model
based on Belnap and Dunn’s logic that preserves an agent’s
ability to answer contradictory queries in a coherent way,
splitting the language to distinguish between implicit and
explicit beliefs.
In (Priest 2001) and (Tanaka 2005), it is suggested that
revision can be performed by just adding sentences without
removing anything, i.e, revision can be defined as a simple expansion. Furthermore, Priest first pointed out that in a
paraconsistent framework, revision on belief sets can be performed as external revision, defined with the reversed Levi
identity as advanced for belief bases (Hansson 1993) .
Remark 9. From now on, let us assume a LFI, namely
L=hL, ⊢L i, such that L is mbC or some extension as presented above. Since the context is clear, we will omit the
subscript, and simply denote ⊢L by ⊢ and, accordingly, the
respective closure by Cn.
4.1
Revisions in the LFIs
In (Testa, Coniglio, and Ribeiro 2017) the so-called AGMp
system is proposed, in which it is shown that a paraconsistent revision of a belief set K by a belief-representing sentence α (the operation K ∗ α) can be defined not only by
the Levi identity as in classical AGM (that is, by a prior contraction by ¬α followed by a expansion by α) but also by
reversed Levi identity and other kind of constructions where
contradictions are temporarily accepted. Formally, we have
the following:
Let K = Cn(K). The expansion of K by α (K + α) is
given by
Definition 10. K + α = Cn(K ∪ {α})
There are several constructions for defining a contraction
operator. The one adopted is the partial meet contraction,
constructed as follows (Alchourrón, Gärdenfors, and Makinson 1985):
1. Choose some maximal subsets of K (with respect the inclusion) that do not entail α.
2. Take the intersection of such sets.
The remainder of K and α is the set of all maximal subsets of K that do not entail α.
Definition 11 (Remainder). The set of all the maximal subsets of K that do not entail α is called the remainder set of
K by α and is denoted by K⊥α, that is, K ′ ∈ K⊥α iff:
(i) K ′ ⊆ K.
22
(ii) α 6∈ Cn(K ′ ).
(iii) If K ′ ⊂ K ′′ ⊆ K then α ∈ Cn(K ′′ ).
Typically K⊥α may contain more than one maximal subset. The main idea constructing a contraction function is to
apply a selection function γ which intuitively selects the sets
in K⊥α containing the beliefs that the agent holds in higher
regard (those beliefs that are more entrenched).
Definition 12 (selection function). A selection function for
K is a function γ such that, for every α:
1. γ(K⊥α) ⊆ K⊥α if K⊥α 6= ∅.
2. γ(K⊥α) = {K} otherwise.
The partial meet contraction is the intersection of the sets
of K⊥α selected by γ.
Definition 13 (partial meet contraction). Let K be a belief
set, and γ a selection function for K. The partial meet contraction on K that is generated by γ is the operation −γ
such that for all sentences α:
\
K −γ α =
γ(K⊥α).
Example 1. A man has died in a remote place in which only
two other persons, Adam and Bob, were present. Initially,
the public prosecutor believes that neither Adam nor Bob
has killed him. Thus her belief state contains ¬A (Adam has
not killed the deceased) and ¬B (Bob has not killed the deceased). For simplicity, we may assume that her belief state
is K0 = Cn({¬A, ¬B}).
Case 1: The prosecutor receives a police report saying (1)
that the deceased has been murdered, and that either Adam
or Bob must have done it; and (2) that Adam has previously
been convicted of murder several times. After receiving the
report, she revises her belief set by (A ∨ B) and by the assumption that Bob’s innocence is indeed consistent ◦¬B, i.e.
she revises her initial belief set by (A ∨ B) ∧ ◦¬B.
Case 2: differs from case 1 only that it is Bob who has previously been convicted of murder. Thus, the new piece of
information consists of (A ∨ B) ∧ ◦¬A.
Internal Revision approach: If represented as an internal
partial meet revision, when the first suboperation is performed (namely, contraction by ¬((A ∨ B) ∧ ◦¬B) and
¬((A ∨ B) ∧ ◦¬A) respectively in case 1 and case 2), we
have that
The distinct revisions are then defined as follows:
Definition 14. Internal revision (K − ¬α) + α
External revision (K + α) − ¬α
Semi-revision (K + α)!
The aforementioned operator “!”, originally advanced for
belief bases (Hansson 1997), is a particular case of contraction – called consolidation. In Hansson’s original presentation, this operator is defined as a contraction by “⊥”.
In the context of LFIs, it is defined as the contraction by
ΩK = {α ∈ K : exists β ∈ L such that α = β ∧ ¬β}. The
technical details of those operations, alongside a presentation through postulates and their respective representation
theorems can be found in the references.
4.2
K0 ⊥(¬((A ∨ B) ∧ ◦¬B)) = K0 ⊥(¬((A ∨ B) ∧ ◦¬A)).
The subsequent expansion does not necessarily add nor
delete Adam’s or Bob’s guilty/innocence in both cases,
since the previous contraction could indiscriminately delete
Adam’s or Bob’s innocence – not taking profit of the new
piece of information as a whole.
External Revision approach: If represented as an external
partial meet revision, we have the following.
Case 1: The police report brings about the expansion of K
to K1 = Cn(K +(A∨B)∧◦¬B). Notably, A ∈ K1 (on the
grounds that ◦¬B, ¬B, A ∨ B ⊢ ◦¬B, ¬B, A ∨ ¬¬B ⊢ A).
In plain English, Adam is now proven to be guilty. Moreover, •¬A ∈ K1 (for A ∧ ¬A ⊢ ¬¬A ∧ ¬A ⊢ •¬A) i.e.,
the initial assumption about Adam’s innocence is logically
proven to be inconsistent. The subsequent contraction thus
has means to delete the initial supposition about Adam’s innocence.
Case 2: Mutatus mutandis.
Semi-revision approach: The semi-revision approach is
analogous to the external-revision, with the distinction that
the second suboperation (namely, contraction) does not necessarily delete Adam’s and Bob’s innocence (respectively in
case 1 and case 2) but, rather, gives the option for deleting
the new piece of information given by the police report.
Reasoning with consistency and inconsistency
Each of the LFIs in the aforementioned family (recall definition 6) captures distinct properties regarding the notion
of formal consistency. For instance, mbC separates the
notions of consistency from non-contradictoriness (◦α ⊢
¬(¬α ∧ α), but the converse does not hold), and also separates the notions of inconsistency from contradictoriness
(α ∧ ¬α ⊢ ¬◦α, but the converse does not hold). In Ci inconsistency and contradictoriness are identified (¬◦α⊣⊢α ∧
¬α) and, in Cil consistency and non-contradictoriness are
identified (◦α⊣⊢¬(α ∧ ¬α)).
This cautious way of dealing with the formal concept of
consistency allows the modeling of significant forms of reasoning, as it is illustrated by the following example adapted
from (Hansson 1999). In Hansson’s original presentation,
it was intended to show a case of an external partial meet
revision that is not also an internal partial meet revision –
indeed, neither one can be subsumed under the other. In our
analysis, the same conclusion applies: the avoidance of contradictions in every step of the reasoning refrain the revision
to adduce the following significant results.
Let ¬◦α =def •α, and let us consider Ci as the underlying logic.
4.3
Formal consistency as an epistemic attitude
An alternative system considered in (Testa, Coniglio, and
Ribeiro 2017), called AGM◦, relies heavily on the formal
consistency operator. This means that the explicit constructions themselves (and accordingly the postulates) assume
that such operator plays a central role. In a static paradigm
(i.e., when the focus is the logical consequence relation) this
is already the case. Assuming the consistency of the sentence involved in a contradiction entails a trivialization (as
elucidated in the gentle explosion principle) – which somehow captures and describes the intuition of the expansion.
23
The main idea of AGM ◦ is to also incorporate the notion
of consistency in the contraction. In this case, it is interpreted that a belief being consistent means that it is not liable to be removed from the belief set in question, adducing
that the contraction endows the postulate of failure (namely,
that if ◦α ∈ K then K − α = K).
The strategy is to incorporate the idea of non-revisibility
in the selection function – the consistent belief remains in
the epistemic state in any situation, unless the agent retract
the very fact that such belief is consistent.
Definition 15 (selection function for AGM◦ contraction).
A selection function for K is a function γ ′ such that, for
every α:
1. γ ′ (K, α) ⊆ K⊥α if α ∈
/ Cn(∅) and ◦α ∈
/ K.
2. γ ′ (K, α) = {K} otherwise.
Contraction, thus, is defined as definition 13.
In short, the seven epistemic attitudes defined in AGM ◦
are:
Definition 16 (Possible epistemic attitudes in AGM◦, see
figure 1 (Testa, Coniglio, and Ribeiro 2017; Testa 2014)).
Let K be a given belief set. Then, a sentence α is said to be:
Accepted if α ∈ K.
Rejected if ¬α ∈ K.
Under-determined if α ∈
/ K and ¬α ∈
/ K.
Over-determined if α ∈ K and ¬α ∈ K.
Consistent if ◦α ∈ K.
Boldly accepted if ◦α ∈ K and α ∈ K.
Boldly rejected if ◦α ∈ K and ¬α ∈ K (i.e. ∼G ∈ K).
2. Ellen, on the other hand, is a believer (G ∈ K). However, it may very well happen that she loses her faith so
definitely that she can never become a believer in God
again (◦¬G ∈ K).
3. Florence is an inveterate doubter. Nothing can bring her
to a state of firm (irreversible) belief (◦G 6∈ K) and
neither can she be brought to a state of firm disbelief
(◦¬G 6∈ K)
Paraconsistent Belief Revision based on the LFIs are
an important step for further advancements on systems
for detecting and handling with contradictions, mostly if
combined with tools for expressing probabilistic reasoning.
Some progress in this direction are overviewed in the following sections.
5
This section briefly surveys the research initiative on paraconsistent probability theory based on the LFIs and its consequences, which makes it possible to treat realistic probabilistic reasoning under contradiction.
Paraconsistent probabilities can be regarded as degrees of
belief that a rational agent attaches to events, even if such
degrees of belief might be contradictory. Thus it is not impossible for an agent to believe in the proposition α and ¬α
and to be rational, if this belief is justified by evidence, as
argued in (Bueno-Soler and Carnielli 2016).
A quite general notion of probability function can be defined, in such a way that different logics can be combined
with probabilistic functions, giving rise to new measures that
may reflect some subtle aspects of probabilistic reasoning.
⊤
◦α, α ∈ K
α, ¬α ∈ K
◦α, ¬α ∈ K
α∈K
◦α ∈ K
¬α ∈ K
Sound probabilistic reasoning under
contradiction
Definition 17. A probability function for a language L of a
logic L, or a L-probability function, is a function P : L 7→
R satisfying the following conditions, where ⊢L stands for
the syntactic derivability relation of L:
1.
2.
3.
4.
5.
α, ¬α ∈
/K
Figure 1: Epistemic attitudes in AGM◦
Non-negativity: 0 ≤ P (ϕ) ≤ 1 for all ϕ ∈ L
Tautologicity: If ⊢L ϕ, then P (ϕ) = 1
Anti-tautologicity: If ϕ ⊢L , then P (ϕ) = 0
Comparison: If ψ ⊢L ϕ, then P (ψ) ≤ P (ϕ)
Finite additivity: P (ϕ ∨ ψ) = P (ϕ) + P (ψ) − P (ϕ ∧ ψ)
This collection of meta-axioms, by assuming appropriate
⊢L (for instance, by taking the classical, intuitionistic or
paraconsistent derivability relation) defines distinct probabilities, each one deserving a full investigation. In particular,
for the sake of this project, we have in mind paraconsistent
probability theory based on the Logics of Formal Inconsistency, as it has been treated in (Bueno-Soler and Carnielli
2016),(2017).
Several central properties of probability are preserved, as
the notions of paraconsistent updating which is materialized
through new versions of Bayes’ theorem for conditionalization. Other papers already proposed connections between
non-classical logics and probabilities and even for the paraconsistent case (references can be found in the aforementioned works), recognizing that some non-classical logics
The following examples illustrate an important feature of
human belief that, in classical AGM, has no room in a model
solely based on contractions and revisions – the stubbornness of human belief. Instead of introducing the notions of
necessity and possibility on the metalanguage, as suggested
by (Hansson 1999), it is possible to capture such notions
based on the concept of bold-acceptance. Indeed, as interpreted by (Testa 2014), this fact illustrate a well-studied feature regarding the proximity of LFIs with modal logics.
Example 2. Adapted from (Hansson 1999)
1. Doris is not religious, but she has religious leanings. She
does not believe that God exists (G 6∈ K), but it is possible
for her to become a believer (∼G 6∈ K).
24
Furthermore, as given in the problem, P (D/A) = 0.98,
P (C/¬A) = 0.9 and P (D) = 0.11. The results of
the test have no paraconsistent character, since the events
D (‘doping’ ) and C (‘no doping’) exclude each other.
Thus, P (D/¬A) = 1 − P (C/¬A) = 0.1 and P (C/A) =
1 − P (D/A) = 0.02.
Suppose someone has been tested, and the test is positive
(“doping”). What is the probability that the tested individual regularly uses this illegal drug, that is what is P (A/D)?
By applying the paraconsistent Bayes’ rule:
are better suited to support uncertain reasoning in particular domains. The combinations between probabilities and
LFIs deserves to be emphasized, as they offer a quite natural and intuitive extension of standard probabilities which is
useful and philosophically meaningful.
The following example uses the system Ci, a member of
the LFI family with some features that make it reasonably
close to classical logic (recall definition 6); it is appropriate, in this way, to define a generalized notion of probability
strong enough to enjoy useful properties.
Observation 18 (Paraconsistent Bayes’ Conditionalization
Rule (PBCR) (Bueno-Soler and Carnielli 2016)).
If P (α ∧ ¬α) 6= 0, then:
P (α/β) =
P (A/D) =
P (β/α) · P (α)
P (β/α) · P (α) + P (β/¬α) · P (¬α) − δα
P (D/A) · P (A)
P (D/ A) · P (A) + P (D/¬A) · P (¬A) − δA
where δA = P (D/A ∧ ¬A) · P (A ∧ ¬A)
since P (A ∧ ¬A) 6= 0.
All of the values are known, with the exception of
P (D/A ∧ ¬A). Since:
where δα = P (β/α∧¬α)·P (α∧¬α) is the ’contradictory
residue’ of α.
It is clear that this rule generalizes the classical conditionalization rule, as it reduces to the classical case if
P (α ∧ ¬α) = 0 or if α is consistent: indeed, in the last
case, P (β ∧ ◦α) = P (β ∧ ◦α ∧ α) + P (β ∧ ◦α ∧ ¬α) since
P (◦α ∧ α ∧ ¬α) = 0.
We can interpret (PBCR) as Bayes’ ruke taking into account the likelihood relative to the contradiction. It is possible, however, to formulate other kinds of conditionalization
rules by combining the notions of conditional probability,
contradictoriness, consistency and inconsistency.
Example 3. As an example, suppose that a doping test for
an illegal drug is such that it is 98% accurate in the case of
a regular user of that drug (i.e., it produces a positive result,
showing “doping”, with probability 0.98 in the case that the
tested individual often uses the drug), and 90% accurate in
the case of a non-user of the drug (i.e., it produces a negative
result, showing “no doping”, with probability 0.9 in the case
that the tested individual has never used the drug or does not
often use the drug).
Suppose, additionally, that: (i) it is known that 10% of the
entire population of all athletes often uses this drug; (ii) that
95% of the entire population of all athletes does not often
use the drug or has never used it; and (iii) that the test produces a positive result, showing “doping”, with probability
0.11 for the whole population, independent of the tested individual.
Let the following be some mnemonic abbreviations:
D : the event that the drug test has declared “doping” (positive) for an individual;
C : the event that the drug test has declared “clear” or “no
doping” (negative) for an individual;
A : the event that the person tested often uses the drug;
¬A : the event that the person tested does not often use the
drug or has never used it.
We know that P (A) = 0.1 and P (¬A) = 0.95. The
situation is clearly contradictory with respect to the events
A and ¬A, as they are not excludent. Therefore, by finite
additivity, P (A ∨ ¬A) = 1 = (P (A) + P (¬A)) − P (A ∧
¬A), and thus, P (A ∧ ¬A) = (P (A) + P (¬A)) − 1 = 0.05
P (D/A ∧ ¬A) =
P (D ∧ A ∧ ¬A)
P (A ∧ ¬A)
it remains to compute P (D ∧ A ∧ ¬A). It follows directly from
some easy properties of probability that P (D ∧ A ∧ ¬A) =
P (D ∧ A) + P (D ∧ ¬A) − P (D) = P (D/A).P (A) +
P (D/¬A).P (¬A) − P (D) = 0.083. Therefore, by plugging in
all of the values, it follows that P (A/D) = 51.9%1 .
This example suggests, as argued below, that the paraconsistent Bayes’ conditionalization rule is more robust than
traditional conditionalization, as it can provide useful results
even in the case the test could be regarded as ineffective due
to contradictions. The following table compares the paraconsistent result with the results obtained by trying to remove the contradiction involving the events A (the event that
the person tested often uses the drug) and ¬A (the event that
the person tested does not often use the drug or has never
used it), that is by trying to make them “classical”.
Since A and ¬A overlap by 5%, we might consider reviewing the values, by ‘removing the contradiction’ according to three hypothetical scenarios: an alarming scenario,
by lowering the value of ¬A by 5%; a happy scenario, by
lowering the value of A by 5%; and a cautious scenario, by
dividing the surplus equally between A and ¬A and computing the probability P (A/D) that the tested individual regularly uses this illegal drug.
Table 1: Removing the contradiction
Alarming Scenario
Cautious Scenario
Happy Scenario
P (A) = 10%
P (¬A) = 90%
P (D/A) = 98%
P (D/¬A) = 10%
P (A) = 7.5%
P (¬A) = 92.5%
P (D/A) = 98%
P (D/¬A) = 10%
P (A) = 5%
P (¬A) = 95%
P (D/A) = 98%
P (D/¬A) = 10%
Result
Result
Result
P (A/D) = 52%
P (A/D) = 44%
P (A/D) = 34%
1
The values correct some miscalculations in (Bueno-Soler and
Carnielli 2016).
25
4. Comparison: If ψ ⊢L ϕ, then N (ψ) ≤ N (ϕ)
5. Conjunction: N (ϕ ∧ ψ) = min{N (ϕ), N (ψ)}
6. Metaconsistency: N (•α) + N (◦α) = 1
A condition N (α) = λ can be understood as expressing
that ‘α is certain to degree λ’ (in all normal states of affairs).
Possibilistic measures are also useful when representing
preferences expressed as sets of prioritized goals, as e.g.
some lattice-valued possibility measures studied in the literature instead of real-valued possibility measures. The parameter L in the above definition can be Cie, or the threevalued logic LFI1, or XXX (see references for details).
Analogously to the necessity function, a generic notion
of logic-dependent possibility measure (dual to a necessity
function) is defined as follows:
Definition 20. A possibility function (or measure) for the
language L of Cie, or a Cie- possibility function, is a function Π : L 7→ R satisfying the following conditions:
1. Non-negativity: 0 ≤ Π(ϕ) ≤ 1 for all ϕ ∈ L
2. Tautologicity: If ⊢L ϕ, then Π(ϕ) = 1
3. Anti-Tautologicity: If ϕ ⊢L , then Π(ϕ) = 0
4. Comparison: If ψ ⊢L ϕ, then Π(ψ) ≤ Π(ϕ)
5. Disjunction: Π(ϕ ∨ ψ) = max{Π(ϕ), Π(ψ)}
6. Metaconsistency: Π(•α) + Π(◦α) = 1
Standard necessity and possibility measures do not cope
well with contradictions, since they treat contradictions in
a global form (even if in a gradual way). This is the main
reason to define new forms of necessity and possibility measures based upon paraconsistent logics; although they lack
graduality, LFIs offer a tool for handling contradictions in
knowledge bases in a local form, by locating the contradictions on critical sentences. Yet, the combination of them
reaches a good balance: the paraconsistent paradigm by itself does not allow for any fine-grained graduality in the
treatment of contradictions, which may lead to some loss
of information when contradictions appear in a knowledge
base. When enriched with possibility and necessity functions, however, a new reasoning tool emerges.
It is possible to define a natural non-monotonic consequence relation on databases acting under some of the logic
L as above. Non-monotonic logics are structurally closed
to the internal reasoning of belief revision, as argued in
(Gärdenfors 1990), where it is shown that the formal structures of the two theories are similar. The resulting logic systems have a great potential to be used in real-life knowledge
representation and reasoning systems.
Another important concept that can be advantageously
treated by the paraconsistent paradigm is the concept of evidence. The paper (Rodrigues, Bueno-Soler, and Carnielli
2020) introduces the logic of evidence and truth LETF as
an extension of the Belnap-Dunn four-valued logic FDE.
LETF is equipped with a classicality operator ◦ and its dual
to non-classicality operator •. It would be interesting to define possibility and necessity measures over LETF , generalizing the probability measures defined over LETF and to
further investigate the connections between the formal notions of evidence and the graded notions of possibility and
necessity.
Using paraconsistent probabilities, one obtains, in the
case of this example, a value close (even if a bit inferior)
to the “alarming” hypothetical scenario, helping to make a
decision even if the contradictory character would make it be
seen as ineffective. In other words, the presence of a contradiction does not mean that we need to discard the test, if we
have reasoning tools that are sensitive and robust enough.
6
Possibility and necessity measures
Possibility theory is a generalization of (or an alternative to)
probability theory devoted to deal with certain types of uncertainty by means of possibility and necessity measures.
As aforementioned, it is well recognized that reasoning
with contradictory premises is a critical issue, since large
knowledge bases are inexorably prone to incorporate contradictions. Contradictory information comes from the fact that
data is provided by different sources, or by a single source
that delivers contradictory data as certain.
The connections between the possibilistic and the paraconsistent paradigms are complex and various forms of
contradiction can be accommodated into possibilistic logic,
defining concepts such as ‘paraconsistency degree’ and
‘paraconsistent completion’ (Dubois and Prade 2015). Paraconsistent logics offer simple and effective models for reasoning in the presence of contradictions, as they avoid collapsing into deductive trivialism by a natural logic machinery. Taking into consideration that it is more natural and effective to reason from a contradictory information scenario
than trying to remove the contradictions involved, the investigation of credal calculi concerned with necessity and
possibility is naturally justified.
On one hand, possibility theory based on classical logic
is able to handle contradictions, but at the cost of expensive
maneuvers (Dubois and Prade 2015). On the other hand,
paraconsistent logics cannot easily express uncertainty in a
gradual way. The blend of both via the LFIs, in view of the
operators of consistency and inconsistency, offers a simple
and natural qualitative and quantitative tool to reason with
uncertainty.
The idea of defining possibility and necessity models,
dubbed as credal calculi, based on the Logics of Formal
Inconsistency, takes advantage of the flexibility of the notions of consistency “◦” and inconsistency “•”. Some basic properties of possibility and necessity functions over the
Logics of Formal Inconsistency have been investigated in
(Carnielli and Bueno-Soler 2017), making clear that paraconsistent possibility and necessity reasoning can, in general, attain realistic models for artificial judgement.
A generic notion of logic-dependent necessity measures
is given by the conditions below.
Definition 19 ((Carnielli and Bueno-Soler 2017)). A necessity function (or measure) for a language L in an LFI, called
an LFI-necessity function, is a function N : L 7→ R satisfying the following conditions, where ⊢L stands for the syntactic derivability relation of L:
1. Non-negativity: 0 ≤ N (ϕ) ≤ 1 for all ϕ ∈ L
2. Tautologicity: If ⊢L ϕ, then N (ϕ) = 1
3. Anti-Tautologicity: If ϕ ⊢L , then N (ϕ) = 0
26
7
Other applications and further work
dling contradictions, and producing explanations for its conclusions. This is naturally relevant, for instance, in medical
diagnosis, natural language understanding, forensic sciences
and other areas where evidence interpretation is an important
issue.
Again, this is work in progress, but it seem clear that paraconsistent Bayesian networks may be useful and stimulating
in a series of circumstances where contradictions are around.
Description Logics (DLs) play an important role in the semantic web domain and in connections to computational
ontologies, and incorporating uncertainty in DL reasoning
has been the topic of lively research. DLs can expanded
with paraconsistent, probabilistic and possibilistic tools, or
with their combinations (one example toward the relevance
of paraconsistent reasoning for the Semantic Web can be
found in (Zhang, Lin, and Wang 2010)). Enhancing DLs
with LFI-probabilities and possibility measures is a research
in progress, and will represent a considerable step forward
to DLs in regard to the representation of more realistic ontologies.
A second problem concerns clarifying the concept of
evidence. As mentioned, (Rodrigues, Bueno-Soler, and
Carnielli 2020) introduces the logic of evidence and truth
LETF , a Logic of Formal Inconsistency and Undeterminedness that extends Belnap–Dunn four-valued logic, formalizes a notion of evidence as a concept weaker than truth in
the sense that there may be evidence for a proposition α even
if α is not true.
The paper proposes a probabilistic semantics for LETF
taking into account probabilistic and paracomplete scenarios (where, respectively, the sum or probabilities for α and
¬α is P (α) + P (¬α), is greater or less than 1). Classical
reasoning can be recovered when consistency and inconsistency behave within normality, that is, then P (◦α) = 1 or
P (•α) = 0. In this way it is possible to obtain some new
versions of standard results of probability theory. By relating the concepts of evidence and coherence, it may be possible to obtain an enhanced version of the model proposed in
(Chopra and Parikh 1999). This may represent an important
leap forward into the clarification of the notion of evidence,
each time more demanded in AI and KR.
Paraconsistent Bayesian networks is another topic with
great interest. Bayesian Networks are indispensable tools
for expressing the dependency among events and assigning
probabilities to them, thus ascertaining the effect of changes
of occurrence in one event given the others.
Bayesian Networks can be (roughly) represented as nodes
an annotated acyclic graph (a set of direct edges between
variables) that represents a joint (paraconsistent) probability distribution over a finite set of random variables V =
{V1 · · · , Vn }. The praxis usually supposes that each variable has only a finite number of possible values (though this
is not a mandatory restriction – numeric or continuous variables that take values from a set of continuous numbers can
also be used.
For such discrete random variables, conditional probabilities are usually represented by a table containing the probability that a child node takes on each of the values, taking
into account the combination of values of its parents, that
is, to each variable Vi with parents {B1 , · · · , Bni } there is
attached a conditional probability table relating Vi to its parents (regarded as “causes”)
Paraconsistent Bayesian networks, notably when combined with paraconsistent belief revision (including (Testa,
Coniglio, and Ribeiro 2017)) and with belief maintenance
systems can lead to a new approach to detecting and han-
Acknowledgments
The authors are grateful for the colleagues that participated
in advancing the results presented in this paper. Carnielli
acknowledges support from the National Council for Scientific and Technological Development (CNPq), Brazil under research grants 307376/2018-4 and from Modal Institute, Brasilia. Testa acknowledges support from São
Paulo Research Foundation, under research grants FAPESP
2014/22119-2 (at CLE-Unicamp, Brazil) and FAPESP
2017/10836-0 (at University of Madeira, Portugal).
References
Alchourrón, C. E.; Gärdenfors, P.; and Makinson, D. 1985.
On the logic of theory change: Partial meet contraction and
revision functions. The Journal of Symbolic Logic 50:510–
530.
Anderson, A. R.; Belnap, N. D.; and Dunn, J. M. 1992.
Entailment: The Logic of Relevance and Necessity, Volume
2. Princeton: Princeton University Press.
Asenjo, F. G. 1966. A calculus of antinomies. Notre Dame
Journal of Formal Logic 7(1):103–105.
Avron, A. 2005. Non-deterministic matrices and modular
semantics of rules. In Béziau, J.-Y., ed., Logica Universalis,
149–167. Basel: Birkhäuser Verlag.
Batens, D. 2001. A general characterization of adaptive
logics. Logique et Analyse 44(173-175):45–68.
Batens, D. 2009. Adaptive Cn logics. In Carnielli, W.;
Coniglio, M.; and D’Ottaviano, I. M. L., eds., The many
sides of logic, volume 21, 27–45. London, UK: College
Publications.
Bueno-Soler, J., and Carnielli, W. A. 2016. Paraconsistent
probabilities: consistency, contradictions and Bayes’ theorem. In Stern, J., ed., Statistical Significance and the Logic
of Hypothesis Testing, volume Entropy 18(9). MDPI Publications. Open acess at http://www.mdpi.com/1099-4300/18/
9/325/htm.
Carnap, R., and Bar-Hillel, Y. 1952. An outline of a theory
of semantic information. In Research laboratory of electronics technical report 247. Massachusetts Institute of Technology.
Carnielli, W. A., and Bueno-Soler, J. 2017. Paraconsistent
probabilities, their significance and their uses. In Caleiro,
C.; Dionisio, F.; Gouveia, P.; Mateus, P.; and Rasga, J., eds.,
Essays in Honour of Amilcar Sernadas, volume 10500. London: College Publications. 197–230.
Carnielli, W., and Coniglio, M. 2016. Paraconsistent
Logic: Consistency, Contradiction and Negation. New
27
York: Logic, Epistemology, and the Unity of Science Series, Springer.
Carnielli, W., and Lima-Marques, M. 2017. Society semantics and the logic way to collective intelligence. Journal of
Applied Non-Classical Logics 27(3-4):255–268.
Carnielli, W., and Marcos, J. 2002. A taxonomy of
c-systems. In Carnielli, W. A.; Coniglio, M. E.; and
D’Ottaviano, I. M. L., eds., Paraconsistency: The Logical
Way to the Inconsistent, volume 228 of Lecture Notes in
Pure and Applied Mathematics, 1–94. Marcel Dekkerr.
Carnielli, W.; Coniglio, M.; and Marcos, J. 2007. Logics of
formal inconsistency. In Gabbay, D. M., and Guenthner, F.,
eds., Handbook of Philosophical Logic, volume 14, 1–93.
Springer.
Chopra, S., and Parikh, R. 1999. An inconsistency tolerant
model for belief representation and belief revision. In Proceedings of the Sixteenth International Joint Conference on
Artificial Intelligence, IJCAI 99. Stockholm, Sweden.
da Costa, N. C., and Bueno, O. 1998. Belief change and
inconsistency. Logique et Analyse 41(161/163):31–56.
da Costa, N. C. A. 1974. On the theory of inconsistent formal systems. Notre Dame Journal of Formal Logic
15(4):497–510.
Dubois, D., and Prade, H. 2015. Inconsistency management from the standpoint of possibilistic logic. International Journal of Uncertainty, Fuzziness and KnowledgeBased Systems 23:15–30.
Fermé, E., and Hansson, S. O. 2018. Belief Change: Introduction and Overview. Switzerland: Springer Briefs in
Intelligent Systems, Springer.
Fermé, E., and Wassermann, R. 2017. Iterated belief change
the case of expansion into inconsistency. In 2017 Brazilian
Conference on Intelligent Systems (BRACIS), 420–425.
Gärdenfors, P. 1990. Belief revision and nonmonotonic
logic: two sides of the same coin? In European Workshop
on Logics in Artificial Intelligence, 52–54. Springer.
Halldén, S. 1949. The Logic of Nonsense. Lundequistska
Bokhandeln: Uppsala: A.-B.
Hansson, S. O. 1993. Reversing the levi identity. Journal of
Philosophical Logic 22(6):637–669.
Hansson, S. 1997. Semi-revision. Journal of Applied NonClassical Logics 7(1-2):151–175.
Hansson, S. O. 1999. A Textbook of Belief Dynamics. Theory
Change and Database Updating. Kluwer.
Jaśkowski, S. 1948. Rachunek zdań dla systemów dedukcyjnych sprzecznych. Studia Societatis Scientiarum
Torunensi (Sectio A) 1(5):55–77.
Mares, E. D. 2002. A paraconsistent theory of belief revision. Erkenntnis 56(2):229–246.
Mendonça, B. R. 2018. Traditional theory of semantic information without scandal of deduction: a moderately externalist reassessment of the topic based on urn semantics
and a paraconsistent application. Ph.D. Dissertation, IFCH,
Unicamp.
Nelson, D. 1959. Negation and separation of concepts in
constructive systems. In Heyting, A., ed., Constructivity in
Mathematics, volume 40, 208–225. NorthHolland, Amsterdam.
Priest, G. 1979. The logic of paradox. Journal of Philosophical Logic 8(1):219–241.
Priest, G. 1987. In Contradiction: A Study of the Transconsistent. Dordrecht: Martinus Nijhoff. second edition, Oxford: Oxford University Press, 2006.
Priest, G. 2001. Paraconsistent belief revision. Theoria
67:214–228.
Restall, G., and Slaney, J. 1995. Realistic belief revision.
In De Glas, M., and Pawlak, Z., eds., Proceedings of the
Second World Conference in the Fundamentals of Artificial
Intelligence, 367–378. Paris: Angkor.
Rodrigues, A.; Bueno-Soler, J.; and Carnielli, W. A. 2020.
Measuring evidence: a probabilistic approach to an extension of Belnap-Dunn logic. Synthese. In print.
Schotch, P.; Brown, B.; and Jennings, R. 2009. On Preserving: Essays on Preservationism and Paraconsistent Logic.
Toronto: University of Toronto Press.
Tamminga, A. 2001. Belief Dynamics: (Epistemo)logical
investivations. Ph.D. Dissertation, Institute for Logic, Language and Computation, Universiteit van Amsterdam.
Tanaka, K. 2005. The AGM theory and inconsistent belief
change. Logique et Analyse 48:113–150.
Testa, R.; Fermé, E.; Garapa, M.; and Reis, M. 2018. How
to construct remainder sets for paraconsistent revisions: Preliminary report. Proceedings of the 17th International Workshop on Nonmonotonic Reasoning.
Testa, R.; Coniglio, M.; and Ribeiro, M. 2015. Paraconsistent belief revision based on a formal consistency operator.
CLE e-prints 15(8).
Testa, R.; Coniglio, M.; and Ribeiro, M. 2017. AGM-like
paraconsistent belief change. Logic Journal of the IGPL
25(4):632–672.
Testa, R. R. 2014. Revisão de Crenças Paraconsistente
baseada em um operador formal de consistência. Ph.D. Dissertation, IFCH, Unicamp.
Testa, R. 2015. The cost of consistency: information economy in paraconsistent belief revision. South American Journal of Logic 1(2):461–480.
Testa, R. 2020. Judgment aggregation and paraconsistency.
(working manuscript).
Thomason, R. 2020. Logic and artificial intelligence. In
Zalta, E. N., ed., The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, summer 2020 edition.
Vasiliev, N. 1912. Imaginäre (nichtaristotelische) logik. In
Zhurnal m–va nar. prosveshcheniya, volume 40, 207–246.
Zhang, X.; Lin, Z.; and Wang, K. 2010. Towards a paradoxical description logic for the semantic web. In Link, S., and
Prade, H., eds., Foundations of Information and Knowledge
Systems, 306–325. Springer.
28
Towards Interactive Conflict Resolution in ASP Programs
Andre Thevapalan1 , Gabriele-Kern-Isberner2
1,2
Department of Computer Science, TU Dortmund University, Dortmund, Germany
1
andre.thevapalan@tu-dortmund.de, 2 gabriele.kern-isberner@cs.tu-dortmund.de
Abstract
of each conflict is done in interaction with the user. For each
conflict, suggestions on how to resolve the conflict are generated. Each conflict is presented to the expert together with
the matching suggestions of which the expert can choose
one. The active involvement of the expert guarantees an updated program which represents the knowledge of the expert
in the best possible way. Especially in sensitive fields like
medical therapies it is of the utmost importance that the encoded knowledge remains correct. The presented approach
ensures this by not only providing transparency by showing
the expert where the conflicts are located and what modifications are made but by also actively involving the expert.
The presentation of the interactive update approach is furthermore accompanied by a running example which depicts
a scenario where the knowledge about the determination of
therapies for cancer patients has to be updated with new, but
conflicting knowledge.
Note that instead of the update procedure of (Eiter et al.
2002), any other ASP update system that produces answer
sets containing information on conflicting rules could be
used for our interactive update approach.
The rest of the paper is organised as follows: Section 2
provides some necessary preliminaries regarding answer set
programming. In section 3 we explain the update of extended logic programs by causal rejection as presented in
(Eiter et al. 2002). In section 4 and 5 we introduce our
approach to detect conflicts when updating extended logic
programs by using the mechanism mentioned in section 3
and how to resolve the conflicts interactively with an expert.
Section 6 deals with related work. This paper ends in section
7 with a short summary and a discussion of further extensions and improvements regarding the presented approach.
For reasons of readability and oversight we moved the larger
programs of the running example to the appendix.
Updating knowledge bases often requires topical expertise
as to whether prior knowledge should be corrected, simply
deleted, or merged with the new information. In this paper
we introduce a formalism to update non-disjunctive ASP programs in an interactive way with the user by generating suitable suggestions regarding how to solve each conflict which
are based on an ASP update procedure by Eiter et al.. The
main goal is the development of a lean method to efficiently
update ASP programs by highlighting possible causes for conflicts, generate solution suggestions for the user and eventually modifying parts of the program so that an updated,
conflict-free program results in a guided way.
1
Introduction
In (Thevapalan et al. 2018) a prototype decision support system for mammary carcinoma therapy plans (MAMMA - DSCS)
was introduced. The core of this system is based on answer set programs (ASP) with the extension HEX which allows the connection to external sources with answer set programs (Eiter et al. 2005). MAMMA - DSCS was motivated by
the steady growth of knowledge in the medical sector, especially in the oncological field. In a fast pace, new drugs
and therapies are developed, and new ways are found to detect specific cancer subtypes (e. g., specific gene markers)
which improve the therapy possibilities. Thus an application which has a logic program as the core component was
introduced. Rule-based systems like ASP offer a declarative
paradigm which allows the extension of a program by simply adding more rules. However, despite of the declarative
paradigm, updating an existing ASP program with additional
rules can be quite complex due to the emergence of contradictions the rules can potentially cause. In this paper, we
present a method to update ASP programs interactively by
handling all conflicts that arise jointly with the user. Figure
1 gives an overview on the general approach of the method
presented in this paper. There, a logic program P1 has be
to be updated with new information P2 . Basically, the original programs P1 , P2 are successively modified to programs
Pˆ1 , Pˆ2 by altering the conflict-causing rules. At the end of
the process the update can be realized by simply uniting
Pˆ1 , Pˆ2 because all conflicts have been resolved before. To
find all conflicts we modify an approach to update answer
set programs presented in (Eiter et al. 2002). The resolution
2
Preliminaries
In this paper we look at non-disjunctive extended logic programs (ELPs) (Gelfond and Lifschitz 1991). An ELP is a
finite set of rules over a set A of propositional atoms. A
literal L is either an atom A (positive literal) or a negated
atom ¬A (negative literal). For a literal L the complementary literal L is ¬A if L = A and A otherwise. For a set
X of literals, X = {L | L ∈ X} is the set of corresponding
complementary literals. Then LitA denotes the set A ∪ A of
29
(P1 , P2 )
blocking its applicability whenever the newer rule is applicable.
In this paper, we will deal only with update sequences of
length n = 2 because we focus on situations where a consistent ELP is available, and some new information has to
be integrated in such a way that a consistent ELP results
that represents the current knowledge. Furthermore, in medical environments like hospitals new information is usually
provided and implemented periodically rather than continuously, and after the update, a new consistent view is expected
that will be the base for the next update. Therefore, in the
following we will briefly describe the translation explicitly
for an update sequence of length n = 2.
Given an update sequence P = (P1 , P2 ) over A the set
A∗ is the extension of A by pairwise distinct atoms rej(r)
and Ai , for each rule r occurring in P, each atom A ∈ A,
and each i ∈ {1, 2}. The literal which is created by replacing the atomic formula A of a literal L by Ai will be denoted
by Li .
P̂1 ∪ P̂2
AS(P1 ◭ P2 )
Conflict
detection
Interactive
conflict
resolution
User
Figure 1: Overview on the whole interactive update procedure
all literals over A. A default-negated literal L is written as
not L. A rule r is of the form
L0 ← L1 , . . . , Lm , not Lm+1 , . . . , not Ln .
with literals L0 , . . . , Ln and 0 ≤ m ≤ n.
The literal L0 is the head of r, denoted by H(r) and
{L1 , . . . Lm , not Lm+1 , . . . not Ln } is the body of r, denoted by B(r). Furthermore {L1 , . . . , Lm } is denoted by
B + (r) and {Lm+1 , . . . , Ln } by B − (r). A rule r with
B(r) = ∅ is called a fact, and if H(r) = ∅ rule r is called a
constraint. A set of literals is inconsistent if it contains complementary literals. A set X of non-complementary literals
is called interpretation. An interpretation X is called model
of an ELP P if for every rule r ∈ P the following holds:
H(r) ∈ X whenever B + (r) ⊆ X and B − (r) ∩ X = ∅. The
reduct P X of a program P relative to a set X of literals is
defined by
Definition 2 (Update program (Eiter et al. 2002)). Given
an update sequence P = (P1 , P2 ) over a set of atoms A,
the update program P⊳ = P1 ⊳ P2 over A∗ consists of the
following items:
(i) all constraints occurring in P1 , P2 ;
(ii-a) for each r ∈ P1 :
L1 ←B(r), not rej(r).
(ii-b) for each r ∈ P2 :
L2 ←B(r).
if H(r) = L;
(iii) for each r ∈ P1 :
P X = {H(r) ← B + (r). | r ∈ P, B − (r) ∩ X = ∅}.
rej(r) ←B(r), L2 .
An answer set of an ELP P is an interpretation X which
is a ⊆-minimal model of P X (Gelfond and Lifschitz 1991).
The set of all answer sets of a program P will be denoted by
AS(P ), and P is called consistent iff AS(PS) 6= ∅. We say
a literal L is derivable in an ELP P iff L ∈ AS(P ).
3
if H(r) = L;
if H(r) = L;
(iv) for each literal L occurring in P:
L1 ←L2 .;
L ←L1 ..
Note that transformations of type (ii-a) are only applied
to P1 because there is no P3 . Consequently, the rules of P2
do not need to be modified (see (ii-b)).
The answer set of an update sequence P is the projection
of the answer set of the update program P⊳ onto the set of
atoms A.
Update by causal rejection
Our approach to detect conflicts is based on the update of
extended answer set programs by causal rejection (Eiter et
al. 2002). In that approach the extended logic programs are
given in a sequence (P1 , . . . , Pn ) of ELPs, where each Pi
updates the information encoded in (P1 , . . . , Pi−1 ).
Definition 3 (Update Answer Set (Eiter et al. 2002)). Let
P = (P1 , P2 ) be an update sequence over a set of atoms A.
Then S ⊆ LitA is an update answer set of P iff S = S ′ ∩ A
for some answer set S ′ of P⊳ .
Definition 1 (Update sequence (Eiter et al. 2002)). An update sequence P = (P1 , . . . , Pn ) is a series of consistent
ELPs over A, where A is the set of all atoms occurring in
P1 , . . . , Pn and where Pi contains newer information than
Pi−1 .
To illustrate the update mechanism we present an example
which represents a possible scenario in a medical setting.
Example 1. Consider the following extended logic program
P1 :
An update sequence P = (P1 , . . . , Pn ) is translated into
a single program P⊳ which encodes the information of P
and whose answer sets represent the answer sets of P. Informally the translated program P⊳ merges the information
of the programs in P but in case of conflicting rules, P⊳ rejects the rule of the program with the older information by
r1 : tnbc met.
r2 : pdl pos.
r3 : treat.
30
r4 : mono th ← treat, tnbc met, not visc crisis.
can not hold which consequently prevents a conflict. Similarly the information that the suggested therapy should be a
monotherapy is rejected via the rej(r4 )-literal.
From the medical experts’ point of view not only the resulting updated and consistent program is important - for
them the information about explicit rule rejections is crucial
and have to be further analyzed. Generally it is important
to know which rules were rejected and especially why they
were rejected. To be precise, from a medical expert’s point
of view the following questions are relevant regarding rule
rejections:
r5 : mono th ← treat, tnbc met, visc crisis.
r6 : nab pt ← treat, tnbc met.
r7 : carbopl ← treat, tnbc met, visc crisis.
r8 : low success ← treat, tnbc met.
Let P1 encode the following scenario: A patient (let us
call her Agent A) has a metastasized triple negative breast
cancer (r1 ), and one of the tests showed that the tumor
is also PD-L1-positive (r2 ). The patient is getting treated
at a cancer clinic (r3 ). According to the clinic’s guidelines, Agent A should get treated with a drug called nabpaclitaxel (r6 ). Usually, the treatment with a single drug
(monochemotherapy) is indicated (r4 ). But if Agent A’s cancer is rapidly progressing due to severe organ dysfunction
(visceral crisis) a more aggressive approach can be chosen
by additionally treating with carboplatin (r7 ). The use of
multiple drugs in a chemotherapy is called polychemotherapy, which in this scenario can be interpreted as the opposite resp. negation of a monochemotherapy (r5 ). Generally though, the treatment of metastasized tnbc is known
to have a low success rate and the chemotherapy is a palliative treatment (r8 ). However, recent studies then show
(cf. (Schmid et al. 2018; Schneeweiss et al. 2019)) that for
PD-L1-positive patients, who are not in a visceral crisis, the
treatment with an additional immunotherapy consisting of
the PD-L1-inhibitor atezolizumab is advisable (r9 ). Such a
combination therapy with atezolizumab and nab-paclitaxel
(r10 ) can prolong the life of a tnbc patient significantly (r11 ).
This is encoded in following program P2 :
r9 : atzmab ← treat, tnbc met, pdl pos, not visc crisis.
(Q1) Which rule r in P1 was rejected?
(Q2) Which rule r′ in P2 caused the rejection?
(Q3) What are the options to handle the rejection?
(Q3a)
(Q3b)
(Q3c)
(Q3d)
Remove r and/or r′ completely?
Modify body of r - how?
Modify body of r′ - how?
Are there deeper causes of rejection? (which
rule led to the applicability of r′ etc.)
But neither the update answer set S ′ nor the answer set S
show information how these conflicts have arisen. On the
basis of the update program P⊳ one is only able to see which
rules in P1 were rejected. In the following we will describe
how we extend the update procedure to use it for an interactive update process.
4
Conflict Detection
With (Eiter et al. 2002) it is possible to compute information
about syntactical correlations between two programs. We
will use this as meta-information to detect conflicts. In this
paper we will define conflicts via conflicting rules.
r10 : mono th ← treat, tnbc met, nab pt, atzmab.
Definition 4 (Conflicting Rules, Conflict). Let P be an
ELP and let LitP be the set of all literals derivable in P .
Two rules r, r′ ∈ P are conflicting if H(r) and H(r′ ) are
complementary and there exists an interpretation X ⊆ LitP
such that B(r) and B(r′ ) are true in X. A conflict is a pair
(r, r′ ) of rules such that r, r′ are conflicting.
r11 : low success ← treat, tnbc met, atzmab, mono th.
Which treatment would the clinic recommend to Agent A
now, and how can the new information P2 be integrated into
the prior knowledge P1 ?
Following definition 2 we can generate the update
program P⊳ = P1 ⊳ P2 (cf. Appendix A.1) where the
only answer set of P⊳ is S ′ = {tnbc met1 , tnbc met,
pdl pos1 , pdl pos, treat1 , treat, nab pt1 , nab pt,
atzmab2 , atzmab1 , atzmab, mono th2 , mono th1 ,
mono th, low success2 , low success1 , low success,
rej(r4 ), rej(r8 )}. Consequently, we get the update answer
set of P = (P1 , P2 ) by S = S ′ ∩ A ={tnbc met, pdl pos,
treat, nab pt, atzmab, mono th, low success}.
Note that the request for an ELP to be conflict-free is
stronger than for it to be consistent as a conflict-free ELP
is consistent but not vice versa.
Instead of automating the update of programs we will
compute suggestions for resolving these conflicts. Therefore
we extend the meta-information given in an update program
by modifying the update method in (Eiter et al. 2002). In
order to use the update program itself to control the update
process we add the possibility to recognize the immediate
cause of a rejection. Therefore in addition to the rej-atoms
we will introduce rej cause- and active-atoms to enable
the backtracking of rejections. To realize these modifications our generated update program P◭ will be over a set
of atoms A∗∗ which is the extension of A∗ by pairwise distinct atoms rej cause(r, r′ ), active(r), for each rule r and
for each pair of rules r 6= r′ occurring in P1 , P2 .
With low success ∈ S one can see that the therapy’s expected success is updated correctly. Indeed, the new information P2 is crucial for helping Agent A effectively. By the
rules
low success1 ←treat, tnbc met, not rej(r8 ).,
rej(r8 ) ←treat, tnbc met, low success2 .
in P⊳ it is guaranteed that the update program does not
conclude low success if newer information suggests otherwise. Literal rej(r8 ) being in answer set S ′ shows that r8
Definition 5 (Modified update program). Given an update
sequence P = (P1 , P2 ) over A the modified update program
31
holds: rej(r) ∈ S through r iff there exists a rule r′ in P2
such that H(r′ ) = L and rej cause(r, r′ ) ∈ S + . For literal
rej cause(r, r′ ) we have: rej cause(r, r′ ) ∈ S + iff B(r)
is true in S and active(r′ ) ∈ S + . Furthermore, we have
active(r′ ) ∈ S + if B(r′ ) is true in S. Hence, the following
holds: rej(r) ∈ S iff B(r) is true in S, and there exists a
rule r′ in P2 such that H(r′ ) = L and B(r′ ) is true in S.
Consequently, we have: L1 ∈ S through r iff B(r) is true
in S, and there is no rule r′ in P2 such that H(r′ ) = L and
B(r′ ) is true in S.
Again, both update procedures employ equivalent strategies to derive literals from A∗ in their answer sets. Furthermore, due to these considerations, it is clear that
for each answer set S ∈ AS(P⊳ ), there is an answer set
S + ∈ AS(P◭ ) with S = S + ∩ A∗ , and, the other way
around, for each S + ∈ AS(P◭ ), S = S + ∩ A∗ is an answer set of AS(P⊳ ).
(MUP) P◭ = P1 ◭ P2 over A∗∗ consists of the following
rules:
(m-i) all constraints occurring in P1 , P2 ;
(m-ii-a) for each r ∈ P1 :
L1 ← B(r), not rej(r).
if H(r) = L;
(m-ii-b) for each r′ ∈ P2 :
L2 ← B(r′ ).
if H(r′ ) = L;
active(r′ ) ← B(r′ ).;
(m-iii) for each r ∈ P1 if there exists a rule r′ ∈ P2 such
that H(r), H(r′ ) are complementary:
rej cause(r, r′ ) ← B(r), active(r′ ).
Corollary 1. Let P = (P1 , P2 ) be an update sequence over
a set of atoms A. Then for the set AS(P⊳ ) of answer sets of
the update program and the set AS(P◭ ) of answer sets of
the modified update program, the following holds:
AS(P⊳ ) = {S | S = S + ∩ A∗ , S + ∈ AS(P◭ )}
Example 2. The update program P◭ of the modified approach can be found in Appendix A.2. The only answer set of
P◭ = P1 ◭ P2 is T ′ = {tnbc met1 , tnbc met, pdl pos1 ,
pdl pos, treat1 , treat, nab pt1 , nab pt, atzmab2 ,
atzmab1 , atzmab, active(r9 ), mono th2 , mono th1 ,
mono th, active(r10 ), low success2 , low success1 ,
low success, active(r11 ), rej cause(r4 , r10 ), rej(r4 ),
rej cause(r8 , r11 ), rej(r8 )}. The update answer set
of the update sequence P = (P1 , P2 ) with the modified approach is T = T ′ ∩ A = {tnbc met, pdl pos, treat,
nab pt, atzmab, mono th, low success}.
We can see that the update answer set S in example 1
and answer set T in example 2 are identical. However the
answer sets of a modified update program P◭ (e. g. answer
set T ′ in example 2) enables us to analyze the rejections and
its causes. With the modified approach we can detect the
immediate causes of rejections. In the previous example the
rules
low success2 ←treat, tnbc met, atzmab, mono th.,
rej(r) ← rej cause(r, r′ ).
(m-iv) for each literal L occurring in P:
L1 ← L2 .;
L ← L1 ..
We can show that our modifications to the original approach in (Eiter et al. 2002) only adds meta-information in
form of custom literals to each answer set of the update program without changing the (intended) update answer sets
themselves.
Proposition 1. Let P = (P1 , P2 ) be an update sequence
over a set of atoms A. Then for every answer set
S ∈ AS(P⊳ ) there exists a corresponding answer set
S + ∈ AS(P◭ ) such that S = S + ∩ A∗ , meaning S + is
a composition of all literals in S and possibly additional
active- and rej cause-literals. Conversely, for answer sets
S + ∈ AS(P◭ ) S = S + ∩ A∗ is an answer set of P⊳ .
Proof. Let P = (P1 , P2 ) be an update sequence over a set
of atoms A.
Let r be a rule in P2 and H(r) = L. Then for each answer
set S ∈ AS(P⊳ ) we have: L2 ∈ S through r iff B(r) is true
in S. Likewise, for every answer set S + ∈ AS(P◭ ) with
S = S + ∩ A∗ we have: L2 ∈ S through r iff B(r) is true
in S and iff active(r) ∈ S + . This shows that the conditions
for L2 to be derived via r on the base of A∗ are identical.
Now, let r be a rule in P1 and H(r) = L.
Then for each answer set S ∈ AS(P⊳ ) we have: L1 ∈ S
through r iff B(r) is true in S and rej(r) ∈
/ S. For literal
rej(r), the following holds: rej(r) ∈ S iff B(r) is true in S
and L2 ∈ S. Consequently, we have: rej(r) ∈ S iff B(r) is
true in S and there exists a rule r′ ∈ P2 such that H(r′ ) = L
and B(r′ ) is true in S. Altogether, we have L1 ∈ S through
r iff B(r) is true in S, and there is no rule r′ ∈ P2 such that
H(r′ ) = L and B(r′ ) is true in S.
Likewise, for each answer set S + ∈ AS(P◭ ) with
S = S + ∩ A∗ we have: L1 ∈ S through r iff B(r) is true
in S and rej(r) ∈
/ S. For literal rej(r) ∈ S, the following
active(r11 ) ←treat, tnbc met, atzmab, mono th.,
rej cause(r8 , r11 ) ←active(r11 ).
make sure that the answer set of P◭ contains the literal
rej cause(r8 , r11 ). This tells us that the knowledge about
the recommended therapy having a low success rate is ignored due to the statement given in rule r11 ∈ P2 . The other
reject-literal rej cause(r4 , r10 ) ∈ T ′ indicates, that rule r4
is also rejected.
The conflict detection therefore provides a way to locate
each conflict between two programs of an update sequence
P by using the corresponding MUP P◭ , as each literal
rej cause(r, r′ ) in an answer set of P◭ represents a conflict (r, r′ ). After the determination of all conflicts in P we
can now look at how to generate suggestions for resolving
each conflict.
32
not rej(r) which prevents that r and a rule r′ ∈ P2 hold
simultaneously. In our approach a suggestion for a conflict (r, r′ ) consists of an alternative rule r̂ for r. Similar
to the extension of a rule in step (ii-a) we will extend r to
r̂ by adding body-literals which ensure that r̂ and r′ are not
conflicting. The extension will be realized by adding bodyliterals which will be determined by comparing B(r) and
B(r′ ).
Interactive conflict resolution
Conflict c and
suggestions
displayed to user
Suggestion
selected
by user
Suggestions
for conflict c
generated
Rule
modified
Proposition 2. Let r, r′ be two conflicting rules and
P ot(C) = {C | C ⊆ C} the powerset of C = B(r′ ) − B(r).
Furthermore let r̂ be a possible modification of r with
r̂ : H(r) ← B(r), Cnot .
Figure 2: Interactive conflict resolution
5
where C ∈ P ot(C), Cnot = { not c | c ∈ C} with
not not c ≡ c. Then for every non-empty set C ∈ P ot(C)
the rules r̂, r′ are non-conflicting.
Interactive Conflict Resolution
As mentioned above, instead of an automated, rule-based
update of ELPs according to (Eiter et al. 2002), we propose
an interactive conflict resolution approach. It uses the metainformation given in the MUP P◭ of an update sequence
P = (P1 , P2 ) to recognize the conflicts which two programs
P1 , P2 cause. The goal is to gradually modify P1 , P2 such
that the resulting programs Pˆ1 and Pˆ2 do not contain conflicting rules. Figure 2 shows the components of the interactive part of the update process. For every conflict, suitable
suggestions are generated based on P◭ . These suggestions
are modifications of the original rules involved in the conflict. The rules involved in the conflict have to be shown to
the user and solutions in the form of possible rule modifications are suggested. The user can choose the most suitable
modification which will then be applied to the corresponding
modified logic program (e. g. Pˆ1 ). This interaction can be
done for each conflict. The original programs of the update
sequence are thereby successively modified in such way that
the update can be realized by simply uniting the two modified programs without creating conflicts.
In the previous section we showed how to detect conflicts in an update sequence P = (P1 , P2 ). To detect the
conflicts, we have to look at every answer set S + of the
MUP P◭ = P1 ◭ P2 . Each rej cause(r, r′ ) ∈ S + represents a conflict. In the proof of the modified approach
it is shown that rej cause(r, r′ ) ∈ S + iff the following
holds: r ∈ P1 , r′ ∈ P2 and B(r), B(r′ ) are true in S + . This
means, to resolve a conflict it is necessary to manipulate the
rules r and r′ such that the modified rules are not conflicting
anymore and hence can replace the rules r, r′ . As one can
see, there can be a large amount of possibilities to resolve
a single conflict, mainly the adding, removing or modifications of rules. In the remainder of this paper we will focus
our attention on the most difficult case and generate suggestions for the modification of rules. Therefore, we will
adhere to following principle: A conflict between two rules
r, r′ is resolved by only modifying r (as it stems from the
program with the older knowledge) where B(r) is modified using literals which occur in B(r), B(r′ ). The actual
absence of conflicts will be ensured by following a principle which is also exploited in (Eiter et al. 2002). In step
(ii-a) of definition 2 the original rule r ∈ P1 is extended by
Example 3. For the conflict between r4 and r10 we get
C = {∅, nab pt, atzmab}. Consequently, the following
modifications are possible:
r̂4 : mono th ←treat, tnbc met, not visc crisis,
not nab pt.
r̂′4 : mono th ←treat, tnbc met, not visc crisis,
not atzmab.
r̂′′4 : mono th ←treat, tnbc met, not visc crisis,
not nab pt, not atzmab.
For the conflict between r8 and r11 we get
C = {∅, atzmab, mono th}. This leads to the following possible modifications:
r̂8 : low success ←treat, tnbc met, not atzmab.
r̂8′ : low success ←treat, tnbc met, not mono th.
r̂8′′ : low success ←treat, tnbc met, not atzmab,
not mono th.
Note that the suggestions r̂4′′ and r̂8′′ contain all literals of their respective set C. If the human expert
is not able to choose a suggestion for a conflict, this
type of suggestion can be chosen by default.
For
each conflict (r, r′ ) the fallback solution would then be
the modification of r with r̂ : H(r) ← B(r), Cnot . where
Cnot = { not c | c ∈ (B(r′ ) − B(r))}.
After resolving all detected conflicts we get two programs
P̂1 , P̂2 whose union results in a conflict-free ELP.
Example 4. To resolve conflict (r4 , r10 ) the medical expert
would choose r̂4′ , as besides a visceral crisis the drug atezolizumab is primarily relevant for the decision whether the
patient should get monotherapy or not. Likewise, for conflict
(r8 , r11 ) the expert would choose r̂8′ , as the low success rate
of the longer known monotherapy applies further on.
Therefore one result of modified programs would be
P̂2 = P2 and P̂1 :
r1 : tnbc met.
33
6
r2 : pdl pos.
r3 : treat.
r4 : mono th ← treat, tnbc met, not visc crisis,
not atzmab.
Related Works
In the context of logic programs, instead of program sequences the connection of different knowledge bases can often be found in multi-agent-systems. In (Vos et al. 2005b)
a multi-agent architecture is presented which allows deductive reasoning via Ordered Choice Logic Programs (OCLP).
OCLP is an extension of ASP, which allows choice rules and
a preferential order over rule sets. Each agent is encoded
as an OCLP and can communicate with other agents. Compared to the approach in this paper, the knowledge update in
(Vos et al. 2005b) is realized by the exchange of information
between the agents. An agent’s knowledge can be updated
by the incoming information. Although an extension of ASP
is used, negation is not allowed explicitly and therefore contradictions are not directly possible. But each agent has specific goals in form of rules and facts. Incoming information
is only incorporated in the agent’s knowledge if the information is not contradictory to the agent’s goals. The authors
mention that negation and contradictory information could
be implemented and handled, amongst others, by removing
knowledge or adding the notion of trust between agents (Vos
et al. 2005a). The implementation in (Vos et al. 2005b) allows an human agent. Similar to our approach, the human
agent can control the agents’ updates if needed. But unlike our approach, the multi-agent platform is designed to
run mostly autonomously while the updating of each agent’s
knowledge is mainly done in an automated manner using the
preferential order and choice rules provided in OCLPs.
r5 : mono th ← treat, tnbc met, visc crisis.
r6 : nab pt ← treat, tnbc met.
r7 : carbopl ← tnbc met, mono th, visc crisis.
r8 : low success ← treat, tnbc met, not mono th.
Given the update sequence P = (P1 , P2 ) the update of
P1 with P2 is therefore the conflict-free program P̂ with
P̂ = P̂1 ∪ P̂2 .
As one can see, the strategy in proposition 2 can potentially lead to multiple suggestions per conflict. In these cases
the human expert, who is informed about each conflict, has
to actively step in and choose the suggestion which is, according the expert’s knowledge, the most suitable one. On
the one hand, this ensures full transparency of the update
sequence to the expert regarding the modifications. On the
other hand, the approach creates an updated program whose
professional suitability is guaranteed by the expert. In this
context we say a program is professionally suitable if the
represented knowledge is suitable according to the experts.
Proposition 3. Let P = (P1 , P2 ) be an update sequence.
The interactive conflict resolution modifies the rules in
P1 , P2 such that the union P̂ = P̂1 ∪ P̂2 of their corresponding modified programs P̂1 , P̂2 is conflict-free and professionally suitable.
7
Conclusion and Future Work
In this paper we presented a method to update ELPs of an
update sequence interactively with an expert. We used the
approach in (Eiter et al. 2002) to find all conflicts between
the programs. To resolve each conflict we defined a strategy
to generate possible modifications of conflicting rules which
resolve the conflict. To ensure professional suitability, for
each conflict the expert can choose the suggestion which is
the most suitable. This procedure leads to the successive
modification of the programs given in the update sequence,
resulting in modified ELPs which do not have conflicting
rules and can therefore be updated by simply uniting the programs. During the interactive conflict resolution the expert
maintains full control over each modification.
Revisiting the questions relevant to the medical experts
in section 3, the modified approach delivers following improvements: Questions (Q1) and (Q2) can be answered
by presenting the information of the rej cause-literals in
AS(P◭ ). Question (Q3a) should be solely answered and
acted upon by the expert. In section 5 we delivered answers for questions (Q3b) and (Q3c). Regarding question
(Q3d) further research is conceivable. The presented approach does only generate modification suggestions for rules
directly involved in a conflict. This purely syntactical procedure can be limiting considering that due to the active involvement of the expert the expert’s knowledge is already
available. It is possible that the actual conflict lies in other
rules which are not part of the conflict pair itself. One possible solution to find deeper causes for a conflict could be the
It is important to note that the strategy for the resolution
of conflicts defined in proposition 2 prevents the creation of
new conflicts when modifying rules.
Proposition 4. Let (r, r′ ) be a conflict. After the conflict resolution according to proposition 2, the resulting rules r̂, r′
cannot be extended (by adding literals to the respective rule
bodies) such that they become conflicting again.
Proof. Let (r, r′ ) be a conflict and r̂, r′ the resulting nonconflicting rules after the extension of r according to
proposition 2. Then, there exists a literal L such that
(1) L ∈ B + (r̂) and L ∈ B − (r′ ) or (2) L ∈ B − (r̂) and
L ∈ B + (r′ ). Let P be an ELP, {r̂, r′ } ⊆ P , S an answer set
of P , Lh = H(r̂), and Lh = H(r′ ). In case (1), the following holds: If B(r̂) is true in S, then L ∈ S and consequently
B(r′ ) cannot be true in S. This implies that Lh 6∈ S whenever Lh ∈ S . If B(r′ ) is true in S, then B(r̂) cannot be true
in S. This in turn implies that Lh 6∈ S whenever Lh ∈ S.
The line of argumentation holds analogously in case (2).
This approach therefore provides a way to update an ELP
by interactively modifying the programs of the update sequence such that the conflicts between the programs are
eliminated while preserving the professional suitability of
the updated program.
34
active inclusion of the expert. This enables the search for
the cause of a conflict in all rules of the update sequence on
a professional level. One can also consider to implement the
search for rules involved in a conflict syntactically and the
computation of matching resolutions.
One can also consider to look at various semantical extensions like the support of default negation in rule heads.
(Slota, Baláz, and Leite 2014) point out that the consideration of both types of negation lead to more fine-grained
control over each atom. Especially in medical scenarios a
distinction between positive and negative test results (strong
negation) and the absence of symptoms (default negation) is
important. By allowing default negation in rule heads rules
can be defined more precisely and the general conflict potential when updating a program could be mitigated.
Another useful improvement would be the extension of
the conflict detection. Currently the detection of conflicts is
dependent on the facts given in the programs of an update
sequence. Looking at the running example of the paper one
can imagine a scenario where a patient with different patient
data is given. This results in a different set of facts. Then
the update has to be executed specifically for this patient.
The detection of conflicts independent of the program’s facts
would save the time and effort to compute the update for
each patient.
Furthermore, as mentioned in the introduction, the
method to detect conflicts can be switched out. This also
applies to the method of generating possible rule modifications. Possible implementation approaches could be the
manual conflict resolution by the expert, the complete removal of rules with older knowledge or generating modifications of rules with newer knowledge.
For larger programs one can consider to improve the computation of answer sets and implicitly the detection of conflicts by using approaches like multi-shot ASP solving developed by (Gebser et al. 2019). This approach enables more
control when grounding and solving an ELP which would
lead to faster and more efficient interaction processes.
atzmab2 ←treat, tnbc met, pdl pos, not visc crisis.
mono th2 ←treat, tnbc met, nab pt, atzmab.
low success2 ←treat, tnbc met, atzmab, mono th.
rej(r1 ) ←tnbc met2 .
rej(r2 ) ←pdl pos2 .
rej(r3 ) ←treat2 .
rej(r4 ) ←treat, tnbc met, not visc crisis,
mono th2 .
rej(r5 ) ←treat, tnbc met, visc crisis, mono th2 .
rej(r6 ) ←treat, tnbc met, nab pt2 .
rej(r7 ) ←treat, tnbc met, visc crisis, carbopl2 .
rej(r8 ) ←treat, tnbc met, low success2 .
tnbc met1 ←tnbc met2 .
tnbc met ←tnbc met1 .
pdl pos1 ←pdl pos2 .
pdl pos ←pdl pos1 .
treat1 ←treat2 .
treat ←treat1 .
mono th1 ←mono th2 .
mono th ←mono th1 .
mono th1 ←mono th2 .
mono th ←mono th1 .
visc crisis1 ←visc crisis2 .
visc crisis ←visc crisis1 .
nab pt1 ←nab pt2 .
nab pt ←nab pt1 .
carbopl1 ←carbopl2 .
carbopl ←carbopl1 .
atzmab1 ←atzmab2 .
atzmab ←atzmab1 .
low success1 ←low success2 .
low success ←low success1 .
Acknowledgements
We would like to thank the anonymous reviewers for their
helpful suggestions and comments.
A Update Programs of Example
A.1
Update program P⊳
tnbc met1 ← not rej(r1 ).
pdl pos1 ← not rej(r2 ).
treat1 ← not rej(r3 ).
mono th1 ←treat, tnbc met, not visc crisis,
not rej(r4 ).
low success1 ←low success2 .
low success ←low success1 .
A.2
mono th1 ←treat, tnbc met, visc crisis,
not rej(r5 ).
nab pt1 ←treat, tnbc met, not rej(r6 )
carbopl1 ←treat, tnbc met, visc crisis, not rej(r7 ).
low success1 ←treat, tnbc met, not rej(r8 ).
Modified Update program P◭
tnbc met1 ← not rej(r1 ).
pdl pos1 ← not rej(r2 ).
treat1 ← not rej(r3 ).
mono th1 ←treat, tnbc met, not visc crisis,
not rej(r4 ).
35
References
mono th1 ←treat, tnbc met, visc crisis,
not rej(r5 ).
nab pt1 ←treat, tnbc met, not rej(r6 )
carbopl1 ←treat, tnbc met, visc crisis,
not rej(r7 ).
low success1 ←treat, tnbc met, not rej(r8 ).
Eiter, T.; Fink, M.; Sabbatini, G.; and Tompits, H. 2002.
On properties of update sequences based on causal rejection.
Theory Pract. Log. Program. 2(6):711–767.
Eiter, T.; Ianni, G.; Schindlauer, R.; and Tompits, H. 2005.
A uniform integration of higher-order reasoning and external evaluations in answer-set programming. In Proceedings
of the 19th International Joint Conference on Artificial Intelligence, IJCAI’05. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. 90–96.
Gebser, M.; Kaminski, R.; Kaufmann, B.; and Schaub, T.
2019. Multi-shot ASP solving with clingo. Theory Pract.
Log. Program. 19(1):27–82.
Gelfond, M., and Lifschitz, V. 1991. Classical negation
in logic programs and disjunctive databases. New Gener.
Comput. 9(3/4):365–386.
Schmid, P.; Adams, S.; Rugo, H. S.; Schneeweiss, A.;
Barrios, C. H.; Iwata, H.; Diéras, V.; Hegg, R.; Im, S.A.; Shaw Wright, G.; Henschel, V.; Molinero, L.; Chui,
S. Y.; Funke, R.; Husain, A.; Winer, E. P.; Loi, S.; and
Emens, L. A. 2018. Atezolizumab and nab-paclitaxel in
advanced triple-negative breast cancer. New England Journal of Medicine 379(22):2108–2121.
Schneeweiss, A.; Denkert, C.; Fasching, P. A.; Fremd, C.;
Gluz, O.; Kolberg-Liedtke, C.; Loibl, S.; and Lück, H.J. 2019. Diagnosis and therapy of triple-negative breast
cancer (tnbc)–recommendations for daily routine practice.
Geburtshilfe und Frauenheilkunde 79(06):605–617.
Slota, M.; Baláz, M.; and Leite, J. 2014. On strong and default negation in logic program updates (extended version).
CoRR abs/1404.6784.
Thevapalan, A.; Kern-Isberner, G.; Howey, D.; Beierle, C.;
Meyer, R. G.; and Nietzke, M. 2018. Decision support core
system for cancer therapies using ASP-HEX. In Brawner,
K., and Rus, V., eds., Proceedings of the Thirty-First International Florida Artificial Intelligence Research Society
Conference, FLAIRS 2018, Melbourne, Florida, USA. May
21-23 2018, 531–536. AAAI Press.
Vos, M. D.; Cliffe, O.; Watson, R.; Crick, T.; Padget,
J. A.; and Needham, J. 2005a. T-LAIMA: answer set programming for modelling agents with trust. In Gleizes, M.;
Kaminka, G. A.; Nowé, A.; Ossowski, S.; Tuyls, K.; and
Verbeeck, K., eds., EUMAS 2005 - Proceedings of the Third
European Workshop on Multi-Agent Systems, Brussels, Belgium, December 7-8, 2005, 126–136. Koninklijke Vlaamse
Academie van Belie voor Wetenschappen en Kunsten.
Vos, M. D.; Crick, T.; Padget, J. A.; Brain, M.; Cliffe, O.;
and Needham, J. 2005b. LAIMA: A multi-agent platform
using ordered choice logic programming. In Baldoni, M.;
Endriss, U.; Omicini, A.; and Torroni, P., eds., Declarative
Agent Languages and Technologies III, Third International
Workshop, DALT 2005, Utrecht, The Netherlands, July 25,
2005, Selected and Revised Papers, volume 3904 of Lecture
Notes in Computer Science, 72–88. Springer.
atzmab2 ←treat, tnbc met, pdl pos,
not visc crisis.
active(r9 ) ←treat, tnbc met, pdl pos,
not visc crisis.
mono th2 ←treat, tnbc met, nab pt, atzmab.
active(r10 ) ←treat, tnbc met, nab pt, atzmab.
low success2 ←treat, tnbc met, atzmab, mono th.
active(r11 ) ←treat, tnbc met, atzmab, mono th.
rej cause(r4 , r10 ) ←active(r10 ).
rej(r4 ) ←rej cause(r4 , r10 ).
rej cause(r8 , r11 ) ←active(r11 ).
rej(r8 ) ←rej cause(r8 , r11 ).
tnbc met1 ←tnbc met2 .
tnbc met ←tnbc met1 .
pdl pos1 ←pdl pos2 .
pdl pos ←pdl pos1 .
treat1 ←treat2 .
treat ←treat1 .
mono th1 ←mono th2 .
mono th ←mono th1 .
mono th1 ←mono th2 .
mono th ←mono th1 .
visc crisis1 ←visc crisis2 .
visc crisis ←visc crisis1 .
nab pt1 ←nab pt2 .
nab pt ←nab pt1 .
carbopl1 ←carbopl2 .
carbopl ←carbopl1 .
atzmab1 ←atzmab2 .
atzmab ←atzmab1 .
low success1 ←low success2 .
low success ←low success1 .
low success1 ←low success2 .
low success ←low success1 .
36
Towards Conditional Inference under Disjunctive Rationality
Richard Booth1 , Ivan Varzinczak2,3
1
Cardiff University, United Kingdom
2
CRIL, Univ. Artois & CNRS, France
3
CAIR, Computer Science Division, Stellenbosch University, South Africa
boothr2@cardiff.ac.uk, varzinczak@cril.fr
Abstract
the latter does not hold. Quite surprisingly, the debate did
not catch on, and, for lack of rivals of the same stature,
Rational Closure has since reigned alone as a role model
in non-monotonic inference. That used to be the case until
Rott (2014) reignited interest in Disjunctive Rationality by
considering interval models in connection with belief contraction. Inspired by that, here we revisit disjunctive consequence relations and make the first steps in the quest for
a suitable notion of disjunctive rational closure of a conditional knowledge base.
The plan of the paper is as follows. First, in Section 2,
we give the usual summary of the formal background assumed in the following sections, in particular of the rational
closure construction. Then, in Section 3, we make a case for
weakening the rationality requirement and propose a semantics with an accompanying representation result for a weaker
form of rationality enforcing the rule of Disjunctive Rationality. We move on, and in Section 4, we investigate a notion
of closure of (or entailment from) a conditional knowledge
base under Disjunctive Rationality. Our analysis is in terms
of a set of postulates, all reasonable at first glance, that one
can expect a suitable notion of closure to satisfy. Following that, in Section 5, we propose a specific construction for
the Disjunctive Rational Closure of a conditional knowledge
base and assess its suitability in the light of the postulates
previously put forward (Section 6). We conclude with some
remarks on future directions of investigation.
The question of conditional inference, i.e., of which conditional sentences of the form “if α then, normally, β” should
follow from a set KB of such sentences, has been one of the
classic questions of non-monotonic reasoning, with several
well-known solutions proposed. Perhaps the most notable is
the rational closure construction of Lehmann and Magidor,
under which the set of inferred conditionals forms a rational
consequence relation, i.e., satisfies all the rules of preferential reasoning, plus Rational Monotonicity. However, this last
named rule is not universally accepted, and other researchers
have advocated working within the larger class of disjunctive
consequence relations, which satisfy the weaker requirement
of Disjunctive Rationality. While there are convincing arguments that the rational closure forms the “simplest” rational
consequence relation extending a given set of conditionals,
the question of what is the simplest disjunctive consequence
relation has not been explored. In this paper, we propose a
solution to this question and explore some of its properties.
1
Introduction
The question of conditional inference, i.e., of which conditional sentences of the form “if α then, normally, β” should
follow from a set KB of such sentences, has been one of the
classic questions of non-monotonic reasoning, with several
well-known solutions proposed. Since the work of Lehmann
and colleagues in the early ’90s, the so-called preferential
approach to defeasible reasoning has established itself as
one of the most elegant frameworks within which to answer this question. Central to the preferential approach is
the notion of rational closure of a conditional knowledge
base, under which the set of inferred conditionals forms a
rational consequence relation, i.e., satisfies all the rules of
preferential reasoning, plus Rational Monotonicity. One of
the reasons for accepting rational closure is the fact it delivers a venturous notion of entailment that is conservative
enough. Given that, rationality has for long been accepted as
the core baseline for any appropriate form of non-monotonic
entailment.
Very few have stood against this position, including
Makinson (1994), who considered Rational Monotonicity
too strong and has briefly advocated the weaker rule of Disjunctive Rationality instead. This rule is implied by Rational Monotonicity and may still be desirable in cases where
2
Formal preliminaries
In this section, we provide the required formal background
for the remainder of the paper. In particular, we set up
the notation and conventions that shall be followed in the
upcoming sections. (The reader conversant with the KLM
framework for non-monotonic reasoning can safely skip to
Section 3.)
Let P be a finite set of propositional atoms. We use
p, q, . . . as meta-variables for atoms. Propositional sentences are denoted by α, β, . . ., and are recursively defined
in the usual way:
α ::= ⊤ | ⊥ | P | ¬α | α ∧ α | α ∨ α | α → α | α ↔ α
We use L to denote the set of all propositional sentences.
def
With U =
{0, 1}P , we denote the set of all propositional
valuations, where 1 represents truth and 0 falsity. We use
37
v, u, . . ., possibly with primes, to denote valuations. Whenever it eases the presentation, we shall represent valuations
as sequences of atoms (e.g., p) and barred atoms (e.g., p̄),
with the understanding that the presence of a non-barred
atom indicates that the atom is true (has the value 1) in the
valuation, while the presence of a barred atom indicates that
the atom is false (has the value 0) in the valuation. Thus,
for the logic generated from P = {b, f, p}, where the atoms
stand for, respectively, “being a bird”, “being a flying creature”, and “being a penguin”, the valuation in which b is
true, f is false, and p is true will be represented as bf̄p.
Satisfaction of a sentence α ∈ L by a valuation v ∈ U
is defined in the usual truth-functional way and is denoted
by v
α. The set of models of a sentence α is defined as
def
JαK =
{v ∈ U | v
α}. This notion isTextended to a set
def
of sentences X in the usual way: JXK =
α∈X JαK. We say
a set of sentences X (classically) entails α ∈ L, denoted
X |= α, if JXK ⊆ JαK. α is valid, denoted |= α, if JαK = U .
2.1
Monotonicity rule (Lehmann and Magidor 1992), it is said
to be a rational consequence relation:
(RM)
Rational consequence relations can be given an intuitive
semantics in terms of ranked interpretations.
Definition 1. A ranked interpretation R is a function
from U to N ∪ {∞} such that R(v) = 0 for some v ∈ U,
and satisfying the following convexity property: for every
i ∈ N, if R(u) = i, then, for every j s.t. 0 ≤ j < i, there is
a u′ ∈ U for which R(u′ ) = j.
In a ranked interpretation, we call R(v) the rank of v
w.r.t. R. The intuition is that valuations with a lower rank
are deemed more normal (or typical) than those with a
higher rank, while those with an infinite rank are regarded
as so atypical as to be ‘forbidden’, e.g. by some background
knowledge—see below. Given a ranked interpretation R,
we therefore partition the set U into the set of plausible valuations (those with finite rank), and that of implausible ones
(with rank ∞).1
Figure 1 depicts an example of a ranked interpretation for
P = {b, f, p}. (In our graphical representations of ranked
interpretations—and of interval-based interpretations later
on—we shall plot the set of valuations in U on the y-axis
so that the preference relation reads more naturally across
the x-axis—from lower to higher. Moreover, plausible valuations are associated with the colour blue, whereas the implausible ones with red.)
KLM-style rational defeasible consequence
Several approaches to non-monotonic reasoning have been
proposed in the literature over the past 40 years. The preferential approach, initially put forward by Shoham (1988)
and subsequently developed by Kraus et al. (1990) in much
depth (the reason why it became known as the KLMapproach), has established itself as one of the main references in the area. This stems from at least three of
its features: (i) its intuitive semantics and elegant prooftheoretic characterisation; (ii) its generality w.r.t. alternative approaches to non-monotonic reasoning such as circumscription (McCarthy 1980), default logic (Reiter 1980), and
many others, and (iii) its formal links with AGM-style belief
revision (Gärdenfors and Makinson 1994). The fruitfulness
of the preferential approach is also witnessed by the great
deal of recent work extending it to languages that are more
expressive than that of propositional logic such as those of
description logics (Bonatti 2019; Britz, Meyer, and Varzinczak 2011; Casini et al. 2015; Britz and Varzinczak 2017;
Giordano et al. 2007; Giordano et al. 2015; Pensel and
Turhan 2017; Varzinczak 2018), modal logics (Britz and
Varzinczak 2018a; Britz and Varzinczak 2018b; Chafik et
al. 2020), and others (Booth, Meyer, and Varzinczak 2012).
A defeasible consequence relation |∼ is defined as a binary relation on sentences of our underlying propositional
logic, i.e., |∼ ⊆ L × L. We say that |∼ is a preferential consequence relation (Kraus, Lehmann, and Magidor 1990) if it
satisfies the following set of (Gentzen-style) rules:
(Ref)
α |∼ α
(LLE)
|= α ↔ β, α |∼ γ
β |∼ γ
(And)
α |∼ β, α |∼ γ
α |∼ β ∧ γ
(Or)
α |∼ γ, β |∼ γ
α ∨ β |∼ γ
(RW)
α |∼ β, |= β → γ
α |∼ γ
(CM)
α |∼ β, α |∼ γ
α ∧ β |∼ γ
α |∼ β, α 6|∼ ¬γ
α ∧ γ |∼ β
b̄f̄p
b̄fp
bfp
bf̄p
bf̄p̄
bfp̄
b̄fp̄
b̄f̄p̄
•
•
•
•
•
•
•
•
0
1
2
∞
Figure 1: A ranked interpretation for P = {b, f, p}.
Given a ranked interpretation R and α ∈ L, with JαKR
we denote the set of plausible valuations satisfying α (αvaluations for short) in R. If JαKR = J⊤KR , then we say α
def
min{R(v) |
is true in R and denote it R α. With R(α)=
1
In the literature, it is customary to omit implausible valuations
from ranked interpretations. Since they are not logically impossible, but rather judged as irrelevant on the grounds of contingent
information (e.g. a knowledge base) which is prone to change, we
shall include them in our semantic definitions. This does not mean
that we do anything special with them in this paper; they are rather
kept for future use.
If, in addition to the preferential rules, the defeasible consequence relation |∼ also satisfies the following Rational
38
v ∈ JαKR } we denote the rank of α in R. By convention,
if JαKR = ∅, we let R(α) = ∞. Defeasible consequence
of the form α |∼ β is then given a semantics in terms of
ranked interpretations in the following way: We say α |∼ β
is satisfied in R (denoted R α |∼ β) if R(α) < R(α ∧
¬β). (And here we adopt Jaeger’s (1996) convention that
∞ < ∞ always holds.) It is easy to see that for every α ∈ L,
R
α if and only if R
¬α |∼ ⊥. If R
α |∼ β,
we say R is a ranked model of α |∼ β. In the example in
Figure 1, we have R b |∼ f, R p → b (and therefore
R
¬(p → b) |∼ ⊥), R
p |∼ ¬f, R 6 f |∼ b, and
R
p ∧ ¬b |∼ b, which are all according to the intuitive
expectations.
That this semantic characterisation of rational defeasible consequence is appropriate is a consequence of a representation result linking the seven rationality rules above to
precisely the class of ranked interpretations (Lehmann and
Magidor 1992; Gärdenfors and Makinson 1994).
rational extensions of |∼KB . Nevertheless, as pointed out
by Lehmann and Magidor (1992, Section 4.2), the intersection of all such rational extensions is not, in general, a rational consequence relation: it coincides with preferential closure and therefore may fail RM. Among other things, this
means that the corresponding entailment relation, which is
called rank entailment and defined as KB |=R α |∼ β if
every ranked model of KB also satisfies α |∼ β, is monotonic and therefore it falls short of being a suitable form
of entailment in a defeasible reasoning setting. As a result, several alternative notions of entailment from conditional knowledge bases have been explored in the literature on non-monotonic reasoning (Booth and Paris 1998;
Booth et al. 2019; Casini, Meyer, and Varzinczak 2019;
Giordano et al. 2012; Giordano et al. 2015; Lehmann 1995;
Weydert 2003), with rational closure (Lehmann and Magidor 1992) commonly acknowledged as the gold standard in
the matter.
2.2
Rational closure (RC) is a form of inferential closure extending the notion of rank entailment above. It formalises
the principle of presumption of typicality (Lehmann 1995,
p. 63), which, informally, specifies that a situation (in our
case, a valuation) should be assumed to be as typical as possible (w.r.t. background information in a knowledge base).
Rational closure
One can also view defeasible consequence as formalising
some form of (defeasible) conditional and bring it down to
the level of statements. Such was the stance adopted by
Lehmann and Magidor (1992). A conditional knowledge
base KB is thus a finite set of statements of the form α |∼ β,
with α, β ∈ L, and possibly containing classical statements.
As an example, let KB = {b |∼ f, p → b, p |∼ ¬f}. Given
a conditional knowledge base KB, a ranked model of KB is
a ranked interpretation satisfying all statements in KB. As
it turns out, the ranked interpretation in Figure 1 is a ranked
model of the above KB. It is not hard to see that, in every
ranked model of KB, the valuations b̄f̄p and b̄fp are deemed
implausible—note, however, that they are still logically possible, which is the reason why they feature in all ranked interpretations.
An important reasoning task in this setting is that of determining which conditionals follow from a conditional knowledge base. Of course, even when interpreted as a conditional
in (and under) a given knowledge base KB, |∼ is expected
to adhere to the rules of Section 2.1. Intuitively, that means
whenever appropriate instantiations of the premises in a rule
are sanctioned by KB, so should the suitable instantiation of
its conclusion.
To be more precise, we can take the defeasible conditionals in KB as the core elements of a defeasible consequence
relation |∼KB . By closing the latter under the preferential
rules (in the sense of exhaustively applying them), we get
a preferential extension of |∼KB . Since there can be more
than one such extension, the most cautious approach consists in taking their intersection. The resulting set, which
also happens to be closed under the preferential rules, is
the preferential closure of |∼KB , which we denote by |∼KB
PC.
When interpreted again as a conditional knowledge base, the
preferential closure of |∼KB contains all the conditionals entailed by KB. (Hence, the notions of closure of and entailment from a conditional knowledge base are two sides of
the same coin.) The same process and definitions carry over
when one requires the defeasible consequence relations also
to be closed under the rule RM, in which case we talk of
Assume an ordering KB on all ranked models of a
knowledge base KB, which is defined as follows: R1 KB
R2 , if, for every v ∈ U , R1 (v) ≤ R2 (v). Intuitively,
ranked models lower down in the ordering are more typical. It is easy to see that KB is a weak partial order.
Giordano et al. (2015) showed that there is a unique KB minimal element. The rational closure of KB is defined in
terms of this minimum ranked model of KB.
Definition 2. Let KB be a conditional knowledge base, and
KB
be the minimum element of the ordering KB on
let RRC
ranked models of KB. The rational closure of KB is the
def
KB
defeasible consequence relation |∼KB
RC = {α |∼ β | RRC
α |∼ β}.
As an example, Figure 1 shows the minimum ranked
model of KB = {b |∼ f, p → b, p |∼ ¬f} w.r.t. KB . Hence
we have that ¬f |∼ ¬b is in the rational closure of KB.
Observe that there are two levels of typicality at work for
rational closure, namely within ranked models of KB, where
valuations lower down are viewed as more typical, but also
between ranked models of KB, where ranked models lower
down in the ordering are viewed as more typical. The most
KB
typical ranked model RRC
is the one in which valuations
are as typical as KB allows them to be (the principle of presumption of typicality we alluded to above).
Rational closure is commonly viewed as the basic (although certainly not the only acceptable) form of nonmonotonic entailment, on which other, more venturous forms of entailment can be and have been constructed (Booth et al. 2019; Casini et al. 2014; Casini,
Meyer, and Varzinczak 2019; Lehmann 1995).
39
3
Disjunctive rationality and interval-based
preferential semantics
Figure 2 illustrates an example of an interval-based interpretation for P = {b, f, p}. In our depictions of intervalbased interpretations, it will be convenient to see I as a
function from U to intervals on the set N ∪ {∞}. Whenever
the intervals associated to valuations u and v overlap, the
intuition is that both valuations are incomparable in I ; otherwise the leftmost interval is seen as more preferred than
the rightmost one.
One may argue that there are cases in which Rational Monotonicity is too strong a rule to enforce and for which a
weaker defeasible consequence relation would suffice (Giordano et al. 2010; Makinson 1994). Nevertheless, doing away
completely with rationality (i.e., sticking to the preferential
rules only) is not particularly appropriate in a defeasiblereasoning context. Indeed, as widely known in the literature, preferential systems induce entailment relations that
are monotonic. In that respect, here we are interested in defeasible consequence relations (or defeasible conditionals)
that do not necessarily satisfy Rational Monotonicity while
still encapsulating some form of rationality, i.e., a venturous passage from the premises to the conclusion. A case in
point is that of the Disjunctive Rationality (DR) rule (Kraus,
Lehmann, and Magidor 1990) below:
(DR)
b̄f̄p
b̄fp
bfp
bf̄p
bf̄p̄
bfp̄
b̄fp̄
b̄f̄p̄
α ∨ β |∼ γ
α |∼ γ or β |∼ γ
•
•
•
•
•
0
Intuitively, DR says that if one may draw a conclusion from
a disjunction of premises, then one should be able to draw
this conclusion from at least one of these premises taken
alone (Freund 1993). A preferential consequence relation
is called disjunctive if it also satisfies DR.
As it turns out, every rational consequence relation is
also disjunctive, but not the other way round (Lehmann and
Magidor 1992). Therefore, DR is a weaker form of rationality, as its name suggests. Given that, Disjunctive Rationality
is indeed a suitable candidate for the type of investigation
we have in mind.
1
2
∞
Figure 2: An interval-based interpretation for P = {b, f, p}.
In Figure 2, the rationale behind the ordering is as follows: situations with flying birds are the most normal ones;
situations with non-flying penguins are more normal than
the flying-penguin ones, but both are incomparable to nonpenguin situations; the situations with penguins that are not
birds are the implausible ones; and finally those that are not
about birds or penguins are so irrelevant as to be seen as
incomparable with any of the plausible ones.
The notions of plausible and implausible valuations, as
well as that of α-valuations, carry over to interval-based
interpretations, only now the plausible valuations are the
ones with finite lower ranks (and hence also finite upper ranks, by part (iv) of the previous definition). With
def
def
min{U (v) |
min{L (v) | v ∈ JαKI } and U (α) =
L (α) =
I
v ∈ JαK } we denote, respectively, the lower and the upper rank of α in I . By convention, if JαKI = ∅, we
let L (α) = U (α) = ∞. We say α |∼ β is satisfied in I
(denoted I α |∼ β) if U (α) < L (α ∧ ¬β). (Recall the
convention that ∞ < ∞.) As an example, in the intervalbased interpretation of Figure 2, we have I
b |∼ f,
I
p |∼ ¬f, and I 6 ¬f |∼ ¬p (contrary to the ranked
interpretation R in Figure 1, which endorses the latter statement).
In the tradition of the KLM approach to defeasible reasoning, we define the defeasible consequence relation induced
def
by an interval-based interpretation: |∼I =
{α |∼ β | I
α |∼ β}. We can now state a KLM-style representation result establishing that our interval-based semantics is suitable
for characterising the class of disjunctive defeasible consequence relations, which is a variant of Freund’s (1993) result:
Theorem 1. A defeasible consequence relation is a disjunctive consequence relation if and only if it is defined by some
A semantic characterisation of disjunctive consequence
relations was given by Freund (1993) based on a filtering
condition on the underlying ordering. Here, we provide an
alternative semantics in terms of interval-based interpretations. (We conjecture Freund’s semantic constructions and
ours can be shown to be equivalent in the finite case.)
Definition 3. An interval-based interpretation is a tuple
def
I =
hL , U i, where L and U are functions from U to
N∪{∞} s.t. (i) L (v) = 0, for some v ∈ U; (ii) if L (u) = i
or U (u) = i, then for every 0 ≤ j < i, there is u′ s.t. either L (u′ ) = j or U (u′ ) = j, (iii) L (v) ≤ U (v), for
every v ∈ U, and (iv) L (u) = ∞ iff U (u) = ∞. Given
I = hL , U i and v ∈ U, L (v) is the lower rank of v
in I , and U (v) is the upper rank of v in I . Hence, for
any v, the pair (L (v), U (v)) is the interval of v in I . We
say u is more preferred than v in I , denoted u ≺ v, if
U (u) < L (v).
The preference order ≺ on U defined above via an
interval-based interpretation forms an interval order, i.e., it
is a strict partial order that additionally satisfies the interval
condition: if u ≺ v and u′ ≺ v ′ , then either u ≺ v ′ or
u′ ≺ v. Furthermore, every interval order can be defined
from an interval-based interpretation in this way. See the
work of Fishburn (1985) for a detailed treatise on interval
orders.
40
interval-based interpretation, i.e., |∼ is disjunctive if and
only if there is I such that |∼ = |∼I .
4
say two knowledge bases KB 1 , KB 2 are equivalent, written
KB 1 ≡ KB 2 , if there is a bijection f : KB 1 −→ KB 2 s.t.
each α |∼ β ∈ KB 1 is equivalent to f (α |∼ β).
Towards disjunctive rational closure
Given this, we can express a weak form of syntax independence:
Given a conditional knowledge base KB, the obvious definition of closure under Disjunctive Rationality consists in
taking the intersection of all disjunctive extensions of |∼KB
(cf. Section 2.2). Let us call it the disjunctive closure
of |∼KB , with interval-based entailment, defined as KB |=I
α |∼ β if every interval-based model of KB also satisfies
α |∼ β, being its semantic counterpart. The following result
shows that the notion of disjunctive closure is stillborn, i.e.,
it does not even satisfy Disjunctive Rationality.
2
1
.
=|∼KB
Equivalence If KB 1 ≡ KB 2 , then |∼KB
∗
∗
Finally, the last of our basic postulates requires rational
closure to be the upper bound on how venturous our consequence relation should be.
KB
Infra-Rationality |∼KB
∗ ⊆ |∼RC .
4.2
Theorem 2. Given a conditional knowledge base KB, (i)
the disjunctive closure of KB coincides with its preferential
KB
closure |∼KB
P C . (ii) There exists KB such that |∼P C does not
satisfy Disjunctive Rationality.
Echoing a fundamental principle of reasoning in general and
of non-monotonic reasoning in particular is a property requiring |∼KB
to contain only conditionals whose inferences
∗
can be justified on the basis of KB. The first idea to achieve
this would be to set |∼KB
to be a set-theoretically minimal
∗
disjunctive consequence relation that extends KB.
For a simple counterexample showing that |∼KB
P C need not
satisfy Disjunctive Rationality, consider KB = {⊤ |∼ b}.
Clearly we have p ∨ ¬p |∼KB
P C b, but one can easily construct
interval-based interpretations I1 , I2 whose corresponding
consequence relations both satisfy KB but for which p 6|∼I1
b and ¬p 6|∼I2 b.
This result suggests that the quest for a suitable definition
of entailment under disjunctive rationality should follow the
footprints in the road which led to the definition of rational closure. Such is our contention here, and our research
question is now: ‘Is there a single best disjunctive relation
extending the one induced by a given conditional knowledge
base KB?’
Let us denote by |∼KB
∗ the special defeasible consequence
relation that we are looking for. In the remainder of this section, we consider some desirable properties for the mapping
from KB to |∼KB
∗ , and consider some simple examples in
order to build intuitions. In the following section, we will
offer a concrete construction: the Disjunctive Rational Closure of KB.
4.1
Minimality postulates
Example 1. Suppose the only knowledge we have is a single conditional saying “birds normally fly”, i.e., KB =
{b |∼ f}. Assuming just two variables, we have a unique
⊆-minimal disjunctive consequence relation extending this
knowledge base, which is given by the interval-based interpretation I in Figure 3. Indeed, the conditional b |∼ f
is saying precisely that bf ≺ bf̄, but is telling us nothing
with regard to the relative typicality of the other two possible valuations, so any pair of valuations other than this one
is incomparable. For this reason, we do not have ¬f |∼I ¬b
here. Note the rational closure in this example does endorse
this latter conclusion, thus providing further evidence that
the rational closure arguably gives some unwarranted conclusions.
bf̄
bf
b̄f
b̄f̄
Basic postulates
Starting with our most basic requirements, we put forward
the following two postulates:
Inclusion If α |∼ β ∈ KB, then α |∼KB
β.
∗
•
•
0
D-Rationality |∼KB
∗ is a disjunctive consequence relation.
1
Figure 3: Interval-based model of KB = {b |∼ f}.
Note that, given Theorem 1, D-Rationality is equivalent
to saying there is an interval-based interpretation I such
that |∼KB
∗ = |∼I . If we replace “disjunctive consequence”
in the statement by “rational consequence”, then that is the
postulate that is usually considered in the area.
Another reasonable property to require from an induced
consequence relation is for two equivalent knowledge bases
to yield exactly the same set of inferences. This prompts the
question of what it means to say that two conditional knowledge bases are equivalent. One weak notion of equivalence
can be defined as follows.
The next example illustrates the fact that there might be
more than one ⊆-minimal extension of a KB-induced consequence relation.
Example 2. Assume a COVID-19 inspired scenario with
only two propositions, m and s, standing for, respectively,
“you wear a mask” and “you observe social distancing”.
Let KB = {m |∼ s, ¬m |∼ s}. There are two ⊆-minimal disjunctive consequence relations extending |∼KB , corresponding to the two interval-based interpretations I1 and I2
(from left to right) in Figure 4. The first conditional is saying
ms ≺ ms̄, while the second is saying m̄s ≺ m̄s̄. According
Definition 4. Let α1 , α2 , β1 , β2 ∈ L. We say α1 |∼ β1 is
equivalent to α2 |∼ β2 if |= (α1 ↔ α2 ) ∧ (β1 ↔ β2 ). We
41
(on P) is a function σ : P −→ L. A symbol translation
can be extended to a function on L by setting, for each sentence α, σ(α) to be the sentence obtained from α by replacing each atom p occurring in α by its image σ(p) throughout.2 Similarly, given a conditional knowledge base KB and
a symbol translation σ(·), we denote by σ(KB) the knowledge base obtained by replacing each conditional α |∼ β
in KB by σ(α) |∼ σ(β).
to the interval condition (see the paragraph following Definition 3), we must then have either ms ≺ m̄s̄ or m̄s ≺ ms̄.
The choice of which gives rise to I1 and I2 , respectively.
ms̄
ms
m̄s
m̄s̄
ms̄
ms
m̄s
m̄s̄
•
•
0
1
2
•
•
Representation Independence For any symbol translaσ(KB)
tion σ(·), we have α |∼KB
β iff σ(α) |∼∗
σ(β).
∗
0
1
Note that Weydert (2003) also considers Representation
Independence (RI) in the context of conditional inference,
but in a slightly different framework. The idea behind it
has also been explored by Jaeger (1996), who, in particular, looked at the property in relation to rational closure.
As noted by Marquis and Schwind (2014), the property is a
very demanding one that is likely hard to satisfy in its full,
unrestricted form above. And indeed this is confirmed in
our setting, since it can be shown that Representation Independence is jointly incompatible with two of our basic postulates, namely Inclusion and Infra-Rationality. This motivates the need to focus on specific families of symbol translation. Some examples are the following:
2
Figure 4: Interval-based models of the two ⊆-minimal extensions
of |∼KB , for KB = {m |∼ s, ¬m |∼ s}.
In the light of Example 2 above, a question that arises
is what to do when one has more than a single ⊆-minimal
extension of |∼KB . Theorem 2 already tells us we cannot, in general, take the obvious approach by taking their
intersection. However, even though returning the disjunctive/preferential closure |∼KB
P C is not enough to ensure DRationality, we might still expect the following postulates as
reasonable.
Vacuity If |∼KB
P C is a disjunctive consequence relation, then
KB
|∼KB
∗ = |∼P C .
1. σ(·) is a permutation on P, i.e., is just a renaming of the
propositional variables;
2. σ(p) ∈ {p, ¬p}, for all p ∈ P. Then, instead of using p
to denote say “it’s raining”, we use it rather to denote “it’s
not raining”. We call any symbol translation of this type
a negation-swapping symbol translation.
KB
Preferential Extension |∼KB
P C ⊆ |∼∗ .
(Note, given Theorem 2, the postulate above follows from
Inclusion and D-Rationality.)
Justification If α |∼KB
β, then α |∼′ β for at least one
∗
⊆-minimal disjunctive relation |∼′ extending |∼KB .
4.3
Each special subfamily of symbol translations yields a
corresponding weakening of RI that applies to just that kind
of translation. In particular we have the following postulate:
Representation independence postulates
Negated Representation Independence For any negationswapping symbol translation σ(·), we have α |∼KB
β iff
∗
σ(KB)
σ(α) |∼∗
σ(β).
Going back to Example 2, what should the expected output
be in this case? Intuitively, faced with the choice of which
of the pairs ms ≺ m̄s̄ or m̄s ≺ ms̄ to include, and in the
absence of any reason to prefer either one, it seems the right
thing to do is to include both, and thereby let the intervalbased interpretation depicted in Figure 5 yield the output.
Notice that this will be the same as the rational closure in
this case.
ms̄
ms
m̄s
m̄s̄
Example 3. Going back to Example 2, when modelling the
scenario, instead of using propositional atom m to denote
“you wear a mask” we could equally well have used it to
denote “you do not wear a mask”. Then the statement “if
you wear a mask then, normally, you do social distancing” would be modelled by ¬m |∼ s, etc. This boils down
to taking a negation-swapping symbol translation such that
σ(m) = ¬m and σ(s) = s. Then σ(KB) = {¬m |∼
s, ¬¬m |∼ s}, and if we inferred, say, m ↔ s |∼ s from KB
then we would expect to infer ¬m ↔ s |∼ s from σ(KB).
•
•
•
•
4.4
0
Cumulativity postulates
The idea behind a notion of Cumulativity in our setting is
that adding a conditional to the knowledge base that was
already inferred should not change anything in terms of its
consequences. We can split this into two ‘halves’.
1
Figure 5: Interval-based models of the union of the two ⊆-minimal
extensions of |∼KB , for KB = {m |∼ s, ¬m |∼ s}.
Cautious Monotonicity If α |∼KB
β and KB ′ = KB ∪
∗
′
KB
{α |∼ β}, then |∼KB
.
∗ ⊆ |∼∗
We can express the desired symmetry requirement in
a syntactic form, using the notion of symbol translations (Marquis and Schwind 2014). A symbol translation
2
Marquis and Schwind (2014) consider much more general settings, but this is all we need in the present paper.
42
′
Cut If α |∼KB
β and KB ′ = KB ∪ {α |∼ β}, then |∼KB
⊆
∗
∗
KB
|∼∗ .
take all conditionals holding in this interpretation as the consequences of KB. In this section, we give our construction
KB
of the interpretation IDC
that gives us the disjunctive rational closure of a conditional knowledge base.
KB
KB
KB
To specify IDC
, we will construct the pair hLDC
, UDC
i
of functions specifying the lower and upper ranks for each
valuation. Since we aim to satisfy Infra-Rationality, our conKB
struction method takes the rational closure RRC
of KB as a
point of departure. Starting with the lower ranks, we simply
set, for all v ∈ U:
def
KB
KB
LDC
(v) =
RRC
(v).
That is, the lower ranks are given by the rational closure.
KB
, if we happen to have
For the upper ranks UDC
KB
KB
LDC (v) = RRC (v) = ∞, then, to conform with the definition of interval-based interpretation, it is clear that we must
KB
KB
(v) 6= ∞, then the construc(v) = ∞ also. If LDC
set UDC
KB
tion of UDC (v) becomes a little more involved. We require
first the following definition.
Definition 5. Given a ranked interpretation R and a conditional α |∼ β such that R
α |∼ β, we say a valuation v
verifies α |∼ β in R if R(v) = R(α).
KB
Now, assuming LDC
(v) 6= ∞, our construction of
KB
UDC (v) splits into two cases, according to whether v veriKB
fies any of the conditionals from KB in RRC
or not.
We conclude this section with an impossibility result concerning a subset of the postulates we have mentioned so far.
Theorem 3. There is no method ∗ simultaneously satisfying all of Inclusion, D-Rationality, Equivalence, Vacuity,
Cautious Monotonicity and Negated Representation Independence.
Proof. Assume, for contradiction, that ∗ satisfies all the
listed properties. Suppose P = {m, s} and let KB be the
knowledge base from Example 2, i.e., {m |∼ s, ¬m |∼ s}.
By Inclusion, m |∼KB
s and ¬m |∼KB
s. By D-Rationality,
∗
∗
we know |∼KB
satisfies
the Or rule, so, from these two, we
∗
get m ∨ ¬m |∼KB
s which, in turn, yields (m ↔ s) ∨ (¬m ↔
∗
s) |∼KB
s, by LLE. Applying DR to this means we have:
∗
(m ↔ s) |∼KB
s or (¬m ↔ s) |∼KB
s
∗
∗
(1)
Now, let σ(·) be the negation-swapping symbol translation mentioned in Example 3, i.e., σ(m) = ¬m, σ(s) = s,
so σ(KB) = {¬m |∼ s, ¬¬m |∼ s}. Then, by Negated
Representation Independence, we have (m ↔ s) |∼KB
s iff
∗
σ(KB)
(¬m ↔ s) |∼∗
s. But clearly we have KB ≡ σ(KB),
so, by Equivalence, we obtain from this:
(m ↔ s) |∼KB
s iff (¬m ↔ s) |∼KB
s
∗
∗
Case 1: v does not verify any of the conditionals in KB
KB
in RRC
. In this case, we set:
(2)
s
Putting (1) and (2) together gives us both (m ↔ s) |∼KB
∗
′
s.
Now,
let
KB
=
KB
∪
{(m
↔
and (¬m ↔ s) |∼KB
∗
KB′
s) |∼ s}. By Cautious Monotonicity, |∼KB
. In
∗ ⊆ |∼∗
′
KB
particular, (¬m ↔ s) |∼∗
s. It can be checked that
the disjunctive/preferential closure of KB ′ is itself a disjunctive consequence relation. In fact, it corresponds to the
interval-based interpretation on the left of Figure 4. Hence,
by Vacuity, this particular interval-based interpretation cor′
. But, by inspecting this picture, we
responds also to |∼KB
∗
′
see (¬m ↔ s) 6|∼KB
s,
which leads to a contradiction.
∗
def
KB
KB
KB
(v) =
max{RRC
(u) | RRC
(u) 6= ∞}
UDC
KB
Case 2: v verifies at least one conditional from KB in RRC
.
In this case, the idea is to extend the upper rank of v as
much as possible while still ensuring the constraints repKB
resented by KB are respected in the resulting IDC
. If v
KB
verifies α |∼ β in RRC , then this is achieved by setting
KB
KB
UDC
(v) = RRC
(α ∧ ¬β) − 1; or, if R(α ∧ ¬β) = ∞, then
KB
KB
KB
again just set UDC
(v) = max{RRC
(u) | RRC
(u) 6= ∞},
as in Case 1. (This takes care of ‘redundant’ conditionals
that might occur in KB, like α |∼ α). We introduce now the
following notation. Given sentences α, β:
KB
KB
RRC (α ∧ ¬β) − 1, if RRC
(α ∧ ¬β) 6= ∞
def
tKB
(α,
β)
=
KB
KB
RC
max{RRC (u) | RRC (u) 6= ∞}, otherwise.
But we need to take care of the situation in which v posKB
sibly verifies more than one conditional from KB in RRC
.
In order to ensure that all conditionals in KB will still be
satisfied, we need to take:
def
KB
min{tKB
UDC
(v) =
RC (α, β) | (α |∼ β) ∈ KB and
Theorem 3 is both surprising and disappointing, since all
of the properties mentioned seem to be rather intuitive and
desirable. Note that a close inspection of the proof shows
that even just Vacuity and Cautious Monotonicity together
place some quite severe restrictions on the behaviour of ∗.
Corollary 1. Let P = {p, q} and KB = {p |∼ q, ¬p |∼
q}. There is no operator ∗ satisfying Vacuity and Cautious
Monotonicity that infers both (p ↔ q) |∼KB
q and (¬p ↔
∗
q) |∼KB
q.
∗
KB
}
v verifies α |∼ β in RRC
So, summarising the two cases, we arrive at our final defKB
inition of UDC
:
min{tKB
RC (α, β) | α |∼ β ∈ KB and
KB
v verifies α |∼ β in RRC
},
def
KB
if v verifies at least one conditional from
UDC (v)=
KB
KB in RRC
KB
KB
max{RRC
(u) | RRC
(u) 6= ∞}, otherwise.
What can we do in the face of these results? Our strategy
will be to seek to construct a method that can satisfy as many
of these properties as possible. We now provide our candidate for such a method - the disjunctive rational closure.
5
A construction for disjunctive rational
closure
In order to satisfy D-Rationality, we can focus on constructing a special interval-based interpretation from KB and then
43
KB
Note that if v verifies α |∼ β ∈ KB in RRC
, then
KB
KB
KB
(v) = RRC
(α) ≤ RRC
(α ∧ ¬β) − 1 = tKB
(α,
β).
RRC
RC
KB
KB
Thus, in both cases above, we have LDC (v) ≤ UDC (v)
KB
KB
and so the pair LDC
and UDC
form a legitimate intervalbased interpretation.
We thus arrive at our final definition of the disjunctive
rational closure of a conditional knowledge base.
KB def
KB
KB
, UDC
i be the intervalDefinition 6. Let IDC
= hLDC
KB
KB
based interpretation specified by LDC and UDC
as above.
The disjunctive rational closure of KB is the defeasible condef
KB
α |∼ β}.
sequence relation |∼KB
DC = {α |∼ β | IDC
In the remainder of this section, we revisit the examples
we have seen throughout the paper, to see what answer the
disjunctive rational closure gives.
Example 4. Going back to Example 1, with KB = {b |∼
KB
KB
f}, the rational closure yields RRC
(bf) = RRC
(b̄f) =
KB
KB
KB
KB
RRC (b̄f̄) = 0 and RRC (bf̄) = 1. Since LDC = RRC
,
KB
this gives us the lower ranks for each valuation in IDC
.
Turning to the upper ranks, the only valuation that verifies
KB
the single conditional b |∼ f in KB is bf, thus UDC
(bf) =
KB
KB
tRC (b, f) = RRC (b ∧ ¬f) − 1 = 1 − 1 = 0, meaning that
the interval assigned to bf is (0, 0). The other three valuations all get assigned the same upper rank, which is just the
KB
maximum finite rank occurring in RRC
, which is 1. Thus
the interval assigned to bf̄ is (1, 1), while both the valuaKB
tions in J¬bK are assigned (0, 1). So IDC
outputs exactly
the same interval-based interpretation depicted in Figure 3
which, recall, gives the unique ⊆-minimal disjunctive consequence relation extending KB in this case.
Example 5. Returning to Example 2, with KB = {m |∼
KB
s, ¬m |∼ s}, the rational closure yields RRC
(ms) =
KB
KB
KB
RRC (m̄s) = 0 and RRC (ms̄) = RRC (m̄s̄) = 1, which
gives us the lower ranks. The valuation ms verifies only
KB
the conditional m |∼ s, and so UDC
(ms) = tKB
RC (m, s) =
KB
RRC (m ∧ ¬s) − 1 = 1 − 1 = 0. Similarly, the valuation m̄s
verifies only the conditional ¬m |∼ s and so, by analoKB
gous reasoning, UDC
(m̄s) = tKB
RC (¬m, s) = 0. So both
KB
.
of these valuations are assigned the interval (0, 0) by IDC
The other two valuations, which verify neither conditional
KB
in KB, are assigned (1, 1). Thus, in this case, IDC
returns
just the rational closure of KB, as pictured in Figure 5.
In both the above examples, the disjunctive rational closure returns arguably the right answers.
Example 6. Consider KB = {b |∼ f, p → b, p |∼ ¬f}.
KB
As previously mentioned, the rational closure RRC
for this
KB is depicted in Figure 1. Since both of the valuations
in Jp ∧ ¬bK (in red at the top of the picture) are deemed implausible (i.e., have rank ∞), they are both assigned interval
(∞, ∞). Focusing then on just the plausible valuations, the
KB
only valuation verifying b |∼ f in RRC
is bfp̄ (which veriKB
KB
fies no other conditional in KB), so UDC
(bfp̄) = RRC
(b ∧
¬f) − 1 = 1 − 1 = 0. The only valuation verifying p |∼ ¬f
KB
KB
is bf̄p, so UDC
(bf̄p) = RRC
(p ∧ f) − 1 = 2 − 1 = 1. All
other plausible valuations get assigned as their upper rank
KB
the maximum finite rank, which is 2. The resulting IDC
is
the interval-based interpretation depicted in Figure 2.
We end this section by considering our construction from
the standpoint of complexity. The construction method
above runs in time that grows (singly) exponentially with the
size of the input, even if the rational closure of the knowledge base has been computed offline. To see why, let the
input be a set of propositional atoms P together with a conditional knowledge base KB, and let |KB| = n. (For simplicity, we assume the size of KB to be the number of conditionals therein.) We know that |U| = 2|P| . Now, for
each valuation v ∈ U , one has to check whether v verifies at least one conditional α |∼ β in KB (cf. Definition 5).
In the worst case, we have (i) all conditionals in KB will
be checked against v, i.e., we will have n checks per valuation. Each of such checks amounts to comparing R(v)
with R(α), where α is the antecedent of the conditional under inspection. While R(v) is already known, R(α) has
to be computed (unless, of course, we also assume it has
been done offline in the computation of the rational closure).
Computing R(α) is done by searching for the lowest valKB
uations in RRC
satisfying α. In the worst case, we have
|P|
that (ii) 2
valuations have to be inspected. Each such
inspection amounts to a propositional verification, which
is a polynomial-time task. Every time v verifies a conditional α |∼ β, the computation of tKB
RC (·) also requires that
KB
(α ∧ ¬β). In the worst case, the latter requires 2|P|
of RRC
propositional verifications. So, the computation of tKB
RC (·)
takes at most (iii) n × 2|P| checks. From (i), (ii) and (iii),
it follows that n2 × 22|P| propositional verifications are required. This has to be done for each of the 2|P| valuations,
and therefore we have a total of n2 × 23|P| verifications in
the worst case, from which the result follows.
Let us now take a look at the complexity of entailment
checking, i.e., that of checking whether a conditional α |∼
KB
β is satisfied by IDC
. This task amounts to computing
KB
KB
UDC (α) and LDC (α ∧ ¬β) and comparing them. It is
easy to see that in the worst-case scenario both require 2|P|
propositional verifications.
6
Properties of the Disjunctive Rational
Closure
We now turn to the question of which of the postulates from
Section 4 are satisfied by the disjunctive rational closure. We
start by observing that we obtain all of the basic postulates
proposed in Section 4.1:
Proposition 1. The disjunctive rational closure satisfies Inclusion, D-Rationality, Equivalence and Infra-Rationality.
Proof. (Outline) D-Rationality is immediate since we construct an interval-based interpretation. Equivalence is also
straightforward. For Infra-Rationality, first recall that
KB
KB
KB
α |∼KB
DC β iff UDC (α) < LDC (α∧¬β). Since LDC (α) ≤
KB
UDC (α) (follows by definition of interval-based interpretaKB
KB
tion) and LDC
(α ∧ ¬β) = RRC
(α ∧ ¬β) (by construction),
KB
KB
KB
we have UDC (α) < LDC (α ∧ ¬β) implies RRC
(α) =
KB
KB
KB
LDC (α) < RRC (α ∧ ¬β), giving α |∼RC β, as required
for Infra-Rationality. For Inclusion, suppose α |∼ β ∈ KB.
44
KB
KB
KB
If RRC
(α) = ∞, then LDC
(α) = UDC
(α) = ∞ by conKB
struction and so α |∼KB
β.
So
assume
RRC
(α) 6= ∞.
DC
KB
KB
Then, to show α |∼DC β, it suffices to show UDC
(v) <
KB
KB
LDC (α ∧ ¬β) = RRC (α ∧ ¬β) for at least one v ∈ JαK.
Since rational closure satisfies inclusion, we know α |∼KB
RC β
KB
and so, since RRC
(α) 6= ∞, there must exist at least one v ′
KB
KB
verifying α |∼ β in RRC
. By construction of UDC
, we
KB
KB
KB ′
have UDC (v ) ≤ tRC (α, β) = RRC (α ∧ ¬β) − 1 as required.
lead to an increase in the upper ranks, which means the disjunctive rational closure does satisfy Cautious Monotonicity.
Proposition 3. The disjunctive rational closure satisfies
Cautious Monotonicity.
′
Proof. (Outline) Suppose α |∼KB
DC β and let KB = KB ∪
{α |∼ β}. Since disjunctive rational closure satisfies InfraRationality, we know α |∼KB
RC β, and so, since rational cloKB′
KB
= RRC
, i.e.,
sure satisfies Cautious Monotonicity, RRC
KB′
are unchanged
the lower ranks of all valuations in IDC
KB
KB′
from IDC
. To show |∼KB
, it thus suffices to show
DC ⊆|∼∗
′
KB
KB
UDC (v) ≤ UDC (v) for all valuations v. If v does not
KB
KB′
KB′
(v) (since
(v) = UDC
verify α |∼ β in RRC
, then UDC
KB′
all terms and cases in the definition of UDC depend only
KB′
KB′
KB
), while if v does verify α |∼ β in RRC
,
on RRC
= RRC
′
′
KB
KB
KB
(v),
(α,
β)}
≤
U
(v), tKB
then UDC
(v) = min{UDC
DC
RC
as required.
We remind the reader that, since Inclusion and DRationality hold, disjunctive rational closure also satisfies
Preferential Extension.
Now we look at the Cumulativity properties. It is known
from the work by Lehmann and Magidor (1992) that rational
closure satisfies both Cautious Monotonicity and Cut, and,
′
in fact, if α |∼KB
RC β and KB = KB ∪ {α |∼ β}, then
KB
KB′
RRC = RRC . We can show the following for disjunctive
rational closure.
As we have seen in Corollary 1 in Section 4.4, the satisfaction of Cautious Monotonicity, plus the seemingly very
reasonable behaviour displayed by disjunctive rational closure in Example 5, come at the cost of Vacuity, i.e., even if
the preferential closure happens to be a disjunctive relation,
the output may sanction extra conclusions.
Proposition 4. The disjunctive rational closure does not
satisfy Vacuity.
Proposition 2. The disjunctive rational closure does not
satisfy Cut.
Proof. Assume P = {b, f}, and KB is again the knowledge base from Example 1, i.e., {b |∼ f}. We have seen in
KB
Example 4 that IDC
is given by the interval-based interpretation depicted in Figure 3. By inspecting this picture, we
KB
see IDC
⊤ |∼ (b → f). Now let KB ′ = KB ∪ {⊤ |∼
KB′
(b → f)}. Then IDC
is given by the model in Figure 6.
KB′
We now have IDC
¬f |∼ ¬b, whereas before we had
KB
IDC
6 ¬f |∼ ¬b.
bf̄
bf
b̄f
b̄f̄
Proof. By Corollary 1, there can be no operator ∗ satisfying Cautious Monotonicity and Vacuity that infers both
KB
(¬m ↔ s) |∼KB
DC s and (m ↔ s) |∼DC s. We saw in Example 5 that the disjunctive rational closure returns the rational
closure for this KB, and so yields both these conditional inferences. We have also just seen that disjunctive rational
closure satisfies Cautious Monotonicity. Hence we deduce
that disjunctive rational closure cannot satisfy Vacuity.
•
•
•
•
0
What about the Representation Independence postulates?
Concerning full Representation Independence, we have remarked earlier that this postulate is not compatible with the
basic postulates, and so Proposition 1 already tells us that
disjunctive rational closure fails it. However, we conjecture
that Negated Representation Independence is satisfied, since
we can show that if rational closure satisfies it, then the disjunctive rational closure will inherit the property. Although
Jaeger (1996) showed that rational closure does indeed conform with his version of Representation Independence, it remains to be proved that his notion coincides precisely with
ours.
1
Figure 6: Output for KB′ = {b |∼ f, ⊤ |∼ (b → f)}.
Essentially, the reason for the failure of Cut is that by
adding a new conditional α |∼ β to the knowledge base,
even when that conditional is already inferred by the disjunctive rational closure, we give certain valuations (namely
those in JαK) the opportunity to verify one more condiKB′
tional from the knowledge base in RRC
. (See, e.g. the
two valuations in J¬bK in the above counterexample.) This
leads, potentially, to a corresponding decrease in their upper ranks U∗KB , leading in turn to more inferences being
made available. This behaviour reveals that disjunctive rational closure can be termed a base-driven approach, since
the conditionals that are included explicitly in the knowledge
base have more influence compared to those that are merely
derived. However, adding an inferred conditional will never
7
Concluding remarks
In this paper, we have set ourselves the task to revive interest in weaker alternatives to Rational Monotonicity when
reasoning with conditional knowledge bases. We have studied the case of Disjunctive Rationality, a property already
known by the community from the work of Kraus et al. and
Freund in the early ’90s, which we have then coupled with a
semantics in terms of interval orders borrowed from a more
recent work by Rott in belief revision.
45
In our quest for a suitable form of entailment ensuring
Disjunctive Rationality, we started by putting forward a set
of postulates, all reasonable at first glance, characterising
its expected behaviour. As it turns out, not all of them can
be satisfied simultaneously, which suggests there might be
more than one answer to our research question. We have
then provided a construction of the disjunctive rational closure of a conditional knowledge base, which infers a set of
conditionals intermediate between the preferential closure
and the rational closure.
Regarding the properties of disjunctive rational closure,
the news is somewhat mixed, with several basic postulates
satisfied, as well as Cautious Monotonicity, but with neither Cut nor Vacuity holding in general. Regarding Cut, the
reason for its failure seems tied to the fact that disjunctive
rational closure places special importance on the conditionals that are explicitly written as part of the knowledge base.
In this regard it shares commonalities with other base-driven
approaches to defeasible inference such as the lexicographic
closure (Lehmann 1995). We conjecture that a weaker version of Cut will still hold for our approach, according to
which the new conditional added α |∼ β is such that α
already appears as an antecedent of another conditional already in KB.
Regarding Vacuity, our impossibility result and surrounding discussion tells us that its failure is unavoidable given the
other, reasonable, behaviour that we have shown disjunctive
rational closure to exhibit. Essentially, when trying to devise
a method for conditional inference under Disjunctive Rationality, we are faced with a choice between Vacuity and Cautious Monotonicity, with disjunctive rational closure favouring the latter at the expense of the former. It is possible, of
course, to tweak the current approach by treating the case
when |∼KB
P C happens to be a disjunctive relation separately,
outputting the preferential closure in this case, while returning the disjunctive rational closure otherwise. However the
full ripple effects on the other properties of |∼KB
DC of making
this manoeuvre remain to be worked out.
As for future work, we plan to start by checking
whether disjunctive rational closure satisfies Negated Representation Independence, as well as the Justification postulate. We also plan to investigate suitable definitions of
a preference relation on the set of interval-based interpretations. We hope our construction can be shown to be the
most preferred extension of the knowledge base according
to some intuitively defined preference relation, as has been
done in the rational case.
In this work we required the postulate of Infra-Rationality.
As a result our construction of disjunctive rational closure
took the rational closure as a starting point and then performed a particular modification to it to obtain a special
‘privileged’ subset of it that extends the input knowledge
base and forms a disjunctive consequence relation. However
it is clear that this modification could just as well be applied
to any of the other conditional inference methods that have
been suggested in the literature and that output a rational
consequence relation, such as the lexicographic closure or
System JLZ (Weydert 2003) or those based on c-revisions
(Kern-Isberner 2001). It will be interesting to see what kind
of properties will be gained or lost in these cases.
Finally, given the recent trend in applying defeasible reasoning to formal ontologies in Description Logics (Bonatti
et al. 2015; Bonatti and Sauro 2017; Britz, Meyer, and Varzinczak 2011; Britz and Varzinczak 2019; Giordano et al.
2015; Pensel and Turhan 2018), an investigation of our approach beyond the propositional case is also envisaged.
Acknowledgments
This work is based upon research supported in part by the
“Programme de Recherche Commun” Non-Classical Reasoning for Enhanced Ontology-based Semantic Technologies between the CNRS and the Royal Society. Thanks to
the anonymous NMR reviewers for some helpful suggestions.
References
Bonatti, P., and Sauro, L. 2017. On the logical properties of
the nonmonotonic description logic DLN . Artificial Intelligence 248:85–111.
Bonatti, P.; Faella, M.; Petrova, I.; and Sauro, L. 2015. A
new semantics for overriding in description logics. Artificial
Intelligence 222:1–48.
Bonatti, P. 2019. Rational closure for all description logics.
Artificial Intelligence 274:197–223.
Booth, R., and Paris, J. 1998. A note on the rational closure of knowledge bases with both positive and negative
knowledge. Journal of Logic, Language and Information
7(2):165–190.
Booth, R.; Casini, G.; Meyer, T.; and Varzinczak, I. 2019.
On rational entailment for propositional typicality logic. Artificial Intelligence 277.
Booth, R.; Meyer, T.; and Varzinczak, I. 2012. PTL:
A propositional typicality logic. In Fariñas del Cerro, L.;
Herzig, A.; and Mengin, J., eds., Proceedings of the 13th
European Conference on Logics in Artificial Intelligence
(JELIA), number 7519 in LNCS, 107–119. Springer.
Britz, K., and Varzinczak, I. 2017. Toward defeasible
SROIQ. In Proceedings of the 30th International Workshop on Description Logics.
Britz, K., and Varzinczak, I. 2018a. From KLM-style conditionals to defeasible modalities, and back. Journal of Applied Non-Classical Logics (JANCL) 28(1):92–121.
Britz, K., and Varzinczak, I. 2018b. Preferential accessibility and preferred worlds. Journal of Logic, Language and
Information (JoLLI) 27(2):133–155.
Britz, K., and Varzinczak, I. 2019. Contextual rational closure for defeasible ALC. Annals of Mathematics and Artificial Intelligence 87(1-2):83–108.
Britz, K.; Meyer, T.; and Varzinczak, I. 2011. Semantic foundation for preferential description logics. In Wang,
D., and Reynolds, M., eds., Proceedings of the 24th Australasian Joint Conference on Artificial Intelligence, number
7106 in LNAI, 491–500. Springer.
Casini, G.; Meyer, T.; Moodley, K.; and Nortjé, R. 2014.
Relevant closure: A new form of defeasible reasoning for
46
description logics. In Fermé, E., and Leite, J., eds., Proceedings of the 14th European Conference on Logics in Artificial Intelligence (JELIA), number 8761 in LNCS, 92–106.
Springer.
Casini, G.; Meyer, T.; Moodley, K.; Sattler, U.; and Varzinczak, I. 2015. Introducing defeasibility into OWL ontologies. In Arenas, M.; Corcho, O.; Simperl, E.; Strohmaier,
M.; d’Aquin, M.; Srinivas, K.; Groth, P.; Dumontier, M.;
Heflin, J.; Thirunarayan, K.; and Staab, S., eds., Proceedings
of the 14th International Semantic Web Conference (ISWC),
number 9367 in LNCS, 409–426. Springer.
Casini, G.; Meyer, T.; and Varzinczak, I. 2019. Taking defeasible entailment beyond rational closure. In Calimeri, F.;
Leone, N.; and Manna, M., eds., Proceedings of the 16th
European Conference on Logics in Artificial Intelligence
(JELIA), number 11468 in LNCS, 182–197. Springer.
Chafik, A.; Cheikh, F.; Condotta, J.-F.; and Varzinczak, I.
2020. On the decidability of a fragment of preferential LTL.
In Proceedings of the 27th International Symposium on Temporal Representation and Reasoning (TIME).
Fishburn, P. 1985. Interval Orders and Interval Graphs: A
Study of Partially Ordered Sets. Wiley.
Freund, M. 1993. Injective models and disjunctive relations.
Journal of Logic and Computation 3(3):231–247.
Gärdenfors, P., and Makinson, D. 1994. Nonmonotonic
inference based on expectations. Artificial Intelligence
65(2):197–245.
Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G.
2007. Preferential description logics. In Dershowitz, N.,
and Voronkov, A., eds., Logic for Programming, Artificial
Intelligence, and Reasoning (LPAR), number 4790 in LNAI,
257–272. Springer.
Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. 2010.
Preferential vs rational description logics: which one for reasoning about typicality? In Proceedings of the European
Conference on Artificial Intelligence (ECAI), 1069–1070.
Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. 2012.
A minimal model semantics for nonmonotonic reasoning.
In Fariñas del Cerro, L.; Herzig, A.; and Mengin, J., eds.,
Proceedings of the 13th European Conference on Logics in
Artificial Intelligence (JELIA), number 7519 in LNCS, 228–
241. Springer.
Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. 2015.
Semantic characterization of rational closure: From propositional logic to description logics. Artificial Intelligence
226:1–33.
Jaeger, M. 1996. Representation independence of nonmonotonic inference relations. In Aiello, L.; Doyle, J.;
and Shapiro, S., eds., Proceedings of the 5th International
Conference on Principles of Knowledge Representation and
Reasoning (KR), 461–472. Morgan Kaufmann Publishers.
Kern-Isberner, G. 2001. Conditionals in Nonmonotonic
Reasoning and Belief Revision. Springer, Lecture Notes in
Artificial Intelligence.
Kraus, S.; Lehmann, D.; and Magidor, M. 1990. Nonmono-
tonic reasoning, preferential models and cumulative logics.
Artificial Intelligence 44:167–207.
Lehmann, D., and Magidor, M. 1992. What does a conditional knowledge base entail? Artificial Intelligence 55:1–
60.
Lehmann, D. 1995. Another perspective on default reasoning. Annals of Mathematics and Artificial Intelligence
15(1):61–82.
Makinson, D. 1994. General patterns in nonmonotonic reasoning. In Gabbay, D.; Hogger, C.; and Robinson, J., eds.,
Handbook of Logic in Artificial Intelligence and Logic Programming, volume 3. Oxford University Press. 35–110.
Marquis, P., and Schwind, N. 2014. Lost in translation:
Language independence in propositional logic – application
to belief change. Artificial Intelligence 206:1–24.
McCarthy, J. 1980. Circumscription, a form of nonmonotonic reasoning. Artificial Intelligence 13(1-2):27–39.
Pensel, M., and Turhan, A.-Y. 2017. Including quantification in defeasible reasoning for the description logic EL⊥ .
In Balduccini, M., and Janhunen, T., eds., Proceedings of
the 14th International Conference on Logic Programming
and Nonmonotonic Reasoning (LPNMR), number 10377 in
LNCS, 78–84. Springer.
Pensel, M., and Turhan, A.-Y. 2018. Reasoning in the defeasible description logic EL⊥ – computing standard inferences
under rational and relevant semantics. International Journal
of Approximate Reasoning 112:28–70.
Reiter, R. 1980. A logic for default reasoning. Artificial
Intelligence 13(1-2):81–132.
Rott, H. 2014. Four floors for the theory of theory change:
The case of imperfect discrimination. In Fermé, E., and
Leite, J., eds., Proceedings of the 14th European Conference
on Logics in Artificial Intelligence (JELIA), number 8761 in
LNCS, 368–382. Springer.
Shoham, Y. 1988. Reasoning about Change: Time and Causation from the Standpoint of Artificial Intelligence. MIT
Press.
Varzinczak, I. 2018. A note on a description logic of concept
and role typicality for defeasible reasoning over ontologies.
Logica Universalis 12(3-4):297–325.
Weydert, E. 2003. System JLZ - rational default reasoning
by minimal ranking constructions. Journal of Applied Logic
1(3-4):273–308.
47
Treewidth-Aware Complexity in ASP:
Not all Positive Cycles are Equally Hard∗
Jorge Fandinno1 , Markus Hecher1,2
1
University of Potsdam, Germany
2
TU Wien, Austria
{jorgefandinno, mhecher}@gmail.com
Abstract
of the program. Still, existing solvers (Gebser et al. 2012;
Alviano et al. 2017) are able to find solutions for many interesting problems in reasonable time. A way to shed light
into this discrepancy is by means of parameterized complexity (Cygan et al. 2015), which conducts more fine-grained
complexity analysis in terms of parameters of a problem. For
ASP, several results were achieved in this direction (Gottlob,
Scarcello, and Sideri 2002; Lonc and Truszczynski 2003;
Lin and Zhao 2004; Fichte and Szeider 2015), some insights
involve even combinations (Lackner and Pfandler 2012;
Fichte, Kronegger, and Woltran 2019) of parameters. More
recent studies focus on the influence of the parameter
treewidth for solving ASP (Jakl, Pichler, and Woltran 2009;
Fichte et al. 2017; Fichte and Hecher 2019; Bichler, Morak,
and Woltran 2018; Bliem et al. 2020). These works directly
make use of the treewidth of a given logic program in order to solve, e.g., the consistency problem, in polynomial
time in the program size, while being exponential only in the
treewidth. Recently, it was shown that for normal ASP deciding consistency is expected to be slightly superexponential
for treewidth (Hecher 2020). More concretely, a lower bound
was established saying that under reasonable assumptions
such as the Exponential Time Hypothesis (ETH) (Impagliazzo,
Paturi, and Zane 2001), consistency for any normal logic program of treewidth k cannot be decided in time significantly
better than 2k·⌈log(k)⌉ · poly(n), where n is the number of
variables (atoms) of the program. This result matches the
known upper bound (Fichte and Hecher 2019) and shows that
the consistency of normal ASP is slightly harder than the
satisfiability (SAT) of a propositional formula, which under
the ETH cannot be decided in time 2o(k) · poly(n).
We address this result and provide a more detailed analysis, where besides treewidth, we also consider the size ℓ
of the largest strongly-connected component (SCC) of the
positive dependency graph as parameter. This allows us to
obtain runtimes below 2k·⌈log(k)⌉ · poly(n) and show that
that not all positive cycles of logic programs are equally hard.
Then, we also provide a treewidth-aware reduction from headcycle-free ASP to the fragment of tight ASP, which prohibits cycles in the corresponding positive dependency graph.
This reduction reduces a given head-cycle-free program of
treewidth k to a tight program of treewidth O(k · log(ℓ)),
which improves known results (Hecher 2020). Finally, we establish that tight ASP is as hard as SAT in terms of treewidth.
It is well-know that deciding consistency for normal answer set
programs (ASP) is NP-complete, thus, as hard as the satisfaction problem for classical propositional logic (SAT). The best
algorithms to solve these problems take exponential time in
the worst case. The exponential time hypothesis (ETH) implies
that this result is tight for SAT, that is, SAT cannot be solved in
subexponential time. This immediately establishes that the result is also tight for the consistency problem for ASP. However,
accounting for the treewidth of the problem, the consistency
problem for ASP is slightly harder than SAT: while SAT can
be solved by an algorithm that runs in exponential time in the
treewidth k, it was recently shown that ASP requires exponential time in k · log(k). This extra cost is due checking that
there are no self-supported true atoms due to positive cycles in
the program. In this paper, we refine the above result and show
that the consistency problem for ASP can be solved in exponential time in k · log(λ) where λ is the minimum between
the treewidth and the size of the largest strongly-connected
component in the positive dependency graph of the program.
We provide a dynamic programming algorithm that solves the
problem and a treewidth-aware reduction from ASP to SAT
that adhere to the above limit.
1
Introduction
Answer Set Programming (ASP) (Brewka, Eiter, and
Truszczyński 2011; Gebser et al. 2012) is a problem modeling and solving paradigm well-known in the area of
knowledge representation and reasoning that is experiencing an increasing number of successful applications (Balduccini, Gelfond, and Nogueira 2006; Nogueira et al. 2001;
Guziolowski et al. 2013). The flexibility of ASP comes with
a high computational complexity cost: its consistency problem, that is, deciding the existence of a solution (answer set)
for a given logic program is ΣP
2 -complete (Eiter and Gottlob 1995), in general. Fragments with lower complexity are
also known. For instance, the consistency problem for normal ASP or head-cycle-free (HCF) ASP, is NP-complete.
Even for solving this class of programs, the best known algorithms require exponential time with respect to the size
∗
The work has been supported by the Austrian Science Fund
(FWF), Grants Y698 and P32830, and the Vienna Science and
Technology Fund, Grant WWTF ICT19-065. It is also accepted for
presentation at the ASPOCP’20 workshop (Fandinno and Hecher
2020).
48
and Truszczyński 2011). Let m, n, o be non-negative integers such that m ≤ n ≤ o, a1 , . . ., ao be distinct propositional atoms. Moreover, we refer by literal to an atom or the
negation thereof. A (logic) program Π is a set of rules of
the form a1 ∨ · · · ∨ am ← am+1 , . . . , an , ¬an+1 , . . . , ¬ao .
For a rule r, we let Hr := {a1 , . . . , am }, Br+ :=
{am+1 , . . . , an }, and Br− := {an+1 , . . . , ao }. We denote
the sets of atoms occurring in a rule r or in
S a program Π by
at(r) := Hr ∪ Br+ ∪ Br− and at(Π) := r∈Π at(r). For a
set X ⊆ at(Π) of atoms, we let X := {¬x | x ∈ X}. Program Π is normal, if |Hr | ≤ 1 for every r ∈ Π. The positive
dependency digraph DΠSof Π is the directed graph defined
on the set of atoms from r∈Π Hr ∪ Br+ , where there is a directed edge from vertex a to vertex b iff there is a rule r ∈ Π
with a ∈ Br+ and b ∈ Hr . A head-cycle of DΠ is an {a, b}cycle1 for two distinct atoms a, b ∈ Hr for some rule r ∈ Π.
A program Π is head-cycle-free (HCF) if DΠ contains no
head-cycle (Ben-Eliyahu and Dechter 1994) and Π is called
tight if DΠ contains no cycle at all (Lin and Zhao 2003). The
class of tight, normal, and HCF programs is referred to by
tight, normal, and HCF ASP, respectively.
An interpretation I is a set of atoms. I satisfies a rule r
if (Hr ∪ Br− ) ∩ I 6= ∅ or Br+ \ I 6= ∅. I is a model
of Π if it satisfies all rules of Π, in symbols I |= Π. For
brevity, we view propositional formulas as sets of clauses
that need to be satisfied, and use the notion of interpretations, models, and satisfiability analogously. The GelfondLifschitz (GL) reduct of Π under I is the program ΠI obtained
from Π by first removing all rules r with Br− ∩ I 6= ∅ and
then removing all ¬z where z ∈ Br− from every remaining
rule r (Gelfond and Lifschitz 1991). I is an answer set of
a program Π if I is a minimal model of ΠI . The problem
of deciding whether an ASP program has an answer set is
called consistency, which is ΣP2 -complete (Eiter and Gottlob
1995). If the input is restricted to normal programs, the complexity drops to NP-complete (Bidoı́t and Froidevaux 1991;
Marek and Truszczyński 1991). A head-cycle-free program Π can be translated into a normal program in polynomial time (Ben-Eliyahu and Dechter 1994). The following characterization of answer sets is often invoked when
considering normal programs (Lin and Zhao 2003). Given a
set A ⊆ at(Π) of atoms, a function σ : A → {0, . . . , |A|−1}
is called level mapping over A. Given a model I of a normal
program Π and a level mapping σ over I, an atom a ∈ I is
proven if there is a rule r ∈ Π proving a with σ, where a ∈
Hr with (i) Br+ ⊆ I, (ii) I ∩Br− = ∅ and I ∩(Hr \{a}) = ∅,
and (iii) σ(b) < σ(a) for every b ∈ Br+ . Then, I is an answer set of Π if (i) I is a model of Π, and (ii) I is proven, i.e.,
every a ∈ I is proven. This characterization vacuously extends to head-cycle-free programs (Ben-Eliyahu and Dechter
1994) and allows for further simplification when considering SCCs of DΠ (Janhunen 2006). To this end, we denote
for each atom a ∈ at(Π) the strongly-connected component
(SCC) of atom a in DΠ by scc(a). Then, Condition (iii)
above can be relaxed to σ(b) < σ(a) for every b ∈ Br+ ∩ C,
where C = scc(a) is the SCC of a.
Contributions. More concretely, we present the following.
1. First, we establish a parameterized algorithm for deciding
consistency of any head-cycle-free program Π that runs in
time 2O(k·log(ℓ)) · poly(|at(Π)|), where k is the treewidth
of Π and ℓ is the size of the largest strongly-connected
component (SCC) of the dependency graph of Π. Combining this result with results from (Hecher 2020), consistency of any head-cycle-free program can be decided
in 2O(k·log(λ)) ·poly(|at(Π)|) where λ is the minimum of k
and ℓ. Besides, our algorithm bijectively preserves answer
sets with respect to the atoms of Π and can be therefore
easily extended, see, e.g. (Pichler, Rümmele, and Woltran
2010), for counting and enumerating answer sets.
2. Then, we present a treewidth-aware reduction from headcycle-free ASP to tight ASP. Our reduction takes any headcycle-free program Π and creates a tight program, whose
treewidth is at most O(k · log(ℓ)), where k is the treewidth
of Π and ℓ is the size of the largest SCC of the dependency
graph of Π. In general, the treewidth of the resulting tight
program cannot be in o(k · log(k)), unless ETH fails. Our
reduction forms a major improvement for the particular
case where ℓ ≪ k.
3. Finally, we show a treewidth-aware reduction that takes
any tight logic program Π and creates a propositional formula, whose treewidth is linear in the treewidth of the
program. This reduction cannot be significantly improved
under ETH. Our result also establishes that for deciding
consistency of tight logic programs of bounded treewidth k,
one indeed obtains the same runtime as for SAT, namley 2O(k) · poly(|at(Π)|), which is ETH-tight.
Related Work. While the largest SCC size has already
been considered (Janhunen 2006), it has not been studied
in combination with treewidth. Also programs, where the
number of even and/or odd cycles is bounded, have been
analyzed (Lin and Zhao 2004), which is orthogonal to the
size of the largest cycle or largest SCC size ℓ. Indeed, in
the worst-case, each component might have an exponential number of cycles in ℓ. Further, the literature distinguishes the so-called feedback width (Gottlob, Scarcello,
and Sideri 2002), which involves the number of atoms required to break the positive cycles. There are also related
measures, called smallest backdoor size, where the removal
of a backdoor, i.e., set of atoms, from the program results
in normal or acyclic programs (Fichte and Szeider 2015;
Fichte and Szeider 2017).
2
Background
We assume familiarity with graph terminology. Given a
directed graph G = (V, E). Then, a set C ⊆ V of vertices
of G is a strongly-connected component (SCC) of G if C is
a ⊆-largest set such that for every two distinct vertices u, v
in C there is a directed path from u to v in G. A cycle over
some vertex v of G is a directed path from v to v.
Answer Set Programming (ASP). Further, we assume familiarity with propositional satisfiability (SAT) and follow
standard definitions of propositional ASP (Brewka, Eiter,
1
Let G = (V, E) be a digraph and W ⊆ V . Then, a cycle in G
is a W -cycle if it contains all vertices from W .
49
e
g
b
t5 {b, e, f }
a
e
d
f
b
c
g
f
d
c
Figure 1: Positive dependency graph DΠ of Π of Example 1.
a
t3 {b, d, e}
t1 {a, b, d}
t4
{e, f, g}
{b, c, d} t2
Figure 2: Graph G (left) and a tree decomposition T of G (right).
Example
1.
Consider
Π,
given
by
gram
the
following
pro;
b
←
a
;
b
←
{a
←
d
| {z } | {z } | {z d};
r1
b ← e, ¬f ; c ← b; d ← b, c; e ∨ f ∨ g ← }.
| {z } | {z } | {z } |
{z
}
r4
r5
r6
r2
Example 3. Recall program Π from Example 1 and observe
that graph G of Figure 2 is the primal graph of Π. Further,
we have Πt1 = {r1 , r2 , r3 }, Πt2 = {r3 , r5 , r6 }, Πt3 = ∅,
Πt4 = {r7 } and Πt5 = {r4 }.
r3
Observe
r7
that Π is head-cycle-free. Figure 1 shows the positive
dependency graph DΠ consisting of SCCs scc(e), scc(f ),
scc(g), and scc(a) = scc(b) = scc(c) = scc(d). Then,
I := {a, b, c, d, e} is an answer set of Π, since I |= Π,
and we can prove with level mapping σ := {e 7→ 0, f 7→ 0,
g 7→ 0, b 7→ 0, c 7→ 1, d 7→ 2, a 7→ 3} atom e by rule r7 ,
atom b by rule r4 , atom c by rule r5 , and atom d by rule r6 .
Further answer sets are {f } and {g}.
3
Bounding Treewidth and Positive Cycles
Recently, it was shown that under reasonable assumptions,
namely the exponential time hypothesis (ETH), deciding
consistency of normal logic programs is slightly superexponential and one cannot expect to significantly improve in the
worst case. For a given normal logic program, where k is the
treewidth of the primal graph of the program, this implies
that one cannot decide consistency in time significantly better
than 2k·⌈log(k)⌉ · poly(|at(Π)|).
A tree decomposiTree Decompositions (TDs).
tion (TD) (Robertson and Seymour 1986) of a given
graph G=(V, E) is a pair T =(T, χ) where T is a tree
rooted at root(T ) and χ assigns to each node
S t of T a
set χ(t) ⊆ V , called bag, such that (i) V = t of T χ(t),
(ii) E ⊆ {{u, v} | t in T, {u, v} ⊆ χ(t)}, and (iii) “connectedness”: for each r, s, t of T , such that s lies on the
path from r to t, we have χ(r) ∩ χ(t) ⊆ χ(s). For every
node t of T , we denote by chld(t) the set of child nodes
of t in T . The bags χ≤t below t consists of the union
of all bags of nodes below t in T , including t. We let
width(T ) := maxt of T |χ(t)|−1. The treewidth tw (G) of G
is the minimum width(T ) over all TDs T of G. TDs can be
5-approximated in single exponential time (Bodlaender et al.
2016) in the treewidth. For a node t of T , we say that type(t)
is leaf if t has no children and χ(t) = ∅; join if t has children t′ and t′′ with t′ 6= t′′ and χ(t) = χ(t′ ) = χ(t′′ );
int (“introduce”) if t has a single child t′ , χ(t′ ) ⊆ χ(t)
and |χ(t)| = |χ(t′ )| + 1; forget if t has a single child t′ ,
χ(t′ ) ⊇ χ(t) and |χ(t′ )| = |χ(t)| + 1. If for every node
t of T , type(t) ∈ {leaf, join, int, forget}, the TD is called
nice. A TD can be turned into a nice TD (Kloks 1994)[Lem.
13.1.3] without increasing the width in linear time.
Example 2. Figure 2 illustrates a graph G and a TD T
of G of width 2, which is also the treewidth of G, since G
contains (Kloks 1994) a completely connected graph among
vertices b,c,d.
In order to use TDs for ASP, we need dedicated graph
representations of programs (Jakl, Pichler, and Woltran 2009).
The primal graph2 GΠ of program Π has the atoms of Π as
vertices and an edge {a, b} if there exists a rule r ∈ Π and
a, b ∈ at(r). Let T = (T, χ) be a TD of primal graph GΠ of
a program Π, and let t be a node of T . The bag program Πt
contains rules entirely covered by the bag χ(t). Formally,
Πt := {r | r ∈ Π, at(r) ⊆ χ(t)}.
Proposition 1 (Lower Bound for Treewidth (Hecher 2020)).
Given a normal or head-cycle-free logic program Π, where k
is the treewidth of the primal graph of Π. Then, under ETH
one cannot decide consistency of Π in time 2o(k·log(k)) ·
poly(|at(Π)|).
While according to Proposition 1, we cannot expect to
significantly improve the runtime for normal logic programs
in the worst case, it still is worth to study the underlying
reason that makes the worst case so bad. It is well-known
that positive cycles are responsible for the hardness (Lifschitz
and Razborov 2006; Janhunen 2006) of computing answer
sets of normal logic programs. The particular issue with logic
programs Π in combination with treewidth and large cycles
is that in a tree decomposition of GΠ it might be the case that
the cycle spreads across the whole decomposition, i.e., tree
decomposition bags only contain parts of such cycles, which
prohibits to view these cycles (and dependencies) as a whole.
This is also the reason of the hardness given in Proposition 1
and explains why under bounded treewidth evaluating normal logic programs is harder than evaluating propositional
formulas. However, if a given normal logic program only has
positive cycles of length at most 3, and each atom appears
in at most one positive cycle, the properties of tree decompositions already ensure that the atoms of each such positive
cycle appear in at least one common bag. Indeed, a cycle
of length at most 3 forms a completely connected subgraph
and therefore it is guaranteed (Kloks 1994) that the atoms of
the cycle are in one common bag of any tree decomposition
of GΠ .
Example 4. Recall program Π of Example 1. Observe that
in any TD of GΠ it is required that there are nodes t, t′
with χ(t) ⊆ {b, c, d} and χ(t′ ) ⊆ {a, b, d} since a cycle of
length 3 in the positive dependency graph DΠ (cf., Figure 1)
forms a completely connected graph in the primal graph, cf.,
Figure 2 (left).
2
Analogously, the primal graph GF of a propositional Formula F uses variables of F as vertices and adjoins two vertices a, b
by an edge, if there is a clause in F containing a, b.
50
In the following, we generalize this result to cycles of
length at most ℓ, where we bound the size of these positive
cycles in order to improve the lower bound of Proposition 1
on programs of bounded positive cycle lengths. This will provide not only a significant improvement in the running time
on programs, where the size of positive cycles is bounded,
but also shows that indeed the case of positive cycle lengths
up to 3 can be generalized to lengths beyond 3. Consequently,
we establish that not all positive cycles are bad assuming that
the maximum size ℓ of the positive cycles is bounded, which
provides an improvement of Proposition 1 as long as ℓ ≪ k,
where k is the treewidth of GΠ .
The overall idea of the algorithm relies on so-called dynamic programming, which be briefly recap next.
Dynamic Programming on Tree Decompositions. Dynamic programming (DP) on TDs, see, e.g., (Bodlaender
and Koster 2008), evaluates a given input instance I in parts
along a given TD of a graph representation G of the instance.
Thereby, for each node t of the TD, intermediate results are
stored in a table τt . This is achieved by running a table algorithm, which is designed for a certain graph representation,
and stores in τt results of problem parts of I, thereby considering tables τt′ for child nodes t′ of t. DP works for many
problem instances I as follows.
1. Construct a graph representation G of I.
2. Compute a TD T = (T, χ) of G. For simplicity and
better presentation of the different cases within our table
algorithms, we use nice TDs for DP.
3. Traverse the nodes of T in post-order (bottom-up tree
traversal of T ). At every node t of T during post-order
traversal, execute a table algorithm that takes as input a bag
χ(t), a certain bag instance It depending on the problem,
as well as previously computed child tables of t. Then, the
results of this execution is stored in table τt .
4. Finally, interpret table τn for the root node n of T in order
to output the solution to the problem for instance I.
Bounding Positive Cycles. In the remainder, we assume
an HCF logic program Π, whose treewidth is given by k =
tw (GΠ ). We let ℓscc(a) for each atom a be the number of
atoms (size) of the SCC of a in DΠ . Further, we let ℓ :=
maxa∈at(Π) ℓscc(a) be the largest SCC size. This also bounds
the lengths of positive cycles. If each atom a appears in
at most one positive cycle, we have that ℓscc(a) is the cycle
length of a and then ℓ is the length of the largest cycle in Π.
We refer to the class of HCF logic programs, whose largest
SCC size is bounded by a parameter ℓ by SCC-bounded
ASP. Observe that the largest SCC size ℓ is orthogonal to the
measure treewidth.
Example 5. Consider program Π from Example 1.
Then, ℓscc(e) =ℓscc(f ) =ℓscc(g) =1, ℓscc(a) =ℓscc(b) =ℓscc(c) =
ℓscc(d) =4, and ℓ = 4. Now, assume a program Π′ , whose
primal graph equals the dependency graph, which is just one
large (positive) cycle. It is easy to see that this program has
treewidth 2 and one can define a TD of GΠ′ , whose bags are
constructed along the cycle. However, the largest SCC size
coincides with the number of atoms. Conversely, there are
instances of large treewidth without any positive cycle.
Bounding cycle lengths or sizes of SCCs seems similar
to the non-parameterized context, where the consistency of
normal logic programs is compiled to a propositional formula
(SAT) by a reduction based on level mappings that is applied
on a SCC-by-SCC basis (Janhunen 2006). However, this
reduction does not preserve the treewidth. On the other hand,
while our approach also uses level mappings and proceeds
on an SCC-by-SCC basis, the overall evaluation is not SCCbased, since this might completely destroy the treewidth in
the worst-case. Instead, the evaluation is still guided along a
tree decomposition, which is presented in two flavors. First,
we show a dedicated parameterized algorithm for the evaluation of logic programs of bounded treewidth, followed by a
treewidth-aware reduction to propositional satisfiability.
3.1
Now, the missing ingredient for solving problems via dynamic programming along a given TD, is a suitable table
algorithm. Such algorithms have been already presented
for SAT (Samer and Szeider 2010) and ASP (Jakl, Pichler,
and Woltran 2009; Fichte et al. 2017; Fichte and Hecher
2019). We only briefly sketch the ideas of a table algorithm
using the primal graph that computes models (not answer
sets) of a given program Π. Each table τt consists of rows
storing interpretations over atoms in the bag χ(t). Then, the
table τt for leaf nodes t consist of the empty interpretation.
For nodes t with introduced variable a ∈ χ(t), we store in τt
interpretations of the child table, but for each such interpretation we decide whether a is in the interpretation or not, and
ensure that the interpretation satisfies Πt . When an atom b
is forgotten in a forget node t, we store interpretations of the
child table, but restricted to atoms in χ(t). By the properties of a TD, it is then guaranteed that all rules containing b
have been processed so far. For join nodes, we store in τt
interpretations that are also in both child tables of t.
3.2
An Algorithm for SCC-bounded ASP and
Treewidth
Similar to the table algorithm sketched above, we present next
a table algorithm BndCyc for solving consistency of SCCbounded ASP. Let therefore Π be a given SCC-bounded
program of largest SCC size ℓ and T = (T, χ) be a tree
decomposition of GΠ . Before we discuss the tables and the
algorithm itself, we need to define level mappings similar to
related work (Janhunen 2006), but adapted to SCC-bounded
programs. Formally, a level mapping σ : A → {0, . . . , ℓ−1}
over atoms A ⊆ at(Π) is a function mapping each atom a ∈
A to a level σ(a) such that the level does not exceed ℓscc(a) ,
i.e., σ(a) < ℓscc(a) .
Towards Exploiting Treewidth for
SCC-bounded ASP
In the course of this and the next section, we establish the
following result.
Theorem 1 (Runtime of SCC-bounded ASP). Assume a
HCF logic program Π, where the treewidth of the primal
graph GΠ of Π is at most k and ℓ is the largest SCC
size. Then, there is an algorithm for deciding the consistency of Π, running in time 2O(k·log(λ)) · poly(|at(Π)|),
where λ = min({k, ℓ}).
51
σ16.i i
i hI16.i ,P16.i ,
P14.i ,
σ14.i i
i hI14.i ,
1 h{e}, {e}, {e 7→ 0}i
1 h∅,
∅,
∅i
2 h{f }, {f }, {f 7→ 0}i
∅,
{f 7→ [ℓ]}i
3 h{g}, {g}, {g 7→ 0}i τ16 2 h{f },
3 h{e, f }, ∅, {e, f 7→ [ℓ]}i
i hI12.i ,P12.i ,σ12.i i
∅,
{b 7→ 0}i
{e, f, g} t16 4 h{b},
1 h∅,
∅,
∅i
5 h{b, e}, {b},
{b 7→ 0,
2
h{b},
∅,
{b
→
7
0}i
e
7→ [ℓ]}i
τ11
{e, f } t15
τ12
6
h{b,
f
},
∅,
{f
7→ [ℓ]}i
σ11.i i
i hI11.i , P11.i ,
7 h{b, e, f },{b},{e, f 7→ [ℓ]}i
1 h∅,
∅,
∅i
{b, e, f } t14
τ14
2 h{b, d},∅, {b, d 7→ [ℓ−1]}i
3 h{b, d},{b},{b 7→ 2, d 7→ 0}i
hI
,
P
σ10.i i i
10.i
10.i ,
{b, e} t13
4 h{b, d},{d},{b 7→ 0, d 7→ 2}i
h∅,
∅,
∅i 1
h{b, d},∅,
{b 7→ [ℓ−1], 2
σ5.i i
i hI5.i , P5.i ,
t12 {b}
d
→
7
[ℓ]}i
1 h∅,
∅,
∅i τ5
h{b, d},{d},
σ10.3 (b)< 3
2 h{b}, ∅,
{b 7→ [ℓ]}i
t11 {b, d} t
10
σ10.3 (d)−1i
3 h{b, d},∅,
{b, d 7→ [ℓ]}i
τ10
4 h{b, d},{b},
σ5.4 (d)<
{b, d} t5 {b, d}
t9
σ5.4 (b)−1i
t4
τ9
P9.i ,
σ9.i i i
{b, c, d} hI9.i ,
P4.i ,
σ4.i i {a, b, d}
i hI4.i ,
h∅,
∅,
∅i 1
1 h∅,
∅,
∅i
t3
{a, b}
{c, d} t8 h{c},
∅,
{c 7→ [ℓ]}i 2
2 h{b},
∅,
{b 7→ [ℓ]}i
h{b, c, d},{c}, σ9.3 (b)< 3
3 h{a, b}, ∅, {a, b 7→ [ℓ]}i
σ9.3 (c)i
t7 {c}
4 h{a, b, d}, {a},
σ4.4 (d)< t2 {a}
h{b, c, d},{d, c}, σ9.4 (b)< 4
σ4.4 (a)i
σ9.4 (c), σ9.4 (c)<σ9.4 (d)i
5 h{a, b, d}, {a, b}, σ4.5 (d)< τ4 ∅ t1
t6 ∅
σ4.5 (a), σ4.5 (a)<σ4.5 (b)i
Listing 1: Table algorithm BndCyc(t, χ(t), Πt , hτ1 , . . . , τo i)
for nodes of nice TDs.
In: Node t, bag χ(t), bag program Πt , sequence hτ1 , . . . , τo i
of child tables of t.
Out: Table τt .
1 if type(t) = leaf then τt ← {h∅, ∅, ∅i}
2 else if type(t) = int and a ∈ χ(t) is the introduced atom then
3
τt ← {hI ′ , P ′ , σ ′ i | hI, P, σi ∈ τ1 , I ′ ∈ {I, Ia+ }, I ′ |= Πt ,
4
σ ′ ∈ levelMaps(σ, {a} ∩ I ′ ), isMin(σ ′ , Πt ),
P ′ = P ∪ proven(I ′ , σ ′ , Πt )}
5 else if type(t) = forget, a 6∈ χ(t) is the forgotten atom then
6
τt ← {hIa− , Pa− , σa∼ i | hI, P, σi ∈ τ1 , a ∈ P ∪ ({a} \ I)}
7 else if type(t) = join /* o=2 children of t */ then
8
τt ← {hI, P1 ∪ P2 , σi | hI, P1 , σi ∈ τ1 , hI, P2 , σi ∈ τ2 }
9 return τt
For a function σ mapping x to σ(x), we let σx∼ :=σ \ {x 7→ σ(x)}
be the function σ without containing x. Further, for given set S and
an element e, we let Se+ :=S ∪ {e} and Se− :=S \ {e}.
These level mappings are used in the construction of the
tables of BndCyc, where each table τt for a node t of TD T
consists of rows of the form hI, P, σi, where I ⊆ χ(t) is
an interpretation of atoms χ(t), P ⊆ χ(t) is a set of atoms
in χ(t) that are proven, and σ is a level mapping over χ(t).
Before we discuss the table algorithm, we need auxiliary
notation. Let proven(I, σ, Πt ) be a subset of atoms I containing all atoms a ∈ I where there is a rule r ∈ Πt proving a with σ. However, σ provides for a only a level number
within the SCC of a, i.e., proven requires the relaxed characterization of provability that considers scc(a), as given in
Section 2. Then, we denote by levelMaps(σ, I) those set of
level mappings σ ′ that extend σ by atoms in I, where for
each atom a ∈ I, we have a level σ ′ (a) with σ ′ (a) < ℓscc(a) .
Further, we let isMin(σ, Πt ) be 0 if σ is not minimal, i.e.,
if there is an atom a with σ(a) > 0 where a rule r ∈ Πt
proves a with a level mapping ρ that is identical to σ, but
sets ρ(a) = σ(a) − 1, and be 1 otherwise.
Figure 3: Tables obtained by DP on a TD T ′ using algorithm BndCyc of Listing 1.
occurrence of isMin in Line 4. Whenever an atom a is forgotten in node t, i.e., if type(t) = forget, we take in Line 6 only
rows of the table τ1 for the child node of t, where either a is
not in the interpretation or a is proven, and remove a from the
row accordingly. By the properties of TDs, it is guaranteed
that we have encountered all rules involving a in any node
below t. Finally, if t is a join node (type(t) = join), we
ensure in Line 8 that we take only rows of both child tables
of t, which agree on interpretations and level mappings, and
that an atom is proven if proven in one of the two child rows.
Example 6. Recall program Π with ℓ = 4 from Example 1. Figure 3 shows a nice TD T ′ of GΠ and lists selected tables τ1 , . . . , τ16 that are obtained during DP by
using BndCyc (cf., Listing 1) on TD T ′ . Rows highlighted in
gray are discarded and do not lead to an answer set, yellow
highlighted rows form one answer set. For brevity, we compactly represent tables by grouping rows according to similar
level mappings. We write [ℓ] for any value in {0, . . . , ℓ−1}
and we sloppily write, e.g., σ9.3 (b) < σ9.3 (c) to indicate any
level mapping σ9.3 in row 3 of table τ9 , where b has a smaller
level than c.
Node t1 is a leaf (type(t1 ) = leaf) and therefore τ1 =
{h∅, ∅, ∅i} as stated in Line 1. Then, nodes t2 , t3 and t4 are
introduce nodes. Therefore, table τ4 is the result of Lines 3
and 4 executed for nodes t2 , t3 and t4 , by introducing a, b,
and d, respectively. Table τ4 contains all interpretations
restricted to {a, b, d} that satisfy Πt4 = {r1 , r2 , r3 }, cf.,
Line 3. Further, each row contains a level mapping among
atoms in the interpretation such that the corresponding set of
proven atoms is obtained, cf., Line 4. Row 4 of τ4 for example
requires a level mapping σ4.4 with σ4.4 (d) < σ4.4 (a) for a
to be proven. Then, node t5 forgets a, which keeps only rows,
where either a is not in the interpretation or a is in the set of
proven atoms, and removes a from the result. The result of
Line 6 on t5 is displayed in table τ5 , where Row 3 of τ4 does
not have a successor in τ5 since a is not proven. For leaf
Listing 1 depicts an algorithm BndCyc for solving consistency of SCC-bounded ASP. The algorithm is inspired by an
approach for HCF logic programs (Fichte and Hecher 2019),
whose idea is to evaluate Π in parts, given by the tree decomposition T . For the ease of presentation, algorithm BndCyc
is presented for nice tree decompositions, where we have
a clear case distinction for every node t depending on the
node type type(t) ∈ {leaf, int, forget, join}. For arbitrary
decompositions the cases are interleaved. If type(t) = leaf,
we have that χ(t) = ∅ and therefore for χ(t) the interpretation, the set of proven atoms as well as the level mapping is
empty, cf., Line 1 of Listing 1. Whenever an atom a ∈ χ(t)
is introduced, i.e., if type(t) = int, we construct succeeding
rows of the form hI ′ , P ′ , σ ′ i for every row in the table τ1 of
the child node of t. We take such a row hI, P, σi of τ1 and
guess whether a is in I, resulting in I ′ , and ensure that I ′
satisfies Πt , as given in Line 3. Consequently, I ′ is a model
(not necessarily an answer set) of Πt . Then, Line 4 takes
succeeding level mappings σ ′ of σ, as given by levelMaps,
that are minimal (see isMin) and we finally ensure that the
proven atoms P ′ update P by proven(I ′ , σ ′ , Πt ). Notably, if
duplicate answer sets are not an issue, one can remove the
52
node t6 we have τt6 = τt1 . Similarly to before, t7 , t8 , and t9
are introduce nodes and τ9 depicts the resulting table for t9 .
Table τ10 does not contain any successor row of Row 2 of τ9 ,
since c is not proven. Node t11 is a join node combining rows
of τ5 and τ9 as given by Line 8. Observe that Row 3 of τ5
does not match with any row in τ9 . Further, combining Row 3
of τ5 with Row 3 of τ9 results in Row 4 of τ11 (since ℓ−1 = 3).
The remaining tables can be obtained similarly. Table τ16 for
the root node only depicts (solution) rows, where each atom
is proven.
size. Then, there is an algorithm for deciding the consistency of Π, running in time 2O(k·log(λ)) · poly(|at(Π)|),
where λ = min({k, ℓ}).
Proof. First, we compute (Bodlaender et al. 2016) a tree decomposition T = (T, χ) of GΠ that is a 5-approximation
of k = tw (GΠ ) and has a linear number of nodes, in
time 2O(k) · poly(|at(Π)|). Computing ℓscc(a) for each
atom a ∈ at(Π) can be done in polynomial time. If ℓ > k,
we directly run an algorithm (Fichte and Hecher 2019)
for the consistency of Π. Otherwise, i.e., if ℓ ≤ k we
run Listing 1 on each node t of T in a bottom-up (postorder) traversal. In both cases, we obtain a total runtime
of 2O(k·log(λ)) · poly(|at(Π)|).
In contrast to existing work (Fichte and Hecher 2019), if
largest SCC size ℓ < k, where k is the treewidth of primal
graph GΠ , our algorithm runs in time better than the lower
bound given by Proposition 1. Further, existing work (Fichte
and Hecher 2019) does not precisely characterize answer
sets, but algorithm BndCyc of Listing 1 exactly computes
all the answer sets of Π. Intuitively, the reason for this is
that level mappings for an atom x ∈ at(Π) do not differ in
different bags of T , but instead we use the same level (at
most ℓscc(x) many possibilities) for x in all bags. Notably,
capturing all the answer sets of Π allows that BndCyc can be
slightly extended to count the answer sets of Π by extending
the rows by an integer for counting accordingly. This can
be extended further for answer set enumeration with linear
delay, which results in an anytime enumeration algorithm
that keeps for each row of a table its predecessor rows.
Consequences on Correctness and Runtime.
sketch correctness and finally show Theorem 1.
4
Treewidth-Aware Reductions for
SCC-bounded ASP
Next, we present a novel reduction from HCF ASP to tight
ASP. Given a head-cycle-free logic program, we present
a treewidth-aware reduction that constructs a tight logic
program with little overhead in terms of treewidth. Concretely, if each SCC of the given head-cycle-free logic program Π has at most ℓ atoms, the resulting tight program has
treewidth O(k · log(ℓ)). In the course of this section, we
establish the following theorem.
Theorem 2 (Removing Cyclicity of SCC-bounded ASP).
Let Π be an HCF program, where the treewidth of GΠ is at
most k and where every SCC C satisfies |C| ≤ ℓ. Then, there
is a tight program Π′ with treewidth in O(k · log(ℓ)) such
that for each answer set of Π there is exactly one answer set
of Π′ , and vice versa.
Next, we
Lemma 1 (Correctness). Let Π be an HCF program, where
the treewidth of GΠ is at most k and where every SCC C
satisfies |C| ≤ ℓ. Then, for a given tree decomposition T =
(T, χ) of primal graph GΠ , algorithm BndCyc executed for
each node t of T in post-order is correct.
4.1
Reduction to Tight ASP
The overall construction of the reduction is inspired by the
idea of treewidth-aware reductions (Hecher 2020), where
in the following, we assume an SCC-bounded program Π
and a tree decomposition T = (T, χ) of GΠ such that the
construction of the resulting tight logic program Π′ is heavily
guided along T . In contrast to existing work (Hecher 2020),
bounding cycles with the largest SCC size additionally allows
to have a “global” level mapping (Janhunen 2006), i.e., we
do not have different levels for an atom in different bags.
Then, while the overall reduction is still guided along the
tree decomposition T in order to take care to not increase
treewidth too much, these global level mappings ensure that
the tight program is guaranteed to preserve all answer sets
(projected to the atoms of Π), as stated in Theorem 2.
Before we discuss the construction in detail, we require
auxiliary atoms and notation as follows. In order to guide
the evaluation of the provability of an atom x ∈ at(Π) in a
node t in T along the decomposition T , we use atoms pxt
and px≤t to indicate that x was proven in node t (with some
rule in Πt ) and below t, respectively. Further, we require
atoms bjx , called level bits, for x ∈ at(Π) and 1 ≤ j ≤
⌈log(ℓscc(x) )⌉, which are used as bits in order to represent
in a level mapping the level of x in binary. To this end, we
denote for x and a number i with 0 ≤ i < ℓscc(x) as well as
a position number 1 ≤ j ≤ ⌈log(ℓscc(x) )⌉, the j-th position
Proof (Sketch). The proof consists of both soundness, which
shows that only correct data is in the tables, and completeness saying that no row of any table is missing. Soundness is
established by showing an invariant for every node t, where
the invariant is assumed for every child node of t. For the
invariant, we use auxiliary notation program Π<t strictly
below t consisting of Πt′ for any node t′ below t, as well
as the program Π≤t below t, where Π≤t :=Π<t ∪ Πt . Intuitively, this invariant for t states that every row hI, P, σi
of table τt ensures (1) “satisfiability”: I |= Πt , (2) “answer set extendability”: I can be extended to an answer set
of Π<t , (3) “provability”: a ∈ P if and only if there is a
rule in Π≤t proving a with σ, and (4) “minimality”: there is
no a ∈ P, r ∈ Π≤t such that r proves a with σ ′ , where σ ′
coincides with σ, but sets σ ′ (a) = σ(a) − 1. Notably, the
invariant for the empty root node n = root(T ) ensures that
if τn 6= ∅, there is an answer set of Π. Completeness can
be shown by establishing that if τt is complete, then every
potential row that fulfills the invariant for any child node t′
of t, is indeed present in the corresponding table τt′ .
Theorem 1 (Runtime of SCC-bounded ASP). Assume a
HCF logic program Π, where the treewidth of the primal
graph GΠ of Π is at most k and ℓ is the largest SCC
53
answer set of Π′ and vice versa.
of i in binary by [i]j . Then, we let [[x]]i be the consistent set
of literals over level bits bjx that is used to represent level
number i for x in binary. More precisely, for each position
number j, [[x]]i contains bjx if [i]j = 1 and ¬bjx otherwise,
i.e., if [i]j = 0. Finally, we also use auxiliary atoms of the
form x ≺ i (could be optimized out) to indicate that the level
for x represented by [[x]]i is indeed smaller than i > 0.
{x} ←
←
Br+ , Br−
∪ Hr
{bjx } ←
Example 7. Recall program Π, level mapping σ, and largest
SCC size ℓ = 4 from Example 1. For representing σ in binary,
we require ⌈log(ℓ)⌉ = 2 bits per atom a ∈ at(Π) and we
assume that bits are ordered from least to most significant
bit. So [σ(e)]0 = [σ(e)]1 = 0, [σ(c)]0 = 1 and [σ(c)]1 = 0.
Then, we have [[e]]σ(e) = {¬b0e , ¬b1e }, [[b]]σ(b) = {¬b0b , ¬b1b },
[[c]]σ(c) = {b0c , ¬b1c }, [[d]]σ(d) = {¬b0d , b1d }, and [[a]]σ(a) =
{b0a , b1a }.
x≺i←
Next, we are ready to discuss the treewidth-aware reduction from SCC-bounded ASP to tight ASP, which takes Π
and T and creates a tight logic program Π′ . To this end, let t
be any node of T . First, truth values for each atom x ∈ χ(t)
are subject to a guess by Rules (1) and by Rules (2) it is
ensured that all rules of Πt are satisfied. Notably, by the
definition of tree decompositions, Rules (1) and Rules (2) indeed cover all the atoms of Π and all rules of Π, respectively.
Then, the next block of rules consisting of Rules (3)–(10)
is used for ensuring provability and finally the last block of
Rules (11)–(13) is required in order to preserve answer sets,
i.e., these rules prevent duplicate answer sets of Π′ for one
specific answer set of Π.
For the block of Rules (3)–(10) to ensure provability, we
need to guess the level bits for each atom x as given in
Rules (3). Rules (4) ensure that we correctly define x ≺ i,
which is the case if there exists a bit [i]j that is set to 1, but we
′
have ¬bjx and for all larger bits [i]j that are set to 0 (j ′ > j),
′
we also have ¬bjx . Then, for Rules (5) we slightly abuse
notation x ≺ i and use it also for a set X, where X ≺ i
denotes a set of atoms of the form x ≺ i for each x ∈ X.
Rules (5) make sure that whenever a rule r ∈ Πt proves x
with the level mapping given by the level bits over atoms
in χ(t), we have provability pxt for x in t. However, only for
the atoms of the positive body Br+ which are also in the same
SCC C = scc(x) as x we need to check that the levels are
smaller than the level of x, since by definition of SCCs, there
cannot be a positive cycle among atoms of different SCCs.
As a result, if there is a rule, where no atom of the positive
body is in C, satisfying the rule is enough for proving x as
given by Rules (6). If provability pxt holds, we also have px≤t
by Rules (7) and provability is propagated from node t′ to its
parent node t by setting p≤t if p≤t′ , as indicated by Rules (8).
Finally, whenever an atom x is forgotten in a node t, we
require to have provability px≤t ensured by Rules (9) and (10)
since t might be root(T ).
Preserving answer sets (bijectively): The last block consisting of Rules (11), (12), and (13) makes sure that atoms
that are false or not in the answer set of Π′ get level 0 and
that we do prohibit levels for an atom x that can be safely decreased by one without loosing provability. This ensures that
for each answer set of Π we get exactly one corresponding
¬bjx , ¬bjx1 ,
. . . , ¬bjxs
for each x ∈ χ(t); see3
(1)
for each r ∈ Πt
(2)
for each x ∈ χ(t),
1 ≤ j ≤ ⌈log(ℓscc(x) )⌉; see3
(3)
for each x ∈ χ(t), C = scc(x),
1 ≤ i < ℓC , 1 ≤ j ≤ ⌈log(ℓC )⌉,
[i]j =1, {j ′ | j < j ′ ≤ ⌈log(ℓC )⌉,
′
[i]j =0} = {j1 , . . . , js }
(4)
pxt ← x, [[x]]i , Br+ ,
for each r ∈ Πt , x ∈ χ(t) with
Br− ∪(Hr \{x}),
(Br+ ∩C)≺i
pxt ← x, Br+ ,
Br− ∪(Hr \{x})
px≤t ← pxt
px≤t ← px≤t′
x ∈ Hr , C = scc(x), 1 ≤ i < ℓC ,
and Br+ ∩ C 6= ∅
(5)
for each r ∈ Πt , x ∈ χ(t) with
x ∈ Hr , Br+ ∩ scc(x) = ∅
for each x ∈ χ(t)
(6)
(7)
for each x ∈ χ(t),
t′ ∈ chld(t), x ∈ χ(t′ )
(8)
for each t′ ∈ chld(t),
x ∈ χ(t′ ) \ χ(t)
for each x ∈ χ(n),
n = root(T )
(10)
← ¬x, bjx
for each x ∈ χ(t),
1 ≤ j ≤ ⌈log(ℓscc(x) )⌉
(11)
← x, [[x]]i , Br+ ,
Br− ∪(Hr \{x}),
(Br+ ∩C)≺i−1
← x, [[x]]i , Br+ ,
Br− ∪(Hr \{x})
for each r ∈ Πt , x ∈ χ(t) with
← x, ¬px≤t′
←
x, ¬px≤n
(9)
x ∈ Hr , C = scc(x), 2 ≤ i < ℓC ,
and Br+ ∩ C 6= ∅
(12)
for each r ∈ Πt , x ∈ χ(t) with
x ∈ Hr , C = scc(x), 1 ≤ i < ℓC ,
and Br+ ∩ C = ∅
(13)
Example 8. Recall program Π of Example 1 and TD T =
(T, χ) of GΠ as given in Figure 2. Rules (1) and Rules (2) are
constructed for each atom a ∈ at(Π) and for each rule r ∈
Π, respectively. Similarly, Rules (3) are constructed for
each of the ⌈log(ℓscc(a) )⌉ many bits of each atom a ∈ at(Π).
Rules (4) serve as auxiliary definition, where for, e.g., atom c
we construct c≺1 ← ¬b0c , ¬b1c ; c≺2 ← ¬b1c ; c≺3 ← ¬b0c ;
and c≺3 ← ¬b1c . Next, we show Rules (5)–(13) for node t2 .
No. Rules
(5) pbt2 ← b, [[b]]1 , d≺1, d; pbt2 ← b, [[b]]2 , d≺2, d;
pbt2 ← b, [[b]]3 , d≺3, d;
pct2 ← c, [[c]]1 , d≺1, d; pct2 ← c, [[c]]2 , d≺2, d;
pct2 ← c, [[c]]3 , d≺3, d;
pdt2 ← d, [[d]]1 , b≺1, c≺1, b, c; pdt2 ← d, [[d]]2 , b≺2, c≺2,
b, c; pdt2 ← d, [[d]]3 , b≺3, c≺3, b, c
3
A choice rule (Simons, Niemelä, and Soininen 2002) is of the
form {a} ← and in an HCF logic program it corresponds to a
disjunctive rule a ∨ a′ ← , where a′ is a fresh atom.
54
(7) pb≤t2 ← pbt2 ; pc≤t2 ← pct2 ; pd≤t2 ← pdt2
(11) ← ¬b, b0b ; ← ¬b, b1b ; ← ¬c, b0c ; ← ¬c, b1c ;
← ¬d, b0d ; ← ¬d, b1d
(12) ← b, [[b]]2 , d≺1, d; ← b, [[b]]3 , d≺2, d;
← c, [[c]]2 , d≺1, d; ← c, [[c]]3 , d≺2, d;
← d, [[d]]2 , b≺1, c≺1, b, c; ← d, [[d]]3 , b≺2, c≺2, b, c
For root node t5 of T , we obtain the following Rules (5)–(13).
No. Rules
(6) pbt5 ← b, e, ¬f
(7) pb≤t5 ← pbt5 ; pe≤t5 ← pet5 ; pf≤t5 ← pft5
(8) pb≤t5 ← pb≤t3 ; pe≤t5 ← pe≤t3 ; pe≤t5 ← pe≤t4 ; pf≤t5 ← pf≤t4
(9) ← d, ¬pd≤t3 ; ← g, ¬pg≤t4
(10) ← b, ¬pb≤t5 ; ← e, ¬pe≤t5 ; ← f, ¬pf≤t5
(11) ← ¬b, b0b ; ← ¬b, b1b ; ← ¬e, b0e ; ← ¬e, b1e ;
← ¬f, b0f ; ← ¬f, b1f
(13) ← b, [[b]]1 , e, ¬f ; ← b, [[b]]2 , e, ¬f ; ← b, [[b]]3 , e, ¬f
Lemma 3 (Treewidth-Awareness). Let Π be an HCF program, where every SCC C satisfies |C| ≤ ℓ. Then, the
treewidth of tight program Π′ obtained by the reduction
above, i.e., Rules (1)–(13), by using Π and a TD T = (T, χ)
of primal graph GΠ of width k, is in O(k · log(ℓ)).
Proof (Sketch). We take T = (T, χ) and construct a
TD T ′ :=(T, χ′ ) of GΠ′ , where χ′ is defined as follows.
For every node t of T , whose parent node is t∗ , we
let χ′ (t) :=χ(t) ∪ {bjx | x ∈ χ(t), 0 ≤ j ≤ ⌈log(ℓscc(x) )⌉} ∪
{pxt , px≤t , p≤t∗ | x ∈ χ(t)}. It is easy to see that indeed
all atoms of every instance of Rules (1)–(13) appear in at
least one common bag of χ′ . Further, we have connectedness
of T ′ , i.e., T ′ is a TD of GΠ′ and |χ(t)| in O(k · log(ℓ)).
Finally, we are in the position to prove Theorem 2.
Theorem 2 (Removing Cyclicity of SCC-bounded ASP).
Let Π be an HCF program, where the treewidth of GΠ is at
most k and where every SCC C satisfies |C| ≤ ℓ. Then, there
is a tight program Π′ with treewidth in O(k · log(ℓ)) such
that for each answer set of Π there is exactly one answer set
of Π′ , and vice versa.
Correctness and Treewidth-Awareness. We discuss correctness and treewidth-awareness as follows.
Lemma 2 (Correctness). Let Π be an HCF program, where
the treewidth of GΠ is at most k and where every SCC C
satisfies |C| ≤ ℓ. Then, the tight program Π′ obtained by the
reduction above on Π and a tree decomposition T = (T, χ)
of primal graph GΠ , is correct. Formally, for any answer
set I of Π there is exactly one answer set I ′ of Π′ as given
by Rules (1)–(13) and vice versa.
Proof. First, we compute a tree decomposition T = (T, χ)
of GΠ that is a 5-approximation of k = tw (GΠ ) in
time 2O(k) · poly(|at(Π)|). Observe that the reduction consisting of Rules (1)–(13) on Π and T runs in polynomial time,
precisely in time O(k · log(ℓ) · poly(|at(Π)|)). The claim follows by correctness (Lemma 2) and by treewidth-awareness
as given by Lemma 3.
Proof. “=⇒”: Given any answer set I of Π. Then, there exists a unique (Janhunen 2006), minimal level mapping σ proving each x ∈ I with 0 ≤ σ(x) < ℓscc(x) . Let P :={pxt , px≤t |
r ∈ Πt proves x with σ, x ∈ I, t in T }. From this we construct an interpretation I ′ :=I ∪ {bjx | [σ(x)]j = 1, 0 ≤
j ≤ ⌈log(ℓscc(x) )⌉, x ∈ I} ∪ P ∪ {px≤t | x ∈ I, t′ ∈
T, t′ is below t in T, px≤t′ ∈ P }, which sets atoms as I and
additionally encodes σ in binary and sets provability accordingly. It is easy to see that I ′ is an answer set of Π′ .
“⇐=”: Given any answer set I ′ of Π′ . From this we construct I :=I ′ ∩ at(Π) as well as level mapping σ :={x 7→
fI (x) | x ∈ at(Π)}, where we define function fI ′ (x) :
at(Π) → {0, . . . , ℓ−1} for atom x ∈ at(Π) to return 1 ≤
0 < ℓscc(x) if {bjx | 0 ≤ j ≤ ⌈log(ℓscc(x) )⌉, [i]j = 1} =
{bjx ∈ I ′ | 0 ≤ j ≤ ⌈log(ℓscc(x) )⌉}, i.e., the atoms in answer
set I ′ binary-encode i for x. Assume towards a contradiction
that I 6|= Π. But then I ′ does not satisfy at least one instance
of Rules (1) and (2), contradicting that I ′ is an answer set
of Π′ . Again, towards a contradiction assume that I is not an
answer set of Π, i.e., at least one x ∈ at(Π) cannot be proven
with σ. We still have px≤n ∈ I ′ for n = root(T ), by Rules (9)
and (10). However, then we either have that px≤t ∈ I ′
or pxn ∈ I ′ by Rules (7) and (8) for at least one child node t
of n. Finally, by the connectedness property (iii) of the definition of TDs, we have that there has to be a node t′ that is
either n or a descendant of n where we have pxt′ ∈ I ′ . Consequently, by Rules (5) and (6) as well as auxiliary Rules (3)
and (4) we have that there is a rule r ∈ Π that proves x
with σ, contradicting the assumption. Similarly, Rules (11),
(12), and (13) ensure minimality of σ.
Having established Theorem 2, the reduction above easily allows for an alternative proof of Theorem 1. Instead of
Algorithm BndCyc of Listing 1, one could also compile the
resulting tight program of the reduction above to a propositional formula (SAT), and use an existing algorithm for SAT
to decide satisfiability. Indeed, such algorithms run in time
single-exponential in the treewidth (Samer and Szeider 2010)
and we end up with similar running times as in Theorem 1.
4.2
Reduction to SAT
Having established the reduction of SCC-bounded ASP to
tight ASP, we now present a treewidth-aware reduction of
tight ASP to SAT, which together allow to reduce from SCCbounded ASP to SAT. While the step from tight ASP to
SAT might seem straightforward for the program Π′ obtained
by the reduction above, in general it is not guaranteed that
existing reductions, e.g. (Fages 1994; Lin and Zhao 2003;
Janhunen 2006), do not cause a significant blowup in the
treewidth of the resulting propositional formula. Indeed, one
needs to take care and define a treewidth-aware reduction.
Let Π be any given tight logic program and T = (T, χ) be
a tree decomposition of GΠ . Similar to the reduction from
SCC-bounded ASP to tight ASP, we use as variables besides
the original atoms of Π also auxiliary variables. In order to
preserve treewidth, we still need to guide the evaluation of the
provability of an atom x ∈ at(Π) in a node t in T along the
TD T , whereby we use atoms pxt and px≤t to indicate that x
was proven in node t and below t, respectively. However, we
do not need any level mappings, since there is no positive
55
cycle in Π, but we still guide the idea of Clark’s completion (Clark 1977) along TD T . Consequently, we construct
the following propositional formula, where for each node t
of T we add Formulas (14)–(18). Intuitively, Formulas (14)
ensure that all rules are satisfied, cf., Rules (2). Formulas (15)
and (16) take care that ultimately an atom that is set to true
requires to be proven, similar to Rules (9) and (10). Finally,
Formulas (17) and (18) provide the definition for an atom to
be proven in a node and below a node, respectively, which is
similar to Rules (5)–(8), but without the level mappings.
Preserving answer sets: Answer sets are already preserved,
i.e., we obtain exactly one model of the resulting propositional formula F for each answer set of Π and vice versa. If
the equivalence (↔) in Formulas (17) and (18) is replaced by
an implication (→), we might get duplicate models for one
answer set while still ensuring preservation of consistency,
i.e., the answers to both decision problems coincide.
_
_
a
for each r ∈ Πt
¬a ∨
(14)
+
−
a∈Br
a∈Br ∪Hr
x → px≤n
_
(
^
a∧x∧
r∈Πt ,x∈Hr a∈Br+
px≤t ↔ pxt ∨ (
_
^
¬b)
b∈Br− ∪(Hr \{x})
px≤t′ )
t′ ∈chld(t),x∈χ(t′ )
Knowing that under ETH tight ASP has roughly the same
complexity for treewidth as SAT, we can derive the following corollary that complements the existing lower bound for
normal ASP as given by Proposition 1.
Corollary 1. Let Π be any normal logic program, where
the treewidth of GΠ is at most k. Then, under ETH,
there is no reduction to a tight logic program Π′ running
in time 2o(k·log(k)) · poly(|at(Π)|) such that tw (GΠ′ ) is
in o(k · log(k)).
for each t′ ∈ chld(t),
x ∈ χ(t′ ) \ χ(t)
(15)
for each x ∈ χ(n),
n = root(T ) (16)
x → px≤t′
pxt ↔
Proof. First, we reduce SAT to tight ASP, i.e., capture
all models of a given formula F in a tight program Π.
Thereby Π consists of a choice rule for each variable of F
and a constraint for each clause. Towards a contradiction assume the contrary of this proposition. Then, we
reduce Π back to a propositional formula F ′ , running in
time 2o(k) · poly(|at(Π)|) with tw (GF ′ ) being in o(k). Consequently, we use an algorithm for SAT (Samer and Szeider
2010) on F ′ to effectively solve F in time 2o(k) · poly(|n|),
where F has n variables, which finally contradicts ETH.
5
Conclusion and Future Work
This paper deals with improving algorithms for the consistency of head-cycle-free (HCF) ASP for bounded treewidth.
The existing lower bound states that under the exponential
time hypothesis (ETH), we cannot solve an HCF program
with n atoms and treewidth k in time 2o(k·log(k)) · poly(n).
In this work, in addition to the treewidth, we also consider
the size ℓ of the largest strongly-connected component of
the positive dependency graph. Considering both parameters, we obtain a more precise characterization of the runtime: 2O(k·log(λ)) · poly(n), where λ = min({k, ℓ}). This
improves the previous result when the strongly-connected
components are smaller than the treewidth. Further, we provide a treewidth-aware reduction from HCF ASP to tight
ASP, where the treewidth increases from k to O(k · log(ℓ)).
Finally, we show that under ETH, tight ASP has roughly the
same complexity lower bounds as SAT, which implies that
there cannot be a reduction from HCF ASP to tight ASP such
that the treewidth only increases from k to o(k · log(k)).
Currently, we are performing experiments and practical
analysis of our provided reductions. For future work we
suggest to investigate precise lower bounds by considering
extensions of ETH like the strong ETH (Impagliazzo and
Paturi 2001). It might be also interesting to establish lower
bounds by taking both parameters k and ℓ into account.
for each x ∈ χ(t)
(17)
for each x ∈ χ(t)
(18)
Correctness and Treewidth-Awareness. Conceptually
the proofs of Lemmas 4 and 5 proceed similar to the proofs
of Lemmas 2 and 3, but without level mappings, respectively.
Lemma 4 (Correctness). Let Π be a tight logic program,
where the treewidth of GΠ is at most k. Then, the propositional formula F obtained by the reduction above on Π and a
TD T of primal graph GΠ , consisting of Formulas (14)–(18),
is correct. Formally, for any answer set I of Π there is exactly
one satisfying assignment of F and vice versa.
Lemma 5 (Treewidth-Awareness). Let Π be a tight logic
program. Then, the treewidth of propositional formula F
obtained by the reduction above by using Π and a TD T of
GΠ of width k, is in O(k).
Proof. The proof proceeds similar to Lemma 3. However,
due to Formulas (18) and without loss of generality one needs
to consider only TDs, where every node has constantly many
child nodes. Such a TD can be easily obtained from any
given TD by adding auxiliary nodes (Kloks 1994).
References
Alviano, M.; Calimeri, F.; Dodaro, C.; Fuscà, D.; Leone, N.;
Perri, S.; Ricca, F.; Veltri, P.; and Zangari, J. 2017. The
ASP system DLV2. In LPNMR’17, volume 10377 of LNAI,
215–221. Springer.
Balduccini, M.; Gelfond, M.; and Nogueira, M. 2006. Answer set based design of knowledge systems. Ann. Math.
Artif. Intell. 47(1-2):183–219.
Ben-Eliyahu, R., and Dechter, R. 1994. Propositional semantics for disjunctive logic programs. Ann. Math. Artif. Intell.
12(1):53–87.
However, we cannot do much better, as shown next.
Proposition 2 (ETH-Tightness). Let Π be a tight logic program, where the treewidth of GΠ is at most k. Then, under
ETH, the treewidth of the resulting propositional formula F
can not be significantly improved, i.e., under ETH there
is no reduction running in time 2o(k) · poly(|at(Π)|) such
that tw (GF ) is in o(k).
56
Bichler, M.; Morak, M.; and Woltran, S. 2018. Single-shot
epistemic logic program solving. In IJCAI’18, 1714–1720.
ijcai.org.
Bidoı́t, N., and Froidevaux, C. 1991. Negation by default and
unstratifiable logic programs. Theoretical Computer Science
78(1):85–112.
Bliem, B.; Morak, M.; Moldovan, M.; and Woltran, S. 2020.
The impact of treewidth on grounding and solving of answer
set programs. J. Artif. Intell. Res. 67:35–80.
Bodlaender, H. L., and Koster, A. M. C. A. 2008. Combinatorial optimization on graphs of bounded treewidth. The
Computer Journal 51(3):255–269.
Bodlaender, H. L.; Drange, P. G.; Dregi, M. S.; Fomin,
F. V.; Lokshtanov, D.; and Pilipczuk, M. 2016. A ck n
5-Approximation Algorithm for Treewidth. SIAM J. Comput.
45(2):317–378.
Brewka, G.; Eiter, T.; and Truszczyński, M. 2011. Answer
set programming at a glance. Communications of the ACM
54(12):92–103.
Clark, K. L. 1977. Negation as failure. In Logic and Data
Bases, Advances in Data Base Theory, 293–322. Plemum
Press.
Cygan, M.; Fomin, F. V.; Kowalik, Ł.; Lokshtanov, D.;
Dániel Marx, M. P.; Pilipczuk, M.; and Saurabh, S. 2015.
Parameterized Algorithms. Springer.
Eiter, T., and Gottlob, G. 1995. On the computational cost
of disjunctive logic programming: Propositional case. Ann.
Math. Artif. Intell. 15(3–4):289–323.
Fages, F. 1994. Consistency of Clark’s completion and existence of stable models. Methods Log. Comput. Sci. 1(1):51–
60.
Fandinno, J., and Hecher, M. 2020. Treewidth-Aware Complexity in ASP: Not all Positive Cycles are Equally Hard. In
ASPOCP@ICLP.
Fichte, J. K., and Hecher, M. 2019. Treewidth and counting
projected answer sets. In LPNMR’19, volume 11481 of
LNCS, 105–119. Springer.
Fichte, J. K., and Szeider, S. 2015. Backdoors to tractable
answer-set programming. Artificial Intelligence 220(0):64–
103.
Fichte, J. K., and Szeider, S. 2017. Backdoor trees for answer
set programming. In ASPOCP@LPNMR, volume 1868 of
CEUR Workshop Proceedings. CEUR-WS.org.
Fichte, J. K.; Hecher, M.; Morak, M.; and Woltran, S. 2017.
Answer set solving with bounded treewidth revisited. In
LPNMR’17, volume 10377 of LNCS, 132–145. Springer.
Fichte, J. K.; Kronegger, M.; and Woltran, S. 2019. A
multiparametric view on answer set programming. Ann. Math.
Artif. Intell. 86(1-3):121–147.
Gebser, M.; Kaminski, R.; Kaufmann, B.; and Schaub, T.
2012. Answer Set Solving in Practice. Morgan & Claypool.
Gelfond, M., and Lifschitz, V. 1991. Classical negation in
logic programs and disjunctive databases. New Generation
Comput. 9(3/4):365–386.
Gottlob, G.; Scarcello, F.; and Sideri, M. 2002. Fixedparameter complexity in AI and nonmonotonic reasoning.
Artif. Intell. 138(1-2):55–86.
Guziolowski, C.; Videla, S.; Eduati, F.; Thiele, S.; Cokelaer,
T.; Siegel, A.; and Saez-Rodriguez, J. 2013. Exhaustively
characterizing feasible logic models of a signaling network
using answer set programming. Bioinformatics 29(18):2320–
2326. Erratum see Bioinformatics 30, 13, 1942.
Hecher, M. 2020. Treewidth-Aware Reductions of normal
ASP to SAT – Is normal ASP harder than SAT after all? In
KR’20. In Press.
Impagliazzo, R., and Paturi, R. 2001. On the complexity of
k-sat. J. Comput. Syst. Sci. 62(2):367–375.
Impagliazzo, R.; Paturi, R.; and Zane, F. 2001. Which problems have strongly exponential complexity? J. of Computer
and System Sciences 63(4):512–530.
Jakl, M.; Pichler, R.; and Woltran, S. 2009. Answer-set programming with bounded treewidth. In IJCAI’09, volume 2,
816–822.
Janhunen, T. 2006. Some (in)translatability results for normal
logic programs and propositional theories. Journal of Applied
Non-Classical Logics 16(1-2):35–86.
Kloks, T. 1994. Treewidth. Computations and Approximations, volume 842 of LNCS. Springer.
Lackner, M., and Pfandler, A. 2012. Fixed-parameter algorithms for finding minimal models. In KR’12. AAAI
Press.
Lifschitz, V., and Razborov, A. A. 2006. Why are there so
many loop formulas? ACM Trans. Comput. Log. 7(2):261–
268.
Lin, F., and Zhao, J. 2003. On tight logic programs and yet
another translation from normal logic programs to propositional logic. In IJCAI’03, 853–858. Morgan Kaufmann.
Lin, F., and Zhao, X. 2004. On odd and even cycles in normal
logic programs. In AAAI, 80–85. AAAI Press / MIT Press.
Lonc, Z., and Truszczynski, M. 2003. Fixed-parameter
complexity of semantics for logic programs. ACM Trans.
Comput. Log. 4(1):91–119.
Marek, W., and Truszczyński, M. 1991. Autoepistemic logic.
J. of the ACM 38(3):588–619.
Nogueira, M.; Balduccini, M.; Gelfond, M.; Watson, R.; and
Barry, M. 2001. An A-Prolog decision support system for the
Space Shuttle. In PADL’01, volume 1990 of LNCS, 169–183.
Springer.
Pichler, R.; Rümmele, S.; and Woltran, S. 2010. Counting and enumeration problems with bounded treewidth. In
LPAR’10, volume 6355 of LNCS, 387–404. Springer.
Robertson, N., and Seymour, P. D. 1986. Graph minors II:
Algorithmic aspects of tree-width. J. Algorithms 7:309–322.
Samer, M., and Szeider, S. 2010. Algorithms for propositional model counting. J. Discrete Algorithms 8(1):50–64.
Simons, P.; Niemelä, I.; and Soininen, T. 2002. Extending
and implementing the stable model semantics. Artif. Intell.
138(1-2):181–234.
57
Towards Lightweight Completion Formulas for Lazy Grounding in Answer Set
Programming
Bart Bogaerts1 , Simon Marynissen1,2 , Antonius Weinzierl3
1
Vrije Universiteit Brussel 2 KU Leuven 3 TU Wien
bart.bogaerts@vub.be, simon.marynissen@kuleuven.be, antonius.weinzierl@kr.tuwien.ac.at
Abstract
Lazy grounding takes the idea of lazily generating the
SAT encoding one step further by also lazily performing
the grounding process. That is, ASP rules are only instantiated when some algorithm detects that they are useful for
the solver in its current state. The most prominent class of
lazy grounding systems for ASP is based on computation
sequences (Liu et al. 2007) and includes systems such as
Omiga (Dao-Tran et al. 2012), GASP (Dal Palù et al. 2009),
ASPeRiX (Lefèvre and Nicolas 2009) and the recently introduced ALPHA (Weinzierl 2017). The latter is the youngest
and most modern of the family and the only one that integrates lazy grounding with a CDCL solver, resulting in superior search performance over its predecessors. Our work
extends the ALPHA algorithm.
Contrary to more traditional ASP systems, lazy grounding
systems aim more at applications in which the full grounding is so large that simply creating it would pose issues (e.g.,
if it does not fit in your main memory). This phenomenon
is known as the grounding bottleneck (Balduccini, Lierler,
and Schüller 2013). Examples of such problems include
queries over a large graph; planning problems, with a very
large number of potential time steps, or problems where the
full grounding contains a lot of unnecessary information and
the actual search problem is not very hard.
The essential idea underlying lazy grounding is that all
parts of the grounding that do not help the solver in its quest
to find a satisfying assignment (a stable model) or prove
unsatisfiability are better not given to the solver since they
only consume precious time and memory. Unfortunately,
it is not easy to detect which parts that are and a trade-off
shows up (Taupe, Weinzierl, and Friedrich 2019): producing larger parts of the grounding will improve search performance (e.g., propagation can prune larger parts of the search
space) but grounding too much will — on the type of instances lazy grounding is built for — result in an unmanageable explosion of the ground theory. Lazy grounding systems and ground-and-solve systems reside on two extremes
of this trade-off: the former produce a minimal required part
of the theory to ensure correctness while the latter produce
the entire bottom-up grounding.
Our work moves lazy grounding a bit more to eager side
of this trade-off. Specifically, we focus on completion formulas (Clark 1978) that essentially express that when an
atom is true, there must be a rule that supports it (a rule with
Lazy grounding is a technique for avoiding the so-called
grounding bottleneck in Answer Set Programming (ASP).
The core principle of lazy grounding is to only add parts of
the grounding when they are needed to guarantee correctness
of the underlying ASP solver. One of the main drawbacks of
this approach is that a lot of (valuable) propagation is missed.
In this work, we take a first step towards solving this problem by developing a theoretical framework for investigating
completion formulas in the context of lazy grounding.
1
Introduction
Answer set programming (ASP) (Marek and Truszczyński
1999) is a well-known knowledge representation paradigm
in which logic programs under the stable semantics (Gelfond and Lifschitz 1988) are used to encode problems in the
complexity class NP and beyond. From a practical perspective, ASP offers users a rich first-order language, ASP-Core2
(Calimeri et al. 2013), to express knowledge in, and many
efficient ASP solvers (Gebser, Maratea, and Ricca 2017) can
subsequently be used to solve problems related to knowledge expressed in ASP-Core2.
Traditional ASP systems work in two phases. First, the
input program is grounded (variables are eliminated). Second, a solver is used to find the stable models of the resulting
ground theory. For a long time, the ASP community has focused strongly on developing efficient solvers, while only a
few grounders were developed. Most modern ASP solvers
are in essence extensions of satisfiability (SAT) (Marques
Silva, Lynce, and Malik 2009) solvers, building on conflictdriven clause learning (CDCL) (Marques-Silva and Sakallah
1999). In recent years, in many formalisms that build on
top of SAT, we have seen a move towards only generating
parts of the SAT encoding on-the-fly, on moments when it is
deemed useful for the solver. This idea lies at the heart of
the CDCL(T) algorithm for SAT modulo theories (Barrett et
al. 2009) and is embraced under the name lazy clause generation (Stuckey 2010) in constraint programming (Rossi, van
Beek, and Walsh 2006). Answer set programming is no exception: the so-called unfounded set propagator and aggregate propagator are implemented using the same principles;
when needed, they generate clauses for the underlying SAT
algorithm. Additionally, lazy clause generation forms the
basis of recent constraint ASP solvers (Banbara et al. 2017).
58
The set of all atoms is denoted by A. If a ∈ A, then var(a)
denotes the set of variables occurring in a. We say that a is
ground if var(a) = ∅. The set of all ground atoms is denoted
Agr . A literal is an atom p or its negation ¬p. The former is
called a positive literal, the latter a negative literal. Slightly
abusing notation, if l is a literal, we use ¬l to denote the
literal that is the negation of l, i.e., we use ¬(¬p) to denote
p. The set of all literals is denoted L and the set of ground
literals Lgr . A clause is a disjunction of literals. A (normal)
rule is an expression of the form
true body and that atom in the head). While ground-andsolve systems add these formulas (in the form of clauses)
to their ground theory, lazy grounders cannot do this easily;
the reason is that the set of ground rules that could derive
a certain atom is not known (more instantiations could be
found later on). Consider, for example the atom p(a) and
a rule p(X) ← q(X, Y ). where the set of ground instantiations of this rule with p(a) in the head depends on the
set of atoms over the binary predicate q. Unless those instances over q are fully grounded, a lazy grounder cannot
add the corresponding completion formula. In this paper,
we develop lightweight algorithms to detect when that set of
rules is complete and hence, when completion formulas are
added. Our hypothesis is that doing this will improve search
performance without blowing up the grounding and as such
result in overall improved performance of lazy-grounding
ASP systems, and specifically the ALPHA system.
The main contribution of our paper is the development of
a novel method to discover completion formulas during lazy
grounding. Our method starts from a static analysis of the
input program in which we discover functional dependencies between variable occurrences. During the search, this
static analysis is then used to figure out the right moment to
add the completion formulas in a manner that is inspired by
the two-watched literal scheme from SAT to avoid adding
the completion constraints on moments they have no chance
of propagating anyway. We do not have an implementation
of this idea available yet, but instead focus on the theoretical
principles.
The rest of this paper is structured as follows. In Section 2 we recall some preliminaries. Section 3 contains the
different methods for discovering completion formulas. In
Section 4, we discuss extensions of our work that could be
used to find even more completion formulas. We conclude
in Section 5.
2
p←L
where p is an atom and L a set of literals. If r is such
a rule, its head, positive body, negative body and body are
defined as H(r) = p, B+ (r) = A ∩ L, B− (r) = {q ∈ A |
¬q ∈ L} and B(r) = L respectively. We call r a fact if
B(r) = ∅ and ground if p and all literals in L are ground.
We use var(r) to denote the set of variables occurring in r,
i.e.,
[
var(q).
var(r) = var(p) ∪
q∈L
A rule r is safe if all variables in r occur in its positive body,
i.e., if var(r) ⊆ var(B+ (r)). A logic program P is a finite
set of safe rules. P is ground if each r ∈ P is. In our examples, logic programs are presented in a more general format,
using, e.g., choice rules (see (Calimeri et al. 2020)). These
can easily be translated into the format considered here.
If X is a set of variables, a grounding substitution of X is
a mapping σ : X → C. The set of all substitutions of X is
denoted sub(X) If e is an expression, a grounding substitution for e is a grounding subtitution of its variables. We write
[c1 /X1 , . . . , cn /Xn ] for the substitution that maps each Xi
to ci and each other variable to itself. The result of applying
a substitution σ to an expression e is the expression obtained
by replacing all variables X by σ(X) and is denoted σ(e).
The most general unifier of two substitutions is defined as
usual (Martelli and Montanari 1982). A substitution σ extends a substitution τ if σ is equal to τ in the domain of τ .
The grounding of a rule is given by
Preliminaries
We now introduce some preliminaries related to answer set
programming in general and the ALPHA algorithm specifically. This section is based on the preliminaries of (Bogaerts
and Weinzierl 2018).
gr(r) = {σ(r) | σ is a grounding substitution}
and the (full)
grounding of a program P is defined as
S
gr(P) = r∈P gr(r).
A (Herbrand) interpretation I is a finite set of ground
atoms. The satisfaction relation between interpretations and
literals is given by
Answer set programming. Let C be a set of constants, V
be a set of variables, and Q be a set of predicates, each with
an associated arity, i.e., elements of Q are of the form p/k
where p is the predicate name and k its arity. We assume
the existence of built-in predicates, such as equality, with a
fixed interpretation. A (non-ground) term is an element of
C ∪ V.1 The set of all terms is denoted T . Our definition of
a term does not allow for nesting. This eases our exposition,
but is not essential for our results. For instance, it allows us
to view + as a ternary predicate +/3, i.e. +(X, Y, Z) means
that X + Y = Z. A (non-ground) atom is an expression of
the form p(t1 , . . . , tk ) where p/k ∈ Q and ti ∈ T for each i.
I |= p if p ∈ I, and
I |= ¬p if p 6∈ I.
An interpretation satisfies a set L of literals if it satisfies each
literal in L. A partial (Herbrand) interpretation I is a consistent set of ground literals (consistent here means that it
does not contain both an atom and its negation). The value
of a literal l in a partial interpretation I is lI = t if l ∈ I, f
if ¬l ∈ I and u otherwise.
Given a (partial) interpretation I and a ground program P,
we inductively define when an atom is justified (Denecker,
Brewka, and Strass 2015) as follows. An atom p is justified
1
Following Weinzierl (2017), we omit function symbols to simplify the presentation. All our results still hold in the presence
of function symbols, except for termination, for which additional
(syntactic) restrictions must be imposed.
59
in I by P if there is a rule r ∈ P with H(r) = p such
that each q + ∈ B+ (r) is justified in I by P and each q − ∈
B− (r) is false in I. A built-in atom is justified in I by P if
it is true in I.
An interpretation I is a model of a ground program P if
for each rule r ∈ P with I |= B(r), also I |= H(r). An
interpretation I is a stable model (or answer set) of a ground
program P (Gelfond and Lifschitz 1988) if it is a model of
P and each true atom in I is justified in I by P. This nonstandard characterization of stable models coincides with
the original reduct-based characterization, as shown by Denecker, Brewka, and Strass (2015) but simplifies the rest of
our presentation. If P is non-ground, we say that I is an answer set of P if it is an answer set of gr(P). The set of all
answer sets of P is denoted AS(P).
decide Pick (using some heuristics (Taupe, Weinzierl, and
Schenner 2017)) one atom p, occurring in Pg that is unknown in Iα and add (p, δ) or (¬p, δ) to α.2
justification-conflict If all atoms in Pg are assigned while
some atom is true but not justified, learn a new clause
that avoids visiting this assignment again. Worst-case
the learned clause contains the negation of all decisions,
but Bogaerts and Weinzierl (2018) developed more optimized analysis methods. After learning this clause, AL PHA backjumps.
3
Deriving Completion Formulas
We now discuss our modifications to the ALPHA algorithm
that allow us to add completion formulas. There are two
main problems to be tackled here: the first, and most fundamental is Question 1: how to generate completion clauses,
or stated differently, how to find all the rules that can derive
a certain atom, without creating the full grounding, and the
second is, Question 2: when to add completion formulas to
the solver. The general idea for the generation is that we
will develop approximation methods that overapproximate
the set of instantiations of rules that can derive a given atom
based on a static analysis of the program. The reason why
we look for an overapproximation is since in general finding
the exact set of such instantiations would require a semantical analysis. Our methods below are designed based on
the principle that such an overapproximation should be as
tight as possible. Specifically, our methods will be based on
functional dependencies and determined predicates.
This section starts by proving definitions for bounds. After that we explain how bounds can be used in ALPHA. The
last subsection describes the different type of bounds and
how they can be detected and combined.
The ALPHA algorithm. We now recall the formalization
of ALPHA of Bogaerts and Weinzierl (2018). This differs
from the original presentation of Weinzierl (2017) in that it
does not use the truth value MUST-BE-TRUE, but instead
makes the justifiedness of atoms explicit. The state of AL PHA is a tuple hP, Pg , C, α, SJ i, where
• P is a logic program,
• Pg ⊆ gr(P) is the so-far grounded program; we use Σg ⊆
Agr to denote the set of ground atoms that occur in Pg ,
• C is a set of (learned) clauses,
• α is the trail; this is a sequence of tuples (l, c) with l
a literal and c either the symbol δ, a rule in Pg or a
clause in C. α is restricted to not containing two tuples (l, c) and (¬l, c′ ); in a tuple (l, c) ∈ α, c represents the reason for making l true: either decision (denoted δ) or propagation because of some rule or clause;
α implicitly determines a partial interpretation denoted
Iα = {l | (l, c) ∈ α for some c}.
• SJ ⊆ A is the set of atoms that are justified by Pg in Iα .
For clause learning and
W propagation, a rule p ← L is
treated as the clause p ∨ l∈L ¬l. Hence, whenever we refer to “a clause” in the following, we mean any rule in Pg
(viewed as a clause) or any clause in C. We refer to rules
whenever the rule structure is needed (for determining justified atoms).
ALPHA interleaves CDCL and grounding. It performs (iteratively) the following steps (listed by priority).
conflict If a clause in C ∪ Pg is violated, analyze the conflict, learn a new clause (add to C) and back-jump (undo
changes to α and SJ that happened since a certain point)
following the so-called 1UIP schema (Zhang et al. 2001).
(unit) propagate If all literals of a clause c ∈ C ∪Pg except
for l are false in Iα , add (l, c) to α.
justify If there is a rule r such that B+ (r) ⊆ SJ and
¬B− (r) ⊆ Iα , add H(r) to SJ .
ground If, for some grounding substitution σ and r ∈ P,
B+ (σ(r)) ⊆ Iα , add σ(r) to Pg . In practice, when
adding this rule, ALPHA makes a new – intermediate –
propositional variable β(σ(r)) to represent the body of
the rule, similar to (Anger et al. 2006).
3.1
Bounds
The core concept of our detection mechanism is the notion
of bounds. We have already stated that we want to find overapproximations of grounding substitutions. We now formalize this.
Definition 3.1. Given a rule r in a program P. A grounding
substitution σ is relevant in r with respect to P if B+ (σ(r))
is justified in some partial interpretation of P.
The following lemma follows immediately from the characterization of stable models in terms of justifications (Denecker, Brewka, and Strass 2015).
Lemma 3.2. Let I be an answer set of P. If I |= p, then
there is a rule r in P and a relevant substitution σ in r such
that σ(H(r)) = p.
Proof. Since the justification characterization of answer
sets, we know that p is justified in I. Then by the definition of justified, the proof follows.
Definition 3.3. Given a rule r and two sets X and Y of
variables in r. A function f : sub(X) → 2sub(Y ) is called
2
ALPHA actually only allows deciding on certain atoms (those
of the form β(r)), hence our presentation is slightly more general.
60
a bound in r if for all σ ∈ sub(X) it holds that f (σ) is a
superset of the elements τ ∈ sub(Y ) for which there is a
relevant substitution in r that extends both σ and τ .
To denote that f is a bound, we write f : X y Y . If
X = ∅, then we say Y is bounded by f in r. If f (σ) contains
at most one element for each σ ∈ sub(X), then f is called
a functional bound.
3.2
makes sense to add the completion constraint. This method
is very lightweight: it does not trigger additional grounding,
does not change the fundamental algorithm underlying AL PHA , and only adds very few additional constraints. It does
enable better pruning of the search space.
The second way is more proactive, but also more invasive.
It happens during the justification-conflict reasoning step.
If an atom h is true, but not justified, instead of triggering the
justification analysis to resolve why this situation happens,
we add the completion formula for h, thereby also avoiding the justification-conflict. However, since certain atoms
β(τ (ri )) from Proposition 3.4 are not yet known to the
solver, also these corresponding rules need to be grounded.
For this reason the second way is more intrusive into the
grounding algorithm.
How to use bounds
Bounds can be used to calculate overapproximations of completion formulas. To start, assume that a predicate p is
defined only in a single rule r. Assume there is a bound
f : var(H(r)) y var(r), and let σ ∈ sub(var(H(r))). Then
with σ, we can determine an overapproximation of the completion formula of h = σ(H(r)) as follows:
_
β(τ (r)).
¬h ∨
3.3
How to find bounds
In the previous subsection, we showed how bounds can be
used to improve the lazy grounding algorithm. We now turn
our attention to the question of how to find bounds. In particular, the various types of bounds we define in this section can all be found using a static analysis of the program.
We illustrate our methods in increasing difficulty, illustrating each of them with examples of rules we encountered in
practice, in encodings of the 5th ASP competition (Calimeri
et al. 2016).
τ ∈f (σ)
The case when p has multiple rules is similar and is formalized in the following proposition.
Proposition 3.4. Let h be a ground atom. Let r1 , . . . , rn
be the rules in P whose head unifies with h. Let σi denote
the most general unifier of h and H(ri ). If there is a bound
fi : var(H(ri )) y var(ri ) for all i, then
_
_
β(τ (ri ))
¬h ∨
1≤i≤n τ ∈fi (σi )
Case 1: Non-projective rules The first case is very simple: in case all variables occurring in a rule also occur in the
head, then we know that for each atom, there is at most one
variable substitution that turns the head of the rule into the
specified atom. We call such a rule non-projective since no
body variables are projected out.
Proposition 3.6. If r is a non-projective rule, i.e., if
var(H(r)) = var(r), then the following is a bound:
holds in all answer sets of P.
Proof. For all answer sets I of P for which I |= ¬h, the
clause trivially holds in I. So assume an answer set I for
which I |= h. This means there is a rule ri in P that derives
h. Hence by Lemma 3.2, there is a relevant substitution ρ
in r that extends σi . This means that I |= β(ρ(r)). By the
definition of a bound, it holds that ρ ∈ f (σ). Therefore I
satisfies the clause, which we needed to show.
id : sub(var(H(r))) → sub(var(r)) : σ 7→ {σ}.
Proof. Take σ ∈ sub(var(H(r))). Let τ ∈ sub(var(r)) for
which there is a relevant substitution ρ in r that extends both
σ and τ . Then τ = ρ = σ. Therefore τ ∈ id (σ), which
proves that id is a bound.
Remark 3.5. By Lemma 3.11 and Lemma 3.12, it is sufficient to have a bound var(H(r)) y var(B(r)) for each rule
r.
For both the multiple and the single rule case, the generated clause might be unwieldy, in particular if the bounds are
bad overapproximations. Therefore, it is crucial that good
bounds are detected, which is discussed in the next subsection.
Of course, a question that remains unanswered is when
such bounds should be added to the solver. We see two ways
to do this.
The first way is a very lightweight mechanism that happens during the ground reasoning step. The idea is that as
soon as all rules that can derive a specific head h have been
grounded, then we add the completion formula for h. Keeping track of this can be done very cheaply: the bounds provide us with an upper bound on the number of rules that
can derive a given atom; it suffices to keep track of a simple counter for each atom to know when the criterion is satisfied. As soon as this is the case, all the atoms β(τ (ri ))
mentioned in Proposition 3.4 are defined in the solver and it
In case a predicate has a single non-projective rule, for
each ground instance of the rule, the head is in fact equivalent to the body. This is a very specific and restricted case.
We mention it here for two reasons. First of all, this is the
only case for which ALPHA, without our extensions already
adds completion constraints. Secondly, this (restricted) situation does show up in practical problems. For instance
the following rule was taken from the new Knight Tour with
Holes encoding used in the 5th ASP competition (Calimeri
et al. 2016).
move(X, Y, XX, Y Y )
← valid (X, Y, XX, Y Y ), ¬other (X, Y, XX, Y Y ).
Of course, if all the rules for a predicate are non-projective,
then we can combine the trivial bounds on each rule to find
a completion formula; however, this is is not yet detected in
the existing ALPHA algorithm.
61
Case 2: Direct functional dependencies In certain cases,
the body of a rule can contain variables the head does not,
yet without increasing the number of instantiations that can
derive the same head. This happens especially if some arithmetic operations are present. To illustrate this, consider the
rule
{gt(A, X, U )} ←elem(A, X), comUnit(U ),
comUnit(U1 ), U1 = U + 1, rule(A),
U < X.
taken from the new Partner Units encoding used in the 5th
ASP competition (Calimeri et al. 2016). This type of pattern occurs quite often, also for instance in Tower of Hanoi
and in many temporal problems in which a time parameter
is incremented by one or in problems over a grid in which
coordinates are incremented by one. We can see that even
though the variable U1 occurs only in the body of the rule,
for each instantation of the head there can be at most one
grounding substitution of the rule that derives it. Hence, if
all rules for gt have this structure, the completion can also
be detected here. We now formalize this idea.
If p is a predicate with arity n, by pj (with 1 ≤ j ≤ n)
we denote the j th argument position of p. For any set J
of argument positions, denote by sub(J) the set of assignments of constants to the positions in J. A tuple of constants
c1 , . . . , cn , is succinctly denoted by c. If p(c) is an atom and
J a set of argument positions in p, we write c|J to denote the
element in sub(J) that maps each pj ∈ J to cj .
Definition 3.7. A ground atom h is relevant in P if there is
a rule r in P and a relevant grounding substitution σ in r
such that σ(H(r)) = h. A ground built-in atom is relevant
in P if it is true.
Definition 3.8. Let J and K be sets of argument positions
of a predicate p in P. We say that J → K is a functional
dependency if for all σ ∈ sub(J), there exists at most one
τ ∈ sub(K) and relevant atom p(c) in P such that c|J = σ
and c|K = τ .
For instance, if p is equality, the following are some
functional dependencies: {=1 } → {=2 }, {=2 } → {=1 },
{=1 , =2 } → {=1 }. Of the ones mentioned here, the last
one is the least interesting. Another example is the predicate +/3. It has among others the following functional
dependencies: {+1 , +2 } → {+3 }, {+1 , +3 } → {+2 },
{+3 , +2 } → {+1 }.
If a built-in predicate p with arity n occurs in the positive
body of a rule r, then a functional dependency of p determines a bound in r.
Proposition 3.9. Assume p is a built-in predicate and p(t) ∈
B+ (r). A functional dependency J → K of p induces a
J→K
) in r:
functional bound (denoted p(t)
mapping a σ to {τσ } if τσ exists and ∅ otherwise. We prove
that f is a bound; hence take any σ ∈ sub(X). If there is
no τ ∈ sub(Y ) for which there is a relevant substitution in
r that extends both σ and τ , then we are done. So suppose,
there is such a τ . We prove that τ = τσ . Any relevant
extension in r of both τ and σ justifies p(X); hence satisfies
p(X). By definition of τσ we have that τ = τσ . Therefore,
τ ∈ f (σ). This proves that f is a bound. That f is functional
follows directly from its definition.
var({ti | pi ∈ J}) y var({ti | pi ∈ K}).
Proof. Let X = var({ti | pi ∈ J}) and Y = var({ti |
pi ∈ K}). Let σ ∈ sub(X). Since J → K is a functional
dependency, there exists at most one τσ ∈ sub(Y ) such that
the atom p(t) is satisfied under some extension of both σ and
τσ . Define
f : sub(X) → 2sub(Y )
Typical ASP encodings of graph coloring do not contain
the rule (1) but instead use the rule
As we will see later, bounds originating from functional
dependencies of built-in predicates will act as a base case for
further functional bounds.
Case 3: Determined predicates Given a program P we
call a predicate determined if its defined only by facts. The
interpretation of determined predicates can be computed efficiently prior to the solving process, and their value can be
used to find bounds on the instantiations of other rules. An
example can be found in graph coloring, in which a rule
colored (N ) ← assign(N, C), color (C)
(1)
expresses that a node is colored if it is assigned a color. The
predicate color here is determined since it is given by facts.
Thus, we know that for each node n, there are at most as
many instances of the rule that derive colored (n) as there
are colors. Notably, the completion contraint that would be
added by taking this into account, is exactly the redundant
constraint that was added manually in the graph coloring
experiments of Leutgeb and Weinzierl (2017) to help lazy
grounding, i.e.
¬colored (n) ∨ assign(n, col1 ) ∨ · · · ∨ assign(n, colk )
Our new methods obtain this constraint automatically,
thereby easing the life of the modeler.
Proposition 3.10. Let r be a rule with d(t) ∈ B+ (r) and
d a determined predicate. In that case there exists a bound
∅ y X, where X is the set of variables in t.
Proof. Every fact d(c) for a tuple of constants c corresponds
to at most one element σc in sub(X). Since d is given by
facts, we can enumerate its interpretation I d . Let
f : sub(∅) → 2sub(X) : σ 7→ {σc | c ∈ I d }
We prove that f is a bound. Take σ ∈ sub(∅). Note that σ
is necessarily the trivial substitution. Take τ ∈ sub(X) for
which there is a relevant substitution in r that extends both σ
and τ . We prove that τ ∈ f (σ), i.e. τ = σc for some c ∈ I d .
By the existence of that relevant substitution in r, we have
that d(t) is satisfied under τ ; hence τ is equal to some σc for
some c ∈ I d . This proves that f is a bound.
colored (N ) ← assign(N, C).
Even in this case, it is possible to determine that C is
bounded by a determined predicate by inspecting the defining rules of assign. This is formalized in the remainder of
this section.
62
substitution that extends both σ and τ . Hence, τ ∈ f (σ)
since f is a bound. Combining this proves that υ ∈ h(σ);
hence proving that h is a bound.
Case 4: Combining bounds Bounds can be obtained
from other bounds in several ways. We already found three
base cases of bounds, given in Propositions 3.6, 3.9, and
3.10:
1. If Y ⊆ X ⊆ var(r), then id : X y Y is a bound, where
id is the function mapping σ to {σ}.
If only functional bounds are considered, then Lemma
3.12 and Lemma 3.13, together with our first base case
forms the axiomatic system for functional dependencies developed by Armstrong (1974). To illustrate the combination
of bounds, consider a rule
J→K
induced by a built-in atom p(t) ∈
2. The bound p(t)
B+ (r) with functional dependency J → K.
3. The bound induced by an atom d(t) ∈ B+ (r) for a determined predicate d.
Additionally, bounds of different types can be altered or
combined to get new bounds, as shown in the following lemmas.
Lemma 3.11. Let f : X y Y be a bound in r. Then for any
X ⊆ X ′ and Y ′ ⊆ Y ′ , the function
h(X) ← +(X, 1, Z), = (Z, U ).
In this case, X y U is a functional bound in r: by using
the functional dependency of + we see that X y Z is a
functional bound; by using the dependencies of =, we see
that Z y U is functional bound, hence we can combine
them, by using Lemma 3.13, to get the desired dependency.
Even more is possible. If f : X y Y and g : X y Y are
bounds, then the pointwise union and intersection are also
bounds. While the union will not be of much benefit for
finding good overapproximations of completion formulas,
the intersection of two bounds can be useful since it allows
for more precise approximations.
′
f ′ : sub(X ′ ) → 2sub(Y ) : σ 7→ {τ |Y ′ | τ ∈ f (σ|X )}
is also a bound. (σ|X denotes σ restricted to the variables
in X)
Proof. Take σ ∈ sub(X ′ ). Let τ ′ ∈ sub(Y ′ ) for which
there exists a relevant substitution ρ in r that extends both
σ and τ ′ . We prove that τ ′ ∈ f (σ), i.e. there exist a
τ ∈ f (σ|X ) such that τ ′ = τ |Y ′ . Take τ = ρ|Y . By definition, τ |Y ′ = τ ′ . We know that ρ extends both σ|X and
τ . Therefore, since f is a bound, it holds that τ ∈ f (σ|X ).
This proves that f ′ is a bound.
Case 5: Bounds on argument positions We have shown
that if d is a determined predicate, then it induces a bound.
However, sometimes bounds by determined predicates are
not explicit. For instance, in the graph coloring example it
would make perfect sense to drop color (C) from the body of
the rule since the fact that C is a color should follow already
from its occurrence in assign(N, C), resulting in the rule
Lemma 3.12. Let f : X y Y be a bound in r and let U ⊆
var(r). Let h denote the function
h : sub(X ∪ U ) → 2
colored (N ) ← assign(N, C).
sub(Y ∪U )
However, from the definition of assign, one can see that
that C is bound by the determined predicate color and hence
the completion constraint could, in principle, still be derived.
We now formally show how to do this.
Definition 3.14. Let p be a predicate with arity n in a program P and J and K be sets of argument positions in p. If
f is a function from sub(J) to 2sub(K) such that for every
relevant atom p(c) in P it holds that c|K ∈ f (c|J ), then f is
said to be a bound in p, which we denote by f : J y K. If
J = ∅, then we say K is bounded by f .
Bounds in rules and predicates are not independent:
bounds in rules determine bounds on argument positions and
vice versa. This is formalized in the following two propositions.
Proposition 3.15. Let p be a predicate symbol and J and K
sets of argument positions in p. Assume that for each rule r
of the form p(t) ← ϕ in P, fr : var(t|J ) y var(t|K ) is a
bound in r, then the union of these fr induces a bound in p.
where
h(σ) = {τ · σ|U \Y | τ ∈ f (σ|X )}
and · is used to denote the combination of two disjoint projected substitutions. The function h is a bound from X ∪ U
to Y ∪ U .
Proof. Take σ ∈ sub(X∪U ). Let τ ∈ sub(Y ∪U ) for which
there is a relevant substitution ρ in r that extends both σ and
τ . We prove that τ ∈ h(σ). We know that ρ also extends
both σ|X and τ |Y . Now, since f is a bound, τ |Y ∈ f (σ|X ).
Since ρ extends both σ and τ , it holds that σ|U \Y = τ |U \Y
because U \ Y is contained in the domains of both σ and
τ . Therefore τ = τ |Y · τ |U \Y = τ ′ · σ|U \Y for some τ ′ ∈
f (σ|X ). This proves that τ ∈ h(σ); hence h is a bound.
Lemma 3.13. If f : X y Y and g : Y y Z are bounds in
r, then the following function is a bound:
[
g(τ )
h : sub(X) → 2sub(Z) : σ 7→
Proof. Let A be any set of argument positions in p. Then
A corresponds uniquely to a set Vr ⊆ var(H(r)) for each
rule r of p, and Vr is the same for each rule r of p. Therefore, this set is denoted V . It is straightforward that sub(A)
is in a one-to-one relation with sub(V ). Misusing notation, we assume sub(A) = sub(V ). Then, we can define
f : sub(J) → 2sub(K) mapping σ to ∪r fr (σ). We now
τ ∈f (σ)
Proof. Take σ ∈ sub(X). Let υ ∈ sub(Z) for which there
is a relevant substitution ρ in r that extends both σ and υ.
As usual we prove that υ ∈ h(σ). Take τ = ρ|Y . Then ρ is
a relevant substitution that extends both τ and υ. Therefore,
since g is a bound, υ ∈ g(τ ). Likewise, ρ is a relevant
63
prove that f is a bound in p. Hence, take a relevant atom p(c)
in P. It sufficies to prove that c|K ∈ f (c|J ). Since p(c) is
relevant, there is a rule r of p and a relevant grounding substitution ρ such that ρ(H(r)) = p(c). By the one-to-one correspondence between sub(J) and sub(VJ ) and sub(K) and
sub(VK ), we know that c|J ∈ sub(VJ ) and c|K ∈ sub(VK ).
Therefore, since fr is a bound, we know that c|K ∈ fr (c|J ).
Hence, c|K ∈ f (c|J ), which proves that f is a bound in
p.
first rule of r, Y is bounded by w, and by transitivity X is
bounded by w as well. In the second rule of r, X is bounded
by u and Y bounded by w. Therefore, r1 is bounded by the
union of u and w, while r2 is bounded by w. Finally, we
obtain the following completion formula for o:
A simple example illustrating this proposition is as follows: Suppose we have the following rules for p:
1. find all bounds on variables in rules (using a fixpoint procedure, using the base cases and lemmas in Case 4 and
Proposition 3.16)
¬o(a) ∨ r(1, a) ∨ r(2, a) ∨ r(3, a) ∨ r(4, a) ∨ r(5, a)
In theory, to find bounds we repeat the two steps below
until a fixpoint is reached:
p(X, Y ) ← X = Y + 1.
p(X, Y ) ← X = Y − 1.
2. find all bounds on argument positions of predicates (using a fixpoint procedure, using Proposition 3.15) (we can
restrict ourselves to the predicates occurring in positive
bodies, since that are the only predicates useful for generating completion formulas)
Both rules have functional bounds from X to Y and vice
versa. By taking the union of these two bounds, we get the
bound p1 y p2 where X is mapped to {X − 1, X + 1}.
This shows that functional bounds on rules do not necessarily give rise to functional bounds on argument positions.
If new bounds in predicates are detected, then these can
be used to find new bounds in rules analogous to Proposition
3.9.
Proposition 3.16. Let p be a predicate with a bound
f : J y K in p. If p(t) ∈ B+ (r), then there is a bound
var t|J y var t|K
4
Future work
To tackle this problem in its most general form, one could
develop methods similar to grounding with bounds (Wittocx,
Mariën, and Denecker 2010) that were developed in the context of model expansion (Mitchell and Ternovska 2005) for
an extensions of first-order logic (Denecker and Ternovska
2008) that closely relates to answer set programming (Denecker et al. 2019).
While the cases studied in the previous section allow for
adding completion constraints in a wide variety of applications, we see the current work as a stepping stone towards a
more extensive theory of approximations that enable adding
completion constraints. In this section, we provide several
directions in which the current work can be extended.
in r. This bound is functional, if f is functional.
Proof. Let X = var t|J and Y = var t|K . Any element
τ ′ ∈ sub(K) corresponds to a unique element τ ∈ sub(Y ).
Similarly, any σ ∈ sub(X) corresponds to a unique element
σ ∈ sub(J). Define
g : sub(X) → 2sub(Y ) : σ 7→ {τ | τ ′ ∈ f (σ ′ )}
Dynamic overapproximations The approximations developed and described in the previous section can all be
determined statically. However, during solving sometimes
more consequences at decision level zero are derived. Taking these also into account (instead of just the determined
predicates) can result in better approximations and hence
more completion constraints.
Take σ ∈ sub(X). Let τ ∈ sub(Y ) and let ρ be a relevant
substitution in r that extends both σ and τ . We prove that
τ ∈ f (σ). Since f is a bound in p, for each relevant atom
p(c) it holds that c|K ∈ f (c|J ). Since ρ is relevant, we
know that p(t) is justified; hence p(ρ(t)) is a relevant atom.
Therefore, ρ(t)|K ∈ f (ρ(t)|J ) because f is a bound. We
can see that ρ(t)|J corresponds to σ and ρ(t)|K corresponds
to τ , which completes the proof.
More bounds in predicates For finding new opportunities
to add completion formulas, it is necessary that (especially
functional) bounds between argument positions are detected,
eventhough they are not directly used in generating the completion formulas. This detection can be done by syntactic
means, such as inspecting their defining rules, or by semantic means (De Cat and Bruynooghe 2013). We already supplied Proposition 3.15, however this is not sufficient to find
all useful bounds.
For example, in each rule below we have functional
bounds {2, 3} y {4, 5} and {4, 5} y {2, 3}, but the complete predicate has the following fundamental functional
bounds {1, 2, 3} y {4, 5} and {1, 4, 5} y {2, 3}. This
is because if you know the first argument position, then
you know the rule that is used. If for example you have
neighbor (n, X, Y, XX, Y Y ) in the positive body of a rule,
The interaction between Proposition 3.15 and Proposition
3.16 is shown in the following example program:
u(1..3). w(3..5).
p(A, B) ← u(A), w(B).
q(B) ← p(C, B).
r(X, Y ) ← q(Y ), X = Y.
r(X, Y ) ← p(X, Y ).
o(a) ← r(X, a).
We know that both u and w are determined predicates.
Therefore, in the rule of p, A is bounded by u and B
bounded by w. This indicates that p1 is bounded by u and
p2 is bounded by w. Similarly, q 1 is bounded by w. In the
64
then you know the first rule is applicable: X = XX and
Y = Y Y − 1.
neighbor (D, X, Y, X, Y Y ) ← D = n, Y = Y Y − 1.
neighbor (D, X, Y, X, Y Y ) ← D = s, Y = Y Y + 1.
neighbor (D, X, Y, XX, Y ) ← D = w, X = XX − 1.
neighbor (D, X, Y, XX, Y ) ← D = e, X = XX + 1.
These dependencies are not detected by the double fixpoint
procedure. Intuitively, what is going on here is that the first
argument of neighbor is inherently linked to which rule is
applicable. Depending on that first argument, we can decide which functional dependency can be generalized to the
predicate level (but it is not always the same).
5
Calimeri, F.; Faber, W.; Gebser, M.; Ianni, G.; Kaminski,
R.; Krennwallner, T.; Leone, N.; Ricca, F.; and Schaub, T.
2013. ASP-Core-2 input language format. Technical report,
ASP Standardization Working Group.
Calimeri, F.; Gebser, M.; Maratea, M.; and Ricca, F. 2016.
Design and results of the fifth answer set programming competition. Artif. Intell. 231:151–181.
Calimeri, F.; Faber, W.; Gebser, M.; Ianni, G.; Kaminski,
R.; Krennwallner, T.; Leone, N.; Maratea, M.; Ricca, F.; and
Schaub, T. 2020. ASP-Core-2 input language format. TPLP
20(2):294–309.
Clark, K. L. 1978. Negation as failure. In Logic and Data
Bases, 293–322. Plenum Press.
Dal Palù, A.; Dovier, A.; Pontelli, E.; and Rossi, G. 2009.
GASP: Answer set programming with lazy grounding. Fundam. Inform. 96(3):297–322.
Dao-Tran, M.; Eiter, T.; Fink, M.; Weidinger, G.; and
Weinzierl, A. 2012. Omiga: An open minded grounding
on-the-fly answer set solver. In del Cerro, L. F.; Herzig, A.;
and Mengin, J., eds., JELIA, volume 7519 of LNCS, 480–
483. Springer.
De Cat, B., and Bruynooghe, M. 2013. Detection and exploitation of functional dependencies for model generation.
TPLP 13(4–5):471–485.
Denecker, M., and Ternovska, E. 2008. A logic of nonmonotone inductive definitions. ACM Trans. Comput. Log.
9(2):14:1–14:52.
Denecker, M.; Lierler, Y.; Truszczynski, M.; and Vennekens,
J. 2019. The informal semantics of answer set programming:
A Tarskian perspective. CoRR abs/1901.09125.
Denecker, M.; Brewka, G.; and Strass, H. 2015. A formal theory of justifications. In Calimeri, F.; Ianni, G.; and
Truszczyński, M., eds., Logic Programming and Nonmonotonic Reasoning - 13th International Conference, LPNMR
2015, Lexington, KY, USA, September 27-30, 2015. Proceedings, volume 9345 of Lecture Notes in Computer Science, 250–264. Springer.
Gebser, M.; Maratea, M.; and Ricca, F. 2017. The sixth
answer set programming competition. J. Artif. Intell. Res.
60:41–95.
Gelfond, M., and Lifschitz, V. 1988. The stable model semantics for logic programming. In Kowalski, R. A., and
Bowen, K. A., eds., ICLP/SLP, 1070–1080. MIT Press.
Lefèvre, C., and Nicolas, P. 2009. The first version of a new
ASP solver: ASPeRiX. In Erdem, E.; Lin, F.; and Schaub,
T., eds., LPNMR, volume 5753 of LNCS, 522–527. Springer.
Leutgeb, L., and Weinzierl, A. 2017. Techniques for efficient lazy-grounding ASP solving. In Seipel, D.; Hanus, M.;
and Abreu, S., eds., Declare 2017 – Conference on Declarative Programming, proceedings, number 499 in Institut für
Informatik technical report, 123–138.
Liu, L.; Pontelli, E.; Son, T. C.; and Truszczynski, M. 2007.
Logic programs with abstract constraint atoms: The role
of computations. In Dahl, V., and Niemelä, I., eds., Logic
Programming, 23rd International Conference, ICLP 2007,
Conclusion
In this paper, we highlighted the issue of missing completion formulas in lazy grounding and provided lightweight
solutions for this issue based on static program analysis. In
our theoretical analysis, we found that the completion formulas that can now be added are in some cases identical to
redundant constraints added to improve search performance;
hence, usage of our techniques eliminates this burden for the
programmer.
Our next step in this research will be implementing the
presented ideas and experimenting to find out what their impact is on the runtime of lazy grounders.
In Section 4, we identified several directions in which this
work can continue that would allow for the detection of even
more completion constraints. We intend to evaluate these as
well in follow-up research.
References
Anger, C.; Gebser, M.; Janhunen, T.; and Schaub, T. 2006.
What’s a head without a body? In Brewka, G.; Coradeschi,
S.; Perini, A.; and Traverso, P., eds., ECAI, 769–770. IOS
Press.
Armstrong, W. W. 1974. Dependency structures of data base
relationships. IFIP Congress 580–583.
Balduccini, M.; Lierler, Y.; and Schüller, P. 2013. Prolog
and ASP inference under one roof. In Cabalar, P., and Son,
T. C., eds., Logic Programming and Nonmonotonic Reasoning, 12th International Conference, LPNMR 2013, Corunna,
Spain, September 15-19, 2013. Proceedings, volume 8148
of LNCS, 148–160. Springer.
Banbara, M.; Kaufmann, B.; Ostrowski, M.; and Schaub, T.
2017. Clingcon: The next generation. TPLP 17(4):408–461.
Barrett, C. W.; Sebastiani, R.; Seshia, S. A.; and Tinelli, C.
2009. Satisfiability modulo theories. In Biere et al. (2009).
825–885.
Biere, A.; Heule, M.; van Maaren, H.; and Walsh, T., eds.
2009. Handbook of Satisfiability, volume 185 of Frontiers
in Artificial Intelligence and Applications. IOS Press.
Bogaerts, B., and Weinzierl, A. 2018. Exploiting justifications for lazy grounding of answer set programs. In Lang, J.,
ed., Proceedings of the Twenty-Seventh International Joint
Conference on Artificial Intelligence, IJCAI 2018, July 1319, 2018, Stockholm, Sweden., 1737–1745. ijcai.org.
65
Porto, Portugal, September 8-13, 2007, Proceedings, volume 4670 of Lecture Notes in Computer Science, 286–301.
Springer.
Marek, V., and Truszczyński, M. 1999. Stable models
and an alternative logic programming paradigm. In Apt,
K. R.; Marek, V.; Truszczyński, M.; and Warren, D. S., eds.,
The Logic Programming Paradigm: A 25-Year Perspective.
Springer-Verlag. 375–398.
Marques-Silva, J. P., and Sakallah, K. A. 1999. GRASP: A
search algorithm for propositional satisfiability. IEEE Transactions on Computers 48(5):506–521.
Marques Silva, J. P.; Lynce, I.; and Malik, S. 2009. Conflictdriven clause learning SAT solvers. In Biere et al. (2009).
131–153.
Martelli, A., and Montanari, U. 1982. An efficient
unification algorithm. ACM Trans. Program. Lang. Syst.
4(2):258–282.
Mitchell, D. G., and Ternovska, E. 2005. A framework for
representing and solving NP search problems. In Veloso,
M. M., and Kambhampati, S., eds., AAAI, 430–435. AAAI
Press / The MIT Press.
Rossi, F.; van Beek, P.; and Walsh, T., eds. 2006. Handbook of Constraint Programming, volume 2 of Foundations
of Artificial Intelligence. Elsevier.
Stuckey, P. J. 2010. Lazy clause generation: Combining
the power of SAT and CP (and mip?) solving. In Lodi,
A.; Milano, M.; and Toth, P., eds., Integration of AI and
OR Techniques in Constraint Programming for Combinatorial Optimization Problems, 7th International Conference,
CPAIOR 2010, Bologna, Italy, June 14-18, 2010. Proceedings, volume 6140 of Lecture Notes in Computer Science,
5–9. Springer.
Taupe, R.; Weinzierl, A.; and Friedrich, G. 2019. Degrees
of laziness in grounding - effects of lazy-grounding strategies on ASP solving. In Balduccini, M.; Lierler, Y.; and
Woltran, S., eds., Logic Programming and Nonmonotonic
Reasoning - 15th International Conference, LPNMR 2019,
Philadelphia, PA, USA, June 3-7, 2019, Proceedings, volume 11481 of Lecture Notes in Computer Science, 298–311.
Springer.
Taupe, R.; Weinzierl, A.; and Schenner, G. 2017. Introducing Heuristics for Lazy-Grounding ASP Solving. In 1st
International Workshop on Practical Aspects of Answer Set
Programming.
Weinzierl, A. 2017. Blending lazy-grounding and CDNL
search for answer-set solving. In Balduccini, M., and Janhunen, T., eds., Logic Programming and Nonmonotonic
Reasoning - 14th International Conference, LPNMR 2017,
Espoo, Finland, July 3-6, 2017, Proceedings, volume 10377
of Lecture Notes in Computer Science, 191–204. Springer.
Wittocx, J.; Mariën, M.; and Denecker, M. 2010. Grounding
FO and FO(ID) with bounds. J. Artif. Intell. Res. (JAIR)
38:223–269.
Zhang, L.; Madigan, C. F.; Moskewicz, M. W.; and Malik,
S. 2001. Efficient conflict driven learning in Boolean satisfiability solver. In ICCAD, 279–285.
66
Splitting a Logic Program Efficiently
Rachel Ben-Eliyahu-Zohary
Department of Software Engineering
Azrieli College of Engineering,
Jerusalem, Israel
rbz@jce.ac.il
Abstract
splitting sets, splitting sets that include certain atoms, or splitting
sets that define a bottom part with minimum number of rules or
bottom that are easy to compute, for example, a bottom which is
an HCF program [Ben-Eliyahu and Dechter, 1994].
Second, we ask if it is possible to relax the definition of splitting sets such that we can now split programs that could not be
split using the original definition. We answer affirmatively to the
second question as well, and we present a more general and relaxed definition of a splitting set.
Answer Set Programming (ASP) is a successful method
for solving a range of real-world applications. Despite
the availability of fast ASP solvers, computing answer
sets demands a very large computational power, since
the problem tackled is in the second level of the polynomial hierarchy. A speed-up in answer set computation
may be attained, if the program can be split into two
disjoint parts, bottom and top. Thus, the bottom part is
evaluated independently of the top part, and the results
of the bottom part evaluation are used to simplify the
top part. Lifschitz and Turner have introduced the concept of a splitting set, i.e., a set of atoms that defines the
splitting.
In this paper, we address two issues regarding splitting.
First, we show that the problem of computing a splitting set with some desirable properties can be reduced
to a classic Search Problem and solved in polynomial
time. Second, we show that the definition of splitting
sets can be adjusted to allow splitting of a broader class
of programs.
1
2
Preliminaries
2.1
Disjunctive Logic Programs and Stable Models
A propositional Disjunctive Logic Program (DLP) is a collection
of rules of the form
A1 | . . . |Ak ←− Ak+1 , . . . , Am , not Am+1 , . . . , not An ,
n ≥ m ≥ k ≥ 0,
Introduction
Answer Set Programming (ASP) is a successful method for solving a range of real-world applications. Despite the availability
of fast ASP solvers, the task of computing answer sets demands
extensive computational power, because the problem tackled is
in the second level of the polynomial hierarchy. A speed-up in
answer set computation may be gained, if the program can be divided into several modules in which each module is computed
separately [Lifschitz and Turner, 1994; Janhunen et al., 2009;
FLL, 2009]. Lifschitz and Turner propose to split a logic program
into two disjoint parts, bottom and top, such that the bottom part
is evaluated independently from the top part, and the results of
the bottom part evaluation are used to simplify the top part. They
have introduced the concept of a splitting set, i.e., a set of atoms
that defines the splitting [Lifschitz and Turner, 1994]. In addition
to inspiring incremental ASP solvers [Gebser et al., 2008], splitting sets are shown to be useful also in investigating answer set
semantics [Dao-Tran et al., 2009; Oikarinen and Janhunen, 2008;
FLL, 2009].
In this paper we raise and answer two questions regarding splitting sets. The first question is, how do we compute a splitting set?
We show that if we are looking for a splitting set having a desirable
property that can be tested efficiently, we can find it in polynomial
time. Examples of desirable splitting sets can be minimum-size
67
where the symbol “not ” denotes negation by default, and each Ai
is an atom (or variable). For k + 1 ≤ i ≤ m, we will say that Ai
appears positive in the body of the rule, while for m + 1 ≤ i ≤ n,
we shall say that Ai appears negative in the body of the rule If
k = 0, then the rule is called an integrity rule. If k > 1, then the
rule is called a disjunctive rule. The expression to the left of ←−
is called the head of the rule, while the expression to the right
of ←− is called the body of the rule. Given a rule r, head (r)
denotes the set of atoms in the head of r, and body(r) denotes the
set of atoms in the body of r. From now, when we refer to a
program, it is a DLP.
Stable Models [Gelfond and Lifschitz, 1991] of a program P
are defined as Follows: Let Lett(P) denote the set of all atoms
occurring in P. Let a context be any subset of Lett(P). Let P
be a negation-by-default-free program. Call a context S closed
under P iff for each rule A1 | . . . |Ak ← Ak+1 , . . . , Am in P, if
Ak+1 , . . . , Am ∈ S, then for some i = 1, . . . , k, Ai ∈ S. A
Stable Model of P is any minimal context S, such that S is closed
under P . A stable model of a general DLP is defined as follows:
Let the reduct of P w.r.t. P and the context S be the DLP obtained
from P by deleting (i) each rule that has not A in its body for some
A ∈ S, and (ii) all subformulae of the form not A of the bodies of
the remaining rules. Any context S which is a stable model of the
reduct of P w.r.t. P and the context S is a stable model of P.
2.2
Programs and graphs
With every program P we associate a directed graph, called the
dependency graph of P, in which (a) each atom in Lett(P) is a
node, and (b) there is an arc directed from a node A to a node B
if there is a rule r in P such that A ∈ body(r) and B ∈ head (r).
A super-dependency graph SG is an acyclic graph built from
a dependency graph G as follows: For each strongly connected
component (SCC) c in G there is a node in SG, and for each arc
in G from a node in a strongly connected component c1 to a node
in a strongly connected component c2 (where c1 6= c2 ) there is an
arc in SG from the node associated with c1 to the node associated
with c2 . A program P is Head-Cycle-Free (HCF), if there are no
two atoms in the head of some rule in P that belong to the same
component in the super-dependency graph of P [Ben-Eliyahu and
Dechter, 1994]. Let G be a directed graph and SG be a super
dependency graph of G. A source in G (or SG) is a node with
no incoming edges. By abuse of terminology, we shall sometimes
use the term “source” or “SCC” as the set of nodes in a certain
source or a certain SCC in SG, respectively, and when there is no
possibility of confusion we shall use the term rule for the set of
all atoms that appears in the rule. Given a node v in G, scc(v)
denotes the set of all nodes in the SCC in SG to which v belongs,
and tree(v) denotes the set of all nodes that belongs to any SCC
S such that there is a path in SG from S to scc(v). Similarly,
when S is a set of nodes, tree(S) is the union of all tree(v) for
every v ∈ S. For example, given the super dependency graph in
Figure 1, scc(e) = {e, h}, tree(e) = {a, b, e, h}, tree({f, g}) =
{a, b, c, d, f, g} and tree(r), where r = c|f ←− not d is actually
tree({c, d, f }) which is {a, b, c, d, f }.
A source in a program will serve as a shorthand for “a source
in the super dependency graph of the program.” Given a source S
of a program P, PS denotes the set of rules in P that uses only
atoms from S.
Example 2.1 (Running Example) Suppose we are given the following program P
1.
2.
3.
4.
5.
6.
7.
8.
a
e|b
f
g|d
c|f
h
e
h
←−
←−
←−
←−
←−
←−
←−
←−
not b
not a
not b
c
not d
e
a, not h
a
In Figure 1 the dependency graph of P is illustrated in solid lines.
The SG is marked with dotted lines. Note that {a, b} is a source
in the SG of P, but it is not a splitting set.
2.3
Splitting Sets
The definitions of Splitting Set and the Splitting Set Theorem
are adopted from a paper by Lifschitz and Turner [Lifschitz and
Turner, 1994]. We restate them here using the notation and the
limited form of programs discussed in our work.
Definition 2.2 (Splitting Set) A Splitting Set for a program P is
a set of of atoms U such that for each rule r in P, if one of the
atoms in the head of r is in U , then all the atoms in r are in U . We
denote by bU (P) the set of all rules in P having only atoms from
U.
The empty set is a splitting set for any program. For an example
of a nontrivial splitting set, the set {a, b, e, h} is a splitting set for
the program P introduced in Example 2.1. The set b{a,b,e,h} (P)
is {r1 , r2 , r6 , r7 , r8 }.
For the Splitting set theorem, we need the a procedure called
Reduce, which resembles many reasoning methods in knowledge representation, as, for example, unit propagation in DPLL
and other constraint satisfaction algorithms [Davis et al., 1962;
68
Procedure Reduce(P,X,Y )
Input: A program P and two sets of atoms: X and Y
Output: An update of P assuming all the atoms in X are true
and all atoms in Y are false
1
2
3
4
5
foreach atom a ∈ X do
foreach rule r in P do
If a appears negative in the body of r delete r ;
If a is in the head of r delete r;
Delete each positive appearance of a in the body of r;
10
foreach atom a ∈ Y do
foreach rule r in P do
If a appears positive in the body of r, delete r ;
If a is in the head of r, delete a from the head of r;
Delete each negative appearance of a in the body of r;
11
return P;
6
7
8
9
Dechter, 2003]. Reduce(P,X,Y ) returns the program obtained
from a given program P in which all atoms in X are set to true,
and all atoms in Y are set to false. Reduce(P,X,Y ) is shown in
Figure Reduce. For example, Reduce(P,{a, e, h},{b}), where P
is the program from Example 2.1, is the following program (the
numbers of the rules are the same as the corresponding rules of
the program in Example 2.1):
3.
4.
5.
f
g|d
c|f
←−
←−
←−
c
not d
Theorem 2.3 (Splitting Set Theorem) (adopted from [Lifschitz
and Turner, 1994]) Let P be a program, and let U be a splitting set for P. A set of atoms S is a stable model for P if and only
if S = X ∪ Y , where X is a stable model of bU (P), and Y is a
stable of Reduce(P, X, U − X).
As seen in Example 2.1, a source is not necessarily a splitting
set. A slightly different definition of a dependency graph is possible. The nodes are the same as in our definition, but in addition
to the edges that we already have, we add a directed arc from a
variable A to a variable B whenever A and B are in the head of
the same rule. It is clear that a source in this variation of dependency graph must be a splitting set. The problem is that the size of
a dependency graph built using this new definition may be exponential in the size of the head of the rules, while we are looking for
a polynomial-time algorithm for computing a nontrivial splitting
set.
2.4
Search Problems
The area of search is one of the most studied and most known
areas in AI (see, for example, [Pearl, 1984]). In this paper we
show how the problem of computing a nontrivial minimum-size
splitting set can be expressed as a search problem. We first recall basic definitions in the area of search. A search problem is
defined by five elements: set of states, initial state, actions or successor function, goal test, and path cost. A solution is a sequence
of actions leading from the initial state to a goal state. Figure 2
provides a basic search algorithm [Russell and Norvig, 2010].
There are many different strategies to employ when we choose
the next leaf node to expand. In this paper we use uniform cost,
according to which we expand the leaf node with the lowest path
cost.
Proof:
The set tree(r) is a union of SCCs. We shall show that
for every SCC S such that S ⊆ tree(r), S ⊆ SP . Let S ′ be the
root of tree(r). The proof is by induction on the distance i from S
to S ′ .
Case i = 0. Then S = S ′ , and since S ′ is the root of tree(r) and
head (r) ∩ SP 6= ∅, by Lemma 3.1 S ⊆ SP .
Induction Step. Suppose that for all SCCs S ∈ tree(r) such that
the distance from S to S ′ is of size i S ⊆ SP . Let R be an
SCC in tree(r), such that the distance from R to S ′ is of size
i + 1. So, there must be an SCC R′ , such that there is an edge
in tree(r) from R to R′ , and the distance from R′ to S ′ is of
size i. By the induction hypothesis, R′ ⊆ SP . Since there
is an edge from R to R′ in tree(r), it must be the case that
there is a rule r in P, such that an atom from R, say Q, is
in body(r), and an atom from R′ , say Q′ , is in head (r). By
induction hypothesis, Q′ ∈ SP , and since SP is a Splitting
Set, it must be that Q ∈ SP . By Lemma 3.1, R ⊆ SP .
Figure 1: The [super]dependency graph of the program P.
✷
Corollary 3.3 Every Splitting set is a collection of trees.
Note that the converse of Corollary 3.3 does not hold. In our
running example, for instance, tree(g) = {c, d, g}, but {c, d, g} is
not a splitting set.
4
Figure 2: Tree Search Algorithm
3
Between Splitting Sets and Dependency Graphs
In this section we show that a splitting set is actually a tree in the
SG of the program P. The first lemma states that if an atom Q
is in some splitting set, all the atoms in scc(Q) must be in that
splitting set as well.
Lemma 3.1 Let P be a program, let SP be a Splitting Set in P,
let Q ∈ SP , and let S = scc(Q). It must be the case that S ⊆ SP .
Proof:
Let R ∈ S. We will show that R ∈ SP . Since Q ∈ S,
and S is a strongly connected component, it must be that for each
Q′ ∈ S there is a path in SG -the super dependency graph of P
- from Q′ to Q, such that all the atoms along the path belong to
S. The proof goes by induction on i, the number of edges in the
shortest path from Q′ to Q.
Case i = 0. Then Q = Q′ , and so obviously Q′ ∈ SP .
Induction Step. Suppose that for all atoms Q′ ∈ S, such that the
shortest path from Q′ to Q is of size i, Q′ belongs to SP . Let
R be an atom in S, such that the shortest path from R to Q is
of size i + 1. So, there must be an atom R′ such that there is
an edge in SG from R to R′ , and the shortest path from R′ to
Q is of size i. By the induction hypothesis, R′ ∈ SP . Since
there is an edge from R to R′ in SG, it must be that there is a
rule r in P, such that R ∈ body(r) and R′ ∈ head (r). Since
R′ ∈ SP and SP is a Splitting Set, it must be the case that
R ∈ SP .
✷
Lemma 3.2 Let P be a program, let SP be a Splitting Set in P,
let r be a rule in P, and S an SCC in SG – the super dependency
graph of P. If head (r) ∩ SP 6= ∅, then tree(r) ⊆ SP .
69
Computing a minimum-size Splitting Set as a
search problem
We shall now confront the problem of computing a splitting set
with a desirable property. We shall focus on computing a nontrivial minimum-size splitting set. Given a program P, this is how we
view the task of computing a nontrivial minimum-size splitting
set as a search problem. We assume that there is an order over the
rules in the program.
State Space. The state space is a collection of forests which are
subgraphs of the super dependency graph of P.
Initial State. The empty set.
Actions.
1. The initial state can unite with one of the sources
in the super dependency graph of P.
2. A state S, other than the initial state, has only one possible action, which is:
(a) Find the lowest rule r (recall that the rules are ordered) such that head (r) ∩ S 6= ∅ and Lett(r) 6⊆ S;
(b) Unite S with tree(r).
Transition Model The result of applying an action on a state S
is a state S ′ that is a superset of S as the actions describe.
Goal Test A state S is a goal state, if there is no rule r ∈ P such
that head (r) ∩ S 6= ∅ and Lett(r) 6⊆ S. (In other words, a
goal state is a state that represents a splitting set.);
Path Cost The cost of moving from a state S to a state S ′ is
|S ′ | − |S|, that is the number of atoms added to S when it
was transformed to S ′ . So, the path cost is actually the number of atoms in the final state of the path.
Once the problem is formulated as a search problem, we can
use any of the search algorithms developed in the AI community
to solve it. We do claim here, however, that the computation of a
nontrivial minimum-size splitting set can be done in time that is
polynomial in the size of the program. This search problem can be
solved, for example, by a search algorithm called Uniform Cost.
Algorithm Uniform Cost [Russell and Norvig, 2010] is a variation of Dijkstra’s single-source shortest path algorithm [Dijkstra,
1959; Felner, 2011]. Algorithm Uniform Cost is optimal, that is, it
returns a shortest path to a goal state. Since the search problem is
formulated so that the length of the path to a goal state is the size
of the splitting set that the goal state represents, Uniform Cost will
find a minimum-size splitting set.
The time complexity of this algorithm is O(bm ), where b is
the branching factor of the search tree generated, and m is the
depth of the optimal solution. It is easy to see that m cannot be
larger than the number of rules in the program, because once we
use a rule for computing the next state, this rule cannot be used
any longer in any sequel state. As for b, the branching factor,
except for the initial state, each state can have at most one child;
to generate a child we apply the lowest rule that demonstrates that
the current state is not a splitting set. In a given a specific state,
the time that required to calculate its child is polynomial in the
size of the program. Therefore, this search problem can be solved
in polynomial time. This claim is summarized in the following
proposition.
Proposition 4.1 A minimum-size nontrivial splitting set can be
computed in time polynomial in the size of the program.
The following example demonstrates how the search algorithm
works, assuming that we are looking for the smallest non-empty
splitting set, and we are using uniform cost search.
Figure 4: Average size of nonempty splitting sets.
tree. The leaf {a, b} with path cost 2, that was there before, and
the leaf {c, d, g}, that was just added, with path cost 3. So we go
and check whether {a, b} is a splitting set and find out that Rule
no. 2 is the lowest rule that proves it is not. W,e add the tree of
Rule no. 2 and get the child {a, b, e, h} with a path cost 4. So, we
go now and check whether {c, d, g} is a splitting set and find that
Rule no. 5 is the lowest rule that proves that it is not. We add the
tree of Rule no. 5 and get the child {c, d, g, f, a, b} with a path
cost 6. Back to the leaf {a, b, e, h}, the leaf with the shortest path,
we find that it is also a splitting set, and we stop the search.
✷
5
Experiments
We have implemented our algorithm and tested it on randomly
generated programs, having no negation as failure. A stable model
is actually a minimal model for this type of program. For each
program we have computed a nontrivial minimum-size splitting
set. The average nontrivial minimum size of a splitting set, and
the median of all nontrivial minimum size splitting sets, as a function of the rules to variable number ratio, are shown in Graph 4
and Graph 5, respectively. The average and median were taken
over 100 programs generated randomly, starting with a ratio of 2
and generating 100 random programs for each interval of 0.25. It
is clear from the graphs that in the transition value of 4.25 (See
[Selman et al., 1996]) the size of the splitting set is maximal, and
it is equal to the number of variables in the program. This is a new
way of explaining that, programs in the phase transition value of
rules to variable are hard to solve
Figure 3: The search tree for P.
Example. Suppose we are given the program P of Example 2.1,
and we want to apply the search procedure to compute a nontrivial
minimum-size splitting set. The search tree is shown in Figure 3.
Our initial state is the empty set. By the definition of the search
problem, the successors of the empty set are the sources of the
super dependency graph of the program, which in this case are
{a, b} and {c, d}, both of which with action cost 2. Since both
current leaves have the same path cost, we shall choose randomly
one of them, say {c, d}, and check whether it is a goal state, or
in other words, a splitting set. It turns out {c, d} is not a splitting
set, and the lowest rule that proves it is rule No. 4 that requires a
splitting set that includes d to have also c and g. So, we make the
leaf {c, d, g} the son of {c, d} with action cost 1 (only one atom,
g, was added to {c, d}). Now we have two leaves in the search
70
6
Relaxing the splitting set condition
As the experiments indicate, in the hard random problems the only
nonempty splitting set is the set of all atoms in the program. In
such cases splitting is not useful at all. In this section we introduce
the concept of generalized splitting set (g-splitting set), which is
a relaxation of the concept of a splitting set. Every splitting set is
a g-splitting set, but there are g-splitting sets that are not splitting
sets.
Definition 6.1 (Generalized Splitting Set.) A Generalized Splitting Set (g-splitting set) for a program P is a set of of atoms U
such that for each rule r in P, if one of the atoms in the head of r
is in U , then all the atoms in the body of r are in U .
splitting method is efficient, but clearly it can be quite resource
demanding in the worst case.
Baumann [Baumann, 2011] discuss splitting sets and graphs,
but they do not go all the way in introducing a polynomial algorithm for computing classical splitting sets, as we do here. The
authors of [Baumann et al., 2012] suggest quasi-splitting, a relaxation of the concept of splitting that requires the introduction
of new atoms to the program, and they describe a polynomial algorithm, based on the dependency graph of the program, to efficiently compute a quasi-splitting set. Our algorithm is essentially a search algorithm with fractions of the dependency graph
as states in the search space. We do not need the introduction of
new atoms to define g-splitting sets.
8
Figure 5: Median size of nonempty splitting sets.
Thus, g-splitting sets that are not splitting sets may be found only
when there are disjunctive rules in the program.
Example 6.2 Suppose we are given the following program P:
1. a
←− not b
2. b
←− not a
3. b|c
←− a
4. a|d ←− b
The program has only the two trivial splitting sets — the empty set
and {a, b, c, d}. However, the set {a, b} is a g-splitting set of P.
We next demonstrate the usefulness of g-splitting sets. We show
that it is possible to compute a stable model of a program P by
computing a stable model of PS for a g-splitting set S of P, and
then propagating the values assigned to atoms in S to the rest of
the program.
Theorem 6.3 (program decomposition.) Let P be a program.
For any g-splitting-set S in P, let X be a stable model of PS .
Moreover, let P ′ = Reduce(P,X,S-X), where Reduce(P, X, S −
X) is the result of propagating the assignments of the model X in
the program P. Then, for any stable model M ′ of P ′ , M ′ ∪ X is
a stable model of P.
The proof can be found in the full version of the paper.
Consider the program P from Example 6.2, which has two stable models: {a, c} and {b, d}. Let us compute the stable models
of P according to Theorem 6.3. We take U = {a, b}, which is
a g-splitting set for P . The bottom of P according to U , denoted b{a,b} (P), are Rule 1 and Rule 2, that is: {a ←− not b,
b ←− not a}. So the bottom has two stable models: {a}, and {b}.
If we propagate the model {a} to the top of the program, we are
left with the rule {c ←− }, and we get the stable model {a, c}. If
we propagate the model {b} to the top of the program, we are left
with the rule {d ←− }, and we get the stable model {b, d}.
7
Related Work
The idea of splitting is discussed in many publications. Here we
discuss papers that deal with generating splitting sets and relaxing
the definition of a splitting set.
The work in [Ji et al., 2015] suggests a new way of splitting
that introduces a possibly exponential number of new atoms to the
program. The authors show that for some typical programs their
71
Conclusions
The concept of splitting has a considerable role in logic programming. This paper has two major contributions. First, we show that
the task of looking for an appropriate splitting set can be formulated as a classical search problem and computed in time that is
polynomial in the size of the program. Search has been studied
extensively in AI, and when we formulate a problem as a search
problem, we immediately benefit from the library of search algorithms and strategies that has developed in the past and will be
generated in the future. Our second contribution is introducing gsplitting sets, which are a generalization of the definition of splitting sets, as presented by Lifschitz and Turner. This allows for a
larger set of programs to be split to non-trivial parts.
References
[Baumann et al., 2012] Ringo Baumann, Gerhard Brewka, Wolfgang Dvořák, and Stefan Woltran. Parameterized Splitting: A
Simple Modification-Based Approach, pages 57–71. Springer
Berlin Heidelberg, Berlin, Heidelberg, 2012.
[Baumann, 2011] Ringo Baumann. Splitting an argumentation
framework. In James P. Delgrande and Wolfgang Faber, editors, Logic Programming and Nonmonotonic Reasoning, pages
40–53, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg.
[Ben-Eliyahu and Dechter, 1994] Rachel Ben-Eliyahu and Rina
Dechter. Propositional semantics for disjunctive logic programs. Annals of Mathematics and Artificial Intelligence,
12:53–87, 1994.
[Dao-Tran et al., 2009] Minh Dao-Tran, Thomas Eiter, Michael
Fink, and Thomas Krennwallner. Modular nonmonotonic logic
programming revisited. In Patricia M. Hill and David S. Warren, editors, Logic Programming, pages 145–159, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg.
[Davis et al., 1962] Martin Davis, George Logemann, and Donald Loveland. A machine program for theorem-proving. Communications of the ACM, 5(7):394–397, 1962.
[Dechter, 2003] Rina Dechter. Constraint processing. Morgan
Kaufmann, 2003.
[Dijkstra, 1959] Edsger W. Dijkstra. A note on two problems
in connexion with graphs. Numerische mathematik, 1(1):269–
271, 1959.
[Felner, 2011] Ariel Felner. Position paper: Dijkstra’s algorithm
versus uniform cost search or a case against dijkstra’s algorithm. In Fourth annual symposium on combinatorial search,
2011.
[FLL, 2009] Symmetric Splitting in the General Theory of Stable
Models., 01 2009.
[Gebser et al., 2008] Martin Gebser, Roland Kaminski, Benjamin Kaufmann, Max Ostrowski, Torsten Schaub, and Sven
Thiele. Engineering an incremental asp solver. In Maria Garcia
de la Banda and Enrico Pontelli, editors, Logic Programming,
pages 190–205, Berlin, Heidelberg, 2008. Springer Berlin Heidelberg.
[Gelfond and Lifschitz, 1991] Michael Gelfond and Vladimir
Lifschitz. Classical negation in logic programs and disjunctive
databases. New Generation Computing, 9:365–385, 1991.
[Janhunen et al., 2009] Tomi Janhunen, Emilia Oikarinen, Hans
Tompits, and Stefan Woltran. Modularity aspects of disjunctive stable models. Journal of Artificial Intelligence Research,
35:813–857, 2009.
[Ji et al., 2015] Jianmin Ji, Hai Wan, Ziwei Huo, and Zhenfeng
Yuan. Splitting a logic program revisited. In Proceedings of
the Twenty-Ninth AAAI Conference on Artificial Intelligence,
AAAI’15, pages 1511–1517. AAAI Press, 2015.
[Lifschitz and Turner, 1994] Vladimir Lifschitz and Hudson
Turner. Splitting a logic program. In ICLP, volume 94, pages
23–37, 1994.
[Oikarinen and Janhunen, 2008] Emilia Oikarinen and Tomi Janhunen. Achieving compositionality of the stable model semantics for smodels programs. Theory and Practice of Logic Programming, 8(5-6):717–761, 2008.
72
[Pearl, 1984] Judea Pearl. Heuristics: intelligent search strategies
for computer problem solving. 1984.
[Russell and Norvig, 2010] Stuart J. Russell and Peter Norvig.
Artificial Intelligence - A Modern Approach, Third International Edition. Pearson Education, 2010.
[Selman et al., 1996] Bart Selman, David G Mitchell, and Hector J Levesque. Generating hard satisfiability problems. Artificial intelligence, 81(1-2):17–29, 1996.
Interpreting Conditionals in Argumentative Environments
Jesse Heyninck,1 Gabriele Kern-Isberner,1 , Kenneth Skiba2 , Matthias Thimm2
1
Technical University Dortmund, Dortmund, Germany
University of Koblenz-Landau, Koblenz, Germany
jesse.heyninck@tu-dortmund.de, gabriele.kern-isberner@cs.tu-dortmund.de,
kennethskiba@uni-koblenz.de, thimm@uni-koblenz.de
2
conditionals with the informal meaning “if φ is true then,
usually, ψ is true as well” and written as (ψ|φ). In abstract
dialectical frameworks, these pairs are interpreted as acceptance conditions, and interpreted as “if φ is accepted then ψ
is accepted as well”. The resemblance of these informal interpretations is striking, but both approaches use fundamentally different semantics to formalise these interpretations.
In previous works (Kern-Isberner and Thimm 2018;
Heyninck, Kern-Isberner, and Thimm 2020) we looked at
the question of what happens if we translate an ADF into
a conditional logic knowledge base, used conditional logic
reasoning mechanisms on the latter, and interpreted the results in argumentative terms. Our results showed, that the
intuition behind the semantics of the two worlds is generally different, but there are also cases where their semantics
coincide. In this paper, we look at the complementary question from before. We investigate what happens if we translate a conditional logic knowledge base into an ADF, use
ADF reasoning mechanisms on the latter, and interpret the
results in conditional logic terms.
Outline of this Paper: After introducing the necessary preliminaries in Section 2 on propositional logic (Section 2.1),
conditional logic (Section 2.2) and abstract dialectial frameworks (Section 2.3), we present our argumentative interpretation of conditionals in Section 3. We first present our translation for literal conditional knowledge bases (Section 3.2)
and discuss the behaviour of the negation needed in this
translation (Section 3.3). Thereafter we show the adequacy
of this translation under both two-valued semantics in Section 3.4 and under other semantics in Section 3.5. We then
generalize the translation as to allow for what we call extended literal conditional knowledge bases (Section 3.6) and
discuss several properties of our translation in Section 3.7.
Thereafter, we further motivate the design choices made
in our interpretation in Section 4. Finally, we compare our
work with related work (Section 5) and conclude in Section 6.
Abstract
In the field of knowledge representation and reasoning, different paradigms have co-existed for many years. Two central such paradigms are conditional logics and formal argumentation. Despite recent intensified efforts, the gap between
these two approaches has not been fully bridged yet. In this
paper, we contribute to the bridging of this gap by showing
how plausible conditionals can be interpreted in argumentative reasoning enviroments. In more detail, we provide interpretations of conditional knowledge bases in abstract dialectical frameworks, one of the most general approaches to computational models of argumentation. We motivate the design
choices made in our translation, show that different semantics give rise to several forms of adequacy, and show several
desirable properties of our translation.
1
Introduction
Different paradigms of modelling human-like reasoning behaviour have emerged over the years within the field of
Knowledge Representation and Reasoning. For one, conditional logics (Kraus, Lehmann, and Magidor 1990; Nute
1984) are a classical approach to non-monotonic reasoning that focus on the role of defeasible rules of the form
(φ|ψ) with the intuitive interpretation “if ψ is true then,
usually, φ is true as well”. There exist several sophisticated reasoning approaches (Goldszmidt and Pearl 1996;
Kern-Isberner 2001) that aim at resolving issues pertaining to contradictory rules. On the other hand, the more recent argumentative approaches (Atkinson et al. 2017) focus
on the role of arguments, i. e., derivations of claims involving multiple rules, and how to resolve issues between arguments with contradictory claims. In particular, the abstract
approach to formal argumentation (Dung 1995) has gained
quite some interest in the wider community. One of the most
general and expressive formalisms to abstract argumentation
are Abstract Dialectical Frameworks (ADFs) (Brewka et al.
2013), which model the acceptability of arguments via general acceptability functions.
In this paper we investigate the correspondence between
abstract dialectical frameworks and conditional logics. Syntactically, both frameworks focus on pairs of objects such
as (φ, ψ). In conditional logic, these pairs are interpreted as
2
Preliminaries
In the following, we briefly recall some general preliminaries on propositional logic, as well as technical details on conditional logic and ADFs (Brewka et al. 2013).
Copyright c 2019, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
73
2.1
Propositional Logic
probabilities. Such a conditional (ψ|φ) can be accepted as
plausible if its verification φ ∧ ψ is more plausible than its
falsification φ∧¬ψ, where plausibility is often modelled by a
total preorder on possible worlds. This is in full compliance
with nonmonotonic inference relations φ |∼ ψ (Makinson
1988) expressing that from φ, ψ may be plausibly/defeasibly
derived. An obvious implementation of total preorders are
ordinal conditional functions (OCFs), (also called ranking
functions) κ : Ω → N ∪ {∞} (Spohn 1988). They express
degrees of (im)plausibility of possible worlds and propositional formulas φ by setting κ(φ) := min{κ(ω) | ω |= φ}.
OCFs κ provide a particularly convenient formal environment for nonmonotonic and conditional reasoning, allowing for simply expressing the acceptance of conditionals and
nonmonotonic inferences via stating that (ψ|φ) is accepted
by κ iff φ |∼ κ ψ iff κ(φ∧ψ) < κ(φ∧¬ψ), implementing formally the intuition of conditional acceptance based on plausibility mentioned above. For an OCF κ, Bel (κ) denotes the
propositional beliefs that are implied by all most plausible
worlds, i. e. Bel (κ) = {φ | ∀ω ∈ κ−1 (0) : ω |= φ}. We
write κ |= φ if φ ∈ Bel (κ).
Specific examples of ranking models are system Z yielding the inference relation |∼Z (Goldszmidt and Pearl 1996)
and c-representations (Kern-Isberner 2001). We discuss system Z defined as follows. A conditional (ψ|φ) is tolerated by
a finite set of conditionals ∆ if there is a possible world ω
with (ψ|φ)(ω) = 1 and (ψ ′ |φ′ )(ω) 6= 0 for all (ψ ′ |φ′ ) ∈ ∆,
i. e. ω verifies (ψ|φ) and does not falsify any (other) conditional in ∆. The Z-partitioning (∆0 , . . . , ∆n ) of ∆ is defined as:
• ∆0 = {δ ∈ ∆ | ∆ tolerates δ};
• ∆1 , . . . , ∆n is the Z-partitioning of ∆ \ ∆0 .
For δ ∈ ∆ we define: Z∆ (δ) = i iff δ ∈ ∆i and
(∆0 , . . . , ∆n ) is the Z-partioning of ∆. Finally, the ranking
Z
function κZ
∆ is defined via: κ∆ (ω) = max{Z(δ) | δ(ω) =
0, δ ∈ ∆} + 1, with max ∅ = −1. We can now define
∆ |∼Z φ iff ⊤ |∼ κZ φ (which can be seen to be equivalent
∆
to φ ∈ Bel (κZ
∆ )).
Below the following Lemma about system Z will prove
useful:
Lemma 1. Let ω ∈ Ω and ∆ be a conditional knowledge
−1
base. Then ω 6∈ (κZ
(0) iff δ(ω) = 0 for some δ ∈ ∆.
∆)
For a set At of atoms let L(At) be the corresponding propositional language constructed using the usual connectives
∧ (and), ∨ (or), ¬ (negation) and → (material implication). We will sometimes write φ̊ to denote some element
of {φ, ¬φ}. The set of literals is denoted by Lit = {φ̊ |
φ ∈ At}. A (classical) interpretation (also called possible
world) ω for a propositional language L(At) is a function
ω : At → {⊤, ⊥}. Let Ω(At) denote the set of all interpretations for At. We simply write Ω if the set of atoms is
implicitly given. An interpretation ω satisfies (or is a model
of) an atom a ∈ At, denoted by ω |= a, if and only if
ω(a) = ⊤. The satisfaction relation |= is extended to formulas as usual. As an abbreviation we sometimes identify
an interpretation ω with its complete conjunction, i. e., if
a1 , . . . , an ∈ At are those atoms that are assigned ⊤ by
ω and an+1 , . . . , am ∈ At are those propositions that are
assigned ⊥ by ω we identify ω by a1 . . . an an+1 . . . am (or
any permutation of this). For example, the interpretation ω1
on {a, b, c} with ω(a) = ω(c) = ⊤ and ω(b) = ⊥ is abbreviated by abc. For Φ ⊆ L(At) we also define ω |= Φ if and
only if ω |= φ for every φ ∈ Φ. Define the set of models
Mod(X) = {ω ∈ Ω(At) | ω |= X} for every formula or set
of formulas X. A formula or set of formulas X1 entails another formula or set of formulas X2 , denoted by X1 ⊢ X2 ,
if Mod(X1 ) ⊆ Mod(X2 ).
2.2
Reasoning with Nonmonotonic Conditionals
Conditional logics are concerned with conditionals of the
form (φ|ψ) whose informal meaning is “if ψ is true then,
usually, φ is true as well”. A conditional knowledge base
∆ is a set of such conditionals. It is atomic if for every (φ|ψ) ∈ ∆, φ, ψ ∈ At and it is literal if for every (φ|ψ) ∈ ∆, φ, ψ ∈ Lit. We will not count the constants ⊤ or ⊥ as atoms or literals. If for every (φ|ψ) ∈ ∆,
φ, ψ ∈ Lit ∪ {⊤}, we say ∆ is an extended literal conditional knowledge base. There are many different conditional logics (cf., e. g., (Kraus, Lehmann, and Magidor 1990;
Nute 1984)), and we will just use basic properties of conditionals that are common to many conditional logics and are
especially important for nonmonotonic reasoning: Basically,
we follow the approach of de Finetti (de Finetti 1974) who
considered conditionals as generalized indicator functions
for possible worlds resp. propositional interpretations ω:
(
1 : ω |= φ ∧ ψ
0 : ω |= φ ∧ ¬ψ
((ψ|φ))(ω) =
(1)
u : ω |= ¬φ
Proof. This follows immediately in view of the fact that ω ∈
−1
(κZ
(0) iff δ(ω) 6= 0 for every δ ∈ ∆.
∆)
We now illustrate OCFs in general and System Z in particular with the well-known “Tweety the penguin”-example.
Example 1. Let ∆ = {(f |b), (b|p), (¬f |p)}, which expresses that most birds (b) fly (f ), most penguins ((p)) are
birds, and most penguins do not fly. This conditional knowledge base has the following Z-partitioning: ∆0 = {(f |b)}
and ∆1 = {(b|p), (¬f |p)}. This gives rise to the following κZ
∆ -ordering over the worlds based on the signature
{b, f, p}:
where u stands for unknown or indeterminate. In other
words, a possible world ω verifies a conditional (ψ|φ) iff it
satisfies both antecedent and conclusion ((ψ|φ)(ω) = 1); it
falsifies, or violates it iff it satisfies the antecedence but not
the conclusion ((ψ|φ)(ω) = 0); otherwise the conditional
is not applicable, i. e., the interpretation does not satisfy the
antecedence ((ψ|φ)(ω) = u). We say that ω satisfies a conditional (ψ|φ) iff it does not falsify it, i. e., iff ω satisfies its
material counterpart φ → ψ. Hence, conditionals are threevalued logical entities and thus extend the binary setting of
classical logics substantially in a way that is compatible with
the probabilistic interpretation of conditionals as conditional
ω
bpf
bpf
74
κZ
∆
2
2
ω
bpf
bpf
κZ
∆
1
2
ω
bpf
bpf
κZ
∆
0
0
ω
bpf
bpf
κZ
∆
1
0
As an example of a κZ
∆ -belief, observe that ¬p, ¬(b ∧
).
¬f ) ∈ Bel (κZ
∆
2.3
f
p
Abstract Dialectical Frameworks
We briefly recall some technical details on abstract dialectical frameworks (ADF) following loosely the notation from
(Brewka et al. 2013). We can depict an ADF D as a directed
graph whose nodes represent statements or arguments which
can be accepted or not. With links we represent dependencies between nodes. A node s is depended on the status of the
nodes with a direct link to s, denoted parent nodes parD (s).
With an acceptance function Cs we define the cases when
the statement s can be accepted (truth value ⊤), depending
on the acceptance status of its parents in D.
An ADF D is a tuple D = (S, L, C) where S is a set of
statements, L ⊆ S × S is a set of links, and C = {Cs }s∈S
is a set of total functions Cs : 2parD (s) → {⊤, ⊥} for each
s ∈ S with parD (s) = {s′ ∈ S | (s′ , s) ∈ L}. By abuse
of notation, we will often identify an acceptance function
Cs by its equivalent acceptance condition which models the
acceptable cases as a propositional formula.
An ADF D = (S, L, C) is interpreted through 3-valued
interpretations v : S → {⊤, ⊥, u}, which assign to each
statement in S either the value ⊤ (true, accepted), ⊥ (false,
rejected), or u (unknown).
A 3-valued interpretation v can be extended to arbitrary
propositional formulas over S via strong Kleene semantics:
1. v(¬φ) = ⊥ iff v(φ) = ⊤, v(¬φ) = ⊤ iff v(φ) = ⊥, and
v(¬φ) = u iff v(φ) = u;
2. v(φ ∧ ψ) = ⊤ iff v(φ) = c(ψ) = ⊤, v(φ ∧ ψ) = ⊥ iff
v(φ) = ⊥ or v(ψ) = ⊥, and v(φ ∧ ψ) = u otherwise;
3. v(φ ∨ ψ) = ⊤ iff v(φ) = ⊤ or v(ψ) = ⊤, v(φ ∨ ψ) = ⊥
iff v(φ) = c(ψ) = ⊥, and v(φ ∨ ψ) = u otherwise.
b
Figure 1: Graph representing links between nodes for D in
Example 2.
•
•
•
•
•
v is a 2-valued model iff v ∈ V 2 and v is a model.
v is complete for D iff v = ΓD (v).
v is preferred for D iff v is ≤i -maximally complete for D.
v is grounded for D iff v is ≤i -minimally complete for D.
v is stable iff v is a model of D and {s ∈ S | v(s) =
⊤} = {s ∈ S | w(s) = ⊤} where w is the grounded
interpretation of Dv .
We denote by 2mod(D), complete(D), preferred(D) respectively stable(D) the sets of 2-valued models and complete, preferred, respectively stable interpretations of D.
The grounded interpretation, which in (Brewka and Woltran
2010) is shown to be unique, will be denoted by vgD . If D is
clear from the context we will just write vg .
Notice that any complete interpretation is also a model.
We finally define consequence relations for ADFs:
Definition 2. Given sem ∈ {2mod, preferred, stable}, an
∩
ADF D = (S, L, C) and s ∈ L(S), we define: D |∼sem s[¬s]
iff v(s) = ⊤[⊥] for all v ∈ sem(D). D |∼grounded s[¬s] iff
vgD (s) = ⊤[⊥].
We illustrate ADFs by looking at a naive formalization of
the Penguin-example in abstract dialectical argumentation:
Example 2. Let D = ({p, b, f }, L, C) with Cp = p, Cb = p
and Cf = ¬p∨b. The corresponding graph for D can be find
in Figure 1. This ADF has two two-valued models, which are
also its preferred models: v1 with v1 (p) = v1 (b) = ⊥ and
v1 (f ) = ⊤ and v2 with v2 (p) = v2 (b) = v2 (f ) = ⊤. The
grounded interpretation assigns u to all nodes p, b and f .
V consists of all three-valued interpretations whereas V 2
consists of all the two-valued interpretations (i. e. interpretations such that for every s ∈ S, v(s) ∈ {⊤, ⊥}). Then v is
a model of D if for all s ∈ S, if v(s) 6= u then v(s) = v(Cs ).
We define an order ≤i over {⊤, ⊥, u} by making u the
minimal element: u <i ⊤ and u <i ⊥ and this order is
lifted pointwise as follows (given two valuations v, w over
S): v ≤i w iff v(s) ≤i w(s) for every s ∈ S. So intuitively
the classical truth values contain more information than the
truth value u. The set of two-valued interpretations extending a valuation v is defined as [v]2 = {w ∈ V 2 | v ≤i w}.
Given a set of valuations V , ⊓i V (s) = v(s) if for every v ′ ∈ V , v(s) = v ′ (s) and ⊓i V (s) = u otherwise.
ΓD (v) : S → {⊤, ⊥, u} where s 7→ ⊓i {w(Cs ) | w ∈ [v]2 }.
For the definition of the stable model semantics, we need
to define the reduct Dv of D given v, defined as: Dv =
(S v , Lv , C v ) with:
• S v = {s ∈ S | v(s) = ⊤},
• Lv = L ∩ (S v × S v ), and
• C v = {Cs [{φ | v(φ) = ⊥}/⊥] | s ∈ S v }.
where Cs [φ/ψ] is the formula obtained by substituting every
occurence of φ in Cs by ψ.
Definition 1. Let D = (S, L, C) be an ADF with v : S →
{⊤, ⊥, u} an interpretation:
3
Interpreting Conditionals in ADFs
In (Heyninck, Kern-Isberner, and Thimm 2020) we looked
at the problem of translating an ADF into a conditional logic
knowledge base. We now look at the complementary question, namely translating a conditional logic knowledge base
into an ADF. These two translations will help to better understand the connection between argumentation and reasoning from conditional knowledge bases.
In this section, we present an interpretation of conditional
knowledge bases into abstract dialectical frameworks. In
Section 3.1 we introduce the language used for translating
knowledge bases and formulate several notions of adequacy
used for evaluating our translation. The translation is presented in Section 3. In Section 3.3 we discuss the use of
the newly introduced negation, whereafter we show the adequacy of our translation under two-valued (Section 3.4) and
other semantics (Section 3.5). Thereafter, we discuss how to
translate normality statements in Section 3.6 and finally we
discuss properties of the translation in Section 3.7.
75
3.1
Translations of Conditionals into ADFs
pe
To obtain an adequate translation, it will prove useful to extend the language with a new atomic negation operator e.
We denote the set of atoms negated by this new negation by
f
At = {φe | φ ∈ At}. Lite (At) = At ∪ f
At. When At is clear
from the context, we will somtimes just write Lite . It will
prove useful to define the following notions:
Definition 3. We define the functions
p
3.2
−· :Lite → Lite
φ
ψe
φ
xφy =
¬ψ
(
φe
−φ =
ψ
f
Translation D1
The guiding idea behind our first translation is that given a
conditional (p|q), what we take into account is the following
behaviour: if q is believed then p should be believed. Now
one way to translate this in ADFs is to have q as a positive
or a supporting link for p. Another way to formalize this
idea, however, is to require that q can be believed only if so
is p, i. e. {Cq } ⊢ p. In other words the consequent p is a
supporting link of the antecedent q. We will here explore the
latter idea and show in Section 4.2 that the former idea leads
to inadequate translations.
We are now ready to define our translation D1 from conditional knowledge bases into ADFs.
Definition 4. Given a literal conditional knowledge base
∆, we define: D1 (∆) = (Lite (At(∆)), L, C) where: Cφ =
V
pψq for any φ ∈ {ψ, ψe | ψ ∈ At(∆)}.
¬−φ∧
x·y :Lite → Lit
pφq =
b
fe
Figure 2: Graph representing the links between nodes of
D1 (∆) in Example 3.
p·q :Lit → Lite
with:
eb
if φ ∈ At
if φ = ¬ψ for some ψ ∈ At
if φ ∈ At
if φ = ψe for some ψ ∈ At
if φ ∈ At
if φ = ψe for some ψ ∈ At
Let Cli (At) is the set of all literal conditional knowledge
bases over At and D(Lite (At)) all the ADFs defined on the
basis of S (i. e. D = (Lite (At), L, C)). In this paper, we consider translations D : Cli (At) → D(Lite (At)), and in particular translations which preserve the meaning of the translated knowledge base ∆. In more detail, we will use two
notions of adequacy to evaluate translations.
The first notion is respecting ∆ and is based on de
Finetti’s conception of conditionals as generalized indicator
functions to worlds described above. Indeed, given a conditional knowledge base ∆ we can straightforwardly extend
(de Finetti 1974)’s notion of conditionals as generalized indicator functions to worlds in Ω(Lite (At(∆))). In more detail, for such an ω ∈ Ω(Lite (At(∆))), we define:
1 : ω |= pφq ∧ pψq
((ψ|φ))(ω) = u : ω |= ¬pφq
0 : ω |= pφq ∧ −pψq
(ψ|xφy)∈∆
Given a literal φ ∈ Lite (At(∆)), the intuition behind Cφ
is the following. The first part ¬ − φ ensures that e behaves
like a negation by ensuring that the contrary −φ of φ is not
believed when
φ is believed. The second part of the condiV
tion Cφ , (ψ|xφy)∈∆ pψq, ensures that conditionals are interpreted adequately. In more detail, it ensures that φ is only
believed if for every conditional (ψ|xφy) which has φ as an
antecedent (modulo transformation to the original language
Lit), the consequent pψq is believed (again, modulo transformation into the extended language Lite ) .
Notice that for any φ ∈ At, the conditions can be equivalently written as (where ψ is an atom):
V
• Cφ = ¬φe ∧ (ψ̊|φ)∈∆ pψ̊q.
V
• Cφe = ¬φ ∧ (ψ̊|¬φ)∈∆ pψ̊q.
We illustrate our translation by first looking at the Tweetyexample:
Example 3. ∆ = {(f |b), (b|p), (¬f |p)}. The following
nodes are part of the ADF: {b, eb, f, fe, p, pe}. We have the
following conditions:
• Cb = ¬eb ∧ f
• Cp = ¬e
p ∧ b ∧ fe
We will say that an interpretation ω ∈ Ω(Lite (At(∆))) respects ∆ if (δ)(ω) 6= 0 for any δ ∈ ∆.
The second notion of adequacy is stronger and requires
equivalence on the level of the non-monotonic inference relation. In more detail, we say a translation D is inferentially
equivalent w.r.t. an ADF-based inference relation1 |∼ if for
any conditional knowledge base ∆: ∆ |∼Z φ iff D(∆) |∼ φ.
Clearly, inferential equivalence w.r.t. |∼ sem (for some semantics sem) of a translation D : Cli (At) → D(Lite (At))
implies that all the interpretations in sem(D(∆)) respect ∆
for any literal conditional knowledge base ∆.
• Cx = ¬ − x for x ∈ {f, fe, eb, pe}.
The corresponding graph can be found in Figure 2.
We can read this as follows: b can be believed whenever
it is not believed that eb (i. e. nothing is both a bird and a
not-bird) and it is believed that f (i. e. something is a bird
only if it flies). Argumentatively, eb attacks b and f supports
b. Likewise, b and fe support p (whereas pe attacks p).
1
An ADF-based infrence relation is a relation |∼ ⊆ D(S) ×
L(S). Examples of such inference relations are those defined in
Definition 2.
76
D1 (∆) has the following two-valued models:
i vi (b) vi (eb) vi (f ) vi (fe) vi (p)
1 ⊤
⊥
⊤
⊥
⊥
⊤
⊥
⊤
⊥
2 ⊥
3 ⊥
⊤
⊤
⊥
⊥
Example 6. ∆ = {(p|q), (p|q)}. We have D(∆) =
({p, q, pe, qe}, L, C) with Cq = ¬e
q ∧ pe, Cqe = ¬q ∧ pe,
Cp = ¬e
p, Cpe = ¬p. This ADF has the following two-valued
models:
p) vi (q) vi (e
q)
i vi (p) vi (e
1 ⊥
⊤
⊥
⊤
⊤
⊤
⊥
2 ⊥
3 ⊤
⊥
⊥
⊥
Notice that v3 is a two-valued model since v3 (¬e
p) = ⊥
and thus v3 (Cq ) = v1 (Cqe) = ⊥. This two-valued model
interprets e as an incomplete negation (i. e. there might be
e -gaps), since both q and qe are false in v3 .
vi (e
p)
⊤
⊤
⊤
Notice that these two-valued models correspond to the most
plausible worls according to κZ
∆ (see Example 1).
Another benchmark example well-known from the literature is the so-called Nixon diamond, where equally plausible
rules lead to mutually inconsistent conclusions.
Example 4 (The Nixon Diamond). Let ∆
=
{(p|q), (¬p|r)}. Then D1 (∆) = ({p, pe, q, qe}, L, C)
with:
• Cq = ¬e
q∧p
• Cr = ¬e
r ∧ pe
• Cx = ¬ − x for x ∈ {p, pe, qe, re}
2mod(D1 (∆)) = {v1 , v2 , v3 , v4 } with:
i
1
2
3
4
vi (q)
⊤
⊥
⊥
⊥
vi (e
q)
⊥
⊤
⊤
⊤
vi (r)
⊥
⊤
⊥
⊥
vi (e
r)
⊤
⊥
⊤
⊤
vi (p)
⊤
⊥
⊥
⊤
However, for any literal knowledge base ∆ and any twovalued model ω of D1 (∆), e is a consistent in ω (i.e. there
are no e -gluts):
Proposition 1. Let a literal conditional knowledge base ∆,
some φ ∈ At(∆), and ω ∈ 2mod(D1 (∆)) be given. Then
e = ⊥ and ω(φ)
e = ⊤ implies ω(φ) =
ω(φ) = ⊤ implies ω(φ)
⊥.
vi (e
p)
⊥
⊤
⊤
⊥
Proof. Suppose ∆ is a literal conditional knowledge base
and φ ∈ At(∆) and ω ∈ 2mod(D1 (∆)). Suppose now
ω(φ) = ⊤. Since ω ∈ 2mod(D(∆)), ω(φ) = ω(Cφ ). Since
e = ⊤,
e V
pψq, ω(Cφ ) = ⊤ implies ω(¬φ)
Cφ = ¬φ∧
(ψ|φ)∈∆
−1
It can be observed that (κZ
(0) = {pqr, pqr, p̊qr}.
∆)
As in the previous example, 2mod(D1 (∆)) corresponds to
−1
(κZ
(0).
∆)
In the Section 3.4, we will see that the correspondence be−1
tween 2mod(D1 (∆)) and (κZ
(0) in the above examples
∆)
is no coincidence.
3.3
e = ⊥. The case for ω(φ)
e is analogous.
i. e. ω(φ)
3.4
Adequacy of Translation D1
We first show that two-valued models of D1 (∆) respect ∆:
Proposition 2. Let a literal conditional knowledge base
∆, ω ∈ 2mod(D1 (∆)) and (φ|ψ) ∈ ∆ be given. Then
ω(pψq) = ⊤ implies ω(pφq) = ⊤.
Properties of e
Before discussing the adequacy of the translation ∆1 , it is
important to ask whether e fulfills some well-known properties of negations, such as completeness and consistency.
Completeness of e in an interpretation ω means that for every φ ∈ At, at least one of φ and φe is true in ω, whereas
consistency in an interpretation ω means that at most one of
φ and φe is true in ω (for any φ ∈ At).
Definition 5. Given ω ∈ Ω(At ∪ f
At), we say e is:
Proof. Suppose that ω ∈ 2mod(D(∆)) and let (φ|ψ) ∈ ∆.
Suppose that ω(pψq) = ⊤. We assume first that ψ, φ ∈ At.
V
pφ′ q and (φ|ψ) ∈ ∆,
Since Cψ = ¬ψe ∧
′
(φ |ψ)∈∆
Cψ = ¬ψe ∧ φ ∧
^
pφ′ q
(φ′ |ψ)∈∆\{(φ|ψ)}
and thus Cψ ⊢ φ. Since ω ∈ 2mod(D(∆)), ω(ψ) =
ω(Cψ ) = ⊤. Since Cψ ⊢ φ, this means ω(φ) = ⊤. Since
φ ∈ At, this implies ω(pφq) = ⊤. The other cases are analogous.
e = ⊤.
• complete in ω if for all φ ∈ At, ω(φ) = ⊤ or ω(φ)
e = ⊥.
• consistent in ω if for all φ ∈ At, ω(φ) = ⊥ or ω(φ)
We can illustrate these definitions with a simple example:
Example 5. Consider the following interpretations of
{p, pe}:
p) is vi consistent? is vi complete?
i vi (p) vi (e
1 ⊥
⊥
yes
no
⊤
yes
yes
2 ⊥
3 ⊤
⊤
no
yes
4 u
u
no
no
Corollary 1. Let a literal conditional knowledge base ∆ be
given. Then any ω ∈ 2mod(D1 (∆)) respects ∆.
Proof. By Proposition 2, for any ω ∈ 2mod(D1 (∆)) and
any (φ|ψ) ∈ ∆, ω(pψq) = ⊥ or ω(pψq ∧ pφq) = ⊤, which
implies that ω((φ|ψ)) 6= 0.
We can now easily show that every two-valued model of
D1 (∆) corresponds to a maximally plausible world ω. We
first have to define a function that allows us to associate
two-valued models in the language using e with the worlds
Ω(At) (and vice-versa).
We first observe that there extist knowledge bases ∆ for
which there are two-valued models ω of D1 (∆) s.t. e is not
complete in ω, as witnessed by the following example:
77
Definition 6. Where ω ∈ Ω(Lite (At)) and e is complete in
ω, we define ω↓ ∈ Ω(At) as the world such that for every
φ ∈ At:
⊤ if ω(φ) = ⊤
ω↓(φ) =
e =⊤
⊥ if ω(φ)
Let ω ∈ Ω(At). Then we define ω↓ ∈ Ω(Lite ) as the world
such that for every φ ∈ At:
Proof. We show this by showing the claim for any φ ∈
Wn Vm ˚
L(At) in disjunctive normal form, i. e. φ = i=1 j=1 φji .
Suppose ω↓ |= φ, i. e. there is some 1 ≤ i ≤ n s.t.
Vm ˚
ω↓ |= j=1 φji . By Fact 2 and Definition 6, this implies
Wn V m ˚
Vm ˚
ω |= j=1 φji and thus ω |= i=1 j=1 φji . The other direction is analogous.
e = ⊥ iff ω(φ) = ⊤
ω↑(φ) = ⊤ and ω↑(φ)
e = ⊤ and ω↑(φ) = ⊥ iff ω(φ) = ⊥
ω↑(φ)
Given some ADF D, we define: D |∼ 2mod φ iff ω(φ) = ⊤
for every ω ∈ 2mod(D) for which e is complete in ω.
Proof. Suppose ∆ is a literal conditional knowledge base,
ω ∈ 2mod(D1 (∆)) and e is complete in ω. Indeed, let
(φ|ψ) ∈ ∆ and suppose ω↓ |= pψq. By Definition 6, this implies ω |= ψ. With Proposition 2, this implies that ω |= pφq.
Again with Definition 6, this implies ω↓ |= φ. Thus, we
have established that if ω ∈ 2mod(D1 (∆)) and e is complete in ω then ω↓ 6|= ψ ∧ ¬φ for any (φ|ψ) ∈ ∆, i. e.
((φ|ψ))(ω↓) 6= 0 (for any (φ|ψ) ∈ ∆). With Lemma 1 this
means κZ
∆ (ω) = 0.
Proof. Suppose first that ∆ |∼Z φ, i. e. for every ω ∈ Ω s.t.
κZ
∆ (ω) = 0, ω |= φ. Take now some ω ∈ 2mod(D1 (∆))
s.t. e is complete in ω. With Proposition 3, κZ
∆ (ω↓) = 0 and
thus ω↓ |= φ. With Definition 6, also ω |= φ. Thus, we have
shown that for any ω ∈ 2mod(D1 (∆)) s.t. e is complete in
∩,c
ω, ω |= φ which implies D1 (∆) |∼ 2mod φ.
∩,c
Suppose now that D1 (∆) |∼ 2mod φ, i. e. for every ω ∈
2mod(D1 (∆)) s.t. e is complete in ω, ω ′ |= φ. Take now
some ω ∈ Ω(At) s.t. κZ
∆ (ω) = 0. With Lemma 2 ω↑ ∈
2mod(D1 (∆)) and with Fact 1, e is complete in ω↑. Thus,
ω↑ |= φ. With Lemma 3, this implies that ω |= φ. Thus we
have shown that for every ω ∈ Ω(At), κZ
∆ (ω) = 0 implies
ω |= φ, which implies that ∆ |∼Z φ.
∩,c
Theorem 1. Given a literal conditional knowledge base ∆,
∩,c
∆ |∼Z φ iff D1 (∆) |∼ 2mod φ.
We can now show the correspondence between ecomplete two-valued models and maximally plausible
worlds.
Proposition 3. Let a literal conditional knowledge base ∆
and an ω ∈ 2mod(D1 (∆)) for which e is complete in ω be
given. Then κZ
∆ (ω↓) = 0.
Fact 1. For any ω ∈ Ω, e is complete in ω↑.
Lemma 2. Let a literal conditional knowledge base ∆ and
some ω ∈ Ω be given. Then if κZ
∆ (ω) = 0 then ω↑ ∈
2mod(D1 (∆)).
3.5
Other Semantics
In this section we show that other semantics also respect ∆.
We first investigate the two-valued stable semantics and then
move to the three-valued complete, preferred and grounded
semantics.
Proof. Let a literal conditional knowledge base ∆ and some
ω ∈ Ω be given. Consider some φ ∈ Lite . We show that
ω↑ |= φ iff ω↑ |= Cφ , which implies ω↑ is a two-valued
model of D1 (∆). For this suppose first that ω↑ |= φ and
suppose
towards a contradiction ω↑ |= ¬Cφ , i. e. ω↑ |= φe ∨
V
¬ (ψ|xφy)∈∆ pψq. With Proposition 1 and since ω↑ |= φ,
e which implies ω↑ |= ¬ V
pψq, i. e. there
ω↑ 6|= φ,
Stable Semantics We first notice that not every twovalued model of D1 (∆) is stable:
Example 7. Let ∆ = {(p|q), (q|p)}. Then D1 (∆) =
({p, q, pe, qe}, L, C) with Cp = ¬e
p ∧ q, Cq = ¬e
q ∧ p and
Cxe = ¬x for any x ∈ {p, q}.
Notice that ω with ω(p) = ω(q) = ⊤ and ω(e
p) = ω(e
q) =
⊥ is a two-valued model of D1 (∆). It is, however, not stable.
To see this, notice that (D1 (∆))ω = ({p, q}, L, C ω ) with
Cpω = ⊤ ∧ q and Cqω = ⊤ ∧ p. The grounded extension v of
(D1 (∆))ω assigns v(p) = v(q) = u.
(ψ|xφy)∈∆
is some (ψ|xφy) ∈ ∆ s.t. ω↑ |= ¬pψq. By definition of
ω↑, this implies that ω |= ¬ψ. But then ω |= φ ∧ ¬ψ for
some (ψ|φ) ∈ ∆, contradiction to κZ
∆ (ω) = 0. Suppose now
(again towards a contradiction) that ω↑ |= Cφ and ω↑ 6|= φ.
e Since Cφ = ¬φe ∧
By Fact 1, ω↑ 6|= φ implies ω↑ |= φ.
V
pψq,
this
contradicts
ω↑
|=
Cφ .
(ψ|xφy)∈∆
Furthermore, stable models might be incomplete w.r.t. e,
just like the two-valued models:
Example 8. Recall the conditional knowledge base from
Example 6. There, v3 ∈ 2mod(D1 (∆)) with v3 (p) = ⊤
and v3 (e
p) = v3 (q) = v3 (e
q ) = ⊥. We have (D1 (∆)v3 ) =
({p}, L, C v3 ) with Cp = ¬⊥. Since the grounded extension
v of (D1 (∆)v3 ) = ({p}, L, C v3 ) assigns v(p) = ⊤, we see
that v3 is stable. As was argued in Example 6, e is incomplete in v3 .
Fact 2. Let some ω ∈ Ω(Lite ) s.t. e is complete in ω and
some φ ∈ At be given. Then ω |= φe iff ω |= ¬φ.
e By Proposition 1, ω 6|= φ and
Proof. Suppose first ω |= φ.
thus ω |= ¬φ. Suppose now that ω |= ¬φ. Since e is come
plete in ω, by Definition 6, ω |= φ.
However, we can make some immediate observations
about the stable models of D1 (∆). We first recall the following result:
Lemma 3. Let some ω ∈ Ω(Lite ) s.t. e is complete in ω and
some φ ∈ L(At) be given. Then ω↓ |= φ iff ω |= φ.
78
Theorem 2 ((Brewka et al. 2017, Theorem 3.1)). For any
ADF D, stable(D) ⊆ 2mod(D).
It follows from Theorem 2 and Proposition 2 that every
stable model of D1 (∆) for which e is complete, respects ∆:
Proposition 4. Let a literal conditional knowledge base ∆
and some (φ|ψ) be given. Then for any ω ∈ stable(D1 (∆)),
if ω |= pψq then ω |= pφq.
We can furthermore show that any stable model of D1 (∆)
is maximally plausible according to κZ
∆ (modulo the ↓transformation):
Proposition 5. Let a literal conditional knowledge base ∆
and an ω ∈ stable(D1 (∆)) for which e is complete be
given. Then κZ
∆ (ω↓) = 0.
pe
We illustrate D1elcb (∆) with an example:
Example 9. Let ∆ = {(p|⊤), (q|p)}. Then D1elcb (∆) =
({p, pe, q, qe}, L, C) with Cp = ¬e
p ∧ q, Cpe = ⊥ and Cx =
¬ − x for any x ∈ {q, qe}. We have two two-valued models,
v1 and v2 with: v1 (p) = v1 (q) = ⊤, v1 (e
p) = v1 (e
q ) = ⊥,
v2 (e
q ) = ⊤ and v2 (p) = v2 (q) = v2 (e
p) = ⊥. Even though
this option gives rise to an incomplete interpretation, v2 ,
there is no two-valued interpretation of D12 (∆) that falsifies any rule in ∆. This is no coincidence as we show below.
We now show the adequacy of D1elcb for extended literal
knowledge bases:
Proposition 7. Given an extended literal conditional knowledge base ∆ and an ω ∈ 2mod(D1elcb ) for which e is complete in ω be given. Then κZ
∆ (ω↓) = 0.
Proof. Suppose that v ∈ V is a model and let (φ|ψ) ∈ ∆.
Suppose that v(pψq) = ⊤. Since v is a model, v(pψq) = ⊤
implies v(Cpψq
V ) = ⊤. Since (φ|ψ) ∈ ∆, Cpψq = ¬ −
pψq ∧ pφq ∧ (φ′ |ψ)∈∆\{(φ|ψ)} pφ′ q, and thus v(Cpψq ) = ⊤
implies v(pφq) = ⊤.
Proof. Suppose ∆ is an extendend literal conditional knowledge base and e is complete in ω. We show that ω↓ 6|= ψ∧¬φ
for any (φ|ψ) ∈ ∆, which with Lemma 1 implies the Proposition. We show the claim for ψ = ⊤, since the case where
ψ 6= ⊤ is identical to the proof of Proposition 3. Thus
consider (φ|⊤) ∈ ∆. Since this means with Definition 7,
C−pφq = ⊥ and e is complete in ω, ω |= φ. With Definition
6, this means ω↓ |= φ.
Corollary 2. Let a literal conditional knowledge base ∆
and some (φ|ψ) ∈ ∆ be given. Then:
1. For any sem ∈ {complete, preferred} and v ∈
Sem(D1 (∆)), v respects ∆.
3.6
respects ∆.
Proposition 8. Given an extended literal conditional knowledge base ∆ and an ω ∈ Ω(At), if κZ
∆ (ω) = 0 then
ω↑ ∈ 2Mod(D1elcb ).
2
Extended Literal Conditional Knowledge
Bases
Proof sketch. Suppose that φ ∈ {ψ, ψe | ψ ∈ At} and there
is some (x−φy|⊤) ∈ ∆ (and thus Cφ = ⊥) and ω↑ |= φ.
Since κZ
∆ (ω) = 0, (x−φy|⊤) ∈ ∆ implies that ω |= x−φy,
which with Definition 6 implies ω↑ |= −φ, contradicting
ω↑ |= φ and Proposition 1. Thus, for any φ ∈ {ψ, ψe |
ψ ∈ At} for which there is some (x−φy|⊤) ∈ ∆: ω↑ |= φ
iff ω↑ |= Cφ . The other case is identical to the proof of
Lemma 2.
Since in our translation D1 , a conditional (φ|ψ) results in
a support link from φ to ψ, it is not immediately clear how
to translate a normality statement of the form (φ|⊤), among
others since ⊤ will not correspond to a node in the ADF. We
circumvent this problem by modelling normality statements
(φ|⊤) by requiring that −pφq is not believed, i. e. by setting
C−pφq = ⊥. This results in the following translation for
extended literal conditional knowledge bases:
Definition 7. Given an extended literal conditional knowledge base ∆, we define: D1elcb (∆) =
(Lite (At(∆)), L, C) where: for any φ ∈ Lite (At(∆)),
⊥
if ∃(x−φy|⊤) ∈ ∆
V
Cφ =
¬ − φ ∧ (ψ|xφy)∈∆ pψq otherwise
The proof of the following Theorem, stating the inferen∩,c
tial equivalence of D1elcb w.r.t. |∼ 2mod is completely analogous to the proof of Theorem 1:
Theorem 3. Given an extended literal conditional knowl∩,c
edge base ∆, ∆ |∼Z φ iff D1elcb (∆) |∼ 2mod φ.
We notice that the first case can be expanded into the following form (where φ ∈ At):
The reader might wonder why we did not simply set
Cφ = ⊤ for any (φ|⊤) ∈ ∆. This would result in an inadequate translation, since any information about conditionals
with φ as an antecedent would be removed from the ADF,
as illustrated by the following example.
• Cφ = ⊥ if there is some (¬φ|⊤) ∈ ∆
2
D (∆)
Recall that vg 1
qe
• Cφe = ⊥ if there is some (φ|⊤) ∈ ∆
Three-Valued Semantics For all of the well-known threevalued semantics, we can show (just like for the two-valued
and stable models) that any corresponding interpretation of
the translation D1 (∆) respects ∆ (thus generalizing Proposition 2):
Proposition 6. Let a literal conditional knowledge base ∆
and a model v ∈ V of D1 (∆) be given. Then for any (φ|ψ) ∈
∆, if v(pψq) = ⊤ then v(pφq) = ⊤.
2.
q
Figure 3: Graph representing the links between nodes of
D1elcb (∆) in Example 9.
Proof. Follows from Theorem 2 and Proposition 7.
D (∆)
vg 1
p
denotes the grounded extension of D1 (∆).
79
for example, p q ∈ 2mod(D2 ({(p|¬q)}) since (p|¬q) is not
taken into account in Cq . We could propose making the following adjustment to avoid this:
Definition 9. Given a literal conditional knowledge
base ∆,
V
we let D3 (∆) = (At(∆), L, C) where: Cφ = (ψ|φ)∈∆ ψ ∧
V
(ψ|¬φ)∈∆ ¬ψ if there is some (ψ|φ) ∈ ∆ or some
(ψ|¬φ) ∈ ∆ and Cφ = φ otherwise.
However, since 2mod(D3 ({(q|p), (q|¬p)}) = {q̊p},
this also results in an inadequate translation, since
((q|¬p))(qp) = 0 and thus κZ
∆ (qp) = 1. A third option
would be to take:
Definition 10. Given a literal conditional knowledge base
∆,
V we let D4V(∆) = (At(∆), L, C) where: Cφ =
(ψ|¬φ)∈∆ ¬ψ if there is some (ψ|φ) ∈ ∆
(ψ|φ)∈∆ ψ ∨
or some (ψ|¬φ) ∈ ∆ and Cφ = φ otherwise.
Notice that 2mod(D4 ({(q|p), (s|¬p)}) contains pqs.
Since ((q|p))(pqs) = 0, this means D4 is not an adequate
translation. There are, of course, some other variations possible, which do, however, lead to similar inadequacies. We
hope to have convinced the reader of the fact that any translation which is based purely on the syntax of conditional
knowledge bases does require a second negation.3
Example 10 (Example 9 continued). We consider ∆ =
{(p|⊤), (q|p)} (as in Example 9). If we translated this
knowledge base using D1 and by in addition setting Cp =
⊤ from above, we get: D′ (∆) = ({p, pe, q, qe}, L, C) with
Cp = ⊤ and Cx = ¬ − x for x ∈ {e
p, q, qe}. In that case,
there are two two-valued models, v3 and v4 with: v3 (p) =
v1 (q) = ⊤, v3 (e
p) = v3 (e
q ) = ⊥, v4 (p) = v4 (e
q) = ⊤
and v4 (e
p) = v4 (q) = ⊥. In that case, there is a (complete)
two-value model, namely v2 , that validates p but not q, even
though (q|p) ∈ ∆ (in fact, (q|p) is even in ∆0 ).
3.7
Properties of the Translation
(Gottlob 1994) proposed several desirable properties for
translations between non-monotonic formalisms like adequacy, polynomiality and modularity. In Section 3.4 we already discussed adequacy in-depth and we have shown, that
our translation is adequate on the level of beliefs for all semantics and for any extended literal knowledge base.
A translation satisfies polynomiality if the translation is
calculable with reasonable bounds. It is easy to see, that our
translation is polynomial in the length of the translated conditional knowledge base.
For modularity we follow the formulation of (Strass 2013)
for a translation from ADFs to a target formalism, even
though modularity was originally defined for translations between circumscription and default logic (Imielinski 1987).
In other words modular means that “local” changes in the
translated conditional knowledge base results in “local”
changes in the translation. A minimal notion of modularity
would be that if we have to syntactically disjoint conditional
knowledge bases ∆1 and ∆2 , then changes in ∆1 will result
only in changes to Cs for some s ∈ Lite (At(∆1 )). Clearly
the translation presented in this paper is modular.
The biggest downside of this translation is the fact, that it
is not language-preserving since we use a language extension in this translation to construct the ADFs.
Finally, it is clear, that this translation is syntax-based, in
the sense that the translation D1 (∆) can be derived purely
on the basis of the logical form of the knowledge base ∆.
4
4.2
One guiding idea behind our translation D1 is that, relative
to a conditional knowledge base ∆, a node φ ∈ Lite can be
believed only if for every conditional (ψ|xφy) ∈ ∆, pψq is
believed. In other words, the links go from the consequent
pψq to the antecedent φ. One might wonder if adequacy is
preserved when we let the links between nodes run from antecedent to consequent. Such an alternative translation could
be the following:
Definition 11. Given a literal conditional knowledge base
∆, we define: D5 (∆) = ({φ, φe | φ ∈ At(∆)}, L, C) where
W
Cφ = ¬φe ∧ (xφy|ψ)∈∆ pψq for any φ ∈ Lite .
This translation is not adequate, however:
Example 11. Let ∆ = {(p|q), (¬p|s)}. Then D5 (∆) =
({p, pe, q, qe, s, se}, L, C) with: Cp = ¬e
p ∧ q, Cpe = ¬p ∧ s,
Cx = ¬ − x for any x ∈ {q, qe, s, se}. We depicted the corresponding graph in Figure 4.
Consider v(q) = v(s) = v(e
p) = ⊤ and v(e
q ) = v(e
s) =
v(p) = ⊥. Then v is a two-valued model of D3 (∆) (indeed,
observe that v(Cp ) = v(¬e
p ∧ q) = ⊥ since v(e
p) = ⊤).
However, notice that κZ
∆ (pqs) = 1 since ((p|q))(pqs) = 0.
Thus, two-valued models of D5 (∆) might not correspond to
Design Choices
In this section we motivate some important design choices
underlying our translation D1 , especially the extension of
the language to include the negation e, the direction of supporting links resulting from conditionals (φ|ψ) in the translated conditional knowledge base and the restriction to literal
conditional knowledge bases.
4.1
Antecedents as Partial Sufficient Conditions
The necessity of e
The critical reader might wonder, given that ADFs allow for
the negation ¬ to be used in formulating acceptance conditions for nodes, if a second negation e is really needed?
Indeed, a first proposal for a translation avoiding e would be
the following:
Definition 8. Given a literal conditional knowledge
base ∆,
V
we let D2 (∆) = (At(∆), L, C) where: Cφ = (ψ|φ)∈∆ ψ if
there is some (ψ|φ) ∈ ∆. and Cφ = φ otherwise.
Such a translation would be inadequate since conditionals
with negative antecedents are not taken into account. Thus,
3
Since ADFs under two-valued model semantics are equiexpressive with propositional logic (Strass 2014), it is not hard
to come up with a translation that is adequate. For example, it is
straightforward to show the adequacy (under two-valued semantics) of the following translation. Let D⋆ (∆) = (Atoms(∆), L, C)
with:
_
^
_
Cφ =
ω∧
¬ω∨
ω
κZ
∆ (ω)=0 and ω|=φ
κZ
∆ (ω)>0 and ω|=φ
κZ
∆ (ω)>0 and ω|=¬φ
for any φ ∈ At(∆). But such a translation is dependent on the
semantics of system Z and therefore is not syntax-based.
80
pe
qe
p
q
paring the strength of arguments and counterarguments. Our
approach differs both in goal (we investigate the correspondence between argumentation and conditional logics instead
of integrating insights from the latter into the former) and
generality (DeLP is a specific and arguably rather peculiar
argumentation formalism whereas ADFs are some of the
most general formalism around).
Several works investigate postulates for nonmonotonic
reasoning known from conditional logics (Kraus, Lehmann,
and Magidor 1990) for specific structured argumentation formalisms, such as assumption-based argumentation
(Čyras and Toni 2015; Heyninck and Straßer 2018) and
ASPIC+ (Li, Oren, and Parsons 2017). These works revealed gaps between nonmonotonic reasoning and argumentation which we try to bridge in this paper.
Besnard et al. (Besnard, Grégoire, and Raddaoui 2013)
develop a structured argumentation approach where general
conditional logic is used as the base knowledge representation formalism. Their framework is constructed in a similar
fashion as the deductive argumentation approach (Besnard
and Hunter 2008) but they also provide with conditional
contrariety a new conflict relation for arguments, based on
conditional logical terms. Even though insights from conditional logics are used in that paper, this approach stays well
within the paradigm of structured argumentation.
In (Strass 2015) Strass presents a translation from an ASPIC-style defeasible logic theory to ADFs. While actually
Strass embeds one argumentative formalism (the ASPICstyle theory) into another argumentative formalism (ADFs)
and shows how the latter can simulate the former, the process of embedding is similar to our approach. However, inferentially the formalism of (Strass 2015) is more akin to
ASPIC+ , in the sense that literals cannot be accepted unless
there is some rule deriving them. Arguably, this formalism
is more akin to D5 (see Definition 4.2), as in the ADFs generated by (Strass 2015), rules result in support of the consequents of rules.
se
s
Figure 4: Graph representing the links between nodes of
D5 (∆) in Example 11.
maximally plausible worlds (even if the negation e is complete in such a model).
4.3
Literal Conditionals
The final design choice made in this paper we motivate is
the fact that we restricted attention to (possibly extended)
literal conditional knowledge base as the object of translation. The reason is that we choose to represent conditionals
(φ|ψ) as links between nodes φ and ψ (modulo transformation to the extend language). Moving to conditionals with
arbitrary propositional formulas as antecedents and consequents would make it impossible to retain such a representation, since in abstract dialectical argumentation, nodes are
essentially atomic.
5
Related Work
Our aim in this paper is to lay foundations of integrative techniques for argumentative and conditional reasoning.
There are previous works, which have similar aims or are
otherwise related to this endeavour. We will discuss those in
the following.
First, there is huge body of work on structured argumentation (see e. g. (Besnard et al. 2014)). In these approaches,
arguments are constructed on the basis of a knowledge base
possibly consisting of conditionals. An attack relation between these arguments is constructed based on some syntactic criteria. Acceptable arguments are then identified by
applying argumentation semantics to the resulting argumentation frameworks. Even though these formalisms also allow for argumentation-based inferences from a set of conditionals, these approaches will often give rise to inferences
rather different from conditional logics. For example, in
ASPIC+ (Modgil and Prakken 2018), the knowledge base
consisting solely of the defeasible rule p ⇒ q will warrant no inference (in fact the set of arguments based on
this knowledge base will be empty), whereas, for example,
∩,c
D1 ({(q|p)}) |∼ 2mod ¬(p ∧ ¬q). This difference is caused by
the fact that in structured argumentation, arguments are typically constructed in a proof-like manner. This means that defeasible rules can only be applied when there is positive evidence for the antecedent. Conditional logics, and our translation by extension, on the other hand, generate models that
do not falsify any plausible conditional.
There have been some attempts to bridge the gap between
specific structured argumentation formalisms and conditional reasoning. For example, in (Kern-Isberner and Simari
2011) conditional reasoning based on System Z (Goldszmidt
and Pearl 1996) and DeLP (Garcı́a and Simari 2004) are
combined in a novel way. Roughly, the paper provides a
novel semantics for DeLP by borrowing concepts from System Z that allows using plausibility as a criterion for com-
6
Outlook and Conclusion
In this paper we have presented and investigated a translation from conditional knowledge bases into abstract dialectical argumentation based on the syntatic similarities between
the two frameworks. We provide an interpretation of plausible conditionals in abstract dialectical argumentation. We
have shown that this interpretation is adequate under all of
the well-known semantics for ADFs and have shown that
the translation is polynomial and modular. Interestingly, the
translation requires an extension of the language, which we
have argued in Section 4 cannot be avoided.
Another limitation of our interpretation is that adequacy
is only shown with respect to the level of beliefs Bel (κZ
∆)
(or equivalently the level of the most plausible worlds
−1
(κZ
(0)). In future work, we plan to investigate meth∆)
ods to obtain conditional inferences from ADFs and compare them with system Z. One proposal to do this is founded
upon the Ramsey-test (Ramsey 2007), which says that a
conditional (φ|ψ) is accepted if belief in ψ leads to belief in φ. Several ways of modelling the hypothetical belief in ψ are to be considered, such as revision by ψ (using e. g. revision of ADFs as proposed by (Linsbichler and
81
Woltran 2016)), observations of φ (Booth et al. 2012) or interventions with φ (Rienstra 2014). Furthermore, we plan to
tackle the combination of the translation presented in this
paper and the one from ADFs into conditional logics analyzed in previous works (Kern-Isberner and Thimm 2018;
Heyninck, Kern-Isberner, and Thimm 2020). We want to answer the question what happens if we apply these translation
one after each other. Finally, we plan to generalize the results
of this paper to other conditional logics besides system Z,
which we have chosen because of the many desirable properties it satisfies.
for default reasoning, belief revision, and causal modeling.
AI 84(1-2):57–112.
Gottlob, G. 1994. The power of beliefs or translating default
logic into standard autoepistemic logic. In Foundations of
Knowledge Representation and Reasoning. Springer. 133–
144.
Heyninck, J., and Straßer, C. 2018. A comparative study of
assumption-based approaches to reasoning with priorities.
In Second Chinese Conference on Logic and Argumentation.
Heyninck, J.; Kern-Isberner, G.; and Thimm, M. 2020. On
the correspondence between abstract dialectical frameworks
and non-monotonic conditional logics. In 33rd International
FLAIRS Conference.
Imielinski, T. 1987. Results on translating defaults to circumscription. Artificial Intelligence 32(1):131–146.
Kern-Isberner, G., and Simari, G. R. 2011. A default logical
semantics for defeasible argumentation. In FLAIRS.
Kern-Isberner, G., and Thimm, M. 2018. Towards conditional logic semantics for abstract dialectical frameworks.
In et al., C. I. C., ed., Argumentation-based Proofs of Endearment, volume 37 of Tributes. College Publications.
Kern-Isberner, G. 2001. Conditionals in nonmonotonic
reasoning and belief revision: considering conditionals as
agents. Springer-Verlag.
Kraus, S.; Lehmann, D.; and Magidor, M. 1990. Nonmonotonic reasoning, preferential models and cumulative logics.
AI 44(1-2):167–207.
Li, Z.; Oren, N.; and Parsons, S. 2017. On the links between
argumentation-based reasoning and nonmonotonic reasoning. In TAFA, 67–85. Springer.
Linsbichler, T., and Woltran, S. 2016. Revision of abstract
dialectical frameworks: Preliminary report. In First International Workshop on Argumentation in Logic Programming
and Non-Monotonic Reasoning, Arg-LPNMR 2016.
Makinson, D. 1988. General theory of cumulative inference.
In NMR, 1–18. Springer.
Modgil, S., and Prakken, H. 2018. Abstract rule-based argumentation.
Nute, D. 1984. Conditional logic. In Handbook of philosophical logic. Springer. 387–439.
Ramsey, F. P. 2007. General propositions and causality.
Rienstra, T. 2014. Argumentation in flux: modelling change
in the theory of argumentation. Ph.D. Dissertation, University of Luxembourg.
Spohn, W. 1988. Ordinal conditional functions: A dynamic
theory of epistemic states. In Causation in decision, belief
change, and statistics. Springer. 105–134.
Strass, H. 2013. Approximating operators and semantics
for abstract dialectical frameworks. Artificial Intelligence
205:39–70.
Strass, H. 2014. On the relative expressiveness of argumentation frameworks, normal logic programs and abstract
dialectical frameworks. In 15th International Workshop on
Non-Monotonic Reasoning, 292.
Strass, H. 2015. Instantiating rule-based defeasible theories
in abstract dialectical frameworks and beyond. Journal of
Logic and Computation 28(3):605–627.
Acknowledgements The research reported here was supported by the Deutsche Forschungsgemeinschaft under grant
KE 1413/11-1.
References
Atkinson, K.; Baroni, P.; Giacomin, M.; Hunter, A.;
Prakken, H.; Reed, C.; Simari, G. R.; Thimm, M.; and Villata, S. 2017. Toward artificial argumentation. AI Magazine
38(3):25–36.
Besnard, P., and Hunter, A. 2008. Elements of argumentation, volume 47. MIT press Cambridge.
Besnard, P.; Garcia, A.; Hunter, A.; Modgil, S.; Prakken, H.;
Simari, G.; and Toni, F. 2014. Introduction to structured
argumentation. Argument & Computation 5(1):1–4.
Besnard, P.; Grégoire, É.; and Raddaoui, B. 2013. A conditional logic-based argumentation framework. In International Conference on Scalable Uncertainty Management,
44–56. Springer.
Booth, R.; Kaci, S.; Rienstra, T.; and van der Torre, L. 2012.
Conditional acceptance functions. In 4th International Conference on Computational Models of Argument (COMMA
2012), 470–477.
Brewka, G., and Woltran, S. 2010. Abstract dialectical
frameworks. In Twelfth International Conference on the
Principles of Knowledge Representation and Reasoning.
Brewka, G.; Strass, H.; Ellmauthaler, S.; Wallner, J. P.; and
Woltran, S. 2013. Abstract dialectical frameworks revisited.
In Twenty-Third International Joint Conference on Artificial
Intelligence.
Brewka, G.; Ellmauthaler, S.; Strass, H.; Wallner, J. P.; and
Woltran, S. 2017. Abstract dialectical frameworks: An
overview. The IfCoLog Journal of Logics and their Applications 4(8):2263–2317.
Čyras, K., and Toni, F. 2015. Non-monotonic inference
properties for assumption-based argumentation. In TAFA,
92–111. Springer.
de Finetti, B. 1974. Theory of probability (2 vols.).
Dung, P. M. 1995. On the acceptability of arguments
and its fundamental role in nonmonotonic reasoning, logic
programming and n-person games. Artificial Intelligence
77:321–358.
Garcı́a, A. J., and Simari, G. R. 2004. Defeasible logic programming: An argumentative approach. TPLP 4(1+ 2):95–
138.
Goldszmidt, M., and Pearl, J. 1996. Qualitative probabilities
82
Inductive Reasoning with Difference-making Conditionals
Meliha Sezgin1 , Gabriele Kern-Isberner1 , Hans Rott2
1
Department of Computer Science, TU Dortmund University, Germany
2
Department of Philosophy, University of Regensburg, Germany
meliha.sezgin@tu-dortmund.de, gabriele.kern-isberner@cs.uni-dortmund.de, hans.rott@ur.de
Abstract
the countryside. Unfortunately, the weather quickly changed
and it became cold (c). Due to the low temperatures one of
the rather old pipes in the house broke (b) and the agent had
to call a plumber (p) to get the damage fixed.
In belief revision theory, conditionals are often interpreted
via the Ramsey test. However, the classical Ramsey Test fails
to take into account a fundamental feature of conditionals as
used in natural language: typically, the antecedent is relevant
to the consequent. Rott has extended the Ramsey Test by
introducing so-called difference-making conditionals that encode a notion of relevance. This paper explores differencemaking conditionals in the framework of Spohn’s ranking
functions. We show that they can be expressed by standard conditionals together with might conditionals. We prove
that this reformulation is fully compatible with the logic of
difference-making conditionals, as introduced by Rott. Moreover, using c-representations, we propose a method for inductive reasoning with sets of difference-making conditionals
and also provide a method for revising ranking functions by
a set of difference-making conditionals.
1
In this example, it is clear that the cold temperatures are
the reason for the broken pipe. Yet, this is not well reflected if we use a standard conditional ‘If it is cold then
the pipe will break’. We would rather say that the pipe
broke because it was cold. The notion of relevance featuring
here is encoded in the Relevant Ramsey Test which governs
difference-making conditionals first introduced under a different name by Rott (1986) and then studied in Rott (2019).
Except for a very recent paper by Raidl (2020), the logic
of difference-making conditionals has been explored only in
a purely qualitative framework. We characterize differencemaking conditionals in the framework of Spohn’s (1988)
ranking functions and provide a simple and elegant semantics which we can use to define an inductive representation,
that is, to build up an epistemic state from a (conditional)
knowledge base, as well as a revision method for differencemaking conditionals. Our main contributions in this paper
are the following:
Introduction
On most accounts of conditionals, a conditional of the form
‘If A then B’ is true or accepted if (but not only if) B is
true or accepted and A does not undermine B’s truth or acceptance. On the suppositional account, for instance, if you
believe B and the supposition that A is true does not remove B, you may (and must!) accept ‘If A, then B’. On
this account, there is no need that A furthers B or supports
B or is evidence or a reason for B. This does not square
well with the way we use conditionals in natural language.
Skovgaard-Olsen et al. (2019) have conducted an empirical study and concluded that the positive relevance reading
(reason-relation reading) of indicative conditionals is a conventional aspect of their meaning which cannot be cancelled
‘without contradiction’. This, of course, is helpful only if
the notion of contradiction is clear, but we aim to flesh out
the positive relevance reading in an intuitive and yet precise
way. The difference-making conditionals studied in this paper aim at capturing the relevance reading that is conveyed
semantically or pragmatically by the utterance of conditionals in natural language. (Unfortunately, use of the term ‘relevance conditionals’ has been preempted by a completely
different use in linguistics). Let us begin by giving an example that illustrates what we mean by the term ‘relevance’:
• We transfer Rott’s notion of difference-making conditionals to the framework of ordinal conditional functions and
reformulate the relevant Ramsey Test in this framework.
• We define an inductive representation for a set of
difference-making conditionals in the framework of ranking functions.
• We set up a method for revising a ranking function by a set
of difference-making conditionals, and we elaborate this
general method for revising by a single difference-making
conditional in the ranking functions framework, based on
the c-revisions introduced by Kern-Isberner (2001).
• We compare the notion of evidence or support captured
by difference-making conditionals to the one offered in
related approaches like the ‘evidential conditionals’ of
Crupi and Iacona (2019a) or Spohn’s (2012) notion of
‘reason’.
The rest of this paper is organized as follows: In section
2, we define the formal preliminaries and notations used
throughout the paper. Section 3 summarizes concepts and
results from Rott’s (2019) work on difference-making con-
Example 1. An agent wanted to escape the hustle and bustle
of the city and decided to move into an old farm house in
83
sense of Halpern (2003), most commonly represented as
probability distributions, possibility distributions (Dubois
and Prade 2006) or ordinal conditional functions (Spohn
1988, 2012). A knowledge base is consistent if and only if
there is (a representation of) an epistemic state that accepts
the knowledge base, i.e., all conditionals in ∆.
Ordinal conditional functions (OCFs, also called ranking
functions) κ : Ω → N ∪ {∞}, with κ−1 (0) 6= ∅, assign
to each world ω an implausibility rank κ(ω). OCFs were
first introduced by Spohn (1988). The higher κ(ω), the less
plausible ω is, and the normalization constraint requires that
there are worlds having maximal plausibility. Then one puts
κ(A) := min{κ(ω) | ω A} and κ(∅) = ∞. Due to
κ−1 (0) 6= ∅, at least one of κ(A) and κ(A) must be 0. A
proposition A is believed if κ(A) > 0, and the belief set of
a ranking function κ is defined as Bel (κ) = Th(κ−1 {0}).
ditionals. Then, in section 4, we define a ranking semantics for difference-making conditionals via an OCF-version
of the Relevant Ramsey Test and prove the basic principles
using a reformulation of a difference-making conditional
as a pair of more standard conditionals. In section 5, we
construct an inductive representation for sets of differencemaking conditionals using c-representations. Section 6 introduces a method for revising by difference-making conditionals based on c-revisions in the framework of ranking
functions. In section 7, we discuss alternative approaches to
incorporating relevance in conditionals. The concluding section 8 sums up our findings.
2
Formal Preliminaries
Let L be a finitely generated propositional language over
an alphabet Σ with atoms a, b, c, . . . and with formulas
A, B, C, . . .. For conciseness of notation, we will omit the
logical and-connector, writing AB instead of A ∧ B, and
overlining formulas will indicate negation, i.e., A means
¬A. The set of all propositional interpretations over Σ is denoted by ΩΣ . As the signature will be fixed throughout the
paper, we will usually omit the subscript and simply write Ω.
ω A means that the propositional formula A ∈ L holds in
the possible world ω ∈ Ω; then ω is called a model of A, and
the set of all models of A is denoted by Mod (A). For propositions A, B ∈ L, A B holds iff Mod (A) ⊆ Mod (B),
as usual. By slight abuse of notation, we will use ω both for
the model and the corresponding conjunction of all positive
or negated atoms. This will allow us to ease notation a lot.
Since ω A means the same for both readings of ω, no confusion will arise. The set of classical consequences of a set
of formulas A ⊆ L is Cn(A) = {B | A |= B}. The deductively closed set of formulas which has exactly a subset
W ⊆ Ω as a model is called the formal theory of W and
defined as Th(W) = {A ∈ L | ω |= A for all ω ∈ W}.
We extend L to a conditional language (L|L) by introducing a conditional operator ( · | · ), so that (L|L) = {(B|A) |
A, B ∈ L}. (L|L) is a flat conditional language, no nesting of conditionals is allowed. A is called the antecedent
of (B|A), and B is its consequent. (B|A) expresses ‘If A,
then (plausibly) B’. In the following, conditionals (B|A) ∈
(L|L) are referred to as standard conditionals or, if there is
no danger of confusion, simply conditionals.
We further extend our framework of conditionals to a language with might conditionals L|L by introducing a might
conditional operator ·|· (the angle brackets are supposed
to remind the reader of a split diamond operator). For a
might conditional D|C , we call C the antecedent and D
the consequent. As for standard conditionals, L|L is a flat
conditional language, and D|C expresses ‘If C, then D
might be the case’. In a way, the might conditional D|C
is the negation of the standard conditional (D|C) (Lewis
1973). The former is accepted iff the latter isn’t.
A (conditional) knowledge base is a finite set of conditionals ∆ = {(B1 |A1 ), . . . , (Bn |An )} ∪ { Bn+1 |An+1 ,
. . . , Bm |Am }. To give an appropriate semantics to (standard resp. might) conditionals and knowledge bases, we
need richer semantic structures like epistemic states in the
⊳
⊲
That is, the verification of (B|A) is more plausible than
its falsification or the premise of the conditional is always
false.
⊳
⊳
⊳
⊳
⊲
Note that accepting a might conditional is not equivalent
to the acceptance of the conditional with negated consequent
(κ |= (D|C)) but weaker since it allows for indifference
between CD and CD. In this case both (D|C) and (D|C)
fail to be accepted.
3
The Ramsey Test, the Relevant Ramsey
Test and difference-making conditionals
In the following, let Ψ be an epistemic state of any general
format, and let Bel be an operator on belief states that assigns to Ψ the set of beliefs held in Ψ. Let ∗ be a revision
operator on epistemic states, and let (B|A) be a conditional.
The Ramsey Test (so-called after a footnote in Ramsey 1931)
was made popular by Stalnaker (1968). According to it, ‘If
A then B’ is accepted in a belief state just in case B is an
element of the belief set Bel (Ψ ∗ A) that results from a revision of the belief state Ψ by the sentence A. Formally:
(RT) Ψ |= (B|A) iff B ∈ Bel (Ψ ∗ A).
If belief states are identified with ranking functions, the
Ramsey Test reads as follows: κ |= (B|A) iff B ∈ Bel (κ ∗
A); this, taken together with Definition 1 implies a constraint
on κ∗A. The condition B ∈ Bel (Ψ∗A) can be reformulated
using some basic properties of ranking functions:
⊳ ⊲
⊲
⊲
Definition 2. A might conditionals D|C is accepted in
an epistemic state represented by an OCF κ, written as
κ |= D|C , if and only if κ 6|= (D|C) or κ(C) = ∞,
i.e., κ(CD) ≤ κ(CD) or κ(C) = ∞.
⊳ ⊲
⊳ ⊲
⊳ ⊲
⊳
Definition 1. A (standard) conditional (B|A) is accepted
in an epistemic state represented by an OCF κ, written as
κ |= (B|A), iff κ(AB) < κ(AB) or κ(A) = ∞.
⊲
B ∈ Bel (κ ∗ A) = Th((κ ∗ A)−1 {0})
⇔ ∀ ω ∈ min(Mod (κ ∗ A)) it holds that ω |= B
⊲
⇔ (κ ∗ A)(B) < (κ ∗ A)(B)
⇔ (κ ∗ A)(B) > 0
84
⇔
κ ∗ A |= B.
Rott took ≫ to be an intrinsically contrastive connective. It
is important to note, however, that unlike ‘B because A’ and
‘Since A, B’, which can only be accepted if A is believed to
be true, the acceptance of A ≫ B neither entails nor is entailed by a particular belief status of A. (RRT) provides a
clear and simple doxastic semantics for relevance-encoding
conditionals with antecedents and consequents that may be
arbitrary compounds of propositional sentences.
Since (RRT) is more complex than (RT), it is hardly
surprising that difference-making conditionals don’t satisfy
some of the usual principles for standard conditionals such
as CM, Cut and Or. Rott discusses some examples showing
how CM, Cut and OR can fail with difference-making conditionals. The most striking fact, however, is that differencemaking conditionals do not even validate Right Weakening
which has long seemed entirely innocuous to conditional
logicians. Rott even called the invalidity of RW the hallmark of difference-making conditionals and indeed of the
relevance relation. Another notable property of differencemaking conditionals is that B ∈ Cn(A) does not imply that
A ≫ B is accepted. If B is accepted “anyway” (like for instance a logical truth B is), then A cannot be relevant to B,
even if it implies B.
That many of the familiar principles for standard conditionals become invalid for difference-making conditionals
does not mean that there is no logic to the latter. Here are
the basic principles of difference-making conditional operators that Rott (2019) shows to be complete with respect to
the basic AGM postulates for belief revision (actually Rott
uses a slight weakening of the basic AGM postulates that
allows that revisions by non-contradictions may result in inconsistent belief sets):
(≫0)
⊥ ≫ ⊥.
(≫1)
If A ≫ BC, then A ≫ B or A ≫ C.
(≫2a) A ≫ C iff (A ≫ AC and A ≫ A ∨ C).
(≫2b) A ≫ AC iff (notA ≫ A∨C and A ≫ A).
(≫3–4) ⊥ ≫ A ∨ C iff (⊥ ≫ A and A ≫ A ∨ C).
(≫5)
A ∨ B ≫ ⊥ iff (A ≫ ⊥ and B ≫ ⊥).
(≫6)
If Cn(A) = Cn(B) and Cn(C) = Cn(D),
then: A ≫ C iff B ≫ D.
All of these principles are to be read as quantified over all
belief states Ψ: ‘A ≫ C’ is short for ‘Ψ |= A ≫ C’ and ‘not
A ≫ C’ is short for ‘Ψ 6|= A ≫ C’. Roughly, a principle
of the form ‘If ∆, then Γ’ is valid iff for every belief state
Ψ, if the (possibly negated) conditionals mentioned in ∆ are
all accepted in Ψ, then the (possibly negated) conditionals
mentioned in Γ are accepted in Ψ, too.
It follows from principles (≫0) – (≫6) that (And) is also
valid for difference-making conditionals. (≫1) is dual to
the well-known principle of Disjunctive Rationality; it is
called Conjunctive Rationality in Rott (2020). Like its dual,
Conjunctive Rationality is a non-Horn condition. Another
non-Horn condition is the right-to-left direction of (≫2b).
The presence of non-Horn conditions means that reasoning with difference-making conditionals is not trivial. In order to determine what may be inferred from a knowledge
base containing difference-making conditionals, we cannot
We can also define a Ramsey Test for might conditionals:
Ψ |= B|A iff B 6∈ Bel (Ψ ∗ A), that is, iff Ψ 6|= (B|A). Or
more specifically, in terms of ranking functions: κ |= B|A
iff B 6∈ Bel (κ ∗ A), that is, iff κ 6|= (B|A), which follows
from Definition 2. The condition B 6∈ Bel (Ψ ∗ A) can again
be reformulated using some properties of ranking functions:
⊳ ⊲
⊳ ⊲
B 6∈ Bel (κ ∗ A) = Th((κ ∗ A)−1 {0})
⇔ ∃ ω ∈ min(Mod (κ ∗ A)) such that ω |= B
⇔ (κ ∗ A)(B) = 0
⇔
κ ∗ A 6|= B.
Given assumptions on belief revision in the tradition of
Alchourrón, Gärdenfors and Makinson (1985), Ramsey Test
conditionals are known to satisfy, among other things, the
following principles of And, Right Weakening, Cautious
Monotonicity, Cut and Or:
(And) If (B|A) and (C|A), then (BC|A).
(RW) If (B|A) and C ∈ Cn(B), then (C|A).
(CM) If (B|A) and (C|A), then (C|AB).
(Cut) If (B|A) and (C|AB), then (C|A).
(Or)
If (C|A) and (C|B), then (C|A ∨ B).
All of these principles are to be read as quantified over all
belief states Ψ: ‘(B|A)’ is short for ‘Ψ |= (B|A)’. Roughly,
a principle of the form ‘If ∆, then (B|A)’ is valid iff for
every belief state Ψ, if the conditionals mentioned in ∆ are
all accepted in Ψ, then (B|A) is accepted in Ψ, too.
The Ramsey Test falls squarely within the paradigm of
the suppositional account mentioned above. Assume that an
agent happens to believe B. Assume further that her beliefs
are consistent with A (or that she actually already believes
that A). Then, given a widely endorsed condition of belief
preservation, the Ramsey Test rules that the agent is committed to accepting the conditional (B|A). There need not
be any relation of relevance or support between A and B. In
particular, if you happen to believe A and B, this is sufficient
to require acceptance of (B|A).
How can the Ramsey Test be adapted to capture the idea
that the antecedent should be relevant to the consequent?
One straightforward way is to interpret conditionals as being contrastive: The antecedent should make a difference to
the consequent. In order to implement this idea without introducing a dependence on the actual belief status of the antecedent, Rott (2019) suggests the following Relevant Ramsey Test:
(RRT) Ψ |= A ≫ B iff B ∈ Bel (Ψ ∗ A) and
B 6∈ Bel (Ψ ∗ A).
We call conditionals that are governed by (RRT) differencemaking conditionals, and we have changed the notation here
from (B|A) to A ≫ B in order to mark our transition from
standard would conditionals to difference-making conditionals. A ≫ B can be read as ‘If A, then (relevantly) B.’
Here the consequent is accepted if we revise the belief state
by the antecedent, but the consequent fails to be accepted
if we revise by the negation of the antecedent. Rott’s idea
was to liken conditionals to the natural-language connectives
‘because’ and ‘since’ that are widely taken to express the
contrast that a cause or a reason is making to its effect. Thus
85
simply use the axioms as closure operators. This is analogous to the problem of rational consequence relations in
the sense of Lehmann and Magidor (1992) that have made
it necessary to invent special inference methods like rational closure/system Z and c-representations. In the following, we will use the method of c-representations to deal with
difference-making conditionals. A major part of our task
ahead may be described as doing for c-representations what
Booth and Paris (1998) achieved for rational closure.
4
only used standard conditionals. The might conditionals express that if it is not cold, then the pipe might not break, and
if the pipe does not break, we might not call the plumber.
Here the might conditionals formulated in natural language
perhaps sound a bit odd, but together with the standard conditionals they express the reason relations introduced by the
difference-making conditionals.
Next, we turn to the basic principles for differencemaking conditionals. Note that when checking the principles of Rott, instead of a general epistemic state Ψ, we use a
ranking function κ.
Theorem 1. Let κ be a ranking function and let κ |= A ≫ B
be as defined in (2). Then · ≫ · satisfies the basic principles
of difference-making conditionals.
Ranking semantics for difference-making
conditionals
In this section, we define a semantics for difference-making
conditionals in the framework of Spohn’s ranking functions.
We make use of standard conditionals and might conditionals in order to express that the antecedent of the conditional
is relevant to the consequent. We justify our definition of
difference-making conditionals by showing that the Relevant Ramsey Test holds and we show that the Basic principles are satisfied.
Proof. (≫0): We show that κ |= ⊥ ≫ ⊥, i.e., κ∗⊥ |= ⊥ and
κ ∗ ⊤ 6|= ⊥. These are true by the success and consistency
conditions for revisions, respectively.
(≫1): Let κ |= A ≫ BC. We have to show that κ |= A ≫
B or κ |= A ≫ C. Via (2) it follows that we have to
show that κ(ABC) < κ(A(B ∨ C)) and κ(A(B ∨ C)) ≤
κ(ABC) implies κ(AB) < κ(AB), κ(AB) ≤ κ(AB),
or κ(AC) < κ(AC), κ(AC) ≤ κ(AC).
From κ(ABC) < κ(A(B ∨ C)), we derive κ(ABC) <
κ(AB ∨ AC) = min{κ(AB), κ(AC)}, and hence both
κ(ABC) < κ(AB) and κ(ABC) < κ(AC). Since
ABC |= AB, AC we obtain κ(AB) < κ(AB) and
κ(AC) < κ(AC).
Moreover, from κ(A(B ∨ C)) ≤ κ(ABC), we derive that
either κ(AB) ≤ κ(ABC) or κ(AC) ≤ κ(ABC). Since
ABC |= AB, AC we obtain that either κ(AB) ≤ κ(AB)
or κ(AC) ≤ κ(AC).
(≫2a): We have to show that κ |= A ≫ C iff (κ |= A ≫
AC and κ |= A ≫ A ∨ C). Via (2) it follows that we have
to show that κ(AC) < κ(AC) and κ(AC) ≤ κ(AC) iff
(κ(AC) < κ(AC) and κ(A) ≤ κ(⊥)) and (κ(A) < κ(⊥)
and κ(AC) ≤ κ(AC)). This holds trivially.
(≫2b): We have to show that κ |= A ≫ AC iff (not κ |=
A ≫ A ∨ C and κ |= A ≫ A). Via (2) it follows that we
have to show that κ(AC) < κ(AC), κ(A) ≤ κ(⊥) iff
(κ(A) ≥ κ(⊥) or κ(AC) < κ(AC)) and κ(A) < κ(⊥)
and κ(A) ≤ κ(⊥). This holds trivially.
(≫3–4): We have to show that κ |= ⊥ ≫ A ∨ C iff (κ |=
⊥ ≫ A and κ |= A ≫ A ∨ C). Via (2) it follows that
we have to show that κ(AC) ≤ κ(A ∨ C) iff κ(A) ≤
κ(A) and κ(AC) ≤ κ(AC). The direction from left to
right is immediate. For the converse direction, note that
κ(AC) ≤ κ(AC) implies that κ(AC) = κ(A). So we get
from κ(A) ≤ κ(A) and κ(AC) ≤ κ(AC) that κ(AC) ≤
min{κ(A), κ(AC)} = κ(A ∨ C), as desired.
(≫5): We have to show that κ |= A∨ B ≫ ⊥ iff (κ |= A ≫
⊥ and κ |= B ≫ ⊥). But conditionals with impossible
consequents are accepted iff the antecedents are impossible, i.e., have κ-rank ∞. So the claim follows from the
fact that κ(A ∨ B) = min{κ(A), κ(B)}.
Definition 3 (Relevant Ramsey Test for OCFs). Let κ be an
OCF, A ≫ B be a difference-making conditional and ∗ a
revision operator for OCFs. We define the Relevant Ramsey
Test for OCFs as follows:
(RRTocf ) κ |= A ≫ B iff B ∈ Bel (κ ∗ A) and
B 6∈ Bel (κ ∗ A).
Using some basic properties of ranking functions, we can
reformulate (RRTocf ) :
κ |= A ≫ B iff κ ∗ A |= B and κ ∗ A 6|= B.
(1)
From (1), we obtain for A with κ(A), κ(A) < ∞:
⊳ ⊲
(2)
κ |= A ≫ B iff κ |= {(B|A), B|A }
iff both of the following two conditions hold:
κ(AB) < κ(AB) and
(3)
κ(AB) ≤ κ(AB).
(4)
Difference-making conditionals defined by (RRTocf ) can be
expressed by pairs of conditionals. The first conditional
(B|A) corresponds to the first part of the (RRTocf ) , B ∈
Bel (κ ∗ A), using basically the standard Ramsey Test. The
clause for (RRTocf ) implies the clause for the standard Ramsey Test. The second conditional B|A corresponds to the
second part of the (RRTocf ) , namely B 6∈ Bel (κ ∗ A). We
now continue with Example 1 in order to elucidate our reformulation in (2).
⊳ ⊲
Example 2 (Continue Example 1). The agent’s pipe broke
because the temperatures were too low, and therefore she
had to call a plumber to have the pipe fixed. These connections can be expressed using difference-making conditionals c ≫ b and b ≫ p. Applying (2), we can reformulate
∆≫ = {c ≫ b, b ≫ p} = {(b|c), b|c , (p|b), p|b }. The
standard conditionals express that if it is cold, then the pipe
will break, and if the pipe breaks, then the agent will call a
plumber. But the reason relation would get neglected if we
⊳ ⊲
⊳ ⊲
86
⊥≫⊥
κ ∗ ⊤ 6|= ⊥
Bel (κ) is consistent
A≫⊥
κ ∗ A |= ⊥
A is a doxastic
impossibility
A≫⊥
κ ∗ A |= ⊥
A is a doxastic
necessity
⊥≫A
κ ∗ ⊤ 6|= A
A is a non-belief
A≫A
κ ∗ A 6|= A
A is contingent
A ≫ AC
κ ∗ A |= C
C is in Bel (κ ∗ A)
and κ ∗ A 6|= ⊥ and A is contingent
A≫A ∨ C
κ ∗ A 6|= C
not A ≫ A ∨ C κ ∗ A |= C
ditionals, we get a c-representation of sets of differencemaking conditionals.
First, we will turn to the application of the technique of
c-representations to sets of standard and might conditionals.
Proposition 2 (C-representation of sets of standard and
might conditionals). Let ∆ = {(Bi |Ai )}i=1,...,n ∪
{ Bi |Ai }i=n+1,...,m be a set of standard and might conditionals. A c-representation of ∆ is given by an OCF of the
form
X
κ−
(5)
κc∆ (ω) =
i
⊳
ω|=Ai B i
C is not in Bel (κ ∗ A)
with non-negative impact factors κ−
i for each conditional
(Bi |Ai ) ∈ ∆ resp. Bi |Ai ∈ ∆ satisfying
X
X
κ−
κ−
κ−
(6)
i (≥) min {
k } − min {
k}
C is in Bel (κ ∗ A)
⊳
Table 1: The meanings of some basic difference-making conditionals.
ω|=Ai Bi
⊲
ω|=Ai B i
ω|=Ak B k
i6=k
ω|=Ak B k
i6=k
for all 1 ≤ i ≤ m. If i ∈ {1, . . . , n}, i.e. the impact factor stands for a standard conditional, then we need strict
inequalities ‘>’. If i ∈ {n + 1, . . . , m}, i.e. the impact factor stands for a might conditional, then we do not need strict
inequalities and ‘≥’ is sufficient.
To calculate a c-representation of a set of conditionals ∆
we need to solve a system of inequalities, given by formula
(6) for each i = 1, . . . , m, which ensure κc∆ |= ∆. More
precisely, with the ranks of formulas, and formula (5) the
constraint κc∆ (Ai Bi ) < κc∆ (Ai B i ) for 1 ≤ i ≤ n resp.
κc (Ai Bi ) ≤ κc (Ai B i ) for n + 1 ≤ i ≤ m expands to
X
X
min {
κ−
κ−
(7)
k } (≤) min {
k}
(≫6): If Cn(A) = Cn(B) and Cn(C) = Cn(D), then
A ≫ C iff B ≫ D. This follows trivially since structurally analogous compounds of logically equivalent sentences are logically equivalent and thus get the same κranks.
The basic principles explore the logic of conditionals governed by (RRT). The reformulation in Definition 3 shows
that the notion of the Relevant Ramsey Test can be transferred to the OCF framework. The relevance of the antecedent to the consequent can be expressed by splitting
the two directions within the (RRTocf ) into two conditionals, one might and one standard conditional. In Theorem 1,
we have shown that this reformulation serves the logic behind difference-making conditionals. Theorem 1 should be
compared with the results of Raidl (2020).
Rott (2019) explained the meanings of some basic
difference-making conditionals, and the explanations still
work within the OCF framework. They are collected in table 1. Note that the meanings also reflect the idea of the basic
principles. For example, (≫2a) says that C is in the revision
of κ ∗ A and not in the revision κ ∗ A iff A ≫ AC and
A ≫ A ∨ C, which is exactly the meaning of these two basic
difference-making conditionals. Also for (≫2b) the meanings of the difference-making conditionals of both sides of
‘iff’ are exactly the same.
5
⊲
ω|=Ai Bi
|
ω|=Ak B k
{z
(7a)
ω|=Ai B i
}
|
ω|=Ak B k
{z
}
(7b)
The left minimum ranges over the models Ai Bi , so the conditional (Bi |Ai ) resp. Bi |Ai is not falsified by any considered world and thus κ−
i is no element of any sum (7a). As
opposed to this, the right minimum ranges over the models
of Ai B i , so the conditional (Bi |Ai ) resp. Bi |Ai is falsified by every considered world and thus κ−
i is an element of
every sum (7b). With these deliberations, we can rewrite the
inequalities to
X
X
−
min {
κ−
κ−
k } (≤) κi + min {
k } (8)
⊳
⊲
⊳
ω|=Ai Bi
Inductive representation of
difference-making conditionals
and therefore
κ−
i (≥) min {
In this section, we define an inductive representation of sets
of difference-making conditionals ∆≫ by setting up epistemic states in form of OCFs that are admissible with respect
to ∆≫ . We use the approach of c-representations firstly introduced by Kern-Isberner (2001). C-representation are not
only capable of setting up epistemic states that represent sets
of standard conditionals but were extended to might conditionals (see Eichhorn, Kern-Isberner and Ragni 2018). By
combining the representation of standard and might con-
ω|=Ai B i
ω|=Ak B k
i6=k
ω|=Ai Bi
X
ω|=Ak B k
i6=k
κ−
k } − min {
ω|=Ai B i
⊲
ω|=Ak B k
i6=k
X
κ−
k}
(9)
ω|=Ak B k
i6=k
for all 1 ≤ i ≤ m. As we have seen, save for the strictness the inequalities defining impact factors for standard and
might conditionals are the same and therefore can be expressed using ‘(≥)’. Also note that c-representations are not
unique since the solution of the system of inequalities is not
unique. If the system of inequalities in (6) has a solution
87
the set ∆≫ consists just of one difference-making conditional ∆≫ = {A ≫ B}. In this case, the system of inequalities always has a solution and we can define the κc∆≫ as
follows:
Theorem 3. Let A ≫ B be a difference-making conditional.
κcA≫B is a c-representation of A ≫ B iff there are integers
−
κ−
st , κw , such that, for ω ∈ Ω
−
κst , ω |= AB
(13)
κcA≫B (ω) = κ−
, ω |= AB , for all ω ∈ Ω
w
0,
else
then ∆ is consistent and (5) is a model of ∆. For the converse, Kern-Isberner (2001, p. 69, 2004, p. 26) has shown
that every finite consistent knowledge base consisting solely
of standard conditionals has a c-representation; but it is still
an open question whether this result extends to knowledge
bases including might conditionals.
According to (2), a difference-making conditional A ≫ B
can be reformulated as a set of a standard and a might conditional {(B|A), B|A }. So, for a set of difference-making
conditionals ∆≫ = {Ai ≫ Bi | i = 1, . . . , n}, ∆≫ can be
implemented via {(Bk |Ak ) | k = 1, . . . , n} ∪ { B l |Al |
l = 1, . . . , n}. In this way, we can get an inductive representation of ∆≫ by a c-representation as follows:
⊳ ⊲
⊳
⊲
and
Definition 4 (C-representation for sets of difference-making
conditionals). Let ∆≫ = {Ai ≫ Bi | i = 1, . . . , n} be
a set of difference-making conditionals. An OCF κ is a crepresentation of ∆≫ iff
n
X
κc∆≫ (ω) =
n
X
κ−
k +
ω|=Ak B k
k=1
λ−
l .
(10)
Let us now continue with our example concerning the
agent’s broken pipe:
Example 3 (Continue Example 2). Using the representation
of the set of difference-making conditionals ∆≫ = {c ≫
b, b ≫ p} as pairs of standard and weak conditionals from
Example 2, we can construct a c-representation κc∆≫ using
Definition 4. First we have to solve the system of inequalities
−
defining the impact factors. Let κ−
1 and λ1 correspond to the
standard and the might conditional representations of c ≫ b,
−
and let κ−
2 and λ2 apply similarly to b ≫ p:
−
with non-negative impact factors κ−
k resp. λl for each con≫
ditional (Bk |Ak ) ∈ ∆ resp. B j |Aj ∈ ∆≫ satisfying
X
X
κ−
λ−
κ−
j +
k > min {
l }
ω|=Ak Bk
− min {
ω|=Ak B k
− min {
ω|=Al Bl
ω|=Al Bl
X
κ−
j +
X
λ−
j +
ω|=Aj B j
k6=j
λ−
l ≥ min {
ω|=Al B l
⊲
ω|=Aj B j
k6=j
ω|=Aj Bj
l6=j
λ−
l }
X
κ−
k}
(11)
−
−
κ−
1 > min{0, κ2 } − min{0, λ2 } = 0,
ω|=Al Bl
−
−
λ−
1 ≥ min{0, λ2 } − min{0, κ2 } = 0,
−
−
κ−
2 > min{0, λ1 } − min{0, λ1 } = 0,
ω|=Ak B k
ω|=Aj Bj
l6=j
X
X
λ−
j +
X
−
−
λ−
2 ≥ min{0, κ1 } − min{0, κ1 } = 0.
(12)
κ−
k}
The minima on the left-hand side range over worlds verifying the corresponding (standard resp. might) conditional
and the minima on the right-hand side range over worlds
falsifying these. We take the minimum of the summed up impact factors indicating that other conditionals are falsified.
Since the impact factors are non-negative the minima equal
−
−
−
zero. We choose κ−
1 = κ2 = 1 and λ1 = λ2 = 0 and
get the c-representation presented in table 2. It is easy to
verify that κc∆≫ |= c ≫ p, so in this example the differencemaking conditionals satisfy transitivity. Note, however, that
transitivity is only ‘valid by default’, that is, it can easily be
undercut by the addition of another premise. For instance, it
is possible to consistently add c ≫ p as a third premise to
∆≫ . The extended knowledge base has a c-representation
−
−
−
−
−
(based on κ−
1 = κ3 = 2, κ2 = 1 and λ1 = λ2 = λ3 = 0)
that does not satisfy c ≫ p because it does not even satisfy
(p|c).
ω|=Ak B k
Equations (11) and (12) ensure that the impact factors are
chosen such that κc∆≫ |= ∆≫ . Just like in (6), (11) resp. (12)
follows from the success condition in (3) resp. (4). Since
we chose different impact factors κ− resp. λ− for the standard resp. the might conditionals, the terms in the minima
look more complex even though they can be derived from
(6). Also we replaced the general form of might conditionals
B|A by the more specific might conditional B i |Ai , taking advantage of the special structure of difference-making
conditionals. C-representations of difference-making conditionals exist iff all inequalities (11) and (12) are solvable.
Sets of difference-making conditionals, can be inductively
represented by a c-representation. The crucial part is the reformulation of difference-making conditionals as sets of one
standard and one might conditional in (2). Due to the high
adaptability of the approach of c-representations, it is possible to deal with such a set of mixed conditionals.
In order to illustrate c-representations of differencemaking conditionals, we now turn to the special case when
⊳ ⊲
(14)
Proof. Let ∆≫ = {A ≫ B}. Since Ai B i Ai Bi ≡ ⊥, (13)
follows immediately from (10). κ−
st > 0 follows from (11)
and κ−
w ≥ 0 follows from (12), since there are no other
difference-making conditional to interact with.
ω|=Al Bl
l=1
⊳
−
κ−
st > 0 and κw ≥ 0
⊳
⊲
6
Revision by difference-making conditionals
In this section we discuss a revision method for epistemic
states represented by an OCF with one difference-making
conditional. Therefore, we make use of the characterisation
88
ω
cbp
cbp
cbp
cbp
κc∆≫ (ω)
0
κ−
2 =1
−
κ−
1 + λ2 = 1
κ−
1 =1
κc∆≫ (ω)
λ−
1 =0
−
κ−
2 + λ1 = 1
−
λ2 = 0
0
ω
cbp
cbp
cbp
cbp
Since κ0 is a constant factor, it can be removed from the inequality. As in c-representation, the factor κ−
i is no element
of the left sum, whereas the right sum ranges over worlds
falsifying (Bi |Ai ) resp. Bi |Ai and therefore the factor κ−
i
is an element of every sum. With these deliberations we can
rewrite the inequalities to (16) for all 1 ≤ i ≤ m. Note that
the impact factors defining c-revisions are not unique because there are multiple solutions of the system of inequalities in (16). The question as to which choice of the impact
factors is ‘best’ is part of our ongoing work.
Now we turn to the revision of an epistemic state by a
single difference-making conditional in the framework of
OCFs. In (2) we showed that the revision by a differencemaking conditional is equivalent with revising a ranking
function by a special set of conditionals, since A ≫ B
corresponds to {(B|A), B|A }. Thus, we need a revision
method which is capable of dealing with a mixed set of
conditionals. As we have seen before, c-revisions are an
adaptable revision method for sets of conditionals, both for
standard and for might conditionals. Following the general
schema of c-revisions, we get:
⊳
Table 2: The ranking function κc∆≫ of Example 3.
of a difference-making conditional as a set of one standard
conditional and one might conditional in (2) and provide a
method for simultaneously revising an epistemic state with
a standard and a might conditional.
C-revisions, introduced by Kern-Isberner (2001), provide
a highly general framework for revising epistemic states by
sets of conditionals. In the framework of ranking functions,
c-revisions are capable of revising an OCF by a set of conditionals with respect to conditional interaction within the
new information, while preserving conditional beliefs in the
former belief state. This is all depicted in the principle of
conditional preservation, which implies the Darwiche-Pearl
postulates for revising epistemic states (Kern-Isberner 2001,
2004). We will now introduce a simplified version of crevisions for sets of standard and might conditionals.
Proposition 4 (C-revisions by sets of standard and might
conditionals). Let κ be an OCF specifying a prior epistemic
state and let ∆ = {(Bi |Ai ) | i = 1, . . . , n}∪{ Bi |Ai | i =
n + 1, . . . , m} be a set of standard and might conditionals
which represent the new information. Then a c-revision of κ
by ∆ is given by an OCF of the form
X
κ ∗ ∆(ω) = κ∗∆ (ω) = κ0 + κ(ω) +
κ−
(15)
i
⊳
⊳ ⊲
Definition 5 (C-revision by a difference-making conditional). Let κ be an OCF specifying a prior epistemic state
and let A ≫ B = {(B|A), B|A } be a difference-making
conditional which represents the new information. Then a
c-revision of κ by A ≫ B is given by an OCF of the form
⊳ ⊲
⊲
κ ∗ A ≫ B(ω) = κ∗∆≫ (ω)
= κ0 + κ(ω) +
κ−
i
with non-negative impact factors
for each conditional
(Bi |Ai ) ∈ ∆ resp. Bi |Ai ∈ ∆ satisfying
X
κ−
κ−
i (≥) min {κ(ω) +
k }−
⊲
ω|=Ai Bi
ω|=Ak B k
i6=k
X
min {κ(ω) +
ω|=Ai B i
≤) min {κ0 + κ(ω) +
(
ω|=Ai B i
(18)
κ−
w
(19)
≥ κ(AB) − κ(AB).
⊳ ⊲
⊲
ω|=Ai B i
X
κ−
w
ω|=Ai Bi
Since AB and AB are exclusive and we revise with just a
single difference-making conditional, we get (17). The success condition for the standard conditional (B|A) in (3) and
the success condition for might conditional B|A in (4)
−
lead to inequalities defining impact factors κ−
st resp. κw . For
−
κst it holds that (18) follows immediately from (16):
X
X
κ−
κ−
κ−
st > min {κ(ω) +
k } − min {κ(ω) +
k }.
⊳ ⊲
ω|=AB
ω|=Ak B k
X
κ−
st > κ(AB) − κ(AB)
κ ∗ (A ≫ B)(ω) = κ ∗ {(B|A), B|A }(ω)
X
= κ0 + κ(ω) +
κ−
st +
κ0 is a normalization factor to ensure that κ∗∆ is an OCF.
The κ−
i can be considered as impact factors of the single
conditional (Bi |Ai ) ∈ ∆ resp. Bi |Ai ∈ ∆ for falsifying
the conditionals in ∆ which have to be chosen so as to ensure success κ∗∆ |= ∆ by (16). As before, we use ‘(≥)’ as a
dummy operator which is replaced by the strict inequality
symbol > for standard conditionals, while for might conditionals it is replaced by the inequality symbol ≥. From the
success condition κ∗∆ (Ai Bi ) (≤) κ∗∆ (Ai B i ) and the ranks of
formulas, it holds that
X
κ−
min {κ0 + κ(ω) +
k}
ω|=Ai Bi
(17)
⊳ ⊲
ω|=Ak B k
i6=k
⊳
κ−
st , ω |= AB
κ−
w , ω |= AB
As before, κ0 is a normalization factor. The premises
of the standard and the might conditional defining the
difference-making conditional A ≫ B are exclusive, so the
set A ≫ B = {(B|A), B|A } is consistent and κ∗∆≫ always exists. The form of κ∗∆≫ in (17) follows from (15):
(16)
κ−
k}
with
ω|=Ai B i
⊳
⊲
κ−
k }.
ω|=Ak B k
ω|=AB
ω|=Ak B k
The minimal range over worlds AB resp. AB, so the might
conditional B|A are not falsified by any considered world
⊳ ⊲
ω|=Ak B k
89
ω
cbpd
cbpd
cbpd
cbpd
cbpd
cbpd
cbpd
cbpd
κ∗ (ω)
0
0 + κ−
w =0
1
1 + κ−
w =1
1 + κ−
st = 2
1
1 + κ−
st = 2
1
ω
cbpd
cbpd
cbpd
cbpd
cbpd
cbpd
cbpd
cbpd
κ∗ (ω)
0
0 + κ−
w =0
1
1 + κ−
w =1
0 + κ−
st = 1
0
0 + κ−
st = 1
0
The idea of incorporating relevance into the analysis of conditionals has been around for a long time, and several attempts to implement this kind of connective have been made.
In this section, we explore and compare some of these ideas.
The earliest work establishing a tight connection between
conditionals and belief revision was Gärdenfors (1979). In
a similar vein Fariñas and Herzig (1996) uncover a strong
link between belief contraction (which is known to be dual
to belief revision) and dependence. Their idea is close to the
idea of relevance introduced in Rott (1986) and their work
is cited by Rott (2019). This is what Fariñas and Herzig understand by the phrase ‘B depends on A’:
FHD Ψ |= A ❀ B iff B ∈ Bel (Ψ) and
B 6∈ Bel (Ψ−̇A).
So B depends on A if and only if B is believed in the current
belief state Ψ and B is no longer believed if A is withdrawn
from the belief set of Ψ. There are some notable differences
to the Relevant Ramsey Test. The most striking one is that
the domain of Fariñas and Herzig’s dependency relation is
restricted to the agent’s current belief set, since Ψ |= A ❀ B
implies that A, B ∈ Bel (Ψ). It fails to acknowledge dependencies between non-beliefs, i.e., propositions that the agent
either believes to be false or suspends judgement on, like the
propositions featuring in counterfactuals which typically are
non-beliefs.
A second strand of research to compare with the present
one is the study of conditionals incorporating relevance in a
probabilistic framework that was begun by Douven (2016)
and Crupi and Iacona (2019b). Crupi and Iacona (2019a)
suggested a non-probabilistic possible-worlds semantics for
the ‘evidential conditional’ that can be defined as follows:
CPC A ✄ B iff (B|A) and (A|B).
Let us call such conditionals contraposing conditionals.
Crupi and Iacona call a rule essentially identical to (CPC)
the ‘Chrysippus Test’ (Crupi and Iacona 2019a) and say that
it characterizes the evidential interpretation of conditionals
according to which ‘a conditional is true just in case its antecedent provides evidence [or support] for its consequent.’
Raidl (2019) provided the first completeness proof for the
‘evidential conditional’ which has been improved in Raidl,
Crupi and Iacona (2020). Independently, Booth and Chandler (2020, Proposition 12) hit upon the same concept of
contraposing conditionals and have started investigating it.
Rott (2020) raises doubts as to whether contraposition
really captures the idea of evidence or support. It is true
that contraposing conditionals do violate RW, and this violation was called the hallmark of relevance by Rott. Except for that, contraposing conditionals are formally very
well-behaved as they validate, for example, Or, Cautious
Monotony, Negation Rationality and Disjunctive Rationality. These principles are all violated by difference-making
conditionals. However, Rott argues that the contrastive notion of difference-making is better motivated as an explication of evidence and support than contraposition. The Relevant Ramsey Test—which can be found, under the name
‘Strong Ramsey Test’, already in Rott (1986)—has ancestors
in Gärdenfors’ (1980) notion of explanation and in Spohn’s
(1983) notion of reason which both encode the idea that the
Table 3: Schematic c-revised ranking function κ∗ = κc∆≫ ∗(d ≫ b)
of Example 4. Note that κ0 = 0 which is why it is not represented
in this table.
and thus the sums are empty and we get (18). Analogously,
(19) follows from (16):
X
X
κ−
κ−
κ−
w ≥ min {κ(ω) +
l } − min {κ(ω) +
l }.
ω|=A B
ω|=Al B l
ω|=AB
ω|=Al B l
The sums in the minima are empty because the strong conditional is not falsified for any world satisfying AB resp.
AB. Conditions (18) and (19) ensure the success condition
κ∗∆w |= ∆w .
As we have seen, c-revision provides a revision method
for OCFs which can handle sets of standard and might conditionals. The admissible impact factors allow for a combination of standard and might conditionals in the revision. Together with the special structure of difference-making conditionals, we obtain a revision method for epistemic states
which take a difference-making conditional as input and
therefore ensure that the premise of the conditional is relevant for the antecedent.
Now we give an example of a c-revision by a single
difference-making conditional:
Example 4 (continue Example 3). The plumber arrives at
the agent’s house and tells her that another common reason for broken pipes are deposits in the pipe (d). Since the
house is pretty old, the pipe could have also broken because of these deposits. The agent revises her belief state
κc∆≫ with the new information d ≫ b = {(b|d), b|d }.
˙ with ȧ = {a, a} for
Note that κc∆≫ (ċḃṗ) = κc∆≫ (ċḃṗd)
any boolean variable a. Using (18) and (19), we calculate
−
c
c
c
κ−
st > κ∆≫ (bd) − κ∆≫ (bd) = 0 and κw ≥ κ∆≫ (bd) −
−
c
−
κ∆≫ (bd) = 0, and choose κst = 1 and κw = 0. Using
Definition 4 we get κc∆≫ ∗ (d ≫ b) = κ∗ which is depicted
in table 3. Note that in κ∗ still the difference-making conditional c ≫ b holds, so the new reason-relation between the
deposits and the broken pipe does not overwrite the connection between cold temperatures and the broken pipe.
⊳ ⊲
7
Related Work
Difference-making conditionals establish a notion of relevance for conditionals, namely that the antecedent A of a
conditional ‘If A, then B’ is relevant for its consequent B.
90
κ2 (mr) = 2 and κ2 (mr) = 3 captures scenario 2.As we
can see κ1 |= m ≫ r, since κ1 (mr) = 0 < 1 = κ(mr)
and κ1 (m r) = 2 ≤ 3 = κ1 (mr), but κ1 6|= m ✄ r, since
κ1 (m r) = 2 > 1 = κ1 (mr). For the second scenario, it
holds κ2 |= m ✄ r but κ2 6|= m ≫ r. If we compare this with
our intuition towards the relation between medicine and recovery, we find that the difference-making conditional gets
the example right.
Another argument for the notion of relevance encoded
by difference-making conditionals is that they comply with
Spohn’s work who defines causation as follows:
A is a cause of B iff A and B obtain, A precedes B,
and A raises the metaphysical or epistemic status of
B given the obtaining circumstances. (Spohn 2012, p.
352)
As we can see, this is a compound of facts, times, obtaining
circumstances and a reason relation. We do not deal with
the first three components, but we can compare differencemaking conditionals with Spohn’s concept of reason. In
terms of ranking functions, A is a reason for B if the following inequality holds for a ranking functions κ:
explanans (or the reason) should raise the doxastic status of
the explanandum (or of what the reason is a reason for).
If we define a ranking semantics for contraposing conditionals using the framework of Spohn’s ranking functions,
we can compare these two notions of relevance from technical point of view. Let κ be a ranking function and A ✄ B be
a contraposing conditional with contingent A and B. Then
(CP C ocf )
κ |= A ✄ B
iff κ |= (B|A) and κ |= (A|B)
iff both of the following two conditions hold:
κ(AB) < κ(AB) and
(20)
κ(AB) < κ(AB).
(21)
Difference-making and contraposing conditionals both require the acceptance of the standard conditional (B|A), but
they differ in the case when the antecedent is denied. Compare (20) and (21) with (3) and (4). Difference-making conditionals require the AB-worlds to be more or equally plausible as the AB-worlds stressing that the denial of the antecedent should not lead to acceptance of the consequent.
For contraposing conditionals, the denial of the consequent
leads to denial of the antecedent, so some AB-worlds are required to be strictly more plausible than all the AB-worlds.
Difference-making conditionals place inequality constraints
on all possible worlds in Ω{A,B} , whereas contraposing conditionals do not deal with the position of AB-worlds at all.
To give a feel for the contrast between difference-making
conditionals and contraposing conditionals, we present an
example from Rott (2020) and transfer it to the framework
of ranking functions. Suppose an infectious disease breaks
out with millions of cases, and consider the following two
scenarios concerning a treatment:
Scenario 1: Almost all of the people infected were administered a medicine and almost all of them have recovered.
However, only few of the persons who did not receive the
medicine have recovered.
Scenario 2: Only very few of the people infected were administered the medicine. But fortunately, most people end
up recovering anyway. It turns out that within the group
of people who got the medicine slightly less people have
recovered than within the group who did not get it.
We compare these two scenarios and imagine an agent who
has contracted the disease, but of whom it is not know
whether she got the medicine. In Scenario 1, the fact that
the agent received the medicine would clearly support the
fact that she recovered, as it would clearly make the recovery more likely. So we are justified in accepting the conditional ‘If the agent received the medicine, she has recovered’. However, in scenario 2 it does not make sense to apply
this conditional. It is likely that the agent has recovered, but
having received the medicine would not be evidence for the
recovery. We depict these two scenarios using ranking functions. Let m stand for ‘the agent received the medicine’ and
r for ‘the agent recovered’. The ranking function κ1 with
κ1 (mr) = 0, κ1 (mr) = 1, κ1 (m r) = 2 and κ1 (mr) = 3
captures scenario 1 and κ2 with κ2 (mr) = 0, κ2 (m r) = 1,
κ(B|A) − κ(B|A) > κ(B|A) − κ(B|A).
(22)
Compare Spohn (2012, p. 105, using the definition of twosided ranks τ (B|A) = κ(B|A) − κ(B|A)). Inequality (22)
expresses that the conditional (B|A) is stronger than (B|A).
Thus, A is a direct[!] cause of B in Spohn’s sense just in case
A and B are true, the event represented by A precedes the
event represented by B and Spohn’s inequality (22) holds,
given the obtaining circumstances. For κ |= A ≫ B, equations (3) and (4) hold. Via the definition of ranks for conditionals we first elaborate on (22):
(22) ⇔ κ(AB) − κ(A) − (κ(AB) − κ(A))
> κ(AB) − κ(A) − (κ(AB) − κ(A))
⇔ κ(AB) − κ(AB) > κ(AB) − κ(AB).
Now if κ |= A ≫ B, then the left-hand side is positive, due to (3), whereas the right-hand side is not, due
to (4). So, the inequality expressing the notion of reason
defined by Spohn follows immediately from the definition
of difference-making conditionals as a set of standard and
might conditionals. As was pointed out by Eric Raidl (2020,
p. 17), A ≫ C expresses that A is a ‘sufficient reason’ for
C in the terminology of Spohn (2012, pp. 107–108).
8
Conclusion
Difference-making conditionals aim at capturing the intuition that the antecedent A of a conditional is relevant to its
consequent B, that A supports B or is a reason or evidence
for it. The Relevant Ramsey Test encodes this idea, ruling
that revising by the antecedent should lead to acceptance of
the consequent, which is the standard Ramsey Test, but also
ruling that revising by the negation of the antecedent should
not lead to the acceptance of the consequent. Rott (2019) defined the Relevant Ramsey Test and difference-making conditionals in a purely qualitative framework. In the present
paper we extended his approach to ranking functions by first
91
on Theoretical Aspects of Rationality and Knowledge, 147–
161. San Francisco, CA: Morgan Kaufmann.
Gärdenfors, P. 1979. Conditionals and changes of belief. In
Niiniluoto, I., and Tuomela, R., eds., The Logic and Epistemology of Scientific Change, volume 30(2–4) of Acta Philosophica Fennica. Amsterdam: North-Holland. 381–404.
Gärdenfors, P. 1980. A pragmatic approach to explanations.
Philosophy of Science 47(3):404–423.
Halpern, J. 2003. Reasoning about Uncertainty. Cambridge,
MA: MIT Press.
Kern-Isberner, G. 2001. Conditionals in Nonmonotonic
Reasoning and Belief Revision, volume 2087 of Lecture
Notes in Computer Science. Berlin: Springer.
Kern-Isberner, G. 2004. A thorough axiomatization of
a principle of conditional preservation in belief revision.
Annals of Mathematics and Artificial Intelligence 40(1–
2):127–164.
Lehmann, D., and Magidor, M. 1992. What does a
conditional knowledge base entail? Artificial Intelligence
55(1):1–60.
Lewis, D. K. 1973. Counterfactuals. Oxford: Blackwell.
Raidl, E.; Iacona, A.; and Crupi, V. 2020. The logic of the
evidential conditional. Manuscript March 2020.
Raidl, E. 2019. Quick completeness for the evidential
conditional. PhilSci-Archive, http://philsci-archive.pitt.edu/
16664.
Raidl, E. 2020. Definable conditionals. Topoi. https://doi.
org/10.1007/s11245-020-09704-3.
Rott, H. 1986. Ifs, though, and because. Erkenntnis
25(3):345–370.
Rott, H. 2019. Difference-making conditionals and the
relevant Ramsey test. Review of Symbolic Logic. https://
doi.org/10.1017/S1755020319000674.
Rott, H. 2020. Notes on contraposing conditionals. PhilSciArchive, http://philsci-archive.pitt.edu/17092.
Skovgaard-Olsen, N.; Collins, P.; Krzyanowska, K.; Hahn,
U.; and Klauer, K. C. 2019. Cancellation, negation, and
rejection. Cognitive Psychology 108:42–71.
Spohn, W. 1983. Deterministic and probabilistic reasons
and causes. In Hempel, C. G.; Putnam, H.; and Essler,
W. K., eds., Methodology, Epistemology, and Philosophy of
Science. Dordrecht: Springer. 371–396.
Spohn, W. 1988. Ordinal conditional functions: A dynamic
theory of epistemic states. In Harper, W. L., and Skyrms, B.,
eds., Causation in Decision, Belief Change, and Statistics.
Dordrecht: Springer. 105–134.
Spohn, W. 2012. The Laws of Belief. Oxford: Oxford University Press.
Stalnaker, R. C. 1968. A theory of conditionals. In Rescher,
N., ed., Studies in Logical Theory (American Philosophical
Quarterly Monographs 2). Oxford: Blackwell. 98–112.
transferring the Relevant Ramsey Test to the framework of
OCFs. We defined difference-making conditionals as a pair
consisting of a standard and a might conditional, which is
in full compliance with the basic principles that Rott identified for difference-making conditionals. Using this transformation we benefitted from the flexible approach of crepresentations and c-revisions, defining an inductive representation and a revision method for conditionals incorporating relevance. To the best of our knowledge, there is no other
revision method capable of dealing with not only sets of
conditionals but also sets of conditionals of different types,
namely standard and might-conditionals. Finally, drawing
on the ranking semantics for difference-making conditionals, we compared different approaches to relevance or evidence in conditionals. We showed that difference-making
conditionals express something very close to Spohn’s concept of reason in the context of ranking functions, but that
they are fundamentally different from the evidential (or contraposing) conditionals studied by Crupi, Iacona and Raidl.
For future work we plan on elaborating on the inductive
representation of mixed sets of conditionals. Moreover, we
will continue working on the incorporation of relevance in
different kinds of epistemic states and examine different revision methods for conditionals incorporating relevance.
References
Alchourrón, C.; Gärdenfors, P.; and Makinson, D. 1985. On
the logic of theory change: Partial meet contraction and revision functions. Journal of Symbolic Logic 50(2):510–530.
Booth, R., and Chandler, J. 2020. On strengthening the logic
of iterated belief revision: Proper ordinal interval operators.
Artificial Intelligence 285:103289.
Booth, R., and Paris, J. 1998. A note on the rational closure of knowledge bases with both positive and negative
knowledge. Journal of Logic, Language and Information
7(2):165–190.
Crupi, V., and Iacona, A. 2019a. The evidential conditional.
PhilSci-Archive, http://philsci-archive.pitt.edu/16759.
Crupi, V., and Iacona, A. 2019b. Three ways of being non-material. PhilSci-Archive, http://philsci-archive.
pitt.edu/16478.
Douven, I. 2016. The Epistemology of Indicative Conditionals: Formal and Empirical Approaches. Cambridge: Cambridge University Press.
Dubois, D., and Prade, H. 2006. Possibility theory and
its applications: a retrospective and prospective view. In
Della Riccia, G.; Dubois, D.; Kruse, R.; and Lenz, H.-J.,
eds., Decision Theory and Multi-Agent Planning. Vienna:
Springer. 89–109.
Eichhorn, C.; Kern-Isberner, G.; and Ragni, M. 2018. Rational inference patterns based on conditional logic. In McIlraith, S. A., and Weinberger, K. Q., eds., Proceedings of the
Thirty-Second AAAI Conference on Artificial Intelligence
(AAAI-18), 1827–1834. Menlo Park, CA: AAAI Press.
Fariñas del Cerro, L., and Herzig, A. 1996. Belief change
and dependence. In Proceedings of the 6th Conference
92
Stability in Abstract Argumentation
Jean-Guy Mailly , Julien Rossit
LIPADE, Université de Paris
{jean-guy.mailly, julien.rossit}@u-paris.fr
Abstract
mention some application to crime investigation (more precisely, Internet trade fraud). We also have in mind some
other natural applications, like automated negotiation. For
instance, if an agent is certain her argument for supporting
her preferred offer cannot be accepted at any future step of
the debate, she can switch her offer to another one, that may
be less preferred, but at least could be accepted.
In this paper, we adapt the notion of stability to abstract argumentation, and we show that checking stability is equivalent to performing some well-known reasoning
tasks in Argument-Incomplete AFs (Baumeister, Rothe, and
Schadrack 2015; Baumeister, Neugebauer, and Rothe 2018;
Niskanen et al. 2020). While existing work on stability in
structured argumentation focuses on a particular semantics
(namely the grounded semantics), our approach is generic
with respect to the underlying extension-based semantics.
Moreover we consider both credulous and skeptical variants
of argumentative reasoning.
This paper is organized as follows. Section 2 introduces
the basic notions of abstract argumentation literature in
which our work takes place, and presents the concept of
stability for structured argumentation frameworks. We then
propose in Section 3 a counterpart of this notion of stability adapted to abstract argumentation frameworks, and we
show how we can reduce it to well-known reasoning tasks.
We provide some lower and upper bounds for the computational complexity of checking whether an AF is stable. Section 4 then describes an application scenario in the context
of automated negotiation. Finally, Section 5 discusses related work, and Section 6 concludes the paper by highlighting some promising future works.
The notion of stability in a structured argumentation setup
characterizes situations where the acceptance status associated with a given literal will not be impacted by any future
evolution of this setup. In this paper, we abstract away from
the logical structure of arguments, and we transpose this notion of stability to the context of Dungean argumentation
frameworks. In particular, we show how this problem can
be translated into reasoning with Argument-Incomplete AFs.
Then we provide preliminary complexity results for stability
under four prominent semantics, in the case of both credulous
and skeptical reasoning. Finally, we illustrate to what extent
this notion can be useful with an application to argumentbased negotiation.
1
Introduction
Formal argumentation is a family of non-monotonic reasoning approaches with applications to (e.g.) multi-agent systems (McBurney, Parsons, and Rahwan 2012), automated
negotiation (Dimopoulos, Mailly, and Moraitis 2019) or decision making (Amgoud and Vesic 2012). Roughly speaking, we can group the research in this domain in two families: abstract argumentation (Dung 1995) and structured argumentation (Besnard et al. 2014). The former is mainly
based on the seminal paper proposed by Dung, where abstract argumentation frameworks (AFs) are defined as directed graphs where the nodes represent arguments and the
edges represent attacks between them. In this setting, the nature of arguments and attacks is not defined, only their interactions are represented in order to determine the acceptability status of arguments. On the opposite, different settings
have been proposed where the arguments are built from logical formulas or rules, and the nature of attacks is based on
logical conflicts between the elements inside the arguments.
See e.g. (Baroni, Gabbay, and Giacomin 2018) for a recent
overview of abstract and structured argumentation.
In a particular structured argumentation setting, the notion of stability has been defined recently (Testerink, Odekerken, and Bex 2019). Intuitively, it represents a situation
where a certain argument of interest will not anymore have
the possibility to change its acceptability status. Either it is
currently accepted and it will remain so, or on the contrary
it is currently rejected, and nothing could make it accepted
in the future. In the existing work on this topic, the authors
2
2.1
Background
Abstract Argumentation
Let us first introduce the abstract argumentation framework
defined in (Dung 1995).
Definition 1. An argumentation framework (AF) is a pair
F = hA, Ri where A is the set of arguments and R ⊆ A×A
is the attack relation.
In this framework, we are not concerned by the precise
nature of arguments (e.g. their internal structure or their origin) and attacks (e.g. the presence of contradictions between
elements on which arguments are built). Only the relations
93
between arguments (i.e. the attacks) are taken into account
to evaluate the acceptability of arguments.
We focus on finite AFs, i.e. AFs with a finite set of arguments. For a, b ∈ A, we say that a attacks b if (a, b) ∈ R.
Moreover, if b attacks some c ∈ A, then a defends c against
b. These notions are extended to sets of arguments: S ⊆ A
attacks (respectively defends) b ∈ A if there is some a ∈ S
that attacks (respectively defends) b. The acceptability of arguments is evaluated through a notion of extension, i.e. a set
of arguments that are jointly acceptable. To be considered
as an extension, a set has to satisfy some minimal requirements:
We refer the interested reader to (Baroni, Caminada, and
Giacomin 2018) for more details about these semantics, as
well as other ones defined after Dung’s initial work. From
the set of extensions σ(F) (for σ ∈ {co, pr, gr, st}), we define two reasoning modes:
• an argument a ∈ A is credulously accepted with respect
to σ iff a ∈ S for some S ∈ σ(F);
• an argument a ∈ A is skeptically accepted with respect to
σ iff a ∈ S for each S ∈ σ(F).
Then, a possible enrichment of Dung’s framework consists in taking into account some uncertainty in the AF.
This yields the notion of Incomplete AFs, studied e.g.
in (Baumeister, Rothe, and Schadrack 2015; Baumeister,
Neugebauer, and Rothe 2018; Niskanen et al. 2020). Here,
we focus on a particular type, namely Argument-Incomplete
AFs, but for a matter of simplicity we just refer to them as
Incomplete AFs.
Definition 3. An incomplete argumentation framework
(IAF) is a tuple I = hA, A? Ri where
• A is the set of certain arguments;
• A? is the set of uncertain arguments;
• R ⊆ (A ∪ A? ) × (A ∪ A? ) is the attack relation;
and A, A? are disjoint sets of arguments.
Example 2. The IAF I = hA, A? , Ri is shown on Figure 2. The dotted nodes represent the uncertain arguments
A? . Plain nodes and arrows have the same meaning as previously.
• S ⊆ A is conflict-free (denoted S ∈ cf(F)) iff ∀a, b ∈ S,
(a, b) 6∈ R;
• S ∈ cf(F) is admissible (denoted S ∈ ad(F)) iff S defends all its elements against all their attackers.
Then, Dung defines several semantics:
Definition 2. Given F = hA, Ri an AF, a set S ⊆ A is:
• a complete extension (S ∈ co(F)) iff S ∈ ad(F) and S
contains all the arguments that it defends;
• a preferred extension (S ∈ pr(F)) iff S is a ⊆-maximal
complete extension;
• the unique grounded extension (S ∈ gr(F)) iff S is the
⊆-minimal complete extension;
• a stable extension (S ∈ st(F)) iff S ∈ cf(F) and S attacks each a ∈ A \ S,
where ⊆-maximal and ⊆-minimal denote respectively the
maximal and the minimal elements for classical set inclusion.
Example 1. Let F = hA, Ri be the AF depicted in Figure 1.
Nodes in the graph represent the arguments A, while the
edges correspond to the attacks R. Its extensions for σ ∈
{gr, st, pr, co} are given in Table 1.
a2
a1
a6
a3
a4
a5
a1
a6
a3
a4
a5
a7
Figure 2: An Example of IAF I
Uncertain arguments are those that may not actually belong to the system (for instance because of some uncertainty
about the agent’s environment). There are different ways to
“solve” the uncertainty in an IAF, that correspond to different completions:
Definition 4. Given I = hA, A? Ri an IAF, a completion is
an AF F = hA′ , R′ i where
• A ⊆ A′ ⊆ A ∪ A? ;
• R′ = R ∩ (A′ × A′ ).
Example 3. Considering again I from the previous example, we show all its completions at Figure 3. For each uncertain argument in A? = {a4 , a7 }, there are two possibilities:
either the argument is present, or it is not. Thus, there are
four completions.
a7
Figure 1: An Example of AF F
Semantics σ
grounded
stable
preferred
complete
a2
σ-extensions
{{a1 }}
{{a1 , a4 , a6 }}
{{a1 , a4 , a6 }, {a1 , a3 }}
{{a1 , a4 , a6 }, {a1 , a3 }, {a1 }}
Table 1: σ-Extensions of F
94
a2
a1
a3
a6
a2
a1
a6
a5
a3
a4
a5
(a) C1
a2
a1
a3
(b) C2
a6
a2
a1
a6
a5
a3
a4
a5
a7
(c) C3
¬p and −(¬p) = p, with ¬ the classical negation. We call p
(respectively ¬p) a positive (respectively negative) literal.
Definition 5. An argumentation setup is a tuple AS =
hL, R, Q, K, τ i where:
• L is a set of literals s.t. l ∈ L implies −l ∈ L;
• R is a set of defeasible rules p1 , . . . , pm ⇒ q s.t.
p1 , . . . , pm , q ∈ L. Such a rule is called “a rule for q”.
• Q ⊆ L is a set of queryable literals, s.t. no q ∈ Q is a
negative literal;
• K ⊆ L is the agent’s (consistent) knowledge base;
• τ ∈ L is a particular literal called the topic.
Usual mechanisms are used to define arguments and attacks. An argument for a literal q is an inference tree rooted
in a rule p1 , . . . , pm ⇒ q, such that for each pi , there is a
child node that is either an argument for pi , or an element of
the knowledge base. Then, an argument A attacks an argument B if the literal supported by A is the negation of some
literal involved in the construction of B. From the sets of
arguments and attacks built in this way, the grounded extension is defined as usual (see Definition 2).
Given an argumentation setup AS, the status of the topic
τ may be:
• unsatisfiable if there is no argument for τ in AS;
• defended if there is an argument for τ in the grounded
extension of AS;
• out if there are some arguments for τ in AS, and all of
them are attacked by the grounded extension;
• blocked in the remaining case.
Then, stability can be defined, based on the following notion of future setups:
Definition 6. Let AS = hL, R, Q, K, τ i be an argumentation setup. The set of future setups of AS, denoted by
F (AS), is defined by F (AS) = {hL, R, Q, K ′ , τ i | K ⊆
K ′ }. AS is called stable if for each AS ′ ∈ F (AS), the
status of τ is the same as in AS.
Intuitively, a future setup is built by adding new literals
to the knowledge base (keeping the consistency property,
of course). Then, new arguments and attacks may be built
thanks to these new literals. The setup is stable if these new
arguments and attacks do not change the status of the topic.
To conclude this section, let us mention that (Testerink,
Odekerken, and Bex 2019) provides a sound algorithm that
approximates the reasoning task of checking the stability of
the setup. This algorithm is however not complete, i.e. AS
is actually stable if the algorithm outcome is a positive answer, but there are stable setups that are not identified by the
algorithm. This algorithm has the interest of being polynomially computable (more precisely, it stops in O(n2 ) steps,
where n = |L| + |R|).
a7
(d) C4
Figure 3: The Completions of I
This means that a completion is a “classical” AF made of
all the certain arguments, some of the uncertain elements,
and all the attacks that concern the selected arguments. Reasoning with an IAF generalizes reasoning with an AF, by
taking into account either some or each completion. Formally, given I an IAF and σ a semantics, the status of an
argument a ∈ A is:
• possibly credulously accepted with respect to σ iff a belongs to some σ-extension of some completion of I;
• possibly skeptically accepted with respect to σ iff a belongs to each σ-extension of some completion of I;
• necessarily credulously accepted with respect to σ iff a
belongs to some σ-extension of each completion of I;
• necessarily skeptically accepted with respect to σ iff a belongs to each σ-extension of each completion of I.
Example 4. Let us consider again I from the previous example, and its completions C1 , C2 , C3 and C4 . We observe
that a1 is necessarily skeptically accepted for any semantics, since it appears unattacked in every completion (thus,
it belongs to every extension of every completion).
On the opposite, a6 is possibly credulously accepted with
respect to the preferred semantics: it belongs to some extension of C4 . It is not skeptically accepted (because {a1 , a3 } is
a preferred extension of C4 as well), and it is not necessarily
accepted (because in C1 , it is not defended against a5 , thus
it cannot belong to any extension).
2.2
Stability in Structured Argumentation
3
Now we briefly introduce the argumentation setting from
(Testerink, Odekerken, and Bex 2019), based on ASPIC+
(Modgil and Prakken 2014).
Let us start with the notation that is used to represent the
negation of a literal, i.e. for a propositional variable p, −p =
Stability in Abstract Argumentation
In this section, we describe how we adapt the notion of stability to abstract argumentation. Contrary to previous works,
we do not focus on a specific semantics, and thus we consider both credulous and skeptical reasoning. Moreover, we
95
provide a translation of the stability problem into reasoning
with AFs and IAFs. Despite being theoretically intractable,
efficient algorithms exist for solving these problems in practice. So it paves the way to future implementations of an
exact algorithm for checking stability, and its applications to
concrete scenarios.
3.1
a1
a6
a3
a4
a5
Formal Definition of Stability in AFs
a6
a3
a4
a7
From now on, we consider a finite argumentation universe
that is represented by an AF FU = hAU , RU i. We suppose
that any “valid” AF is made of arguments and attacks in FU ,
i.e. F = hA, Ri s.t. A ⊆ AU and R = RU ∩ (A × A).
Definition 7. Given an AF F = hA, Ri, we call the future
AFs the set of AFs F (F) = {F ′ = hA′ , R′ i | A ⊆ A′ }.
Intuitively speaking, this means that a future AF represents a possible way to continue the argumentative process
(by adding arguments and attacks), accordingly to FU . This
corresponds to some kind of expansions of F (Baumann and
Brewka 2010), where the authorized expansions are constrained by FU . This is reminiscent of the set of authorized
updates defined in (de Saint-Cyr et al. 2016). Notice that F
is a particular future AF.
Now we have all the elements to define stability.
Definition 8. Given an AF F = hA, Ri, a ∈ A an argument, and σ a semantics, we say that F is credulously
(respectively skeptically) σ-stable with respect to a iff
• either ∀F ′ ∈ F (F), a is credulously (respectively skeptically) accepted with respect to σ;
• or ∀F ′ ∈ F (F), a is not credulously (respectively skeptically) accepted with respect to σ.
Although in this paper we focus on σ ∈ {gr, st, pr, co},
the definition of stability is generic, and the concept can be
applied when any extension semantics is used (Baroni, Caminada, and Giacomin 2018).
Example 5. Let us consider the argumentation universe FU
and the AF F, both depicted in Figure 4. The argument a3
is not credulously σ-stable for σ = st, since it is credulously
accepted in F, but not in the future AF where a2 is added.
On the contrary, it is skeptically σ-stable since it is not skeptically accepted in F, nor in any future AF.
a6 is skeptically σ-stable as well, but for another reason:
indeed we observe that in F (and in any future AF), a6 is
defended by the (unattacked) argument a7 , thus it belongs to
every extension.
On this simple example, it may seem obvious to determine that a5 , a6 and a7 will keep their status. However,
let us notice that determining whether an argument keeps its
status when an AF is updated has been studied, and is not a
trivial question in the general case (Baroni, Giacomin, and
Liao 2014; Alfano, Greco, and Parisi 2019).
3.2
a2
a5
a7
(a) FU
(b) F
Figure 4: The Argumentation Universe FU and a Possible AF F
Definition 9. Given F = hA, Ri, the corresponding IAF is
IF = hA, AU \ A, RU i.
The corresponding IAF is built from the whole set of arguments that appear in the universe. The ones that belong to
F are the certain arguments, while the other ones are uncertain. Then of course, all the attacks from the universe appear
in the IAF. The set of completions of IF is actually F (F).
Example 6. Figure 5 shows the IAF corresponding to F.
The arguments that belong to the universe but not to F
(namely, a1 and a2 ) appear as uncertain arguments. This
means that the four completions of this IAF correspond to
F (F).
a2
a1
a6
a3
a4
a5
a7
Figure 5: The IAF IF Corresponding to F
We give a characterization of stability based on the IAF
corresponding to an AF.
Proposition 1. Given an AF F = hA, Ri, a ∈ A an argument, and σ a semantics, F is credulously (respectively
skeptically) σ-stable with respect to a iff
• either a is necessarily credulously (respectively skeptically) accepted in IF with respect to σ;
• or a is not possibly credulously (respectively skeptically)
accepted in IF with respect to σ.
This result shows that solving efficiently the stability
problem is possible, using for instance the SAT-based piece
of software taeydennae (Niskanen et al. 2020) for reasoning in IF .
Now, we provide preliminary complexity results. We start
with upper bounds for the computational complexity of stability.
Computational Issues
We now provide a method for checking the stability of an
AF with respect to some argument. The method is generic
regarding the underlying extension semantics. It is based on
the observation that the set of future AFs can be encoded
into a single IAF (see Definition 3).
96
Proposition 2. The upper bound complexity of checking
whether an AF is (credulously or skeptically) σ-stable with
respect to an argument is as presented in Table 2.
σ
st
co
gr
pr
Credulous
∈ ΠP
2
∈ ΠP
2
∈ coNP
∈ ΠP
3
instance, these preferences can be obtained from a notion of
utility associated with each offer). So, each agent’s goal is to
make her preferred practical argument (i.e. the one that supports the preferred offer) accepted at the end of the debate.
Each agent, in turn, can add one (or more) argument(s) that
defend her preferred argument. In this first version of the
negotiation framework, agents have a total ignorance about
their opponent.
Then, an enriched version of this protocol can be defined,
where the agents use the notion of argumentation universe
to model their (uncertain) knowledge about the opponent.
Then, stability can help the agent to obtain a better outcome:
if at some point, the agent’s preferred practical argument is
rejected and stable, this means that this argument will not
be accepted at the end of the debate, whatever the actual
moves of the other agents. It is then profitable to the agent to
change her goal, defending now the argument that supports
her second preferred offer instead of the first one. This can
reduce the number of rounds in the negotiation (and thus,
any communication cost associated with these rounds), and
even improve the outcome of the negotiation for the agent.
Let us now provide a concrete example. We suppose that
the offers O = {o1 , o2 , o3 } are supported by one practical
argument each, i.e. {p1 , p2 , p3 } with pi supporting oi . The
practical arguments are mutually exclusive. The preferences
of the agents are opposed: agent 1 has a preference ranking o3 >1 o2 >1 o1 , while the preferences of agent 2 are
o1 >2 o2 >2 o3 . So, at the beginning of the debate, the
goal of agent 1 (respectively agent 2) is to accept the argument p3 (respectively p1 ). Let us suppose that the first round
consists in agent 1 attacking the argument p1 with three arguments a1 , a2 and a3 , thus defending p3 . This situation is
depicted in Figure 6.
Skeptical
∈ ΣP
2
∈ coNP
∈ coNP
∈ ΣP
3
Table 2: Upper Bound Complexity of Checking Stability
Sketch of proof. Non-deterministically guess a pair of future
AFs F ′ and F ′′ . Check that a is credulously (respectively
skeptically) accepted in F ′ , and a is not credulously (respectively skeptically) accepted in F ′′ . The complexity of credulous (respectively skeptical) acceptance in AFs (Dvorák and
Dunne 2018) allows to deduce an upper bound for credulous
(respectively skeptical) stability.
Now we also identify lower bounds for the computational
complexity of stability.
Proposition 3. The lower bound complexity of checking
whether an AF is (credulously or skeptically) σ-stable with
respect to an argument is as presented in Table 3.
σ
st
co
gr
pr
Credulous
NP-hard
NP-hard
P-hard
NP-hard
Skeptical
coNP-hard
P-hard
P-hard
ΠP
2 -hard
Table 3: Lower Bound Complexity of Checking Stability
p1
Sketch of proof. Credulous (respectively skeptical) acceptance in an AF F can be reduced to credulous (respectively
skeptical) stability, such that the current AF is F, and the
argumentation universe is FU = F. Thus, F is credulously
(respectively skeptically) σ-stable with respect to some argument a iff a is credulously (respectively skeptically) accepted in F with respect to σ. The nature of the reduction (its computation is bounded with logarithmic space and
polynomial time) makes it suitable for determining both Phardness and C-hardness, for C ∈ {NP, coNP, ΠP
2 }. Thus,
we can conclude that stability is at least as hard as acceptance in AFs. From known complexity results for AFs
(Dvorák and Dunne 2018), we deduce the lower bounds
given in Table 3.
4
a1
p2
p3
a3
a2
Figure 6: The Negotiation Debate F1
In F1 , that represents the state of the debate after agent
1’s move, the argument p1 is clearly rejected under the stable semantics,1 since it is not defended against a1 , a2 and a3 .
Consider that agent 2 has one argument at her disposal, a4 ,
with the corresponding attacks (a4 , a3 ) and (a4 , p2 ). Without a possibility to anticipate the evolution of the debate, the
best action for agent 2 is to utter this argument, thus defending p1 against a3 and p2 .
Now, let us suppose that agent 2 has an opponent modelling in the form of the argumentation universe FU , described at Figure 7.
Now, we observe that p1 does not appear in any extension
of any future framework. Indeed, it is obvious that, if one
of a4 , a5 and a6 is not present at the end of the debate, then
Applying Stability to Automated
Negotiation
Now, we discuss the benefit of stability in a concrete application scenario, namely automated negotiation. Let us
consider a simple negotiation framework, where practical
arguments (i.e. those that support some offers) are mutually exclusive, and for each agent there is a preference relation between the offers supported by these arguments (for
1
97
As well as any semantics considered in this paper.
p1
a1
p2
6
p3
In this paper, we have addressed a first study which investigates to what extent the notion of stability can be adapted to
abstract argumentation frameworks. In particular, we have
shown how it relates with Incomplete AFs, that are a model
that integrates uncertainty in abstract argumentation. Our
preliminary complexity results, as well as the translation of
stability into reasoning with IAFs pave the way to the development of efficient computational approaches for stability,
taking benefit from SAT-based techniques. Finally, we have
shown that, besides the existing application of stability to Internet fraud inquiry (Testerink, Odekerken, and Bex 2019),
this concept has other potential applications, like automated
negotiation.
This paper opens the way for several promising research
tracks. First of all, we plan to study more in depth complexity issues in order to determine tight results for the semantics that were studied here. Other direct future works include
the investigation of other semantics, and the implementation
of our stability solving technique in order to experimentally
evaluate its impact in a context of automated negotiation.
We have focused on stability in extension semantics,
which means that an argument will either remain accepted,
or remain unaccepted. However, in some cases, it is important to deal more finely with unaccepted arguments. It is
possible with 3-valued labellings (Caminada 2006). Studying the notion of stability when such labellings are used to
evaluate the acceptability of arguments is a natural extension
of our work.
In some contexts, the assumption of a completely known
argumentation universe is too strong. For such cases, it
seems that using arbitrary IAFs (with also uncertainty on the
attack relation) is a potential solution. Uncertainty on the
existence (or direction) of attacks makes sense, for instance,
when preferences are at play. Indeed, dealing with preferences in abstract argumentation usually involves a notion
of defeat relation, that is a combination of the attacks and
preferences. This defeat relation may somehow ”cancel”
or ”reverse” the initial attack (Amgoud and Cayrol 2002;
Amgoud and Vesic 2014; Kaci, van der Torre, and Villata
2018), thus some uncertainty or ignorance about the other
agents’ preferences can be represented as uncertainty in the
attack relation of the argumentation universe.
We are also interested in stability for other abstract argumentation frameworks. Besides preference-based argumentation that we have already mentioned, Dung’s AFs has
been generalized by adding a support relation (Amgoud et
al. 2008), or associating quantitative weights with attacks
(Dunne et al. 2011) or arguments (Rossit et al. 2020), or associating values with arguments (Bench-Capon 2002). But
adapting the notion of stability to these frameworks may
require different techniques than the one used in this paper. Also, the recent claim-based argumentation (Dvorák
and Woltran 2020) provides an interesting bridge between
structured argumentation and purely abstract frameworks.
It makes sense to study stability in this setting, as a step
that would make our results for different semantics and reasoning modes available for structured argumentation frameworks.
a3
a2
a6
a4
a5
Figure 7: The Argumentation Universe FU
p1 is not defended against (respectively) a3 , a2 or a1 . Otherwise, if a4 , a5 and a6 appear together, the mutual attack
between a5 and a6 will be at the origin of two extensions,
one where a6 appears with a2 (then defeating p1 ), and the
other one containing a5 and a1 (thus defeating again p1 ).
This means that p1 is rejected in F1 , and it is (both credulously and skeptically) σ-stable. In this situation, it is in
the interest of agent 2 to stop arguing, and proposing instead
the option supported by the argument p2 . Indeed, according to the agent’s preferences, p2 is the best option if p1 is
not available anymore. Not only using the notion of stability
in the argumentation universe allows to stop the debate earlier, but it also allows the agent 2 to propose her second best
option, which would not be possible if she had uttered a4 .
5
Conclusion
Related Work
Dynamics of abstract argumentation frameworks (Doutre
and Mailly 2018) has received much attention in the last
decade. We can summarize this field in two kinds of approaches: the goal either is to modify an AF to enforce
some (set of) arguments as accepted, or to determine to what
extent the acceptability of arguments is impacted by some
changes in the AF. In the first family, we can mention extension enforcement (Baumann and Brewka 2010), that is
somehow dual to stability. Enforcement is exactly the operation that consists in finding whether it is possible to modify
an AF to ensure that a set of arguments becomes (included
in) an extension, while stability is the property of an argument that will keep its acceptance status, whatever the future
evolution of the AF. Control Argumentation Frameworks
(Dimopoulos, Mailly, and Moraitis 2018) are also somehow
related to stability, since they are a generalization of Dung’s
AFs that permit to realize extension enforcement under uncertainty.
The second family of works in the field of argumentation dynamics are those that propose efficient approaches to recompute the extensions or the set of (credulously/skeptically) accepted arguments when the AF is modified (Baroni, Giacomin, and Liao 2014; Alfano, Greco, and
Parisi 2019). Although related to stability, these approaches
do not provide an algorithmic solution to the problem studied in our work, since they focus on one update of the AF at
once, instead of the set of all the future AFs.
98
References
Doutre, S., and Mailly, J.-G. 2018. Constraints and changes:
A survey of abstract argumentation dynamics. Argument &
Computation 9(3):223–248.
Dung, P. M. 1995. On the acceptability of arguments and
its fundamental role in nonmonotonic reasoning, logic programming and n-person games. Artif. Intell. 77(2):321–358.
Dunne, P. E.; Hunter, A.; McBurney, P.; Parsons, S.; and
Wooldridge, M. J. 2011. Weighted argument systems: Basic
definitions, algorithms, and complexity results. Artif. Intell.
175(2):457–486.
Dvorák, W., and Dunne, P. E. 2018. Computational problems in formal argumentation and their complexity. In Baroni, P.; Gabbay, D.; Giacomin, M.; and van der Torre, L.,
eds., Handbook of Formal Argumentation. College Publications. 631–688.
Dvorák, W., and Woltran, S. 2020. Complexity of abstract argumentation under a claim-centric view. Artif. Intell.
285:103290.
Kaci, S.; van der Torre, L. W. N.; and Villata, S. 2018. Preference in abstract argumentation. In Proc. of COMMA’18,
405–412.
McBurney, P.; Parsons, S.; and Rahwan, I., eds. 2012. Proc.
of ArgMAS’11, volume 7543 of Lecture Notes in Computer
Science. Springer.
Modgil, S., and Prakken, H. 2014. The ASPIC+ framework
for structured argumentation: a tutorial. Argument & Computation 5(1):31–62.
Niskanen, A.; Neugebauer, D.; Järvisalo, M.; and Rothe,
J. 2020. Deciding acceptance in incomplete argumentation
frameworks. In Proc. of AAAI’20, 2942–2949.
Rossit, J.; Mailly, J.-G.; Dimopoulos, Y.; and Moraitis, P.
2020. United we stand: Accruals in strength-based argumentation. Argument & Computation. To appear.
Testerink, B.; Odekerken, D.; and Bex, F. 2019. A method
for efficient argument-based inquiry. In Proc. of FQAS’19,
114–125.
Alfano, G.; Greco, S.; and Parisi, F. 2019. An efficient
algorithm for skeptical preferred acceptance in dynamic argumentation frameworks. In Proc. of IJCAI’19, 18–24.
Amgoud, L., and Cayrol, C. 2002. A reasoning model based
on the production of acceptable arguments. Ann. Math. Artif.
Intell. 34(1-3):197–215.
Amgoud, L., and Vesic, S. 2012. On the use of argumentation for multiple criteria decision making. In Proc.
of IPMU’12, 480–489.
Amgoud, L., and Vesic, S. 2014. Rich preferencebased argumentation frameworks. Int. J. Approx. Reason.
55(2):585–606.
Amgoud, L.; Cayrol, C.; Lagasquie-Schiex, M.; and Livet,
P. 2008. On bipolarity in argumentation frameworks. Int. J.
Intell. Syst. 23(10):1062–1093.
Baroni, P.; Caminada, M.; and Giacomin, M. 2018. Abstract
argumentation frameworks and their semantics. In Baroni,
P.; Gabbay, D.; Giacomin, M.; and van der Torre, L., eds.,
Handbook of Formal Argumentation. College Publications.
159–236.
Baroni, P.; Gabbay, D. M.; and Giacomin, M. 2018. Handbook of Formal Argumentation. College Publications.
Baroni, P.; Giacomin, M.; and Liao, B. 2014. On topologyrelated properties of abstract argumentation semantics. A
correction and extension to dynamics of argumentation systems: A division-based method. Artif. Intell. 212:104–115.
Baumann, R., and Brewka, G. 2010. Expanding argumentation frameworks: Enforcing and monotonicity results. In
Proc. of COMMA’10, 75–86.
Baumeister, D.; Neugebauer, D.; and Rothe, J. 2018. Credulous and skeptical acceptance in incomplete argumentation
frameworks. In Proc. of COMMA’18, 181–192.
Baumeister, D.; Rothe, J.; and Schadrack, H. 2015. Verification in argument-incomplete argumentation frameworks. In
Proc. of ADT’15, 359–376.
Bench-Capon, T. J. M. 2002. Value-based argumentation
frameworks. In Proc. of NMR’02, 443–454.
Besnard, P.; Garcı́a, A. J.; Hunter, A.; Modgil, S.; Prakken,
H.; Simari, G. R.; and Toni, F. 2014. Introduction to structured argumentation. Argument & Computation 5(1):1–4.
Caminada, M. 2006. On the issue of reinstatement in argumentation. In Proc. of JELIA’06, 111–123.
de Saint-Cyr, F. D.; Bisquert, P.; Cayrol, C.; and LagasquieSchiex, M. 2016. Argumentation update in YALLA (yet
another logic language for argumentation). Int. J. Approx.
Reason. 75:57–92.
Dimopoulos, Y.; Mailly, J.-G.; and Moraitis, P. 2018. Control argumentation frameworks. In Proc. of AAAI’18, 4678–
4685.
Dimopoulos, Y.; Mailly, J.-G.; and Moraitis, P. 2019.
Argumentation-based negotiation with incomplete opponent
profiles. In Proc. of AAMAS’19, 1252–1260.
99
Weak Admissibility is PSPACE-complete
Wolfgang Dvořák1 , Markus Ulbricht2 , Stefan Woltran1
1
TU Wien, Institute of Logic and Computation
2
Leipzig University
{dvorak,woltran}@dbai.tuwien.ac.at
mulbricht@informatik.uni-leipzig.de
Abstract
self-defeating. In a nutshell, weak admissibility captures this
idea of defense only against “reasonable” arguments in the
following way: Given a conflict-free candidate set E of a
framework F , E is weakly admissible if no attacker of E is
weakly admissible in the subframework F E containing the
arguments whose acceptance state is still undecided wrt. E
(i. e. is neither contained in E nor attacked by E).
As a matter of fact this definition includes all admissible
sets (the subframework F E induced by an admissible set
E does not contain any attacker of E whatsoever) but also
tolerates, among other situations, self-attacking attackers
(since those are never contained in any weakly-admissible
set) or attackers which are contained in a self-defeating odd
cycle which is not resolved.
For example, b is weakly admissible in both F and G:
We study the complexity of decision problems for weak admissibility, a recently introduced concept in abstract argumentation to deal with arguments of self-defeating nature.
Our results reveal that semantics based on weak admissibility
are of much higher complexity (under typical assumptions)
compared to all argumentation semantics which have been
analysed in terms of complexity so far. In fact, we show
PSPACE-completeness of all standard decision problems for
w-admissible and w-preferred semantics (with the exception
of skeptical w-admissible acceptance which is trivial). As a
strategy for implementation we also provide a polynomialtime reduction to DATALOG with stratified negation.
1
Introduction
Abstract argumentation frameworks as introduced by
Dung (1995) are nowadays identified as key concept to understand the fundamental mechanisms behind formal argumentation and nonmonotonic reasoning. In these frameworks, it is solely the attack relation between (abstract) arguments that is used to determine acceptable sets of arguments.
A central property for a set of arguments to be acceptable is
admissibility, which states that (i) arguments from the set do
not attack each other and (ii) each attacker of an argument in
the set is attacked by the set. The vast majority of semantics
for abstract argumenation are based on this concept, most
prominently preferred semantics which is defined via subsetmaximal admissible sets. However, already Dung noticed
that the concept of defense as expressed by Condition (ii)
can be seen problematic when self-defeating arguments are
involved, i. e. are attacking the candidate set. Indeed, the
concern comes from the fact that a defense against an argument which is self-contradicting might be not necessary at
all. Although this issue has been known for a long time, no
semantics for abstract argumentation among the numerous
invented so far (see e.g. (Baroni, Caminada, and Giacomin
2011)) has addressed this problem in a commonly agreed
way.
A recent approach to tackle this particular problem has
been proposed by Baumann, Brewka, and Ulbricht (2020)
where the concept of weak admissibility is introduced.
The underlying idea is to weaken admissibility in a way
that counterattacks are only required against “proper” arguments, i. e. arguments that are not directly or indirectly
a2
F:
a
b
a3
G:
b
a1
but not in H, because here the self-defeat is resolved:
a2
H:
a3
c
b
a1
The price we have to pay is that weak admissibility is recursive in its nature as since a set E of arguments is verified
by checking weak admissibility of certain sets of arguments
contained in an induced subframework F E and so on. The
key question in terms of computational complexity is now
whether this recursion does any harm. In this paper, we answer this question affirmatively.
Our main contributions are as follows:
• We show that all standard decision problems for wadmissible and w-preferred semantics (with the exception of skeptical w-admissible acceptance) are PSPACEcomplete.
• Towards implementation we provide a polynomial-time
reduction to non-recursive DATALOG with stratified
negation which is known to be PSPACE-complete in
terms of program-complexity (cf. (Dantsin et al. 2001)).
100
By definition, F E is the subframework of F obtained by
removing the range of E as well as corresponding attacks,
i. e. F E = F↓A\E ⊕ . Intuitively, the E-reduct contains those
arguments whose status still needs to be decided, assuming
the arguments in E are accepted. Consider therefore the following illustrating example.
Example 2.3 (Reduct and Admissibility). Let F be the AF
depicted below. In contrast to {a} we verify the admissibility of {b} in F . However, their reducts are identical and
contain the self-defeating argument c only.
The complexity analysis we provide is of particular interest, since all known complexity results for argumentation semantics are located within the first two layers of the
polynomial hierarchy (see, e.g. (Dvořák and Dunne 2018)).
This holds even for semantics which have a certain recursive nature like cf2- or stage2-semantics; see (Gaggl and
Woltran 2013; Dvořák and Gaggl 2016) for the respective
complexity analyses. We recall that under the assumption
that the polynomial hierarchy does not collapse, problems
complete for PSPACE are rated as significantly harder than
problems located at lower levels of the polynomial hierarchy. Our results are mirrored in the complexity landscape of
nonmonotonic reasoning in the broad sense, where decision
problems for many prominent formalisms (like default logic
or circumscription) are located on the second level of the
polynomial hierarchy (see, e.g. (Cadoli and Schaerf 1993;
Thomas and Vollmer 2010) for survey articles), and only a
few formalisms reach PSPACE-hardness. Examples for the
latter are nested circumscription (Cadoli, Eiter, and Gottlob 2005), nested counterfactuals (Eiter and Gottlob 1996),
model-preference default logic (Papadimitriou 1991), and
theory curbing (Eiter and Gottlob 2006).
2
c
Background
Standard Concepts and Classical Semantics
We fix a non-finite background set U . An argumentation
framework (AF) (Dung 1995) is a directed graph F =
(A, R) where A ⊆ U represents a set of arguments and
R ⊆ A × A models attacks between them. In this paper
we consider finite AFs only. Let F denote the set of all finite AFs over U .
Now assume F = (A, R). For S ⊆ A we let F ↓S =
(A ∩ S, R ∩ (S × S)). For a, b ∈ A, if (a, b) ∈ R we say
that a attacks b as well as a attacks (the set) E given that
b ∈ E ⊆ A. Moreover, E is conflict-free in F (for short,
E ∈ cf (F )) iff for no a, b ∈ E, (a, b) ∈ R. We say a set E
defends an argument a (in F ) if any attacker of a is attacked
by some argument of E, i. e. for each (b, a) ∈ R, there is
c ∈ E such that (c, b) ∈ R.
A semantics σ is a mapping σ : F → 2U where we have
F 7→ σ(F ) ⊆ 2A , i. e. given an AF F = (A, R) a semantics
returns a subset of 2A . In this paper we consider so-called
admissible and preferred semantics (abbr. ad and pr , ).
Definition 2.1. Let F = (A, R) be an AF and E ∈ cf (A).
1. E ∈ ad (F ) iff E defends all its elements,
2. E ∈ pr (F ) iff E is ⊆-maximal in ad (F ).
2.2
a
c
b
a
b
Observe that the reduct does not contain any attacker of the
admissible set {b} in contrast to the non-admissible set {a}.
The reduct is the central notion in the definition of
weak admissible semantics (Baumann, Brewka, and Ulbricht 2020):
Definition 2.4. For an AF F = (A, R), E ⊆ A is called
weakly admissible (or w-admissible) in F (E ∈ ad w (F )) iff
1. E ∈ cf (F ) and
S
2. for any attacker y of E we have y ∈
/ ad w F E .
The major difference between the standard definition of
admissibility and the “weak” one is that arguments do not
have to defend themselves against all attackers: attackers
which do not appear in any w-admissible set of the reduct
can be neglected.
Example 2.5 (Example 2.3 ctd.). In the previous example
we observed {a} ∈
/ ad (F ). Let us verify the weak admissibility of {a} in F . Obviously, {a} is conflict-free in
F (condition 1). Moreover, since c S
is the only attacker
of
{a} in F {a} we have to check c ∈
/ ad w F {a} (condition 2). Since {c} violates conflict-freeness in the reduct
F {a} = ({c}, {(c, c)}) we find {c} ∈
/ ad w F {a} yield
S w {a}
S
ing ad F
= ∅. Hence, c ∈
/ ad w F {a} holds
proving the claim.
Example 2.6. Now assume a is attacked by an odd cycle
a1 , a2 , a3 . Let us check whether {a} ∈ ad w (F ):
Let us start by giving the necessary preliminaries.
2.1
F {a} = F {b} :
F:
a2
a3
F {a} :
a
a1
Weak Admissibility
In fact, the only conflict-free set attacking a is {a3 }. However, in the reduct (F {a} ){a3 } the set {a2 } is weaklyadmissible. Since a2 attacks a3 , {a3 } ∈
/ ad w (F {a} ){a3 } :
The reduct is a central subject of study in this paper. For a
compact definition, we use, given AF F = (A, R), EF+ =
{a ∈ A | E attacks a in F } as well as EF⊕ = E ∪ EF+ . The
latter set is known as the range of E in F . When clear from
the context, we omit subscript F .
Definition 2.2. Let F = (A, R) be an AF and let E ⊆ A.
The E-reduct of F is the AF F E = (E ∗ , R ∩ (E ∗ × E ∗ ))
where E ∗ = A \ EF⊕ .
a2
a3
(F {a} ){a3 } :
a1
101
a
ΠP2 = coNPNP of problems that can be solved in nondeterministic polynomial time when the algorithm has access to an NP oracle. Finally PSPACE contains the problems that can be solved in polynomial space and exponential
time. We have P ⊆ NP/coNP ⊆ ΠP2 ⊆ PSPACE.
Let us consider another example illustrating the mechanisms of weak admissibility beyond self-defeating arguments.
Example 2.7. Consider the following AF F .
a2
F:
a1
3
a4
In this section, we investigate the complexity of the standard
decision problems in argumentation for w-admissibility and
w-preferred semantics. Let us start by building up some intuition about the complexity of weak admissibility semantics. As for most semantics the verification problem is a
suitable groundwork.
a3
Let us verify that –although this seems a bit surprising at
first glance– {a4 } ∈ ad w (F ). To see this, we note that a2 is
the only attacker in the corresponding reduct:
Example 3.1. Consider the following AF F , adapted from
the well-known standard translation from propositional formulas to AFs:
a2
F {a4 } : a1
Complexity Analysis
a4
a3
t
Now since a2 is attacked by a1 , it stands no chance of being
w-admissible in F {a4 } , although it is not a self-defeating
argument. Thus {a4 } ∈ ad w (F ).
c1
Although Example 2.7 may appear somewhat counterintuitive, it is similar in spirit to Example 2.6: in both cases,
weak admissibility verifies whether a certain attacker can be
neglected. In Example 2.6, a3 does no harm since it is contained in a self-defeating odd loop; in Example 2.7 a2 does
no harm since it is defeated by the undisputed a1 .
Following the classical Dung-style semantics, weakly preferred extensions are defined as ⊆-maximal w-admissible
extensions.
x1
x̄1
x2
c3
x̄2
x3
x̄3
Let us check whether E = {t} ∈ ad w (F ). This is the case
if none of the attackers c1 , c2 , or c3 occur in a w-admissible
extension of the reduct F E , which is obtained by removing
the argument t from F .
Now take c1 . We see that c1 does not occur in a wadmissible extension in F E : It is attacked by both x1 and x̄1
which are in turn both w-admissible in any relevant sub-AF
of F . Similarly, neither c2 nor c3 occur in a w-admissible
extension of F E . Thus E ∈ ad w (F ).
Definition 2.8. For an AF F = (A, R), E ⊆ A is called
weakly preferred (or w-preferred) in F (E ∈ pr w (F )) iff E
is ⊆-maximal in ad w (F ).
Although this example was quite straightforward, several
observations can be made:
For more details regarding the definition and basic properties of weak admissibility we refer the reader to (Baumann,
Brewka, and Ulbricht 2020).
2.3
c2
• weak admissibility does not appear to be a local property:
the reason why E = {t} is w-admissible in the previous
example are the arguments x1 , . . . , x̄3 which are not contained in E; we also see that this example is quite small
and can be extended to chains of arbitrary length,
Decision Problems and Complexity Classes
For an AF F = (A, R) and a semantics σ, we say an argument a ∈ A is credulously
accepted
S
T (skeptically accepted)
in F w.r.t. σ if a ∈ σ(F ) (a ∈ σ(F )). The corresponding decision problems for a semantics σ, given an AF F and
argument a, are as follows: Credulous Acceptance Credσ :
Deciding whether a is credulously accepted in F w.r.t. σ;
Skeptical Acceptance Skeptσ : Deciding whether a is skeptically accepted in F w.r.t. σ. We also consider the following
decision problems, given an AF F : Verification of an extension Verσ : deciding whether a set of arguments is in σ(F );
and Existence of a non-empty extension NEmpty σ : deciding whether σ(F ) contains a non-empty set.
Finally, we assume the reader to be familiar with the basic concepts of computational complexity theory (see, e.g.
(Dvořák and Dunne 2018)) as well as the standard classes
P, NP as well as coNP. In addition we consider the class
• unless there is some shortcut, several sub-AFs need to be
computed, inducing a recursion with depth in O(|A|) in
the worst case,
• it is not clear at first glance whether deciding credulous
acceptance is actually much easier, because guessing a
suitable set (here {t, x1 , x2 , x3 }) might skip computationally expensive recursive steps.
The main contribution of this paper is to formally prove that
there are no shortcuts and no suitable guessing in any case:
All considered non-trivial problems are PSPACE-complete.
Our results are summarized in Table 1 together with the results for admissible and preferred semantics (cf. (Dvořák
and Dunne 2018)).
102
For NEmpty ad w = NEmpty pr w we iterate over all nonempty subsets of the arguments and test whether the set is
w-admissible. If one of them is w-admissible we terminate
and return yes otherwise we return false.
Table 1: Complexity of w-admissible / w-preferred semantics in comparison with admissible / preferred semantics.
Credσ
Skeptσ
Verσ
NEmpty σ
σ
ad
NP-c
trivial
in P
NP-c
coNP-c
NP-c
pr
NP-c
ΠP2 -c
ad w PSPACE-c
trivial
PSPACE-c PSPACE-c
pr w PSPACE-c PSPACE-c PSPACE-c PSPACE-c
3.2
Hardness Results
We show hardness by a reduction from the PSPACEcomplete problem of deciding whether a QBF is valid. To
this end we consider QBFs of the form
Φ = ∀xn ∃xn−1 . . . ∀x2 ∃x1 : φ(x1 , x2 , . . . , xn−1 , xn ).
3.1
Membership Results
Notice that Φ might start with an universal or existential
quantifier and then alternates between universal and existential quantifiers after each variable and ends with an existential quantifier. φ is a propositional
formula in CNF given
V
W
by a set of clauses C, i.e, φ = c∈C l∈c l. We call a QBF
starting with a universal quantifier a ∀-QBF and a QBF starting with an existential quantifier a ∃-QBF. Finally, observe
that we named variables in reverse order to avoid renaming
variables in our proofs by induction.
We start with a reduction that maps QBFs to AFs such
that the validity of the QBF can be read of by inspecting
the w-admissible sets of the AF. We will later extend this
reduction to encode the specific decision problems under our
considerations.
Reduction 3.4. Given a QBF Φ with propositional formula
φ(x1 , . . . , xn ) we define the AF GΦ = (A, R) with A and
R as follows.
In this section we provide an algorithm that can be implemented in PSPACE and closely follows the definition of wadmissibility.
Lemma 3.2. Verad w is in PSPACE.
Proof. An algorithm for verifying that E ∈ ad w (F ) proceeds as follows:
• Test whether E ∈ cf (F ); if not return false,
• compute the reduct F E ,
• iterate over all subsets S of F E that contains at least one
attacker of E and test whether S is w-admissible; if so
return false; else return true.
Notice that the last step involves recursive calls. However,
the size of the considered AF is decreasing in each step and
thus the recursion depth is in O(n). Moreover, we only need
to store the current AF as well as the set S to verify. Finally,
iterating over all subsets of an AF can be done in PSPACE
as well. Hence, the above algorithm is in PSPACE.
A ={xi , x̄i , pi | 1 ≤ i ≤ n} ∪ {c | c ∈ C}
R ={(xi , x̄i ), (x̄i , xi ) | 1 ≤ i ≤ n}∪
{(xi , xi+1 ), (xi , x̄i+1 ) | 1 ≤ i < n}∪
{(x̄i , xi+1 ), (x̄i , x̄i+1 ) | 1 ≤ i < n}∪
{(xi , c) | xi ∈ c ∈ C} ∪ {(x̄i , c) | ¬xi ∈ c ∈ C}∪
{(c, x1 ), (c, x̄1 ) | c ∈ C}∪
{(pi , pi+1 ) | 1 ≤ i < n}∪
{(xi , pi ), (x̄i , pi ) | 1 ≤ i ≤ n}∪
{(pi , xi−1 ), (pi , x̄i−1 ) | 2 ≤ k ≤ n}∪{(p1 , c) | c ∈ C}
Given that verifying is in PSPACE we can adapt standard algorithms to obtain the PSPACE membership of the
other problems. Notice, that Skeptad w is always false as the
empty-set is always w-admissible.
Proposition 3.3. All of the following problems can be solved
in PSPACE: Credad w , Verad w , NEmpty ad w , Credpr w ,
Skeptpr w , Verpr w , and NEmpty pr w
Example 3.5. Let us consider the valid QBF ∀x2 ∃x1 : φ
with φ = c1 ∧ c2 = (¬x2 ∨ x1 ) ∧ (x2 ∨ ¬x1 ). Let us
apply Reduction 3.4 to obtain an AF F . It will be convenient
to think of several layers, each one induced by a variable
occurring in the QBF at hand. We thus have two layers here,
with xi and x̄i attacking each other in the expected way and
each layer attacked by its predecessor:
Proof. Verad w ∈ PSPACE is by Lemma 3.2. The other
memberships are by the following algorithms that can be
easily implemented in PSPACE with calls to other PSPACE
problems , e.g. Verad w , and thus are themselves in PSPACE.
Verpr w can be solved by first verifying that the set is wadmissible and then iterating over all super-sets and verifying that they are not w-admissible. For Credad w = Credpr w
we iterate over all subsets of the arguments that contain the
query argument and test whether the set is w-admissible. As
soon as we find a subset that is w-admissible we can stop
and return that the argument is credulously accepted. Otherwise if none of the sets is w-admissible the argument is not
credulously accepted. For Skeptpr w we iterate over all subsets of the arguments that do not contain the query argument
and test whether the set is w-preferred. As soon as we find
a subset that is w-preferred we can stop and return that the
argument is not skeptically accepted. Otherwise if none of
the sets is w-preferred the argument is skeptically accepted.
x2
x1
x̄2
x̄1
The x-arguments attack the c-arguments in the natural way.
The c-arguments attack the X1 layer only.
X2
103
X1
C
The arguments p1 and p2 induce odd cycles to forbid certain
possible extensions.
X2
X1
p2
is w-admissible: The reduct is the same as the one depicted
above, with the attack from x̄1 to c2 removed. Thus neither
{x1 } nor {x̄1 } are w-admssible in (F ′ )E : The argument c2
occurs in both ((F ′ )E ){x1 } and ((F ′ )E ){x̄1 } and hence witnesses that both arguments are not w-admissible in (F ′ )E .
This means in turn that E = {x̄2 } is w-admissible in F ′ .
C
p1
Example 3.6. For the sake of demonstrating our construction, let us assume our QBF consists of three variables, i. e.
consider ∃x3 ∀x2 ∃x1 : φ with φ = (¬x2 ∨ x1 ) ∧ (x2 ∨ ¬x1 )
as above. The AF F induced by Reduction 3.4 is the following:
In full rig-out, Reduction 3.4 applied to our QBF looks as
follows:
c1
c2
c1
x2
x1
x̄2
x̄1
p2
p1
c1
x2
x1
x̄3
x̄2
x̄1
p2
p1
Now E = {x3 } is w-admissible: The reduct F {x3 } is the AF
F from the previous example, where both {x2 } and {x̄2 } are
not w-admissible (recall that we consider the formula φ from
above). Thus E is w-admissible.
The previous examples hint at the following behavior of
Reduction 3.4:
c2
x2
x3
p3
Now regarding our QBF note that setting x2 to true requires
x1 is to be true as well and setting x2 to false requires x1 to
be false. This translates to F as follows: Take E = {x̄2 },
corresponding to setting x2 to false. The set E is not wadmissible in F . To see this, consider the reduct F E :
c2
• if a QBF of the form ∀x2 ∃x1 : φ, evaluates to true then
neither {x2 } nor {x̄2 } is w-admissible,
• if a QBF of the form ∃x3 ∀x2 ∃x1 : φ, evaluates to true,
then at least one of {x3 } and {x̄3 } is w-admissible, and
x1
• analogous reasoning applies to QBFs which evaluate to
false.
x2
We also want to mention that e.g. in Example 3.6 the arguments x3 and x̄3 are the only possible candidates for wadmissible extensions:
x1
p2
• p2 , for example, is attacked by x2 and x̄2 in the corresponding reduct; this can only be prevented by also including x1 or x̄1 , which in turn are in conflict with p2 ,
p1
• c1 , for example, is attacked by p1 which can only be removed from the reduct by including x1 or x̄1 , but both are
attacked by c1 ,
Now {x̄1 } (corresponding to ¬x1 in the QBF) is wadmissible in F E (even admissible) and attacks x̄2 witnessing that E ∈
/ ad w (F ). Similarly, {x2 } is not w-admissible
since it is attacked by x1 in the corresponding reduct.
Let us now consider a QBF which evaluates to false. For
this, we move from φ to φ′ = C1 ∧ C2 = (¬x2 ∨ x1 ) ∧ (x2 ).
Note that φ′ is obtained from φ by removing ¬x1 from C2 .
Consider the induced QBF ∀x2 ∃x1 : φ′ . Let F ′ be the AF
obtained by applying Reduction 3.4. This time, E = {x̄2 }
• x2 , for example, is attacked by p3 , but attacks x3 , x̄3 and
p2 which are the only attackers of p3 .
The following proposition formalizes that these observations
are true in general.
Proposition 3.7. For a QBF Φ,
104
First assume that Φ is valid and consider {xn } (the argument for {x̄n } is analogous). We show that {xn } is not
{x }
w-admissible. Consider GΦ n = GΦ1 . By the induction
hypothesis we have that {xn−1 } or {x̄n−1 } is weakly ad{x }
missible in GΦ n and as both xn−1 and x̄n−1 attack xn we
have that {xn } is not w-admissible.
Now assume that Φ is not valid and w.l.o.g. assume that
Φ1 is not valid. By the induction hypothesis we have
that neither {xn−1 } nor {x̄n−1 } is weakly admissible in
{x }
GΦ n = GΦ1 and thus {xn } is w-admissible.
1. if Φ is of the form ∃xn ∀xn−1 . . . ∃x1 : φ(x1 , x2 , . . . , xn )
we have that ad w (GΦ ) ∩ {{xn }, {x̄n }} =
6 ∅ if Φ is valid
and ad w (GΦ ) = {∅} otherwise; and
2. if Φ is of the form ∀xn ∃xn−1 . . . ∃x1 : φ(x1 , x2 , . . . , xn )
we have that ad w (GΦ ) = {∅} if Φ is valid and
ad w (GΦ ) ∩ {{xn }, {x̄n }} =
6 ∅ otherwise.
Moreover, in both cases ad w (GΦ ) ⊆ {{xn }, {x̄n }, ∅}.
In order to prove the above proposition we introduce several technical lemmas.
Lemma 3.8. For a QBF Φ, ad w (GΦ ) ⊆ {{xn }, {x̄n }, ∅}.
Proof. Let E ∈ ad w (GΦ ).
Assume pi ∈ E for some i ≥ 2. Since E must be conflictfree, we have xi−1 ∈
/ E, x̄i−1 ∈
/ E, and pi+1 ∈
/ E as well
as xi ∈
/ E and x̄i ∈
/ E. Thus both xi and x̄i occur in the
reduct F E and do not have any attacker in F E . In this case
E cannot be w-admissible since pi ∈ E is attacked by {xi }
and {x̄i } which are w-admissible in F E . So the assumption
pi ∈ E for some i ≥ 2 must be false.
Consider now p1 ∈ E. Similarly, if E ∈ cf (F ), then cj ∈
/
E for each j as well as x1 , x̄1 ∈
/ E and hence, p1 is attacked
by x1 and x̄1 which are unattacked in F E ; contradiction.
Now assume cj ∈ E for some j. Since cj attacks both x1
and x̄1 , {p1 } is w-admissible in F E which in turn attacks
cj . Thus cj ∈ E is impossible.
Finally, if xi ∈ E or x̄i ∈ E for i ≤ n − 1, then either
pi+1 is unattacked in F E (which attacks both arguments)
or pi ∈ E, xi+1 ∈ E, or x̄i+1 ∈ E (which contradicts
E ∈ cf (F )). Hence xi ∈
/ E.
Lemma 3.11. If Proposition 3.7 holds for ∀-QBFs with n−1
variables then it also holds for ∃-QBFs with n variables.
Proof. Consider an ∃-QBF Φ = ∃xn ∀xn−1 . . . ∃x1 :
φ(x1 , x2 , . . . , xn ). We have that Φ is valid iff one of
Φ1 = ∀xn−1 ∃xn−2 . . . ∃x1 : φ(x1 , x2 , . . . , ⊤) and Φ2 =
∀xn−1 ∃xn−2 . . . ∃x1 : φ(x1 , x2 , . . . , ⊥) is valid. Moreover,
{x̄ }
{x }
GΦ n = GΦ1 and GΦ n = GΦ2 .
First assume that Φ is valid and w.l.o.g. assume that Φ1
is valid. We show that {xn } is w-admissible. Consider
{x }
GΦ n = GΦ1 . By the induction hypothesis we have that
{x }
neither {xn−1 } nor {x̄n−1 } is weakly admissible in GΦ n
and thus {xn } is w-admissible.
Now assume that Φ is not valid and and consider {xn }
(the argument for {x̄n } is analogous). By the induction hypothesis we have that {xn−1 } or {x̄n−1 } is weakly admissi{x }
ble in GΦ n and as both xn−1 and x̄n−1 attack xn we have
that {xn } is not w-admissible.
The remainder of the proof proceeds by induction on the
number of variables: Lemma 3.9 is the base case and the
consecutive lemmata constitute the induction step.
We next extend our reduction by two further arguments
φ, pn+1 in order to show our hardness results.
Lemma 3.9. For Φ = ∃x1 : φ(x1 ) we have that ad w (GΦ )∩
{{x1 }, {x̄1 }} =
6 ∅ iff Φ is valid.
Reduction 3.12. Given a ∀-QBF Φ = ∀xn ∃xn−1 . . . ∃x1 :
φ(x1 , x2 , . . . , xn ) we define the AF FΦ = GΦ ∪({φ, pn+1 },
{(φ, pn+1 ), (pn+1 , xn ), (pn+1 , x̄n ), (xn , φ), (x̄n , φ)}).
Proof. By the above lemma it suffices to consider the sets
{x1 }, {x̄1 }. The formula Φ is valid iff x1 or ¬x1 appears in
all clauses.
⇒: Assume {x1 } is a w-admissible set but Φ is not valid,
i. e. there is a c ∈ C such that x1 6∈ c. By construction x1
attacks p1 and is attacked by c and thus c is unattacked in
the reduct and thus {c} is w-admissible in the reduct which
is in contradiction to {x1 } being w-admissible. A similar
reasoning applies to the case where {x̄1 } is w-admissible
but Φ is not valid,
⇐: Assume that the formula is valid and w.l.o.g. assume
that x1 appears in all clauses. Then by construction x1 attacks all the other arguments in GΦ and thus {x1 } is a wadmissible set.
Example 3.13. Recall the valid QBF from our first example:
∀x2 ∃x1 : φ with φ = C1 ∧ C2 = (¬x2 ∨ x1 ) ∧ (x2 ∨ ¬x1 ).
Augmenting Reduction 3.4 with Reduction 3.12 yields the
following AF F :
c1
c2
x2
x1
x̄2
x̄1
φ
Lemma 3.10. If Proposition 3.7 holds for ∃-QBFs with n−1
variables then it also holds for ∀-QBFs with n variables.
Proof. Consider a ∀-QBF Φ = ∀xn ∃xn−1 . . . ∃x1 :
φ(x1 , x2 , . . . , xn ). We have that Φ is valid iff both
Φ1 = ∃xn−1 ∀xn−2 . . . ∃x1 : φ(x1 , x2 , . . . , ⊤) and Φ2 =
∃xn−1 ∀xn−2 . . . ∃x1 : φ(x1 , x2 , . . . , ⊥) are valid. More{x̄ }
{x }
over, GΦ n = GΦ1 and GΦ n = GΦ2 .
p3
105
p2
p1
Note the similarity to Example 3.6: Basically, φ replaces the
pair x3 , x̄3 of arguments. Hence it is easy to see that {φ}
is w-admissible since the reduct F {φ} is again the first AF
from Example 3.5 possessing no w-admissible argument.
denotes the set of all ground atoms over U . A rule r is of the
form
We now formally characterize the potential w-admissible
sets in Reduction 3.12.
with m ≥ k ≥ 0, where a, b1 , . . . , bm are atoms, and “not”
stands for default negation. The head of r is a and the body
of r is body(r) = {b1 , . . . , bk , not bk+1 , . . . , not bm }.
Furthermore, body + (r) = {b1 , . . . , bk } is the positive body
and body − (r) = {bk+1 , . . . , bm } is the negative body. When
convenient, we write (parts of) a rule body as conjunction.
A rule r is ground if r does not contain variables. Moreover,
the DATALOG safety condition requires that all variables
of a rule appear in the positive body. We say that a predicate A depends on predicate B if there is a rule where A
appears in the head and B in the body. We say a program is
non-recursive if the dependencies between predicates have
no cycle.
In DATALOG we distinguish between input predicates
which are given by an input database, i. e. a set of ground
atoms, and the predicates that are defined by the rules of
the program. The complexity analysis of DATALOG distinguishes data-complexity where one considers an arbitrary
but fixed program and analyses the complexity w.r.t. the size
of the database and program-complexity where one considers an arbitrary but fixed database and analyses the complexity w.r.t. the size of the program. Our encoding refers
to the latter notion as we encode weakly-admissible sets of
an AF as a non-recursive DATALOG program which is then
applied to a fixed input database.
For our encoding we consider an AF F = (A, R) with arguments A = {a1 , . . . , an }. The weakly-admissible sets
will be encoded as an n-ary predicate wadm(e1 , . . . en )
where variable ei indicates whether argument ai is in the
extension or not. That is, our fixed database will be over the
boolean domain {0, 1}.
Our encoding closely follows the definition of wadmissible sets, which of course is recursive. To avoid recursion in the DATALOG program we will exploit that the recursion depth is bounded by n, as a reduct is always smaller
than the AF from which it is built (we delete at least the arguments from the extension). We will introduce n-copies
of certain predicates, each of which can only be used on a
certain recursion depth of the w-admissible definition.
The input database contains a unary predicate dom =
{0, 1} defining the boolean domain of our variables and
standard predicates that allow to encode the arithmetic operation we are using in our rules, e.g. the binary predicates
equal = {(0, 0), (1, 1)} and leq = {(0, 0), (0, 1), (1, 1)}
(below we denote them via “=” and “≤” symbols).
We will first introduce certain auxiliary predicates which
we require in order to define wadm(e1 , . . . en ). In our encoding will use variables {xi , yi , di , ei | 1 ≤ i ≤ n} to
represent whether arguments are in certain sets or not. We
will use the following short hands to group variables that
together represent a set of arguments: X = x1 , . . . , xn ,
Y = y1 , . . . , yn , D = d1 , . . . , dn , and E = e1 , . . . , en . We
will use each set of variables to represent a set of arguments
such that the i-th variable is set to 1 iff the i-th argument is
in the set and 0 otherwise.
a ← b1 , . . . , bk , not bk+1 , . . . , not bm .
Lemma 3.14. For a QBF Φ, ad w (FΦ ) ⊆ {∅, {φ}}.
Proof. In comparison to Lemma 3.8, we are only left to consider pn+1 . The assumption pn+1 ∈ E ∈ ad w (F ) yields an
analogous contradiction: E ∈ cf (F ) implies φ, xn ∈
/ E and
hence pn+1 is attacked by {φ} ∈ ad w (F E ).
Proposition 3.15. Given a ∀-QBF Φ = ∀xn ∃xn−1 . . . ∃x1 :
φ(x1 , x2 , . . . , xn ) we have that Φ is valid if and only if
ad w (FΦ ) = {∅, {φ}}, and ad w (FΦ ) = {∅} otherwise.
Proof. We have that the empty-set is always w-admissible
and by Lemma 3.14 that {φ} is the only candidate for being
{φ}
a w-admissible set. Now consider {φ} and the reduct FΦ .
{φ}
We have that FΦ = GΦ and xn and x̄n being the attackers
of φ. By Proposition 3.7 we have that {xn } or {x̄n } is wadmissible in the reduct iff Φ is not valid. Thus {φ} is wadmissible iff Φ is valid.
Theorem 3.16. All of the following problems are PSPACEcomplete: Credad w , Verad w , NEmpty ad w , Credpr w ,
Skeptpr w , Verpr w , and NEmpty pr w .
Proof. The membership results are by Proposition 3.3. The
hardness results are all by Reduction 3.12 and Proposition 3.15. It only remains to state the precise problem instances that are equivalent to testing the validity of the ∀QBF Φ. First, consider Credad w = Credpr w . In the AF FΦ
we have that φ is credulously accepted w.r.t. w-admissible
semantics iff {φ} ∈ ad w (FΦ ) iff Φ is valid. Now, consider Verad w and Verpr w . We have that {φ} ∈ ad w (FΦ ) iff
{φ} ∈ pr w (FΦ ) iff Φ is valid. Next, consider Skeptpr w . We
have that φ is skeptically accepted iff pr w (FΦ ) = {{φ}} iff
Φ is valid. Finally, consider NEmpty ad w = NEmpty pr w .
We have that the only w-preferred/w-admissible extension is
the empty-set iff Φ is not valid.
That is, Reduction 3.12 provides a reduction from ∀-QBF
to all of the considered problems, and as it can be clearly
performed in polynomial time, the PSPACE-hardness of all
these problems follows.
4
DATALOG Encoding
In this section we provide a DATALOG encoding for
w-admissible semantics. Our reduction will generate a
polynomial size logic program that falls into the class of
non-recursive DATALOG with stratified negation which
is known to be PSPACE-complete in terms of programcomplexity (Dantsin et al. 2001).
We first briefly recall the syntax of DATALOG programs.
We fix a countable set U of constants. An atom is an expression p(t1 , . . . , tn ), where p is a predicate symbol of arity
n ≥ 0 and each term ti is either a variable or an element
from U . An atom is ground if it is free of variables. BU
106
The first constraint ensures that each argument in E is also in
the range. The second constraint ensures that each argument
in F ↓X that is attacked by E is in the range (but makes no
statement about arguments not in F↓X ). The third constraint
encodes that an argument is only in the range if it is in E
or attacked by E and the final constraint ensures that only
arguments in F↓X can be in the range.
We start with encoding the subset relation between two
sets of arguments X, Y by
n
^
x i ≤ yi .
X⊆Y ←
i=1
Next we define a predicate cf (·) encoding conflictfreeness. Conflict-free sets can be defined by a rule which
for each attack checks that not both incident arguments are
in the set.
cf (E) ←
n
^
^
dom(ei ),
i=1
Example 4.3. Consider our running example, the initial AF, and the set E = {a4 }.
We then obtain
Range(1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1) which represents that
the range of E equals {a3 , a4 }. Now consider this sub-AF
G = F ↓{a1 ,a2 } and the set E = {a2 }. We obtain that
Range(1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0) which represents that
the range of E in G equals {a2 }.
ei + ej ≤ 1.
(i,j)∈R
Notice that we added the dom(ei ) predicates in the body
only to meet the safety condition of DATALOG (for arguments that are not involved in any attack).
Example 4.1. We will use the AF F from Example 2.7 as
our running example for this section.
We are now ready to encode w-admissible semantics. In
a first step we encode the reduct operation and use a predicate Red(X, E, Y ) encoding that when we are in the sub-AF
F ↓X and build the reduct for the argument set E we obtain
the sub-AF F↓Y
a2
F:
a1
a4
Red(X, E, Y ) ←Range(X, E, D),
Range(X, E, D) defines the range of E within the subframework F ↓X and the second constraint makes sure that
exactly those argument which are in X but not in the range
of E are included in the reduct F↓Y (notice that by the definition of Range we have di ≤ xi ).
For our example we obtain that cf (e1 , e2 , e3 , e4 ) =
{(0, 0, 0, 0), (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1),
(0, 1, 1, 0), (1, 0, 0, 1)} which corresponds to the seven
conflict-free sets of F .
Next we define the predicate Att(·, ·) which encodes that
the first set of arguments attacks the second set. To this end,
for each attack (aj , ak ) ∈ R, we add the following rule to
the DATALOG program:
n
^
dom(di ),
i=1
n
^
Example 4.4. Consider our running example, the initial AF, and the set E = {a4 }.
This determines
the first eight variables of Red and we then obtain
Red(1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0) as the Range predicate
sets D to (0, 0, 1, 1). This reflects that F E = F ↓{a1 ,a2 } .
Now consider this sub-AF G = F ↓{a1 ,a2 } and the set
E = {a2 }. We obtain Red(1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0)
as the Range predicate sets D to (1, 0, 0, 0). This reflects
that GE = F↓{a1 } .
dom(ei ), dj = 1, ek = 1.
i=1
The dom(·) predicates again ensure that we meet the safety
condition of DATALOG.
Example 4.2. Consider our running example and sets D =
{a2 }, E = {a4 }. We have that D attacks E as (a2 , a4 ) is
an attack of F . Our DATALAOG program thus has a rule
V4
V4
Att(D, E) ← i=1 dom(di ), i=1 dom(ei ), d2 = 1, e4 =
1. and we thus obtain Att(0, 1, 0, 0, 0, 0, 0, 1). Similarly we
obtain Att(1, 0, 0, 0, 0, 1, 0, 0) as a1 attacks a2 .
In the following, with slight abuse of notation, we will
use F ↓X to refer to the sub-AF of F that is given by the
arguments in the set represented by X, i. e. F↓X = (A′ , R ∩
(A′ × A′ )) with A′ = {ai | xi = 1}. We define a predicate
Range(X, E, D) that defines the range D of an extension
E in the AF F↓X .
n
^
min(ei , xj ) ≤ dj ,
Range(X, E, D) ← E ⊆ D,
In order to define the predicate wadm(E) we introduce predicates Pi (X, E), 1 ≤ i ≤ n that encode the wadmissible sets of the reducts on the i-th recursion level. Recall that the recursion depth of the computation is bounded
by n. The variables X in Pi (X, E) encode the arguments
of the reduct and the variables E encode the extension, i. e.
E represents an w-admissible set of F ↓X . We have that the
initial AF F corresponds to the reduct containing all arguments.
wadm(E) ←P1 (1, . . . , 1, E).
Example 4.5. We want to show that E = {a4 } is wadmissible in our running example, i. e. we want to prove
that wadm(0, 0, 0, 1). By the above rule this is equivalent to
showing P1 (1, 1, 1, 1, 0, 0, 0, 1).
(i,j)∈R
n
^
i=1
yi = x i − d i
i=1
a3
Att(D, E) ←
n
^
Next we define the w-admissible sets of each reduct. We
first state the rules for 1 ≤ i ≤ n − 1 and then consider the
special case Pn where at most one argument is left in the
(di ≤ ei + max ej ), D ⊆ X.
(j,i)∈R
107
reduct.
The lower bound was proved by a suitable adjustment of
the well-known standard translation from propositional formulas to AFs, with some noteworthy novel features: i) the
argument φ representing whether or not the formula evaluates to true is not attacked by the arguments Ci representing the clauses, but only by two of the variables, ii) the arguments representing the variables occurring in the given
formula attack each other forming several layers in order
to implement the quantifier alternation, iii) auxiliary arguments pj are required to guide the simulation of the aforementioned alternation, and iv) none of the arguments corresponding to variables in the QBF at hand are contained in
a w-admissible extension of the constructed AF; the important part of the construction is the interaction of arguments
which are not accepted.
The construction demonstrates that the “look ahead” incorporated in w-admissible semantics, which we hinted at
in Examples 2.7 and 3.1, is as expressive as any reasoning problem in PSPACE. It is quite surprising that a simple definition of an argumentation semantics (after all, only
the reduct and conflict-free semantics are mentioned) with
a natural motivation such as reducing the damage caused
by self-defeating arguments (as opposed to tailoring artificial semantics in order to reach this expressive power) is
PSPACE-hard. However, this high computational complexity also calls for the investigation of suitable algorithms.
As a first step towards this direction, we provided a DATALOG encoding for w-admissible semantics. Implementing
and evaluating it is part of future work.
In light of the results obtained in this paper, several other
reasoning problems are now also expected to be PSPACEcomplete: Most notably, the standard reasoning problems
for weakly complete and weakly grounded extensions (see
(Baumann, Brewka, and Ulbricht 2020)), but also more sophisticated ones like enforcement (Wallner, Niskanen, and
Järvisalo 2017) or computing repairs (Baumann and Ulbricht 2019; Brewka, Thimm, and Ulbricht 2019). A possible future work direction is to formally prove these conjectures.
The most natural strategy to handle the considerable computational complexity is the investigation of suitable subclasses of AFs, which has already been proven beneficial
for the classical Dung-style semantics (see e.g. (Dvořák
and Dunne 2018, Section 3.4)). This might also provide the bases for potential fixed-parameter tractable algorithms (Dvořák, Ordyniak, and Szeider 2012; Dvořák, Pichler, and Woltran 2012).
Pi (X, E)←E ⊆ X, cf (E), not Qi (X, E).
Qi (X, E)←Red(X, E, Y ), D ⊆ Y, Att(D, E), Pi+1 (Y, D).
n
X
xi ≤ 1, E ⊆ X, cf (E).
Pn (X, E)←
i=1
The first two rules are a direct encoding of the definition of
w-admissible sets. That is, a set E is weakly admissible in F
if it is conflict-free in F and there is no weakly-admissible
set D in the reduct F E that attacks that E. The last rule
covers the special case at recursion depth n we have that at
most one argument is left in the reduct.
Example 4.6. In order to prove that {a4 } is w-admissible
we need to show that P1 (1, 1, 1, 1, 0, 0, 0, 1) is derived by
the encoding. By the above we have that (0, 0, 0, 1) ⊆
(1, 1, 1, 1) and cf (0, 0, 0, 1) hold and thus we have to
investigate Q1 (1, 1, 1, 1, 0, 0, 0, 1). We have that by
Red(X, E, Y ), Y must be (1, 1, 0, 0), and further, by D ⊆
Y , D must be one of (0, 0, 0, 0), (0, 1, 0, 0), (1, 0, 0, 0),
(1, 1, 0, 0). Finally, by Att(D, 0, 0, 0, 1), we have that
D must be either (0, 1, 0, 0) or (1, 1, 0, 0) corresponding
to the sets {a2 } and {a1 , a2 }. We thus have to test
P2 (1, 1, 0, 0, 0, 1, 0, 0) and P2 (1, 1, 0, 0, 1, 1, 0, 0).
The latter immediately fails as (1, 1, 0, 0) 6∈ cf . For
the former we have that we have that (0, 1, 0, 0) ⊆
(1, 1, 0, 0) and cf (0, 1, 0, 0) hold and we have to investigate Q2 (1, 1, 0, 0, 0, 1, 0, 0).
For Red(X, E, Y ),
Y must be (1, 0, 0, 0) and further, by D ⊆ Y and
Att(D, 0, 1, 0, 0), D must be (1, 0, 0, 0). We thus have to
test P3 (1, 0, 0, 0, 1, 0, 0, 0).
We have that (1, 0, 0, 0) ⊆ (1, 0, 0, 0) and cf (1, 0, 0, 0)
hold, and we have to investigate Q3 (1, 0, 0, 0, 1, 0, 0, 0).
Now, Y must be (0, 0, 0, 0) to make Red(X, E, Y ) hold,
and by D ⊆ Y also D = (0, 0, 0, 0).
But as
(0, 0, 0, 0, 1, 0, 0, 0) 6∈ Att, i. e. the empty-set does not attack {a1 }, we cannot prove Q3 (1, 0, 0, 0, 1, 0, 0, 0).
But then not Q3 (1, 0, 0, 0, 1, 0, 0, 0) is true and we
obtain P3 (1, 0, 0, 0, 1, 0, 0, 0) as well as Q2 (1, 1, 0, 0,
0, 1, 0, 0). Given that Q2 (1, 1, 0, 0, 0, 1, 0, 0) holds we
have that we cannot prove P2 (1, 1, 0, 0, 0, 1, 0, 0) and
thus, as also P2 (1, 1, 0, 0, 1, 1, 0, 0) failed, we cannot
prove Q1 (1, 1, 1, 1, 0, 0, 0, 1). But then not Q1 (1, 1, 1, 1,
0, 0, 0, 1) is true and we obtain P1 (1, 1, 1, 1, 0, 0, 0, 1) and
wadm(0, 0, 0, 1).
Notice that our DATALOG encoding is indeed nonrecursive and thus can be solved in PSPACE.
Acknowledgments
5
Conclusion
This research has been supported by WWTF through project
ICT19-065, FWF through project P30168, and DFG through
project BR 1817/7-2.
In this paper, we investigated the computational complexity
of the standard reasoning problems for weakly admissible
and weakly preferred semantics. More specifically we examined the verification problem, the problem of deciding
whether or not a given AF possesses a non-empty extension,
as well as credulous and skeptical acceptance of a given argument. It turns out that all of them except the trivial skeptical acceptance for ad w are PSPACE-complete in general.
References
Baroni, P.; Caminada, M.; and Giacomin, M. 2011. An
introduction to argumentation semantics. The Knowledge
Engineering Review 26:365–410.
108
Baumann, R., and Ulbricht, M. 2019. If nothing is accepted–
repairing argumentation frameworks. Journal of Artificial
Intelligence Research 66:1099–1145.
Baumann, R.; Brewka, G.; and Ulbricht, M. 2020. Revisiting the foundations of abstract argumentation: Semantics
based on weak admissibility and weak defense. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2742–2749. AAAI Press.
Brewka, G.; Thimm, M.; and Ulbricht, M. 2019. Strong
inconsistency. Artificial Intelligence 267:78–117.
Cadoli, M., and Schaerf, M. 1993. A survey of complexity results for nonmonotonic logics. J. Log. Program.
17(2/3&4):127–160.
Cadoli, M.; Eiter, T.; and Gottlob, G. 2005. Complexity of
propositional nested circumscription and nested abnormality
theories. ACM Trans. Comput. Log. 6(2):232–272.
Dantsin, E.; Eiter, T.; Gottlob, G.; and Voronkov, A. 2001.
Complexity and expressive power of logic programming.
ACM Comput. Surv. 33(3):374–425.
Dung, P. M. 1995. On the acceptability of arguments
and its fundamental role in nonmonotonic reasoning, logic
programming and n-person games. Artificial Intelligence
77(2):321–357.
Dvořák, W., and Dunne, P. E. 2018. Computational problems in formal argumentation and their complexity. In Baroni, P.; Gabbay, D.; Giacomin, M.; and van der Torre, L.,
eds., Handbook of Formal Argumentation. College Publications. also appears in IfCoLog Journal of Logics and their
Applications 4(8):2623–2706.
Dvořák, W., and Gaggl, S. A. 2016. Stage semantics and the
scc-recursive schema for argumentation semantics. J. Log.
Comput. 26(4):1149–1202.
Dvořák, W.; Ordyniak, S.; and Szeider, S. 2012. Augmenting tractable fragments of abstract argumentation. Artif. Intell. 186:157–173.
Dvořák, W.; Pichler, R.; and Woltran, S. 2012. Towards
fixed-parameter tractable algorithms for abstract argumentation. Artif. Intell. 186:1–37.
Eiter, T., and Gottlob, G. 1996. The complexity of nested
counterfactuals and iterated knowledge base revisions. J.
Comput. Syst. Sci. 53(3):497–512.
Eiter, T., and Gottlob, G. 2006. Reasoning under minimal
upper bounds in propositional logic. Theor. Comput. Sci.
369(1-3):82–115.
Gaggl, S. A., and Woltran, S. 2013. The cf2 argumentation
semantics revisited. J. Log. Comput. 23(5):925–949.
Papadimitriou, C. H. 1991. On selecting a satisfying truth
assignment (extended abstract). In 32nd Annual Symposium on Foundations of Computer Science, San Juan, Puerto
Rico, 1-4 October 1991, 163–169. IEEE Computer Society.
Thomas, M., and Vollmer, H. 2010. Complexity of nonmonotonic logics. Bulletin of the EATCS 102:53–82.
Wallner, J. P.; Niskanen, A.; and Järvisalo, M. 2017. Complexity results and algorithms for extension enforcement in
abstract argumentation. J. Artif. Intell. Res. 60:1–40.
109
Cautious Monotonicity in Case-Based Reasoning with Abstract Argumentation
Guilherme Paulino-Passos1 , Francesca Toni1
1
Imperial College London, Department of Computing
{g.passos18, f.toni}@imperial.ac.uk
Abstract
of the “default argument” in the grounded extension (Dung
1995), and use fragments of the AA framework for explanation (e.g. dispute trees as in (Čyras, Satoh, and Toni 2016b;
Cocarascu et al. 2020) or excess features in (Čyras et al.
2019)). Different incarnations of AA-CBR use different
mechanisms for defining “specificity”, ”irrelevance” and
”default argument”: the original version in (Čyras, Satoh,
and Toni 2016a) defines all three notions in terms of ⊇
(and is thus referred to in this paper as AA-CBR⊇ ); thus,
AA-CBR⊇ is applicable only to cases characterised by sets
of features; the version used for classification in (Cocarascu
et al. 2020) defines “specificity” in terms of a generic partial
order , ”irrelevance” in terms of a generic relation 6∼ and
”default argument” in terms of a generic characterisation δC
(and is thus referred to in this paper as AA-CBR,6∼,δC ).
Thus, AA-CBR,6∼,δC is in principle applicable to cases
characterised in any way, as sets of features or unstructured (Cocarascu et al. 2020). Here we will study a special, regular instance of AA-CBR,6∼,δC (which we refer
to as AA-CBR ) in which “irrelevance” and the ”default
argument” are both defined in terms of “specificity” (and
in particular the “default argument” is defined in terms of
the “most specific” case). AA-CBR admits AA-CBR⊇
as an instance, obtained by choosing =⊇ and by restricting attention to “coherent” casebases (whereby there is no
”noise”, in that no two cases with different outcomes are
characterised by the same set of features).
Recently, abstract argumentation-based models of case-based
reasoning (AA-CBR in short) have been proposed, originally inspired by the legal domain, but also applicable as classifiers in different scenarios, including image classification,
sentiment analysis of text, and in predicting the passage of
bills in the UK Parliament. However, the formal properties of
AA-CBR as a reasoning system remain largely unexplored.
In this paper, we focus on analysing the non-monotonicity
properties of a regular version of AA-CBR (that we call
AA-CBR ). Specifically, we prove that AA-CBR is not
cautiously monotonic, a property frequently considered desirable in the literature of non-monotonic reasoning. We then
define a variation of AA-CBR which is cautiously monotonic, and provide an algorithm for obtaining it. Further, we
prove that such variation is equivalent to using AA-CBR
with a restricted casebase consisting of all “surprising” cases
in the original casebase.
1
Introduction
Case-based reasoning (CBR) relies upon known solutions
for problems (past cases) to infer solutions for unseen problems (new cases), based upon retrieving past cases which
are “similar” to the new cases. It is widely used in legal settings (e.g. see (Prakken et al. 2015; Čyras, Satoh,
and Toni 2016a)), for classification (e.g. via the k-NN algorithm) and, more recently, within the DEAr methodology (Cocarascu et al. 2020)) and for explanation (e.g. see
(Nugent and Cunningham 2005; Kenny and Keane 2019;
Cocarascu et al. 2020)).
In this paper we focus on a recent approach to CBR
based upon an argumentative reading of (past and new)
cases (Čyras, Satoh, and Toni 2016a; Čyras, Satoh, and Toni
2016b; Cocarascu, Čyras, and Toni 2018; Čyras et al. 2019;
Cocarascu et al. 2020), and using Abstract Argumentation
(AA) (Dung 1995) as the underpinning machinery. In this
paper, we will refer to all proposed incarnations of this
approach in the literature generically as AA-CBR (the
acronym used in the original paper (Čyras, Satoh, and Toni
2016a)): they all generate an AA framework from a CBR
problem, with attacks from “more specific” past cases to
“less specific” past cases or to a “default argument” (embedding a sort of bias), and attacks from new cases to ”irrelevant” past cases; then, they all reduce CBR to membership
AA-CBR was originally inspired by the legal domain in
(Čyras, Satoh, and Toni 2016a), but some incarnations of
AA-CBR, integrating dynamic features, have proven useful in predicting and explaining the passage of bills in the
UK Parliament (Čyras et al. 2019), and some instances of
AA-CBR,6∼,δC have also shown to be fruitfully applicable as classifiers in a number of scenarios, including classification with categorical data, with images and for sentiment
analysis of text (Cocarascu et al. 2020).
In this paper we study non-monotonicity properties of
AA-CBR understood at the same time as a reasoning system and as a classifier. These properties, typically considered for logical systems, intuitively characterise in which
sense systems may stop inferring some conclusions when
more information is made available to them (Makinson
1994). These properties are thus related to modelling in-
110
was considered guilty, represented as ({hm}, +). Consider
now a new case ({hm, sd}, ?), with an unknown outcome,
of a defendant who committed homicide, but for which it
was proven that it was in self-defence (sd). In order to predict the new case’s outcome by CBR, AA-CBR reduces the
prediction problem to that of membership of the default argument in the grounded extension G (Dung 1995) of the AA
framework in Figure 1: given that (∅, −) 6∈ G, the predicted
outcome is positive (i.e. guilty), disregarding sd and, indeed,
no matter what other feature this case may have. Thus, up to
this point, having the feature hm is a sufficient condition
for predicting guilty. If, however, the courts decides that for
this new case the defendant should be acquitted, the case
({hm, sd}, −) enters in our casebase. Now, having the feature hm is no longer a sufficient condition for predicting
guilty, and any case with both hm and sd will be predicted
a negative outcome (i.e. that the person is innocent). This is
the case for predicting the outcome of a new case with again
both hm and sd, in AA-CBR using the AA framework in
Figure 2. Thus, adding a new case to the casebase removed
some conclusions which were inferred from the previous,
smaller casebase. This illustrates non-monotonicity.
ference which is tentative and defeasible, as opposed to
the indefeasible form of inference of classical logic. Nonmonotonicity properties have already been studied in argumentation systems, such as ABA and ABA+ (Čyras and
Toni 2015; Čyras and Toni 2016), ASP IC + (Dung 2014;
Dung 2016) and logic-based argumentation systems (Hunter
2010). In this paper, we study those properties for the application of argumentation to classification, in particular in the
form of AA-CBR.
The following example illustrates AA-CBR (and
AA-CBR⊇ in particular) as well as its non-monotonicity,
in a legal setting.
(∅, −)
({hm}, +)
({hm, sd}, ?)
Figure 1: Initial AA framework for Example 1. Past cases (with
their outcomes) and the new case (with no outcome, indicated by
a question mark) are represented as arguments. AA-CBR predicts
outcome + for the new case. (Grounded extension in colour.)
In this paper we prove that the kind of inference underpinning AA-CBR lacks a standard non-monotonicity property, namely cautious monotonicity. Intuitively this property
means that if a conclusion is added to the set of premises
(here, the casebase), then no conclusion is lost, that is, everything which was inferable still is so. In terms of a supervised
classifier, satisfying cautious monotonicity culminates in being “closed” under self-supervision. That is, augmenting the
dataset with conclusions inferred by the classifier itself does
not change the classifier.
Then, we make a two-fold contribution: we define (formally and algorithmically) a provably cautiously monotonic
variant of AA-CBR , that we call cAA-CBR , and prove
that it is equivalent to AA-CBR applied to a restricted
casebase consisting of all “surprising” cases in the original
casebase. We also show that the property of cautious monotonicity of cAA-CBR leads to the desirable properties of
cumulativity and rational monotonicity. All results here presented are restricted to coherent casebases, in which no case
characterisation (problem) occurs with more than one outcome (solution).
(∅, −)
({hm}, +)
({hm, sd}, −)
({hm, sd}, ?)
Figure 2: Revised AA framework for Example 1. Here, the added
past case changes the AA-CBR-predicted outcome to − by limiting the applicability of the previous past case. (Again, grounded
extension in colour.)
Example 1. Consider a simplified legal system built by
cases and adhering, like most modern legal systems, to the
principle by which, unless proven otherwise, no person is to
be considered guilty of a crime. This can be represented by a
“default argument” (∅, −), indicating that, in the absence of
any information about any person, the legal system should
infer a negative outcome − (that the person is not guilty).
(∅, −) can be understood as an argument, in the AA sense,
given that it is merely what is called a relative presumption, since it is open to proof to the contrary, e.g. by proving
that the person did indeed commit a crime. Let us consider
here one possible crime: homicide1 (hm). In one case, it was
established that the defendant committed homicide, and he
2
2.1
Background
Abstract argumentation
An abstract argumentation framework (AF) (Dung 1995) is
a pair (Args, ), where Args is a set (of arguments) and
is a binary relation on Args. For α, β ∈ Args, if α
β, then
we say that α attacks β and that α is an attacker of β. For
a set of arguments E ⊆ Args and an argument α ∈ Args,
E defends α if for all β
α there exists γ ∈ E such that
γ
β. Then, the grounded
extension of (Args, ) can be
S
constructed as G = i>0 Gi , where G0 is the set of all
unattacked arguments, and ∀i > 0, Gi+1 is the set of arguments that Gi defends. For any (Args, ), the grounded
extension G always exists and is unique and, if (Args, ) is
1
This is merely a hypothetical example, so the terms used do
not correspond to a specific jurisdiction.
111
A casebase D is coherent if there are no two cases
(αC , αo ), (βC , βo ) ∈ D such that αC = βC but αo 6= βo .
well-founded (Dung 1995), extensions under other semantics (e.g. stable extensions (Dung 1995), where E ⊆ Args
is stable if ∄α, β ∈ E such that α
β and, moreover,
∀α ∈ Args \ E, ∃β ∈ E such that β
α) are equal to G.
In particular for finite AFs, (Args, ) is well-founded iff it
is acyclic.
Given (Args, ), we will sometimes use α ∈ (Args, )
to stand for α ∈ Args.
2.2
For simplicity of notation, we sometimes extend the definition of to X × Y , by setting (αc , αo ) (βc , βo ) iff
αc βc .3
Definition 3 (Adapted from (Cocarascu et al. 2020)). The
AF mined from a dataset D and a new case (NC , ?) is
(Args, ), in which:
Non-monotonicity properties
• Args = D ∪ {(δC , δo )} ∪ {(NC , ?)} ;
• for (αC , αo ), (βC , βo ) ∈ D ∪ {(δC , δo )}, it holds that
(αC , αo )
(βC , βo ) iff
1. αo 6= βo ,
2. αC βC , and
3. ∄(γC , γo ) ∈ D ∪ {(δC , δo )} with αC ≻ γC ≻ βC and
γo = αo ;
• for (βC , βo ) ∈ D ∪ {(δC , δo )}, it holds that (NC , ?)
(βC , βo ) iff (NC , ?) 6∼ (βC , βo ).
We will be interested in the following properties.2 An arbitrary inference relation ⊢ (for a language including, in particular, sentences a, b, etc., with negations ¬a and ¬b, etc.,
and sets of sentences A, B) is said to satisfy:
1. non-monotonicity, iff A ⊢ a and A ⊆ B do not imply that
B ⊢ a;
2. cautious monotonicity, iff A ⊢ a and A ⊢ b imply that
A ∪ {a} ⊢ b;
3. cut, iff A ⊢ a and A ∪ {a} ⊢ b imply that A ⊢ b;
The AF mined from a dataset D alone is (Args ′ , ′ ), with
Args ′ = Args \ {(NC , ?)} and ′ = ∩(Args ′ × Args ′ ).
4. cumulativity, iff ⊢ is both cautiously monotonic and satisfies cut;
Note that if D is coherent, then the “equals” case in the
item 2 of the definition of attack will never apply. As a result, the AF mined from a coherent D (and any (NC , ?)) is
guaranteed to be well-founded.
5. rational monotonicity, iff A ⊢ a and A 6⊢ ¬b imply that
A ∪ {b} ⊢ a;
6. completeness, iff either A ⊢ a or A ⊢ ¬a.
3
Definition 4 (Adapted from (Cocarascu et al. 2020)). Let
G be the grounded extension of the AF mined from D and
(NC , ?), with default argument (δC , δo ). The outcome for
NC is δo if (δC , δo ) is in G, and δ¯o otherwise.
Setting the ground
In this section we define AA-CBR , adapting definitions
from (Cocarascu et al. 2020).
All incarnations of AA-CBR, including AA-CBR ,
map a database D of examples labelled with an outcome and
an unlabelled example (for which the outcome is unknown)
into an AF. Here, the database may be understood as a casebase, the labelled examples as past cases and the unlabelled
example as a new case: we will use these terminologies interchangeably throughout. In this paper, as in (Cocarascu et
al. 2020), examples/cases have a characterisation (e.g., as
in (Čyras, Satoh, and Toni 2016a), characterisations may be
sets of features), and outcomes are chosen from two available ones, one of which is selected up-front as the default
outcome. Finally, in the spirit of (Cocarascu et al. 2020), we
assume that the set of characterisations of (past and new)
cases is equipped with a partial order (whereby α ≺ β
holds if α β and α 6= β and is read “α is less specific than
β”) and with a relation 6∼ (whereby α 6∼ β is read as “β is
irrelevant to α”). Formally:
In this paper we focus on a particular case of this scenario:
Definition 5. The AF mined from D alone and the AF
mined from D and (NC , ?), with default argument (δC , δo ),
are regular when the following requirements are satisfied:
1. the irrelevance relation 6∼ is defined as: x1 6∼ x2 iff x1 6
x2 , and
2. δC is the least element of X.4
This restriction connects the treatment of a characterisation αC as a new case and as a past case. We will see below
that these conditions are necessary in order to satisfy desirable properties, such as Theorem 7.
In the remainder, we will restrict attention to regular mined AFs. We will refer to the (regular) AF mined
from D and (NC , ?), with default argument (δC , δo ), as
AF (D, NC ), and to the (regular) AF mined from D alone
as AF (D). Also, for short, given AF (D, NC ), with default argument (δC , δo ), we will refer to the outcome for NC
Definition 2 (Adapted from (Cocarascu et al. 2020)). Let X
be a set of characterisations, equipped with a partial order
≺ and a binary relation 6∼. Let Y = {δo , δ¯o } be the set of
(all possible) outcomes, with δo the default outcome. Then,
a casebase D is a finite set such that D ⊆ X × Y (thus
a past case α ∈ D is of the form (αC , αo ) for αC ∈ X
and αo ∈ Y ) and a new case is of the form (NC , ?) for
NC ∈ X. We also discriminate a particular element δC ∈ X
and define the default argument (δC , δo ) ∈ X × Y .
2
3
In (Cocarascu et al. 2020) was directly given over X × Y .
Note that, in X ×Y , anti-symmetry may fail for two cases with different outcomes but the same characterisation, if D is not coherent,
and thus is merely a preorder on X × Y . When we are restricted
to a coherent D, we can guarantee it is a partial order.
4
Indeed this is not a strong condition, since it can be proved
that if αC 6 δC then all cases (αC , αo ) in the casebase could be
removed, as they would never change an outcome. On the other
hand, assuming also the first condition in Definition 5, if (αC , ?) is
the new case and αC 6 δC , then the outcome is δ¯o necessarily.
We are mostly following the treatment of Makinson (1994).
112
as AA-CBR (D, NC ).5 In the remainder of the paper we
assume as given arbitrary X, Y , D, (NC , ?), (δC , δo ) (satisfying the previously defined constraints), unless otherwise
stated.
In the remainder of this section we will identify some
properties of AA-CBR , concerning its behaviour as a
form of CBR.
α = (αC , αo ) such that βC ≺ αC . By contradiction,
assume βo 6= o. Let Γ = {γ ∈ Args | γ = (γC , γo ),
βC ≺ γC αC and γo = o}. Notice that Γ is nonempty, as α ∈ Γ. Γ is the set of “potential attackers”
of β, but only -minimal arguments in Γ do actually
attack β. Let η be such a -minimal element of Γ.6 By
construction, η attacks β. Thus β is attacked and not in
G0 , a contradiction. Hence, βo = o, as required.
(b) For the inductive step, let us assume that the property
holds for a generic Gi , and let us prove it for Gi+1 . Let
β = (βC , βo ) ∈ Gi+1 \ Gi (if β ∈ Gi , the property
holds by the induction hypothesis). (NC , ?) does not
attack β, as otherwise β would not be defended by Gi ,
as Gi is conflict-free. Thus, once again, as β is not a
nearest case, there is a nearest case α = (αC , αo ) such
that βC ≺ αC . Again, assume that βo 6= o. Then let
Γ = {γ ∈ Args | γ = (γC , γo ), βC ≺ γC αC
and γo = o}, with η a -minimal element of Γ. Then
η attacks β. However, as Gi defends β, there is then
θ ∈ Gi such that θ attacks η. By inductive hypothesis,
θ is either (NC , ?) or θ = (θC , o). The first option is
not possible, as η ∈ Γ, and thus ηC αC , and of
course αC NC . Thus, ηC NC and is thus not
attacked by (NC , ?). This means that (θC , o) attacks
η = (ηC , ηo ). But this is absurd as well, as η ∈ Γ
and thus ηo = o = θo . Therefore, our assumption that
βo 6= o was false, that is, βo = o, as required.
2. If o = δ¯o , the default argument (δC , δo ) is not in G, since
we have just proven that all arguments in G other than
(NC , ?) have outcome o.
3. If o = δo , then let β be an attacker of (δC , δo ), and thus of
the form β = (βC , δ¯o ) (again see how regularity is necessary, since otherwise (NC , ?) could be the attacker). β
is not in G and, since G is also a stable extension, some
argument in G attacks β. This is true for any attacker β
of the default argument, and thus the default argument is
defended by G. As G contains every argument it defends,
the default argument is in the grounded extension, confirming that the outcome for NC is δo .
Agreement with nearest cases. Our first property regards
the predictions of AA-CBR in relation to the “most similar” (or nearest) cases to the new case, when these nearest cases all agree on an outcome. This property generalises
(Čyras, Satoh, and Toni 2016a, Proposition 2) in two ways:
by considering the entire set of nearest cases, instead of requiring a unique nearest case, for AA-CBR , instead of its
instance AA-CBR⊇ . As in (Čyras, Satoh, and Toni 2016a),
we prove this property for coherent casebases. We first define the notion of nearest case.
Definition 6. A case (αC , αo ) ∈ D is nearest to NC iff
αC NC and it is maximally so, that is, there is no
(βC , βo ) ∈ D such that αC ≺ βC NC .
Theorem 7. If D is coherent and every nearest case to NC
is of the form (αC , o) for some outcome o ∈ Y (that is, all
nearest cases to the new case agree on the same outcome),
then AA-CBR (D, NC ) = o (that is, the outcome for NC
is o).
Proof. Let G be the grounded extension of AF (D, NC ).
An outline of the proof is as follows:
1. We will first prove that each argument in G is either
(NC , ?) or of the form (βC , o) (that is, agreeing in outcome with all nearest cases).
2. Then we will prove that if o = δ¯o (that is, o is the
non-default outcome), then (δC , δo ) 6∈ G (and thus
AA-CBR (D, NC ) = δ¯o , as envisaged by the theorem).
3. Finally, by using the fact that AF (D, NC ) is wellfounded (given that D is coherent), and thus G is also
stable, we will prove that if o = δo (that is, o is
the default outcome), then (δC , δo ) ∈ G (and thus
AA-CBR (D, NC ) = δo , as envisaged by the theorem).
Addition of new cases. The next result characterises the
set of past cases/arguments attacked when the dataset is extended with a new labelled case/argument. In particular, this
result compares the effect of predicting the outcome of some
N2 from D alone and from D extended with (N1 , o1 ), when
there is no case in D with characterisation N1 already and
moreover D is coherent.
This result will be used later in the paper and is interesting
in its own right as it shows that, any argument attacked by the
“newly added” case (N1 , o1 ) is easily identified in the sets
G0 and G1 in the grounded extension G, being sufficient to
check those rather than the entire casebase D.
We will now prove 1-3.
S
1. By definition G = i>0 Gi . We prove by induction that,
for every i, each argument in Gi is either (NC , ?) or of the
form (βC , o). Then, given that each element of G belongs
to some Gi , the property holds for G.
(a) For the base case, consider G0 . (NC , ?) and all nearest
cases are unattacked, and thus in G0 (notice how this
requires the AF to be regular, otherwise nearest cases
could be irrelevant). G0 may however contain further
unattacked cases. Let β = (βC , βo ) be such a case. If
NC 6 βC , then (δC , δo ) 6∼ β and thus (NC , ?) attacks
β, contradicting that β in unattacked. So βC NC .
As β is not a nearest case, there is a nearest case
Lemma 8. Let D be coherent, N1 , N2 ∈ X, o1 ∈
Y , and suppose that there is no case in D with char6
Note that η is guaranteed to exist, as Γ is non-empty and otherwise we would be able to build an arbitrarily long chain of (distinct)
arguments, decreasing w.r.t. ≺. However this would allow a chain
with more elements than the cardinality of Γ, which is absurd.
5
Note that we omit to indicate in the notations the default argument (δC , δo ), and leave it implicit instead for readability.
113
relevant to NC , and thus α NC , which in turn implies
that α ∈ D2 , since D1NC = D2NC .
On the other hand, as α 6∈ G20 , there is a case β ∈ AF2
such that β
α. However, α 6∈ AF1 , otherwise α would
be attacked in AF1 and thus not in G10 . But then, since
D1NC = D2NC , this means that β 6 NC . Finally, this
means that (NC , ?)
β, and thus G20 defends it. Therefore, β ∈ G21 , what we wanted to prove.
• For the induction step, from j to j + 1:
Again, if G1j+1 ⊆ G2j+1 , we are done. If not, there is a
α ∈ G1j+1 \ G2j+1 . Again we can check that this implies
that α ∈ D2 . Now, since α ∈ G1j+1 , then G1j defends it.
But now, by inductive hypothesis, G1j ⊆ G2j+1 . Therefore,
G2j+1 also defends α, which implies that α ∈ G2j+2 ,as we
wanted.7 This concludes the induction.
acterisation N1 . Consider AF1 = AF (D, N1 ) and
AF2 = AF (D ∪ {(N1 , o1 )}, N2 ). Finally, let G(AF1 )and
G(AF2 ) be the respective grounded extensions. Let β ∈ D
be such that (N1 , o1 )
β in AF2 . Then,
1. for every γ that attacks β in AF1 , N1 6∼ γ (that is, γ is
irrelevant to N1 and, by regularity, N1 6 γ);
2. in AF1 , (N1 , ?) defends β;
S
3. β ∈ G(AF1 ) and, for G(AF1 ) = i>0 Gi , β is either in
G0 (that it, it is unattacked), or in G1 .
4. For every θ = (θC , θo ) ∈ D such that (N1 , ?) defends θ
in AF1 , if θo 6= o1 , then, in AF2 , (N1 , o1 )
θ.
Proof. 1. Let β = (βC , βo ). From the definition of attack:
(i) N1 ≻ βC , (ii) o1 6= βo , and (iii) there is no (αC , xo )
such that xo = o1 and N1 ≻ αC ≻ βC . Consider η =
(ηC , ηo ) such that η attacks β in AF1 (if there is no such
η then the result trivially holds).
Assume by contradiction that η is relevant to N1 . Then
by regularity N1 ηC . But since D is coherent and
(N1 , o1 ) 6∈ D, η and N1 are distinct, and thus N1 ≻ ηC .
As η attacks β, ηo 6= βo , but this in turn implies that
ηo = o1 , since (N1 , o1 ) also attacks β, in AF2 . But then
N1 ≻ ηC ≻ βC , with ηo = o1 . This contradicts requirement 3 in the second bullet of Definition 3 of the attack
between (N1 , o1 ) and β. Therefore, η is not relevant to
N1 , as we wanted to prove.
2. Trivially true, by 1 (as, if η is an attacker β, then N1 6∼ η;
but then (N1 , ?)
η).
3. Trivially true, by 2.
4. Since (N1 , ?) defends θ in AF1 , then any attacker η of
θ is irrelevant to N1 , and by regularity, N1 6 η. Thus
requirement 3 in the second bullet of Definition 3 is satisfied. Requirement 1 is the hypothesis and requirement 2
is satisfied since (N1 , ?) defends θ in AF1 .
To conclude, we can now see that G1 = G2 , since, once
more without loss of generality, if we consider α ∈ G1 , by
definition of G1 there is a j such that α ∈ G1j . But since
G1j ⊆ G2j+1 , α ∈ G2 . This proves that G1 ⊆ G2 . The
converse can be proven analogously.
4
Non-monotonicity analysis of classifiers
In this section we provide a generic analysis of the nonmonotonicity properties of data-driven classifiers, using D,
X and Y to denote generic inputs and outputs of classifiers,
admitting our casebases, characterisations and outcomes as
special instances. Later in the paper, we will apply this analysis to AA-CBR and our modification thereof. Typically,
a classifier can be understood as a function from an input
set X to an output set Y . In machine learning, classifiers are
obtained by training with an initial, finite D ⊆ (X × Y ),
called the training set. In (any form of) AA-CBR, D can
also be seen as a training set of sorts. Thus, we will characterise a classifier as a two-argument function C that maps
from a dataset D⊆ (X × Y ) and from a new input x ∈ X
to a prediction y ∈ Y .8 Notice that this function is total, in
line with the common assumptions that classifiers generalise
beyond their training dataset.
Let us model directly the relationship between the dataset
D and the predictions it makes via the classifier as an inference system in the following way:
Coinciding predictions. The last result (also used later in
the paper) identifies a “core” in the casebase for the purposes
of outcome prediction: this amounts to all past cases that are
less (or equally) specific than the new case for which the
prediction is sought. In other words, irrelevant cases in the
casebase do not affect the prediction in regular AFs.
Lemma 9. Let D1 and D2 be two datasets. Let NC ∈ X be
a characterisation, and DiNC = {α ∈ Di | α NC } for
i = 1, 2. If D1NC = D2NC , then AA-CBR (D1 , NC ) =
AA-CBR (D2 , NC ) (that is, AA-CBR predicts the
same outcome for NC given the two datasets).
Definition 10. Given a classifier C: 2(X×Y ) × X → Y , let
L = L+ ∪ L− be a language consisting of atoms L+ = X ×
Y and negative sentences L− = {¬(x, y)|(x, y) ∈ X × Y }.
+
Then, ⊢C is an inference relation from 2L to L such that
Proof. For i = 1, 2, let AFiS= AF (Di , NC ) and the
grounded extensions be Gi = j>0 Gij . We will prove that
∀j : G1j ⊆ G2j+1 and G2j ⊆ G1j+1 , and this allows us to
prove that G1 = G2 , which in turn implies the outcomes are
the same. Here we consider only G1j ⊆ G2j+1 , as the other
case is entirely symmetric. By induction on j:
7
In abstract argumentation it can be verified that, if E ⊆ Args
defends an argument γ, and E ⊆ E ′ , then E ′ also defends γ.
8
Notice that this understanding relies upon the assumption that
classifiers are deterministic. Of course this is not the case for many
machine learning models, e.g. artificial neural networks trained
using stochastic gradient descent and randomised hyperparameter
search. This understanding is however in line with recent work using decision functions as approximations of classifiers whose output needs explaining (e.g. see (Shih, Choi, and Darwiche 2019)).
Moreover, it works well when analysing AA-CBR .
• For the base case j = 0:
If G10 ⊆ G20 , we are done, since we always have that Gij ⊆
Gij+1 . If not, there is a α ∈ G10 \ G20 . Since α ∈ G10 , it is
114
• D ⊢C (x, y), iff C(D, x) = y;
• D ⊢C ¬(x, y), iff there is a y ′ such that C(D, x) = y ′ and
y ′ 6= y.9
Intuitively, C defines a simple language L consisting of
atoms (representing labelled examples) and their negations,
and ⊢C applies a sort of closed world assumption around C.
Then, we can study non-monotonicity properties from
Section 2.2 of ⊢C .
Theorem 11. 1. ⊢C is complete, i.e. for every (x, y) ∈ (X×
Y ), either D ⊢C (x, y) or D ⊢C ¬(x, y).
2. ⊢C is consistent, i.e. for every (x, y) ∈ (X × Y ), it does
not hold that both D ⊢C (x, y) and D ⊢C ¬(x, y).
3. ⊢C is cautiously monotonic iff it satisfies cut.
4. ⊢C is cautiously monotonic iff it is cumulative.
5. ⊢C is cautiously monotonic iff it satisfies rational monotonicity.
(∅, −)
({a}, +)
({c}, +)
({a, b}, −)
({c, z}, −)
Figure 3: AF (D), given (δC , δo ) = (∅, −), for the proof of Theorem 12.
D ⊢AA-CBR
as required.
Proof. 1. By definition of ⊢C , directly from the totality of
C.
2. By definition of ⊢C , since C is a function.
3. Let ⊢C be cautiously monotonic, D ⊢C p and D ∪ {p} ⊢C
q, for p, q ∈ L. By completeness, either D ⊢C q or D ⊢C
¬q (here ¬q = r if q = ¬r, and ¬r if q = r). In the first
case we are done. Suppose the second case holds. Since
D ⊢C p, by cautious monotonicity D ∪ {p} ⊢C ¬q. But
then D ⊢C q and D ⊢C ¬q, which is absurd since ⊢C is
consistent. Therefore D 6⊢C ¬q, and then D ⊢C q. The
converse can be proven analogously.
4. Trivial from 3.
5. Since ⊢C is complete, D 6⊢C ¬p implies D⊢C p, and thus
rational monotonicity reduces to cautious monotonicity.
(N1 , +) and D ⊢AA-CBR
(N2 , −),
(∅, −)
({a}, +)
({c}, +)
({a, b}, −)
({c, z}, −)
({a, b, c}, ?)
5
Cautious monotonicity in AA-CBR
Our first main result is about (lack of) cautious monotonicity of the inference relation drawn from the classifier
AA-CBR (D, NC ).
Theorem 12. ⊢AA-CBR is not cautiously monotonic.
Figure 4: AF (D, N1 ) for the proof of Theorem 12, with the
grounded extension coloured.
(∅, −)
Proof. We will show a counterexample, instantiating in the
following way: X = 2{a,b,c,z} , Y = {−, +}, and =⊇.
Define D= {({a}, +), ({c}, +), ({a, b}, +), ({c, z}, +)}
and (δC , δo )= (∅, −) from which AF (D) in Figure
3 is obtained, and two new cases: N1 = {a, b, c} and
N2 = {a, b, c, z}.
Let us now consider AA-CBR (D, N1 ) and
AA-CBR (D, N2 ). We can see in Figure 4 that
D
⊢AA-CBR
(N1 , +) and in Figure 5 that
D ⊢AA-CBR (N2 , −).
Now, finally, let us consider AF (D ∪ {(N1 , +)}, N2 ))
in Figure 6. We can then conclude that D ∪
{(N1 , +)}
⊢AA-CBR
(N2 , +) even though
({a}, +)
({a, b}, −)
({c}, +)
({c, z}, −)
({a, b, c, z}, ?)
9
We could equivalently have defined D ⊢C ¬(x, y) iff
C(D, x) 6= y. We have not done so as the used definition can
be generalized for a scenario in which C is not necessarily a total function. This scenario is left for future work.
Figure 5: AF (D, N2 ) for the proof of Theorem 12, with the
grounded extension coloured.
115
question was judged as expected by the case law, and it may
seem strange that the order in which it happens may affects
the case in the second question.
(∅, −)
({a}, +)
({a, b}, −)
The example above aims only to illustrate an interpretation in which the way AA-CBR operates does not
seem appropriate. Whether this behaviour of AA-CBR⊇
in particular is desirable or not depends on other elements
such as the interrelation between features (in general, for
AA-CBR , between the characterisations and the partial
order).
({c}, +)
({c, z}, −)
6
({a, b, c}, +)
A cumulative AA-CBR
We will now present cAA-CBR , a novel, cumulative incarnation of AA-CBR which satisfies cautious monotonicity.
({a, b, c, z}, ?)
Figure 6: AF (D ∪ {(N1 , +)}, N2 ) for the proof of Theorem 12,
with the grounded extension coloured.
Preliminaries. Firstly, let us present some general notions, defined in terms of the ⊢C inference relation from an
arbitrary classifier C.
Intuitively, we are after a relation ⊢′C such that if D ⊢C c
and D ⊢C d, then D ∪ {c} ⊢′C d (in our concrete setting,
⊢C =⊢AA-CBR and ⊢′C =⊢cAA-CBR ). We also want the
property that, whenever D is “well-behaved” (in a sense to
be made precise later), D ⊢C s iff D ⊢′C s. In this way,
given that D ⊢′C c and D ⊢′C d, then we would conclude
D ∪ {c} ⊢′C d, making ⊢′C a cautious monotonic relation.
We will define ⊢′C by building a subset of the original
dataset in such a way that cautious monotonicity is preserved. We start with the following notion of (un)surprising
examples:
Note that the proof of Theorem 12 shows that the inference relation drawn from the original form of AA-CBR
(that is AA-CBR⊇ ) is also non-cautiously monotonic,
given that the counterexample in the proof is also obtained
by using AA-CBR⊇ . This counterexample amounts to an
expansion of Example 1, as follows.
Example 13. (Example 1 continued) Consider now that a
different type of crime happened: public offending someone’s honour, which we will call defamation (df ). In one
case, it was established that the defendant did publicly damage someone’s honour, and was considered guilty ({df }, +).
In a subsequent case, even if proven that the defendant did
hurt someone’s honour, it was established that this was done
by a true allegation (the truth defence), and thus the case was
dismissed, represented as ({df, td}, −).
What happens, then, if a same defendant is:
Definition 14. An example (x, y) ∈ X × Y is unsurprising (or not surprising) w.r.t. D iff D \ {(x, y)} ⊢C (x, y).
Otherwise, (x, y) is called surprising.
We then define the notion of concise (subset of) the
dataset, amounting to surprising cases only w.r.t. the dataset:
1. simultaneously proven guilty of homicide, of defamation,
but shown to have committed the homicide in self-defence
(({hm, df, sd}, ?))?
2. simultaneously proven guilty of homicide, of defamation,
shown to have committed the homicide in self-defence,
also shown to have committed defamation by a true allegation (({hm, df, sd, td}, ?))?
Definition 15. Let S ⊆ X × Y be a dataset, S ′ ⊆ S, and let
ϕ(S ′ ) = {(x, y) ∈ S | (x, y) is surprising w.r.t. S ′ }. Then
S ′ is concise w.r.t. S whenever it is a fixed point of ϕ, that
is, ϕ(S ′ ) = S ′ .
To illustrate this notion in the context of AA-CBR,
consider the dataset S from which the AF in Figure 6 is drawn. S is not concise w.r.t. itself, since
({a, b, c}, +) is unsurprising w.r.t. S (indeed, S \
{({a, b, c}, +)} ⊢AA-CBR ({a, b, c}, +), see Figure 4).
Also, S ′ = S \ {({a, b}, −), ({a, b, c}, +)} is not concise either (w.r.t. S), as ({a, b}, −) is surprising w.r.t. S ′
(the predicted outcome being +), but not an element of
S ′ . The only concise subset of S in this example is thus
S ′′ = S \ {({a, b, c}, +)}.
Let us now consider D′ ⊆ D, for D the dataset underpinning our ⊢C . If D′ is concise w.r.t. D, (x, y) ∈ (X × Y ) \ D
is an example not in D already and D′ ⊢C (x, y), then
(x, y) is unsurprising w.r.t. D′ , and thus D′ is still concise
w.r.t. D ∪ {(x, y)}. Now, suppose that there is exactly one
such concise D′ ⊆ D w.r.t. D (let us refer to this subset
simply as concise(D)). Then, it seems attractive to define
⊢′C , as: D ⊢′C (x, y) iff concise(D) ⊢C (x, y). Such ⊢′C
We can map this to our counterexample in Theorem 12
by setting a = hm, b = sd, c = df , and z = td. The first
question is answered by the AF represented in Figure 4, with
outcome +, that is, the defendant is considered guilty.
What we show in the proof of Theorem 12, given this interpretation of the counter-example, is that the answer to the
second question in AA-CBR would depend on whether
the case in the first question was already judged or not. If
not, then the cases ({hm, sd}, −) and ({df, td}, −) would
be the nearest cases, and the outcome would be −, that is, not
guilty. However, if the case in the first question was already
judged and incorporated into the case law, it would serve as a
counterargument for ({hm, sd}, −), and guarantee that the
outcome is +, that is, guilty. Intuitively this seems strange,
and we focus on one reason for that: the case in the first
116
for ({c}, ?). Thus, every argument in stratum1 is surprising, and are thus included in the next AF , resulting in
D1 = ({a}, +), ({c}, +) and AF1 = AF (D1 ).
Now, the second stratum is stratum2
=
{({a, b}, −), ({c, z}, −)}.
We
can
verify
that AA-CBR (D1 , ({a, b}, ?))
=
+ and
AA-CBR (D1 , ({c, z}, ?)) = +. Thus ({a, b}, −)
and ({c, z}, −) are both surprising, and then included in
next step, that is, D2 = D1 ∪ {({a, b}, −), ({c, z}, −)}, and
AF2 = AF (D2 ).
Finally, stratum3 = {({a, b, c}, +)}. Now we verify that AA-CBR D2 , ({a, b, c}, +) = +, which means
that ({a, b, c}, +) is unsurprising. Therefore it is not added
in the argumentation framework, that is, D3 = D2 and
thus AF3 = AF (D3 ) = AF (D2 ) = AF2 . Now
unprocessed = ∅, and the selected subset if D3 , with corresponding aaF oneD3 = AF3 , and we are done. We can
check that using cAA-CBR the counterexample in the
proof of Theorem 12 would fail, since ({a, b, c}, +) would
not have been added to the AF.
inference relation would then be cautiously monotonic if
concise(D) = concise(D ∪ {(x, y)}). This identity is indeed guaranteed given that a concise subset of D is still a
concise subset of D ∪ {(x, y)}, and given our assumption
that there is a unique concise subset of D}. In the remainder
of this section we will prove uniqueness and (constructively)
existence of concise(D) in the case of AA-CBR .
Uniqueness of concise subsets in AA-CBR .
Theorem 16. Given a coherent dataset D, if there exists a
concise D′ ⊆ D w.r.t. D then D′ is unique.
Proof. By contradiction, let D′′ be a concise subsets of D
distinct from D′ . Let then (x, y) ∈ (D′ \ D′′ ) ∪ (D′′ \ D′ )
such that (x, y) is -minimal in this set. Then the sets
{(x′ , y ′ ) ∈ D′ | (x′ , y ′ ) ≺ (x, y)} and {(x′ , y ′ ) ∈ D′′ |
(x′ , y ′ ) ≺ (x, y)} are equal, otherwise (x, y) would not
be minimal. But then, since D is coherent, by Lemma 9
we can conclude that D′ \ {(x, y)} ⊢AA-CBR (x, y) iff
D′′ \ {(x, y)} ⊢AA-CBR (x, y). Thus, (x, y) is surprising
w.r.t. both D′ and D′′ or w.r.t. neither. But since it is an element of one but not the other, one of them is either missing
a surprising element or containing a non-surprising element.
Such a set is not concise, contradicting our initial assumption.
Notice that we could have defined the algorithm equivalently by looking at cases one-by-one rather than grouping
them in strata. However, using strata has the advantage of
allowing for parallel testing of new cases.
Theorem 18 (Convergence). Algorithm 2 converges.
Existence of concise subsets in AA-CBR . We have
proven that concise(D) is unique, if it exists. Here we prove
that existence is guaranteed too. We do so constructively,
and by doing do we also prove that our approach is practical, giving as we so a (reasonable) algorithm that finds the
concise subset of D.
The main idea behind the algorithm is simple: we start
with the default argument, and progressively build the argumentation framework by adding cases from D by following the partial order . Before adding a past case, we test
whether it is surprising or not w.r.t. the dataset underpinning the current AF: if it is, then it is added; otherwise, it
is not added. More specifically, the algorithm works with
strata over D, alongside . In the simplest setting where
each stratum is a singleton, the algorithm works as follows:starting with D0 = {(δC , δo )} and the entire dataset
D = {di }i∈{1,...,|D|} unprocessed, at each step i + 1, we
obtain either Di+1 = Di ∪ {di+1 }, if di+1 is surprising
w.r.t. Di , and Di+1 = Di , otherwise. Then D̂ = D|D| ⊆ D
is the result of the algorithm. In the general case, each example of the current stratum is tested for “surprise”, and only
the surprising examples are added to Di . The procedure is
formally stated in Algorithm 2, using in turn Algorithm 1.
We illustrate the application of the algorithms next.
Proof. Obvious, since at each iteration of the while loop,
the variable stratum is assigned to a non-empty set, due
to the fact that unprocessed is always a finite set, and thus
there is always at least one minimal element. Thus, the cardinality of unprocessed is reduced by at least 1 at each loop
iteration, which guarantees that it will eventually become
empty.
Theorem 19 (Correctness of Algorithm 1). Every execution
of simple add((Args, ), next case) (Algorithm 1) in Algorithm 2 correctly returns AF (Args ∪ {next case}).
Proof (sketch). This is essentially a consequence of Lemma
8. We know that there will never be an argument in Args
with the same characterisation as next case, since they will
occur in the same stratum, thus the lemma applies. The
lemma guarantees that Algorithm 1 adds all attacks that need
to be added and only those. Finally, we need to check that
it will never be necessary to remove an attack. This is true
due to the requirement 3 in the second bullet of Definition 3,
and since arguments are added following the partial order.
Therefore the only modifications on the set of attacks are
the ones in simple add.
Example 17. Once more consider the dataset
D= {({a}, +), ({c}, +), ({a, b}, +), ({c, z}, +), ({a, b, c}, +)}
in Figure 6, as well as the definitions used in that example
for X, Y , (δC , δo ) and . Let us examine the application
of Algorithm 2 to it. We start with an AF consisting
only of (δC , δo ), that is, D0 = ∅, AF0 = AF (D0 ) =
AF (∅) = ({(∅, −)}, ∅). The first stratum would consist
of stratum1 = {({a}, +), ({c}, +)}. Of course, then, we
have AA-CBR ({(∅, −)}, ({a}, ?)) = −, and similarly
117
Theorem 20 (Correctness of Algorithm 2). If the input
dataset is coherent, then the dataset underpinning the AF
resulting from Algorithm 2 is concise.
Proof (sketch). In order to prove that, for the returned Args current , Args current \{(δC , δo )} is concise,
we just need to prove that at the end of each loop
Args current \{(δC , δo )} is concise w.r.t. the set of all seen
examples.
Algorithm 1: simple add algorithm for AA-CBR .
Input: An AA-CBR framework (Args, ) and a case n = (nc , no )
Output: A new AA-CBR (Args ′ , ′ ) framework
DEF ←− {(x, y) ∈ AF (Args, nC ) | (x, y) 6= (nC , ?) and (nC , ?) defends (x, y) in AF (Args, nc )} ;
Args ′ ←− Args ∪ {n} ;
′
←− ( ∪{(n, a) | a = (ac , ao ), a ∈ DEF, and ao 6= no }) ;
return (Args ′ , ′ )
Algorithm 2: Setup/learning algorithm for cAA-CBR .
Input: A dataset D
Output: An AF cAA-CBR (D)
unprocessed ←− D ;
Argscurrent ←− {(δC , δo )} ;
current ←− ∅ ;
while unprocessed 6= ∅ do
stratum ←− {(x, y) ∈ unprocessed | (x, y) is -minimal in unprocessed} ;
unprocessed ←− unprocessed \ stratum ;
to add ←− ∅ ;
for next case ∈ stratum do
(case characterisation, case outcome) ←− next case ;
if the outcome for case characterisation w.r.t. (Args current , current ) is not case outcome then
to add ←− to add ∪ {next case} ;
end
end
for next case ∈ to add do
(Argscurrent , current ) ←− simple add((Argscurrent , current ), next case) ;
end
end
return (Argscurrent , current )
checking whether the next case is surprising or not, thus we
could optimise its implementation with the use of caching.
Besides, the subset of minimal cases (that is, the stratum)
can be extracted efficiently by representing the partial order
as a directed acyclic graph and traversing this graph. Finally,
as mentioned before, the order in which the cases in the same
stratum are added does not affect the outcome. Thus, each
case in the same stratum can be safely tested for surprise in
parallel.
As the base case, before the loop is entered, this is clearly
the case, as the only seen argument is the default.
As the induction step, we know that every case previously
added is still surprising, since the new cases added are not
smaller than them according to the partial order, and thus by
Lemma 9 their prediction is not changed, that is, they keep
being surprising. The same is true for every case previously
not added: adding more cases afterwards does not change
their prediction. For the cases added at this new iteration, by
definition the surprising ones are added and the unsurprising
ones are not. Regarding the order in which cases of the same
stratum are added, each of the surprising cases will be included and the unsurprising ones will not be. It can be seen
that the order is irrelevant as, since they are all -minimal
and the dataset is coherent, they are incomparable, so each
case in the list is irrelevant with respect to the other. Thus,
for every case seen until this point, it is in the AF iff it is
surprising. As this is true for every iteration, it is true for the
final, returned AF.
cAA-CBR . All theorems in this section so far lead to
the following corollary:
Corollary 21. Given a coherent dataset D, the dataset underpinning the AF resulting from Algorithm 2 is the unique
concise D′ ⊆ D, w.r.t. D.
To conclude, we can then define inference in
cAA-CBR , the classifier yielded by the strategy described until now:
Definition 22. Let D be a coherent dataset and let
concise(D) be the unique concise subset of D, w.r.t. D.
Let cAF (D, NC ) be the AF mined from concise(D)
and (NC , ?), with default argument (δC , δo ). Then,
cAA-CBR (D, NC ) stands for the outcome for NC , given
cAF (D, NC ).
A full complexity analysis of the algorithm is outside the
scope of this paper. However, notice here that the algorithm
refrains from building the AF from scratch each time a new
case is considered, as seen in Theorem 19. Still regarding Algorithm 1, notice that it is easy to compute the set DEF while
118
Thus, we directly obtain the inference relation
⊢AA-CBR .
Then, cAA-CBR amounts to the form of AA-CBR using this inference relation. It is easy to see, in line with the
discussion before Theorem 16, and using the results in Section 11, that cAA-CBR satisfies several non-monotonicity
properties, as follows:
Theorem 23. ⊢cAA-CBR is cautiously monotonic and also
satisfies cut, cumulativity, and rational monotonicity.
7
European Conference on Artificial Intelligence, 18-22 August 2014, Prague, Czech Republic - Including Prestigious
Applications of Intelligent Systems (PAIS 2014), volume 263
of Frontiers in Artificial Intelligence and Applications, 267–
272. IOS Press.
Dung, P. M. 2016. An axiomatic analysis of structured argumentation with priorities. Artificial Intelligence
231:107–150.
Hunter, A. 2010. Base logics in argumentation. In Baroni, P.;
Cerutti, F.; Giacomin, M.; and Simari, G. R., eds., Computational Models of Argument: Proceedings of COMMA 2010,
Desenzano del Garda, Italy, September 8-10, 2010, volume
216 of Frontiers in Artificial Intelligence and Applications,
275–286. IOS Press.
Kenny, E. M., and Keane, M. T. 2019. Twin-systems
to explain artificial neural networks using case-based reasoning: Comparative tests of feature-weighting methods in
ANN-CBR twins for XAI. In Kraus, S., ed., Proceedings of
the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16,
2019, 2708–2715. ijcai.org.
Makinson, D. 1994. General patterns in nonmonotonic reasoning. 35–110. Oxford University Press.
Nugent, C., and Cunningham, P. 2005. A case-based explanation system for black-box systems. Artif. Intell. Rev.
24(2):163–178.
Prakken, H.; Wyner, A. Z.; Bench-Capon, T. J. M.; and
Atkinson, K. 2015. A formalization of argumentation
schemes for legal case-based reasoning in ASPIC+. J. Log.
Comput. 25(5):1141–1166.
Shih, A.; Choi, A.; and Darwiche, A. 2019. Compiling bayesian network classifiers into decision graphs. In
The Thirty-Third AAAI Conference on Artificial Intelligence,
AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI
Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 February 1, 2019, 7966–7974.
Čyras, K., and Toni, F. 2015. Non-monotonic inference
properties for assumption-based argumentation. In Black,
E.; Modgil, S.; and Oren, N., eds., Theory and Applications
of Formal Argumentation - Third International Workshop,
TAFA 2015, Buenos Aires, Argentina, July 25-26, 2015, Revised Selected Papers, volume 9524 of Lecture Notes in
Computer Science, 92–111. Springer.
Čyras, K., and Toni, F. 2016. Properties of ABA+ for nonmonotonic reasoning. CoRR abs/1603.08714.
Čyras, K.; Birch, D.; Guo, Y.; Toni, F.; Dulay, R.; Turvey,
S.; Greenberg, D.; and Hapuarachchi, T. 2019. Explanations by arbitrated argumentative dispute. Expert Syst. Appl.
127:141–156.
Čyras, K.; Satoh, K.; and Toni, F. 2016a. Abstract argumentation for case-based reasoning. In KR 2016, 549–552.
Čyras, K.; Satoh, K.; and Toni, F. 2016b. Explanation for
case-based reasoning via abstract argumentation. In Proceedings of COMMA 2016, 243–254.
Conclusion
In this paper we study regular AA-CBR frameworks, and
propose a new form of AA-CBR, denoted cAA-CBR ,
which is cautiously monotonic, as well as, as a byproduct, cumulative and rationally monotonic. Given that
AA-CBR admits the original AA-CBR⊇ (Čyras, Satoh,
and Toni 2016a) as an instance, we have (implicitly) also
defined a cautiously monotonic version thereof.
(Some incarnations of) AA-CBR have been shown successful empirically in a number of settings (see (Cocarascu
et al. 2020). The formal properties we have considered in
this paper do not necessarily imply better empirical results at the tasks in which AA-CBR has been applied. We
thus leave for future work an empirical comparison between
AA-CBR and cAA-CBR . Other issues open for future
work are comparisons w.r.t. learnability (such as model performance in the presence of noise), as well as a full complexity analysis of the new model. Also, we conjecture that the
reduced size of the AF our method generates could possibly
have advantages in terms of time and space complexity: we
leave investigation of this issue to future work.
8
Acknowledgements
We are very grateful to Kristijonas Čyras for very valuable discussions, as well as to Alexandre Augusto Abreu
Almeida, Victor Luis Barroso Nascimento and Matheus de
Elias Muller for reviewing initial drafts of this paper. The
first author was supported by Capes (Brazil, Ph.D. Scholarship 88881.174481/2018-01).
References
Cocarascu, O.; Stylianou, A.; Čyras, K.; and Toni, F. 2020.
Data-empowered argumentation for dialectically explainable predictions. In ECAI 2020 - 24th European Conference
on Artificial Intelligence, Santiago de Compostela, Spain,
10-12 June 2020.
Cocarascu, O.; Čyras, K.; and Toni, F. 2018. Explanatory
predictions with artificial neural networks and argumentation. In 2nd Workshop on XAI at the 27th IJCAI and the
23rd ECAI.
Dung, P. M. 1995. On the acceptability of arguments
and its fundamental role in nonmonotonic reasoning, logic
programming and n-person games. Artificial Intelligence
77(2):321 – 357.
Dung, P. M. 2014. An axiomatic analysis of structured argumentation for prioritized default reasoning. In Schaub,
T.; Friedrich, G.; and O’Sullivan, B., eds., ECAI 2014 - 21st
119
A Preference-Based Approach for Representing Defaults in First-Order Logic
James Delgrande , Christos Rantsoudis
Simon Fraser University, Canada
first last@sfu.ca
Abstract
over possible worlds. While properties such as specificity
followed directly from the semantics, other properties, such
as handling irrelevant properties, were not obtained. Arguably, at present there is no generally-accepted approach
that adequately handles inference of default properties, reasoning in the presence of irrelevant information, and reasoning about default properties of an individual known to be
exceptional with respect to another default property.
In this paper, we present a new account of defaults.
Consider the default assertions “birds fly” and “birds build
nests”. The usual interpretation is that a normal bird flies
and it builds nests. Our interpretation is that, with regards to
flying, a normal bird flies, and with regards to nest building,
a normal bird builds nests. That is, normality is given with
respect to some property. Consequently, “birds fly” would
be interpreted as saying that, with respect to the property of
flight, an individual that is a bird in fact flies. Similarly, a
penguin, as concerns flight, does not fly.
Semantically, for each n-ary relation in the domain,
we associate a total preorder over n-tuples of individuals,
where the preorder gives the relative normality of a tuple
with respect to that relation. Syntactically, we introduce a
“predicate-forming construct” into the language of FOL that
lets us identify those individuals that satisfy a certain condition (like Bird) and that are minimal in a given ordering
(like that corresponding to F ly); one can then state assertions regarding such (minimal-in-the-ordering) individuals,
for example that they indeed satisfy F ly. Notably, an individual abnormal in one respect (like flight) may be normal in another respect (like nest building). These orderings
allow us to naturally specify a wide class of default assertions, including on predicates of arity > 1. Default inference,
in which an individual is concluded to have a given property “by default”, is specified via a preference ordering over
models. Then inferences that follow by default are just those
that obtain in the minimal models. In the approach we avoid
a modal semantics on the one hand and fixed-point constructions on the other. We also show how a “predicate-forming
construct” can be translated into a standard first-order theory
and argue that the approach presents various advantages: it
satisfies a set of broadly-desirable properties; it is perspicuous, and presents a more nuanced and expressive account of
defaults than previous approaches; and it is couched within
classical FOL.
A major area of knowledge representation concerns representing and reasoning with defaults such as “birds fly”. In
this paper we introduce a new, preference-based approach for
representing defaults in first-order logic (FOL). Our central
intuition is that an individual (or tuple of individuals) is not
simply normal or not, but rather is normal with respect to a
particular predicate. Thus an individual that satisfies Bird
may be normal with respect to F ly but not BuildN est. Semantically we associate a total preorder over n-tuples with
each n-ary relation in the domain. Syntactically, a predicateforming construct is introduced into FOL that lets us assert
properties of minimal elements in an ordering that satisfy
a given condition. Default inference is obtained by (informally) asserting that a tuple in an ordering is ranked as “low”
as consistently possible. The approach has appealing properties: specificity of defaults is obtained; irrelevant properties
are handled appropriately; and one can reason about defaults
within FOL. We also suggest that the approach is more expressive than extant approaches and present some preliminary
ideas for its use in Description Logics.
1
Introduction
One of the major challenges of Artificial Intelligence (AI)
has been in representing and reasoning with defaults such
as “birds fly”. Since the early days of AI, researchers in
the field have recognized the importance of intelligent systems being able to draw default assertions, where one would
conclude by default that a bird flies, while allowing for exceptional conditions and non-flying birds. Of the early approaches to nonmonotonic reasoning, default logic (Reiter
1980) and autoepistemic logic (Moore 1985) were based
on the notion of a fixed-point construction in order to expand the set of obtained consequences, while circumscription (McCarthy 1980) was based on the idea of minimizing the extension of a predicate. In these approaches, desirable properties (such as specificity) had to be hand-coded
in a theory (Reiter and Criscuolo 1981; McCarthy 1986).
About a decade later, approaches based on conditional logics (Delgrande 1987; Lamarre 1991; Boutilier 1994; Fariñas
del Cerro, Herzig, and Lang 1994) and nonmonotonic consequence relations (Kraus, Lehmann, and Magidor 1990;
Lehmann and Magidor 1992) represented defaults as objects
(binary modal operators in conditional logics) in a theory.
In such approaches, the semantics was based on an ordering
120
Syntactically, we introduce a new construct into the language of FOL that, for an ordering associated with a relation, enables us to specify minimal domain elements in
the ordering that satisfy a given condition. This construct
has two parts, a predicate P and a formula φ; it is written
{P (~y ), φ(~y )}. The construct stands for a (new) predicate
that denotes a domain relation which holds for just those
(tuples of) individuals that satisfy φ and that are minimal in
the ordering corresponding to P (i.e., the ordering associated with the denotation of P ).
Given this construct, one can make assertions regarding
individuals that satisfy this relation. For example, to express
“birds normally fly” we use:
∀x {F ly(y), Bird(y)}(x) → F ly(x)
(1)
In the next section we informally describe our framework.
In Section 3 we present the syntax and semantics of our
logic. After presenting some examples in Section 4, we look
at various properties of our logic as well as provide a characterization result in Section 5. We briefly present our treatment of nonmonotonic inferences in Section 6. In Section 7
we compare our approach with related work as well as discuss about future directions. Section 8 concludes.
2
The Approach: Intuitions
A common means for specifying the meaning of a default
is via a preference order over models or possible worlds in
which worlds or models are ordered with respect to their
normality. Then, something holds normally (typically, defeasibly, etc.) just when it holds in the most preferred models or possible worlds. For example, in a conditional logic,
“birds fly” can be represented propositionally as Bird ⇒
F ly. This assertion is true just when, in the minimal Birdworlds, F ly is also true. In circumscription “birds fly” can
be represented as ∀x.Bird(x) ∧ ¬Abf (x) → F ly(x), so a
bird that is not abnormal with respect to flight flies. Then
models are ordered based on the extensions of the Ab predicates (with smaller extensions preferred), and a bird a flies
by default just if F ly(a) is satisfied in the minimal models
that satisfy Bird(a).
Our approach belongs to the preference-based paradigm,
but with significant differences from earlier work. Our preferences are expressed within FOL models, and not between
models (or possible worlds, as in a modal framework). Preferences are given by a total preorder over n-tuples of individuals for each n-ary relation in the domain; these orderings give the relative normality of a tuple with respect to the
underlying relation. Defaults are then expressed by making
assertions concerning sets of minimal (tuples of) individuals
in an ordering.
Consider again the assertion that birds normally fly. We
interpret this as, for a bird that is normal with respect to
the unary relation f ly,1 that bird flies. In a model, the relative normality of individuals with respect to flight is given
by a total preorder associated with the relation f ly. Then
we can say that “birds fly” is true in a model just when, in
the order associated with f ly, the minimal bird individuals satisfy f ly. Similarly, “penguins do not fly” is true in a
model just when, in the order associated with f ly, the minimal penguin individuals do not satisfy f ly. The ranking
of an individual with respect to one relation (like f ly) is not
related to the ranking associated with another relation (like
build nest).
These considerations extend to relations of arity > 1.
Consider “elephants normally like their keepers”. Semantically we would express this by having, in the total preorder
associated with the relation likes, that the most normal pairs
of individuals (d1 , d2 ), in which d1 is an elephant and d2 is
a keeper, satisfy likes. Analogously we could go on and
express that “elephants normally do not like (keeper) Fred”.
whereas for “penguins normally do not fly” we use:
∀x {F ly(y), P enguin(y)}(x) → ¬F ly(x)
(2)
So those individuals that satisfy bird and that are minimal in
the ordering associated with f ly also satisfy f ly whereas the
minimal elements in the f ly ordering that satisfy penguin
do not satisfy f ly. That is, “birds fly” and “penguins do not
fly” both concern the property of flight and so are respect to
the same (f ly) ordering.2
The fact that we deal with orderings over individuals
means that our approach is irreducibly first-order. This is
in contrast to most work in default logic, in which default
theories are very often expressed in propositional terms, and
where a rule with variables is treated as the set of corresponding grounded instances. It is also in contrast to work
in conditional logics and nonmonotonic inference relations,
which are nearly always expressed in propositional terms.
As we suggest later, for many domains, it may well be that a
first-order framework is essential for an adequate expression
of defaults assertions.
For our earlier example “elephants (E) normally like (L)
their keepers (K)”, we have the following:
∀x1 , x2 {L(y1 , y2 ), E(y1 ) ∧ K(y2 )}(x1 , x2 ) →
(3)
L(x1 , x2 )
whereas for “elephants normally do not like (keeper) Fred”:
∀x1 , x2 {L(y1 , y2 ), E(y1 ) ∧ K(y2 ) ∧
(4)
y2 = F red}(x1 , x2 ) → ¬L(x1 , x2 )
In addition, we suggest that our approach leads to a reconsideration of how some defaults are best expressed. Consider the assertion “adults are normally employed at a company”. In a conditional approach, one might express this as:
Adult(x) ⇒x ∃y EmployedAt(x, y) ∧ Company(y)
where, without worrying about details, ⇒x is a variablebinding connective (Delgrande 1998). But the interpretation
2
One way of viewing this is that a relation such as f ly gives a
partition of a domain, into those elements that belong to the relation and those that do not. We also note that the interpretation of
“birds fly” as a default conditional (like Bird ⇒ F ly) is somewhat superficial. A more nuanced approach would assert that for
birds, flight is a default means of locomotion, perhaps along with
others such as bipedal walking. We return to this point later.
1
We use the notation that a lower case string like f ly is used
for a relation in a model whereas upper case, like F ly, is used for
a predicate symbol in the language (in this case denoting f ly).
121
N} and a set of variables V = {x, y, z . . . }.3 Predicate symbols and variables may be subscripted, as may other entities
in the language. The constants and variables make up the set
of terms, which are denoted by ti , i ∈ N. A tuple of variables x1 , . . . , xn is denoted by ~x, and similarly for terms.
For any formula φ, the expression φ(~x) indicates that the
free variables of φ are among those in ~x. Our language LN
is given in the following definition, with L given in Items
1-3.
Definition 1. The well-formed formulas (wffs) of LN are
defined inductively as follows:
1. If P is an n-ary predicate symbol and t1 , . . . , tn are terms
then P (t1 , . . . , tn ) is a wff.
2. If t1 and t2 are terms then t1 = t2 is a wff.
3. If φ and ψ are wffs and x is a variable then (¬φ), (φ →
ψ) and (∀x φ) are wffs.
4. If P is an n-ary predicate symbol, ~y is a tuple of n variables, φ(~y ) is a wff and ~t is a tuple of n terms then
{P (~y ), φ(~y )}(~t) is a wff.
Parentheses may be omitted if no confusion results. The
connectives ∧, ∨, ≡ and ∃ are introduced in the usual way.
For a wff of the form {P (~y ), φ(~y )}(~t), the part {P (~y ), φ(~y )}
can be thought of as a self-contained predicate-forming construct (pfc). The first part, P (~y ), specifies that the ordering
is with respect to predicate P ; it also provides names for
the n variables of P , in ~y . The second part φ(~y ) will in
general be true of some substitutions for ~y and false for others. The denotation of {P (~y ), φ(~y )} is just those n-tuples
of domain elements that satisfy φ and are minimal in the ordering corresponding to P . So {P (~y ), φ(~y )} behaves just
like any predicate symbol. Thus {F ly(y), Bird(y)}(x) can
be thought of as analogous to an atomic formula, which in
a model will be true of some individuals (viz. those that belong to bird and that are minimal in the f ly ordering) and
false of others. Similarly, {F ly(y), Bird(y)}(T weety) will
assert that T weety is a minimal bird element in the f ly ordering.
In the wff {P (~y ), φ(~y )}(~t), there is a one-to-one correspondence between the terms in ~t and the variables ~y inside
the pfc; but otherwise they are unrelated. Hence for the expression {F ly(x), Bird(x)}(x) the occurrences of variable
x within {. . . } are distinct from the third occurrence. For
{P (~y ), φ(~y )}, the variables in ~y are local to {. . . }, and can
be thought of as effectively bound within the expression. In
the following, we use the term predicate expression to refer
to both predicate symbols and pfcs.
We next remind the reader of some terminology regarding
orderings. A total preorder on a set is a transitive and connected relation on the elements of the set. A well-founded
order is one that has no infinitely-descending chains of elements. Formally, for a set S and a total preorder on S,
is well-founded iff:
(∀ T ⊆ S) T 6= ∅ → (∃x ∈ T )(∀y ∈ T ) x y
that “the most normal adults are employed at a company” is
unsuitable, since an abnormal adult here would be abnormal
with respect to other normality assertions regarding adults.
As well, it does not seem to make much sense to say that normality now refers to the full consequent. Instead, it seems
that the best way of interpreting this assertion is that we have
a normality ordering associated with employed at, giving
the relative normality of pairs of domain elements with respect to this relation. Then for the most normal pairs (d1 , d2 )
where d1 is an adult, there is some pair (d1 , d3 ) among them
for which d1 is employed at d3 and d3 is a company. Consequently, this suggests that simple conditionals, at least in
a first-order framework, may not be adequate to represent
general default information.
The preceding sketches our intuitions regarding how we
intend to represent and interpret default information. With
regards to inferring default information in a knowledge base
(KB), we define preferences between models in a similar manner to those of other preferential logics (McCarthy
1980; Shoham 1987; Kraus, Lehmann, and Magidor 1990).
Again, what is new in our approach is that we have multiple orderings inside our models and so we can define more
nuanced preferences between models. As we will see, although we only briefly treat nonmonotonic inferring of assertions, our ordering between the models will result in desirable properties with respect to defeasibility. Specifically,
we satisfy the following principles:
1. Specificity: Properties are ascribed on the basis of most
specific applicable information. Hence a penguin will not
fly by default whereas a bird will.
2. Inheritance: Individuals will inherit all typical properties
of the classes to which they belong, except for those we
know are exceptional. Hence, by default, a penguin may
be concluded to not fly, but will be concluded to have
feathers, etc.
3. Irrelevance: Default inference is not affected by irrelevant information. Hence, by default, a yellow bird will be
concluded to fly.
As we have noted, there is no generally-accepted approach
that fully captures these properties. Default logic, autoepistemic logic and circumscription do not satisfy specificity,
while the rational closure mechanism of the KLM framework does not satisfy inheritance.
3
Language and Semantics
As discussed, a first-order setting is required for our investigation. Thus, the language we employ is based on standard
FOL enhanced with the aforementioned minimality operators. We start with some formal preliminaries, including the
syntax of our new logic, and finish the section by presenting
the semantics.
3.1
Formal Preliminaries
We assume that the reader has some familiarity with standard FOL (Enderton 1972; Mendelson 2015). Let L be a
first-order language containing a set of predicate symbols
P = {P, Q, . . . }, a set of constant symbols C = {ci | i ∈
3
For simplicity, except for constants, we exclude function symbols. Note that this does not affect expressiveness, since any n-ary
function can be encoded by a (n + 1)-place predicate.
122
I,v
M,v
1. M, v |= P(t1 , . . . , tn ) iff (tI,v
1 , . . . , tn ) ∈ P
We will work only with well-founded total preorders. Given
well-foundedness, for a total preorder we can define the
minimal S-elements of , as follows:
I,v
M, v |= t1 = t2 iff tI,v
1 = t2
M, v |= ¬φ iff M, v 6|= φ
M, v |= φ → ψ iff M, v 6|= φ or M, v |= ψ
M, v |= ∀xφ iff M, v |= φ(x/d) for all d ∈ D
As usual, if M, v |= φ for all M and v, then φ is valid
in LN . If φ is a sentence (i.e., without free variables) then
M, v |= φ iff M, v′ |= φ for all variable maps v, v′ ; thus we
just write M |= φ. If Φ is a set of sentences then M |= Φ iff
M |= φ for all φ ∈ Φ, and we say that M is a model of Φ.
Finally, we write Φ |= φ when all models of Φ are models
of φ, and we say that Φ logically entails φ.
2.
3.
4.
5.
min(, S) = {x ∈ S | ∀y ∈ S : x y}
3.2
Semantics
We next present the formal semantics, which will interpret
the terms and formulas in LN with respect to a model.
Definition 2. A model is a triple M = hD, I, Oi where
D =
6 ∅ is the domain, I is the interpretation function, and
O is a set containing, for each n-ary relation r in D, a wellfounded total preorder r on Dn . Specifically:
1. I interprets the predicate and constant symbols into D as
follows:
• P I ⊆ Dn , for each n-ary predicate symbol P ∈ P
• cI ∈ D, for each constant symbol c ∈ C
2. O = { r ⊆ Dn × Dn | r ⊆ Dn and r is a wellfounded total preorder on Dn }
4
We have already seen some wffs in Equations 1–4 of Section 2. We now give some more examples that illustrate
the range and application of our approach. As we have described, the first part of a pfc denotes an ordering associated with a given predicate. The second part is a formula
that specifies minimal (tuples of) individuals in the ordering. The order of the variables in the two parts is important,
as the next two equations illustrate:
∀x1 , x2 {L(y1 , y2 ), P (y1 , y2 )}(x1 , x2 ) → L(x1 , x2 ) (6)
∀x1 , x2 {L(y1 , y2 ), P (y2 , y1 )}(x1 , x2 ) → L(x1 , x2 ) (7)
A variable map v : V 7→ D assigns each variable x ∈ V
an element of the domain v(x) ∈ D.
Definition 3. Let M = hD, I, Oi be a model and v a variable map. The denotation of a term t, written as tI,v , is
defined as follows:
1. tI,v = tI , if t is a constant
2. tI,v = v(t), if t is a variable
When L abbreviates Likes and P abbreviates P arentOf , it
is easy to see that 6 states that “parents normally like their
children” while 7 states that “children normally like their
parents”. Recall also that the tuples of domain elements belonging to the denotation of a pfc {P (~y ), φ(~y )} do not necessarily have to satisfy the predicate P . See, for instance,
Equations 2 and 4.
On another note, predicates in φ may have a higher arity than P in a pfc {P (~y ), φ(~y )}. For instance, consider
the statement “people that trust (T ) themselves are normally
daring (D)”. We would express that using the following wff:
∀x {D(y), T (y, y)}(x) → D(x)
The satisfaction relation |= is defined below. We first give
some preliminary terminology and notation. Assume we
have a model M, a variable map v and a wff φ. When M
satisfies φ under v we write M, v |= φ. When M satisfies
φ under v where the free variable x of φ is assigned to d we
write M, v |= φ(x/d). For ~x a tuple of variables x1 , . . . , xn
and d~ a tuple of domain elements d1 , . . . , dn , we denote by
~x/d~ the one-to-one assignment x1 /d1 , . . . , xn /dn . Similarly for ~x/~y and ~x/~t . Last, given a tuple of n variables ~y
and a formula φ with free variables among ~y , the values of
~y for which φ can be satisfied are given by the set:
~
φ(~y )M,v = {d~ ∈ Dn | M, v |= φ(~y /d)}
Examples
Furthermore, we can express statements about specific individuals by directly replacing variables with constants. For
example, for constant John, we can express that “John’s
pets are normally happy (H)” by:
∀x {H(y), HasP et(John, y)}(x) → H(x)
(5)
We can now define the denotation of a pfc {P (~y ), φ(~y )},
written {P (~y ), φ(~y )}M,v , as the set of domain tuples that:
1. belong to the denotation of φ, as given in Equation 5 and
The reading of our new wffs can sometimes be a bit cumbersome. Consider the earlier example that “adults (A) are
normally employed (Em) at a company (C)”. According to
the discussion in Section 2, this statement can be expressed
by the following wff:
2. are the minimal such tuples in the ordering associated
with P I , viz. P I .
Definition 4. Let M = hD, I, Oi be a model and v a variable map. The denotation of {P (~y ), φ(~y )} is defined as the
set:
{P (~y ), φ(~y )}M,v = min(P I , φ(~y )M,v )
∀x1 , x2 {Em(y1 , y2 ), A(y1 )}(x1 , x2 ) → ∃x3
{Em(y1 , y2 ), A(y1 )}(x1 , x3 ) ∧ Em(x1 , x3 ) ∧ C(x3 )
This contains two instances of the pfc {Em(y1 , y2 ), A(y1 )}.
As a possible solution, if we introduce the predicate NAE
(for Normal Adults wrt the Em ordering) we can rewrite the
previous into the more compact and perspicuous formula:
∀x, y NAE(x, y) → ∃z NAE(x, z) ∧ Em(x, z) ∧ C(z)
Finally, for each predicate symbol P ∈ P we define P M,v =
P I . The satisfaction relation is given as follows (recall that
a predicate expression is either a predicate symbol or a pfc).
Definition 5. Let M = hD, I, Oi be a model, v a variable
map and P a predicate expression.
123
to the more semantic approach of Section 3; and our equivalence result (Theorem 1) provides a counterpart to a standard
soundness and completeness result.5 More precisely:
This method of abbreviating pfcs via smaller “predicate
names” could be used at the outset in order to make KBs
more readable.
A key point is that our approach is highly versatile, and
can express nuances that (arguably) other approaches cannot. Consider for example the ambiguous statement “undergraduate students attend undergraduate courses”.4 Let
U GS stand for “undergrad student” and U GC for “undergrad course”. Among other possibilities, we have the following interpretations:
1. “Normally, the things U GSs attend are U GCs”
That is, for the most normal pairs (d1 , d2 ) according to the
attend relation, such that d1 is an U GS that attends d2 , d2
is an U GC. In LN :
∀x1 , x2 {Attend(y1 , y2 ), U GS(y1 ) ∧
Attend(y1 , y2 )}(x1 , x2 ) → U GC(x2 )
1. For each n-ary predicate symbol P we introduce a new
predicate symbol P of arity 2n. We use these new predicate symbols to express the preference orderings instead
of embedding them directly into the models. That is, each
P will be used in the place of P I .
2. After having interpreted the new predicate symbols
P in the aforementioned way, we translate each wff
{P (~y ), φ(~y )}(~t) to the first-order formula:
φ(~y /~t) ∧ ∀~z φ(~y /~z) → P (~t, ~z)
So the variables from ~y that appear free in φ are assigned
to the respective terms and variables from ~t and ~z, with the
latter being employed in order to ensure the minimality of
the former through the new predicate symbols P . The following list shows some of the examples of Sections 2 and 4
expressed in FOL using this translation:
2. “Normal U GSs attend only U GCs”
That is, for the most normal pairs (d1 , d2 ) according to the
attend relation, such that d1 is an U GS, everything d1 attends is an U GC. In LN :
∀x1 , x2 {Attend(y1 , y2 ), U GS(y1 )}(x1 , x2 ) →
∀x3 Attend(x1 , x3 ) → U GC(x3 )
1. Birds (B) normally fly (F )
∀x B(x) ∧ ∀z B(z) → F (x, z) → F (x)
2. Penguins (P ) normally do not fly
∀x P (x) ∧ ∀z P (z) → F (x, z) → ¬F (x)
3. “Normal U GSs attend some U GC”
That is, for the most normal pairs (d1 , d2 ) according to the
attend relation, such that d1 is an U GS, there exists an
U GC that d1 attends. In LN :
∀x1 , x2 {Attend(y1 , y2 ), U GS(y1 )}(x1 , x2 ) →
∃x3 Attend(x1 , x3 ) ∧ U GC(x3 )
3. Elephants normally like their keepers
∀x1 , x2 E(x1 ) ∧ K(x2 ) ∧ ∀z1 , z2 E(z1 ) ∧ K(z2 ) →
L (x1 , x2 , z1 , z2 ) → L(x1 , x2 )
4. Children normally like their parents
4. “Normally U GSs attend U GCs”
∀x1 , x2 P (x2 , x1 ) ∧ ∀z1 , z2 P (z2 , z1 ) →
L (x1 , x2 , z1 , z2 ) → L(x1 , x2 )
Analogous to Equation 3, we have:
∀x1 , x2 {Attend(y1 , y2 ), U GS(y1 ) ∧
U GC(y2 )}(x1 , x2 ) → Attend(x1 , x2 )
5. People that trust themselves are normally daring
∀x T (x, x) ∧ ∀z T (z, z) → D (x, z) → D(x)
These examples illustrate the wealth of expressiveness in our
logic and present a contrast to the more limited expressiveness of current approaches in the literature.
5
6. John’s pets are normally happy
∀x HasP et(John, x) ∧ ∀z HasP et(John, z) →
H (x, z) → H(x)
Characterization and Properties
In this section we provide a characterization of our new logic
through a translation into standard FOL. As well, we present
some notable properties and briefly compare our approach to
other well-known systems from the literature.
First, we show how to encode our approach in standard
FOL, via the introduction of a new set of predicate symbols representing the preference orderings. Then, we express
the pfcs inside the language using these new predicate symbols. This translation then serves as a syntactic counterpart
As we see, everything presented so far can be expressed using standard FOL without the need to enhance the models
with preference orderings or the syntax with pfcs. Instead, a
new set of predicate symbols together with a translation of
pfcs suffice.
Next, we present the formal translation as well as a characterization theorem.
5
Alternatively, we could have provided an axiomatisation of our
new construct and directly proven a soundness and completeness
result. This is done in (Brafman 1997), where a conditional logic is
developed based on orderings over individuals, but for each n ∈ N
there is a single ordering on n-tuples; see Section 7. We feel that
the given translation is at least as informative as an axiomatisation,
while being more straightforward to obtain.
4
This example is a type of assertion that might occur unconditionally in a description logic TBox. The fact that there are (at
least) four corresponding nonmonotonic (normality) assertions indicates that a fully general approach to defeasibility in description
logics may require substantial expressive power. See Section 7 for
a further discussion.
124
5.1
τ
~ τ}
3. ~t I ,v ∈ {d~ ∈ (Dτ )n | Mτ , v |= ψ(~y /d)
τ
τ
4. ∀~e ∈ (Dτ )n if Mτ , v |= ψ(~y /~e) then (~t I ,v , ~e) ∈
τ
(P )I
Translation into FOL
We first extend P with a new set of predicate symbols P =
{P | P ∈ P}. Let P + = P ∪ P and let L+ be the
extension of L with P + .
From 3. it immediately follows that:
τ
5. Mτ , v |= ψ(~y /~t)
Definition 6. Given the syntax of LN , the translation τ :
LN → L+ is defined as follows:
τ
1. P (t1 , . . . , tn ) = P (t1 , . . . , tn )
2. (t1 = t2 )τ = (t1 = t2 )
3. (¬φ)τ = ¬φτ
4. (φ → ψ)τ = φτ → ψ τ
5. (∀xφ)τ = ∀xφτ
τ
τ
6. {P (~y ), φ(~y )}(~t) = φ(~y /~t) ∧
τ
∀~z φ(~y /~z) → P (~t, ~z)
Next, let ~z be a random tuple of variables and let:
τ
6. Mτ , v |= ψ(~y /~z)
Let us also assume that v(~z) = ~e ∈ (Dτ )n . It immediately
follows:
τ
7. Mτ , v |= ψ(~y /~e)
τ
From 4., 7. and the fact that ~e = v(~z) then (~t I ,v , v(~z)) ∈
τ
(P )I which is equivalent to:
As for the semantics, the language L+ is interpreted over
the usual models of FOL. Regarding the relation between
the models of LN and the models of L+ , we can define a
similar translation τ from the former into the latter.
8. Mτ, v |= P (~t, ~z)
From 6., 8. and the fact that ~z was a random tuple, we have
that:
τ
9. Mτ , v |= ∀~z ψ(~y /~z) → P (~t, ~z)
Definition 7. For a given model M = hD, I, Oi of LN , the
model Mτ = hDτ , I τ i of L+ is defined as follows:
1. Dτ = D
τ
2. for every constant symbol c ∈ C: cI = cI ∈ D
3. for every n-ary predicate symbol P ∈ P:
τ
• P I = P I ⊆ Dn
τ
~ ~e) ∈ Dn × Dn | d~ P I ~e }
• (P )I = {(d,
Finally, 5. and 9. give:
τ
τ
Mτ , v |= ψ(~y /~t) ∧ ∀~z ψ(~y /~z) → P (~t, ~z)
τ
By definition of τ then we get Mτ , v |= {P (~y ), ψ(~y )}(~t) ,
i.e., Mτ , v |= φτ . The reverse procedure gives the other
direction as well: if Mτ , v |= φτ we end up with 3. and 4.
which, by (IH) and the definition of Mτ , are equivalent to
M, v |= φ.
It is easy to see that Mτ interprets all new predicate symbols P as (well-founded) total preorders, in the following
sense.
Proposition 1. Let Mτ be a model of L+ according to Definition 7 and P ∈ P + . The following hold:
Through this characterization result we can move from
LN into L+ and use the known machinery of standard FOL
when evaluating formulas in LN .
1. Mτ |= ∀~x P (~x, ~x)
2. Mτ |= ∀~x, ~y , ~z P (~x, ~y ) ∧ P (~y , ~z) → P (~x, ~z)
3. Mτ |= ∀~x, ~y P (~x, ~y ) ∨ P (~y , ~x)
5.2
Given this translation τ on formulas and models, we obtain the following characterization of LN through L+ .
Properties
We now examine some properties of our logic, starting with
the fact that we can reason about defaults directly within
our framework. A representative example of this property is
showcased in the next proposition.
Theorem 1. Let φ and M be a wff and a model of LN ,
respectively, and let v be a variable map. Then:
Proposition 2. Let Φ = {φ1 , φ2 , φ3 } be a KB where:
1. φ1 = ∀x P (x) → B(x)
“All penguins are birds”
2. φ2 = ∀x {F (y), B(y)}(x) → F (x)
“Birds normally fly”
3. φ3 = ∀x {F (y), P (y)}(x) → ¬F (x)
“Penguins normally do not fly”
M, v |= φ iff Mτ , v |= φτ
Proof. The proof follows by induction on the construction
of φ. We only present the step for pfcs:
Consider φ = {P (~y ), ψ(~y )}(~t) and that the Induction Hypothesis (IH) holds for ψ. We have that M, v |= φ iff:
~t I,v ∈ min(P I , ψ(~y )M,v )
Let us also assume that ~y is a tuple of n variables. By definition then:
~
1. ~t I,v ∈ {d~ ∈ Dn | M, v |= ψ(~y /d)}
2. ∀~e ∈ Dn if M, v |= ψ(~y /~e) then ~t I,v P I ~e
Furthermore, consider the following sentence:
4. ψ = ∀x P (x) → ¬{F (y), B(y)}(x)
“Penguins are not normal birds with respect to flying”
Then Φ |= ψ is derivable in LN .
By (IH) and the definition of Mτ then we also have:
125
6
This is quite an important characteristic of LN since reasoning about defaults within the logic is not possible with
many other approaches, e.g. default logic or circumscription.
Next, we move on to compare the logic LN to the wellknown KLM systems (Kraus, Lehmann, and Magidor 1990;
Lehmann and Magidor 1992). We start by noting that,
like most of the approaches that employ preference orderings (either between worlds or between elements of a domain), KLM rely on a single ordering. This is in contrast
to our multiple orderings and the fact that we can use multiple pfcs, each one associated with a different ordering,
inside the same expression. Consider, e.g., the following
instance of the KLM postulate of Right Weakening: from
|= F ly → M obile and Bird p∼ F ly infer Bird p∼ M obile.
The way we would express this instance of RW in LN would
be the following:
∀x F (x) → M (x) ∧ ∀x {F (y), B(y)}(x) → F (x) →
∀x {M (y), B(y)}(x) → M (x)
where F abbreviates F ly, M abbreviates M obile and B abbreviates Bird. This formula is not valid in our logic since
the two pfcs refer to two different orderings (corresponding
to F ly and M obile). The same holds for any other KLM
postulate apart from Reflexivity. This is because we have not
imposed any relationship between the different orderings or
attempted to combine them in any way. We could impose,
e.g., the following condition between two orderings:
whenever ∀~x (P (~x) → Q(~x)) we also have that P I ⊆QI
which would make the previous formula valid in our logic.
One could propose such restrictions in our models (and more
specifically in the set O) but this is not our intention here.
However, if we introduce a new predicate symbol G that
corresponds to a global ordering we get the following.
Proposition 3. The KLM postulates articulated using only
the (global) ordering associated with predicate G are valid
in LN .
It immediately follows that our approach is at least as expressive as the KLM systems.
Corollary 1. Any proof wrt the KLM systems can be transformed into a proof in LN .
We end this section by presenting some properties of pfcs
in the next proposition, with the names suggesting similar
properties that have appeared in the literature.
Proposition 4. The following formulas are valid in LN :
1. REF: ∀~x {P (~y ), φ(~y )}(~x) → φ(~y /~x)
To this point, in presenting LN , we have dealt with a monotonic formalism. We now examine nonmonotonic reasoning
in LN and explore how default inferences can be obtained.
Our investigations are still preliminary and are on the semantic level, i.e., we work with models. Nevertheless, a syntactic approach is also in the works and employs an extension of
the Closed World Assumption for nonmonotonic reasoning.
The goal then will be to provide a correspondence between
the two approaches (syntactic and semantic). We present the
latter here which, similar to (McCarthy 1980; Shoham 1987;
Kraus, Lehmann, and Magidor 1990), employs preferences
between the models of LN .
We start by restricting our models a bit further so that they
only contain orderings without infinitely-ascending chains
of elements, i.e., our orderings are “upwards” well-founded
as well. We then proceed with the following definitions.
Definition 8. Let M be a model of LN and r ∈ O. The
set mink (r ) is defined inductively as follows:
1. min1 (r ) = {d~ ∈ Dn | ∀~e ∈ Dn : d~ r ~e }
2. mink+1 (r ) = {d~ ∈ Dn | ∀~e ∈ Dn \
d~ r ~e }
{P (~y ), (φ ∧ ψ)(~y )}(~x)
5. OR: ∀~x {P (~y ), (φ ∨ ψ)(~y )}(~x) →
{P (~y ), φ(~y )}(~x) ∨ {P (~y ), ψ(~y )}(~x)
k
S
minn (r ) :
n=1
Intuitively, the set mink (r ) denotes the k-th least set of
r -equivalent elements in the ordering r . Using these sets,
we can define a preference between orderings on the same
relation r as follows.
Definition 9. Let M = hD, I, Oi and M′ = hD, I ′ , O′ i
be two models of LN with r ∈ O and ′r ∈ O′ . We say
that r is lexicographically preferred to ′r iff ∃n ∈ N such
that:
1. mink (′r ) = mink (r ) ∀ k ∈ {1, . . . , n − 1}
2. minn (′r ) ⊂ minn (r )
Given the lexicographic preference between two orderings, we can now generalize our definition to a preference
between models.
Definition 10. Let M = hD, I, Oi and M′ = hD, I ′ , O′ i
be two models of LN . We say that M is preferred to M′ ,
viz. M < M′ , iff for every P ∈ P we have that P I is
lexicographically preferred to ′P I′ .
2. RCE: ∀~x (φ(~x) → ψ(~x)) →
∀~x {P (~y ), φ(~y )}(~x) → ψ(~y /~x)
3. LLE/RCEC: ∀~x (φ(~x) ≡ ψ(~x)) →
∀~x {P (~y ), φ(~y )}(~x) ≡ {P (~y ), ψ(~y )}(~x)
4. AND: ∀~x {P (~y ), φ(~y )}(~x) ∧ {P (~y ), ψ(~y )}(~x) →
Default Inference in LN
Next, we define the minimal models of a KB, which will
be our main tool for drawing default inferences.
Definition 11. Let Φ and M be a KB and a model of LN ,
respectively. M is a minimal model of Φ iff M is a model of
Φ and there is no model M′ of Φ such that M′ < M.
Using the above, the next definition shows how to obtain
default inferences in LN .
126
7
Definition 12. Let Φ and φ be a KB and a sentence of LN ,
respectively. We say that Φ entails φ by default iff M |= φ
for all minimal models M of Φ.
7.1
Proposition 5. Let M be a model of LN and r ∈ O. Then:
1. either ∀k ∈ N mink (r ) 6= ∅
2. or ∃n ∈ N:
• ∀k ∈ {1, . . . , n} mink (r ) 6= ∅
• ∀k > n mink (r ) = ∅
This means that the sets mink (r ), the preference between two orderings, and the preference between two models are all well-defined. Furthermore, a KB has a model iff it
has a minimal one and, similar to monotonic inferences, inconsistent KBs entail all sentences by default. We conclude
this section with a showcase of how specificity, inheritance
and irrelevance, the three principles that we highlighted in
Section 2, are handled in LN .
Corollary 2. Let Φ = {φi | 1 ≤ i ≤ 6} be a KB where:
1. φ1 = B(T weety) ∧ Y (T weety)
“Tweety is a yellow (Y ) bird”
2. φ2 = P (Opus)
3. φ3 = ∀x P (x) → B(x)
“All penguins are birds”
4. φ4 = ∀x {F (y), B(y)}(x) → F (x)
“Birds normally fly”
5. φ5 = ∀x {W (y), B(y)}(x) → W (x)
“Birds normally have wings (W )”
“Penguins normally do not fly”
Such a conditional does not seem to capture accurately the
meaning of the original expression, as argued in Section 2.
However, we are able to capture Brafman’s approach in
ours, provided the formula φ of “φ →~x ψ” has free variables
only among ~x and there are no iterated occurrences of “→~x ”.
More precisely, we can consider the class of models in our
approach in which there is a single ordering for each arity n,
say Un . Then, we can use the formula:
∀~y {Un (~x), φ(~x)}(~y ) → ψ(~x/~y )
6. φ6 = ∀x {F (y), P (y)}(x) → ¬F (x)
Related Work
Our view that normality is relative to a property such as f ly
was anticipated by work in circumscription, in particular in
its use of Ab predicates (McCarthy 1986). (Otherwise the
approaches have little in common.)
Conditional approaches to assertions of normality are
generally propositional; first-order approaches include (Delgrande 1998; Kern-Isberner and Thimm 2012). A predecessor to our work, in a full first-order setting, is Brafman’s (1997) approach to conditional statements. There,
conditional statements of the form “if φ then normally ψ” are
written as “φ →~x ψ” with the intuition being that the minimal tuples of the domain that make φ true also make ψ true.
There are two main differences between the “φ →~x ψ” no
tation and our corresponding “∀~x {P (~y ), φ(~y )}(~x) → ψ ”
notation that make the latter more expressive.
The first difference comes from the fact that we employ
multiple orderings, which gives a more nuanced approach.
In (Brafman 1997) it is not possible to have an individual
that is normal in some respect (say, nest building) while abnormal in another (like flying).
Secondly, our approach allows more expressive formulas,
as we have seen in the sequence of examples in Section 4.
As well, consider the “adults are normally employed at a
company” example, which Brafman would write as:
Adult(x) →x ∃y EmployedAt(x, y) ∧ Company(y)
Before moving to a final example we note that, since the
orderings r are well-founded in both directions, the following proposition holds.
“Opus is a penguin”
Related and Future Work
to represent Brafman’s assertion φ(~x) →~x ψ. Furthermore,
combining this translation with the method described in Section 5.1 implies that the approach of (Brafman 1997) can
also be expressed in standard FOL.
More recently there has been work in Description
Logic (Baader et al. 2007) that deals with the representation
and reasoning of defeasible assertions. The literature on socalled defeasible DLs is large and most of the established approaches to nonmonotonic reasoning (like default logic, circumscription or the rational closure) have been adapted for
the DL setting; see for instance (Baader and Hollunder 1992;
Bonatti, Lutz, and Wolter 2009; Giordano et al. 2015). Nevertheless, there have recently been interesting new proposals
that relate to our work here.
First, driven by the need to overcome problems like the
inheritance of properties in the presence of exceptions, multiple orderings have also been considered in (Gliozzi 2016;
Giordano and Gliozzi 2019) to account for different rankings
between individuals, each corresponding to a particular aspect (like F ly or BuildN est). However, although multiple
Then Φ entails the following sentences by default:
1. ψ1 = ¬F (Opus)
“Opus does not fly”
2. ψ2 = W (Opus)
“Opus has wings”
3. ψ3 = F (T weety) ∧ W (T weety)
“Tweety flies and has wings”
So Opus, being both a bird and a penguin, is concluded to
not fly by ψ1 (specificity) but to have wings by ψ2 (inheritance) since it is an exceptional bird wrt flying but inherits
any other typical property of birds. Then ψ3 (irrelevance)
shows that Tweety, being a yellow bird, is still concluded to
fly and have wings since being yellow is irrelevant wrt those
two properties.
127
orderings are considered in the semantics, only one “typicality” (in practice minimality) operator is employed in the syntax and there is no corresponding syntactic construct like our
pfcs. Furthermore, their multiple orderings are employed
only among individuals (and not tuples) and the use of typicality operators is limited, being only allowed on the left
side of a subsumption axiom. This results in an interesting,
but less expressive, representation of defaults, as opposed to
that developed here.
Similar to the previous approach, but closer to ours, is the
work in (Gil 2014), where the author takes into account multiple typicality operators. This work however suffers from
similar limitations regarding the scope of the orderings and
the limited employment of these typicality operators. A further limitation is the lack of any association between its operators/orderings and any relations or aspects.
A third line of work, originating from an approach
in (Britz and Varzinczak 2016) to define orderings not only
among individuals but also among tuples, culminated in interesting recent developments regarding defeasible reasoning in DLs (Varzinczak 2018; Britz and Varzinczak 2019).
An important characteristic of this work is that the orders on
individuals are derived from the ones specified by the roles,
i.e., they do not correspond to any concepts like in the previously mentioned (and our) work. This results in (contextual) defeasible subsumption needing specific role names to
be subscripted in order to specify the origin of the order that
will be employed, something that we do within the language
by means of the pfcs.
7.2
example that we saw at the end of Section 4. No approach
in the literature can adequately handle the various interpretations we gave in Section 4, especially in a DL setting. Our
goal then for future work is to try and express these interpretations in DL terms in a way that would semantically
correspond to the intended formulas of LN . Whereas the
question of how to syntactically express such assertions is
certainly non-trivial, we believe that the current framework
could provide a basis for (more elaborately) dealing with
defeasibility in DLs. The biggest advantage perhaps will be
that the properties we presented, both the KLM postulates as
well as defeasible principles like specificity, inheritance and
irrelevance, will continue to hold in any DL language (consider, e.g., Corollary 2 adapted for such a DL). The combination of employing multiple orderings in the domain of any
interpretation together with using LN and its pfcs to interpret the new default concept inclusions seems to overcome
the difficulties of the established approaches as well as allow a more “informed” representation of defaults in any DL
language.
Apart from DLs, we plan to expand on this work in the
future in a number of directions. First, a natural extension would be to allow quantifying into a pfc. This would
allow an assertion such as “each elephant normally likes
its keeper”, which is somewhat different from our previous example. Moreover, by allowing quantifying into a pfc,
we would be able to encode nested default assertions, such
as “profs that (normally) give good lectures are (normally)
liked by their students”. Second, we plan to allow for complex expressions in a pfc, and so allow a predicate expression
in place of P in {P (~y ), φ(~y )}. Last, as we already mentioned in Section 6, a thorough treatment and examination
of nonmonotonic reasoning in LN is also in the works.
Future Work
One goal in our work is to extend the approach to a DL setting. In the following we present some preliminary ideas behind such an extension. Consider again the assertion “birds
fly” which in a (non-defeasible) DL language is expressed
by the concept inclusion Bird ⊑ F ly, whereas in a defeasible DL it could be expressed as T(Bird) ⊑ F ly or
Bird ❁
∼ F ly among others. We propose to express the same
concept inclusion, perhaps through some extended syntax,
in such a way that its structure will invoke the use of the
pfcs from our setting. In this specific assertion, e.g., the
“new” DL expression would be semantically equivalent to
the LN -formula:
∀x {F ly(y), Bird(y)}(x) → F ly(x)
(8)
8
Conclusion
We have presented a new and well-behaved approach to representing default assertions through an expressive language
and novel formalism. This approach takes the position that
normality is not an absolute characteristic of an individual,
but instead is relative to a property (or, in general, relation).
This is achieved via an extension to the language of FOL,
along with an enhancement to models in FOL; a subsequent
result however shows that the approach may be embedded
in standard FOL. The approach allows for a substantially
more expressive language for representing default information than previous approaches. Moreover, we show that the
approach possesses quite natural and desirable features and
satisfies the standard KLM properties. With a variety of future directions and promising possible applications, like the
one we briefly discussed for the DL setting, we believe the
current framework presents an interesting new approach to
representing and reasoning about defaults as well as obtaining “well-behaved” nonmonotonic reasoning in general.
That is, we are interested in the minimal elements that satisfy the left side of the inclusion, similar to some of the
aforementioned DL approaches, while also specifying the
preference ordering we want to employ. This means that
the domain of any given interpretation would once again
be enhanced with preference orderings and the new syntax would somehow indicate the preference ordering that is
used inside a (default) concept inclusion. In other words,
whereas the inclusion Bird ⊑ F ly would be interpreted as
Bird I ⊆ F ly I , its default version would translate into and
follow the same semantics of Equation 8, being instead interpreted (roughly) as min(F lyI , BirdI ) ⊆ F ly I .
As for more complex and ambiguous statements, consider
the “undergraduate students attend undergraduate courses”
Acknowledgements
We thank the reviewers for their helpful comments. Financial support was gratefully received from the Natural Sciences and Engineering Research Council of Canada.
128
References
Kraus, S.; Lehmann, D.; and Magidor, M. 1990. Nonmonotonic reasoning, preferential models and cumulative logics.
Artificial Intelligence 44(1-2):167–207.
Lamarre, P. 1991. S4 as the conditional logic of nonmonotonicity. In Proceedings of the Second International Conference on the Principles of Knowledge Representation and
Reasoning, 357–367.
Lehmann, D., and Magidor, M. 1992. What does a conditional knowledge base entail?
Artificial Intelligence
55(1):1–60.
McCarthy, J. 1980. Circumscription – a form of nonmonotonic reasoning. Artificial Intelligence 13:27–39.
McCarthy, J. 1986. Applications of circumscription to formalizing common-sense knowledge. Artificial Intelligence
28:89–116.
Mendelson, E. 2015. Introduction to Mathematical Logic.
CRC Press, 6th edition.
Moore, R. 1985. Semantical considerations on nonmonotonic logic. Artificial Intelligence 25:75–94.
Reiter, R., and Criscuolo, G. 1981. On interacting defaults.
In Proceedings of the International Joint Conference on Artificial Intelligence, 270–276.
Reiter, R. 1980. A logic for default reasoning. Artificial
Intelligence 13(1-2):81–132.
Shoham, Y. 1987. A semantical approach to nonmonotonic
logics (extended abstract). In Symposium on Logic in Computer Science, 275–279.
Varzinczak, I. 2018. A note on a description logic of concept
and role typicality for defeasible reasoning over ontologies.
Logica Universalis 12(3-4):297–325.
Baader, F., and Hollunder, B. 1992. Embedding defaults
into terminological knowledge representation formalisms.
In Nebel, B.; Rich, C.; and Swartout, W., eds., Proceedings of the Third International Conference on the Principles
of Knowledge Representation and Reasoning, 306–317.
Baader, F.; Calvanese, D.; McGuinness, D.; Nardi, D.; and
Patel-Schneider, P., eds. 2007. The Description Logic Handbook. Cambridge: Cambridge University Press, second edition.
Bonatti, P. A.; Lutz, C.; and Wolter, F. 2009. The complexity
of circumscription in description logic. Journal of Artificial
Intelligence Research 35:717–773.
Boutilier, C. 1994. Conditional logics of normality: A
modal approach. Artificial Intelligence 68(1):87–154.
Brafman, R. I. 1997. A first-order conditional logic with
qualitative statistical semantics. Journal of Logic and Computation 7(6):777–803.
Britz, K., and Varzinczak, I. J. 2016. Introducing role defeasibility in description logics. In Logics in Artificial Intelligence - 15th European Conference, JELIA, 174–189.
Britz, K., and Varzinczak, I. 2019. Contextual rational closure for defeasible ALC. Annals of Mathematics and Artificial Intelligence 87(1-2):83–108.
Delgrande, J. 1987. A first-order conditional logic for prototypical properties. Artificial Intelligence 33(1):105–130.
Delgrande, J. 1998. On first-order conditional logics. Artificial Intelligence 105(1-2):105–137.
Enderton, H. 1972. A Mathematical Introduction to Logic.
Academic Press.
Fariñas del Cerro, L.; Herzig, A.; and Lang, J. 1994. From
ordering-based nonmonotonic reasoning to conditional logics. Artificial Intelligence 66(2):375–393.
Gil, O. F. 2014. On the non-monotonic description logic
alc+tmin . CoRR abs/1404.6566.
Giordano, L., and Gliozzi, V. 2019. Reasoning about exceptions in ontologies: An approximation of the multipreference semantics. In Kern-Isberner, G., and Ognjanovic,
Z., eds., Symbolic and Quantitative Approaches to Reasoning with Uncertainty, ECSQARU, volume 11726, 212–225.
Springer.
Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. L.
2015. Semantic characterization of rational closure: From
propositional logic to description logics. Artificial Intelligence 226:1–33.
Gliozzi, V. 2016. Reasoning about multiple aspects in rational closure for DLs. In Adorni, G.; Cagnoni, S.; Gori, M.;
and Maratea, M., eds., AI*IA 2016: XVth International Conference of the Italian Association for Artificial Intelligence,
volume 10037 of Lecture Notes in Computer Science, 392–
405. Genova, Italy: Springer.
Kern-Isberner, G., and Thimm, M. 2012. A ranking semantics for first-order conditionals. In Proceedings of the European Conference on Artificial Intelligence, 456–461. IOS
Press.
129
Probabilistic Belief Fusion at Maximum Entropy by First-Order Embedding
Marco Wilhelm1 , Gabriele-Kern-Isberner2
1,2
Department of Computer Science, TU Dortmund University, Dortmund, Germany
1
marco.wilhelm@tu-dortmund.de, 2 gabriele.kern-isberner@cs.tu-dortmund.de
Example 1. Consider a doctor who comes to the conclusion
that a symptom s is an indicator for a disease d with probability 0.9 which she formalizes in a probabilistic conditional
(d|s)[0.9] (“if s holds, then d holds with probability 0.9”)
while her colleague is more skeptical and assigns the probability 0.8 to the same conditional, (d|s)[0.8]. If we confide
in both doctors, we do not want to reject one’s opinion but
exploit both in order to obtain a unified view on this issue.
Obviously, both conditionals cannot be satisfied at the same
time and
Rdoc = {(d|s)[0.9], (d|s)[0.8]}
is inconsistent. Hence, there is a more sophisticated approach needed to combine both doctors’ views than purely
joining them by set union. Instead, it seems to make sense to
derive some kind of mean value of the two probabilities 0.9
and 0.8.
(b) The first-order setting allows one to ask and answer
more complex queries than the initial propositional setting.
And most importantly,
(c) the semantical freedom of first-order conditionals can
be easily used to define different belief fusion operators.
While we represent the beliefs of single reasoners by propositional conditionals (B|A)[p] with the meaning “if A holds,
then B follows with probability p,” we translate them into
first-order conditionals when merging. This enables a (fictive) decision maker to differentiate between the viewpoints
of the single reasoners. Ground instantiated first-order conditionals (B(i)|A(i))[p] express that “B follows from A
with probability p in the view of reasoner ri ,” while open
first-order conditionals (B(X)|A(X))[p] stand for “B follows
from A with probability p in the consolidated view of the
group of reasoners.” By doing so, the semantics of open
conditionals affects (or reflects; depending on your point of
view) the reasoning behavior of the decision maker.
Eventually, the application of the principle of maximum
entropy to the merged belief base completes missing probability values in order to obtain a whole belief state while
adding as less information as possible. In our opinion, this
methodology perfectly fits to the mission of the decision
maker as she should not contribute own beliefs but her task
is to process the beliefs of the reasoners as unbiasedly as
possible.
In summary, the main contribution of this paper is the presentation of a framework for generating belief fusion opera-
Abstract
Belief fusion is the task of combining beliefs of several reasoners such that the outcome reflects the consensual opinions
of the group of reasoners properly. We consider the case
in which the beliefs are formalized by probabilistic conditional statements of the form “if A holds, then B follows with
probability p” where A and B are propositions and present a
formal framework for generating belief fusion operators that
deal with such probabilistic beliefs. For this, we translate the
beliefs of the reasoners into first-order conditionals and apply
the principle of maximum entropy to the merged beliefs in
order to observe a consolidated belief state. By varying the
semantics of first-order conditionals, it is possible to generate
different belief fusion operators. We prove that well-known
belief fusion operations like linear and logarithmic pooling of
maximum entropy distributions can be reproduced with our
approach, while it can also be used to generate novel operators.
1
Introduction
Judgment aggregation (Grossi and Pigozzi 2014) is a rapidly
growing research area which addresses the problem of combining judgments of several individuals when their common
opinion on a certain issue is in demand. It has practical applications in domains such as economics, philosophy, political science, law, and medicine. In the subfield of probabilistic aggregation one is interested in a “probability assignment
to a given set of propositions on basis of the group members’
individual probability assignments” (List 2012) which shifts
this research topic into the field of belief fusion (Bloch et al.
2001; Dubois et al. 2016) as uncertain probabilistic beliefs
have to be combined.
In this paper, we present a novel approach for merging
probabilistic belief bases based on a first-order translation of
beliefs and apply the principle of maximum entropy (Paris
2006) to the merged belief base in order to infer an aggregated belief state. The first-order embedding mainly brings
three advantages with it:
(a) While a simple merging of the belief bases of several
reasoners typically causes conflicts (inconsistencies in the
merged belief base; cf. Example 1), our approach guarantees
consistency by a syntactic separation of beliefs of different
reasoners.
130
tors based on first-order embedding which comes along with
an inherent connection between probabilistic belief fusion
and first-order semantics.
The paper is organized as follows: First, we give a brief
insight into probabilistic belief fusion in general, followed
by a short discussion of pooling maximum entropy distributions in a propositional setting. Afterwards, we switch to the
first-order level, introduce different semantics for first-order
conditionals, and present our approach for belief base merging based on first-order embedding and apply the principle
of maximum entropy. We elaborate on different first-order
semantics and their connection to belief fusion operators,
compare our approach with related work, and conclude.
2
R1 , . . . , Rn
i=1
inductive inference
inductive inference
(belief states of n
reasoners)
opinion
pooling
P(R)
(fused belief state)
Figure 1: The two ways of processing social inferences.
Probabilistic Belief Fusion
Pi (ω) =
n
X
µi · Pi (ω),
A more general setting is dealt with in social inference processes (Wilmers and Jensen 2010; Adamcik 2014;
Wilmers 2015). In social inference processes, the reasoners do not contribute their whole belief state but a set
of beliefs which is called belief base. In general, a belief base does not determine the belief state of the reasoner completely and missing information has to be inferred inductively. When starting with a family of belief
bases R1 , . . . , Rn instead of belief states P1 , . . . , Pn , basically, there are two different ways of obtaining a fused
belief state P. On the one hand, it is possible to inductively infer a belief state Pi (Ri ) from the belief base Ri
for every reasoner ri independently and then apply an opinion pooling operator on P1 (R1 ), . . . , Pn (Rn ). This two
stage process is called obdurate merging (Adamcik 2014;
Wilmers 2015). On the other hand, it is possible to merge
the belief bases to a unified belief base R first and to infer a
belief state P(R) from R afterwards (cf. Figure 1).
As for opinion pooling, for belief base merging
(Konieczny and Pérez 2011) there is not one generally accepted strategy but there are many different competing approaches. For probabilistic inductive inferences, the principle of maximum entropy (Shannon and Weaver 1949;
Paris 2006) provides a well-founded methodology. The
maximum entropy distribution for a belief base R is the
probability distribution which satisfies all beliefs in R while
adding as less information as possible. In (Paris 1999) it
is shown that the maximum entropy distribution is the only
probability distribution which satisfies a couple of fundamental principles from commonsense reasoning. Accordingly, there has been expended some effort in obdurate
merging at maximum entropy (Wilmers and Jensen 2010;
Adamcik 2014; Wilmers 2015), i.e., processing social inferences following the way “down right” in Figure 1. As opposed to this, the interaction of merging belief bases first and
applying the maximum entropy principle afterwards has not
been investigated satisfactorily yet (the way “right down” in
Figure 1).
In this paper, we want to provide insights into the second
way of processing social inferences. In detail, we present a
novel approach for merging belief bases which uses a firstorder translation of beliefs. Then, we apply the principle
of maximum entropy to the merged beliefs and draw infer-
ω ∈ Ω,
i=1
logarithmic pooling (Bacharach 1972) maps probabilities to
a (normalized) weighted geometric mean,
Qn
n
M
P (ω)µi
i=1
Qn i
Pi (ω) = P
, ω ∈ Ω.
′ µi
ω ′ ∈Ω
i=1 Pi (ω )
i=1
The weights µ1 , . . . , µn usually satisfy
µi ≥ 0,
R
(aggregated belief
base)
P(R1 ), . . . , P(Rn )
Belief fusion (Bloch et al. 2001; Dubois et al. 2016) addresses the task of aggregating beliefs of several reasoners when their common opinion is in demand. The usual
way of aggregating beliefs in form of probability assignments is opinion pooling (Dietrich and List 2017; Genest
and Zidek 1986). In opinion pooling one assumes that the
reasoners (say, ri for i = 1, . . . , n) contribute their whole
belief state which is formalized by a probability distribution Pi over a set of possible worlds Ω. The probability
of a possible world ω ∈ Ω expresses the reasoner’s degree
of belief in whether ω formalizes the real world (in relation to the other possible worlds) accurately or not. For
all reasoners, the set of possible worlds is assumed to be
the same in order to guarantee comparability of the probabilities. As there is no obvious solution to the opinion
pooling problem, which is finding a mapping from the single belief states to a consensual belief state that reflects the
opinions of the group of reasoners best, various properties
are declared in order to determine what is a ‘good’ opinion pooling operation (see, e.g., (Dietrich and List 2017;
Genest and Zidek 1986)). The most prominent approaches
map probabilities to some kind of mean values: While linear
pooling (Stone 1961; McConway 1981) maps probabilities
to a (normalized) weighted arithmetic mean,
n
M
merging
(belief bases of n
reasoners)
i = 1, . . . , n,
and
n
X
µi = 1
i=1
and regulate the impact of the single belief states on the aggregated one. They can be understood as a measure of how
much an external decision maker trusts in the particular reasoners when fusing their beliefs. If nothing is known about
the reliability (or expertise) of the reasoners, the weights
should equal 1/n.
131
of c in the presence of a and b when reasoning over all models of Rcp . More precisely, for each probability p ∈ [0, 1],
there is a model of Rcp in which (c|ab)[p] holds. At the
same time, it is reasonable to assume that the probability
p of (c|ab)[p] is at least 0.7 (unless a and b weaken each
other their strength of evidence for c, which is possible but
certainly not to be assumed by default).
Hence, for reasoning tasks, it is useful to select a single model of a consistent belief base R. Of course, this
model should reflect the belief state of the reasoner with belief base R appropriately. Here, we rely on the aforementioned maximum entropy distribution (Paris 2006) which is
formally defined by
X
P(ω) · log P(ω),
ME(R) = arg max −
ences which depend on the semantics of first-order conditionals. We show that our approach is expressive enough to
produce well-known belief fusion operators that are defined
via obdurate merging. In addition, our approach allows one
to easily define novel belief fusion operators by varying the
semantics of first-order conditionals. Due to the expressiveness of first-order logics, we are also able to formulate and
answer more complex queries than in a purely propositional
setting.
Before we present our first-order embedding approach
and highlight its benefits, we briefly recall obdurate merging at maximum entropy and discuss different semantics for
our first-order setting.
3
Obdurate Merging at Maximum Entropy
P|=R
We start this section with a discussion of maximum entropy
reasoning for a single reasoner who expresses her beliefs in
form of probabilistic conditional statements
where the convention 0 · log 0 = 0 applies. Note that, if
R is consistent, ME(R) exists and is unique. It yields the
non-monotonic inference relation
“if A holds, then B follows with probability p”
over a propositional language L. Afterwards, we recall the
obdurate merging operators OLEP and OSEP (see, e.g., (Dietrich and List 2017; Adamcik 2014)) with which the beliefs
of several reasoners can be fused.
Let Σ = {a, b, c, . . .} be a finite set of propositions which
can either be true or false. A formula in L(Σ) is a proposition or inductively defined by ¬A (negation), A ∧ B (conjunction), or A ∨ B (disjunction), where A, B ∈ L(Σ). The
interpretation of formulas is as usual in propositional logics.
To shorten mathematical expressions, we write A instead of
¬A, AB instead of A ∧ B, and ⊤ for any tautological formula like A ∨ A. Probabilistic conditionals are denoted by
(B|A)[p] where A, B ∈ L(Σ) and p ∈ [0, 1] and formalize a reasoner’s degree of belief in B in the presence of A
measured by the probability p. Finite sets of probabilistic
conditionals R serve as belief bases.
The semantics of probabilistic conditionals is based on
probability distributions over possible worlds. Here, a possible world ω is a complete conjunction of literals, i.e., every
proposition from Σ occurs in ω exactly once, either positive
or negated. A probability distribution P over the set of all
possible worlds Ω(Σ) is a model of a belief base R iff
∀(B|A)[p] ∈ R : P(A) > 0 ∧
where
P(A) =
X
ω∈Ω(Σ)
R |=ME (B|A)[p] iff ME(R)(B|A) = p.
For example (cf. Example 2),
Rcp |=ME (c|ab)[p] with p ≈ 0.908.
Now, assume that there are several reasoners r1 , . . . , rn ,
each equipped with a belief base Ri , the common opinion
of which is in demand. The obdurate merging operator
OLEP(R1 , . . . , Rn )(ω) =
n
1 X
ME(Ri )(ω),
= ·
n i=1
ω ∈ Ω(Σ),
linearly pools the maximum entropy distributions for
R1 , . . . , Rn and is called obdurate linear entropy process
(Adamcik 2014; Dietrich and List 2017). The operator
OSEP(R1 , . . . , Rn )(ω) =
Qn
ME(Ri )(ω)1/n
i=1Q
P
=
,
n
′ 1/n
ω ′ ∈Ω(Σ)
i=1 ME(Ri )(ω )
ω ∈ Ω(Σ),
logarithmically pools the maximum entropy distributions
and is called obdurate social entropy process (Adamcik
2014; Dietrich and List 2017). Both operators satisfy a
couple of desirable properties from opinion pooling and,
hence, are important representatives of pooling operators
(see (Adamcik 2014) for a comprehensive collection and
comparison of properties).
P(AB)
= p,
P(A)
P(ω) for A ∈ L(Σ),
ω|=A
and where |= is the classical entailment relation. A belief
base is consistent iff it has at least one model. Note that
P(A) = P(A|⊤) for A ∈ L(Σ). Hence, probabilistic formulas are subsumed within this framework by identifying
A[p] = (A|⊤)[p].
Due to the vast number of probability distributions
over Ω(Σ), reasoning over all models of a consistent belief
base is often very uninformative.
Example 2. If Rcp = {(c|a)[0.7], (c|b)[0.9]}, i.e., a and b
are evidence for c, nothing can be said about the likelihood
4
Maximum Entropy and First-Order
Conditionals
In preparation of our belief base merging approach which
uses a first-order translation of beliefs, we briefly discuss
some particularities of maximum entropy reasoning with
first-order conditionals. The main difference to reasoning
with propositional conditionals is that one has to specify
a semantics of first-order conditionals with free variables.
132
P |=grnd r, iff ∀(B′ |A′ )[p] ∈ grnd(r) :
Actually, we profit from this additional requirement when
varying the semantics of first-order conditionals in order to
produce different belief fusion operators later on.
In this section, we consider a function-free first-order
language FOL over the signature (Pred, Const) consisting of finite sets of predicates Pred and constants Const.
Formulas in FOL are built by using the common connectives (∧, ∨, ¬) and quantifiers (∃, ∀). While constants
and predicates are denoted with sans serif lowercase letters, we denote variables with uppercase letters. We use
the same abbreviations for conjunction and negation as in
the propositional case. If p is a predicate of arity n and
c1 , . . . , cn are constants, the formula p(c1 , . . . , cn ) is called
ground atom. The set of all ground atoms is denoted with
Σfo = Σfo (Pred, Const). A ground literal is a ground atom
or its negation. Possible worlds in Ω(Σfo ) are complete conjunctions of ground literals.
Formulas in FOL can be instantiated by substituting
each free variable by a constant (e.g., ∀X r(X, a) is an instance of ∀X r(X, Y)). The set of all instances of a formula A
is denoted by inst(A) and the set of the free variables in A by
var(A). Formulas without free variables are called closed.
In analogy to the propositional case, probabilistic conditionals are expressions of the form (B|A)[p] with a probability p. In this context, however, A and B may be arbitrary
first-order formulas from FOL. In particular, they may contain free variables. While the interpretation of closed conditionals, i.e. conditionals (B|A)[p] with closed formulas A, B,
is obvious (P |= (B|A)[p] iff P(B|A) = p), conditionals
with free variables can be interpreted in different ways.
We recall three popular semantics of first-order conditionals from literature, namely the grounding semantics, the
averaging semantics, and the aggregating semantics (KernIsberner and Thimm 2010; Thimm and Kern-Isberner 2012),
and present two novel semantics as well. Beforehand, we
introduce some further notations that are used in the definitions of the semantics.
Let r = (B|A)[p] be a first-order conditional. Then,
grnd(r) denotes the set of all proper groundings of r. Proper
groundings are obtained by substituting each free variable
that is mentioned in A or B by any constant from Const. For
example, (d(1)|s(1))[p] and (d(2)|s(2))[p] are the proper
groundings of (d(X)|s(X))[p] when Const = {1, 2}. Note
that (d(1)|s(2))[p] is not a proper grounding as free variables that are mentioned in both A and B have to be substituted with the same constant in A and B. With verr (ω)
and appr (ω) we count the numbers of proper groundings of
r = (B|A)[p] that are verified respectively applicable in the
possible world ω:
P |= (B′ |A′ )[p].
The grounding semantics requires that for every proper
grounding of a conditional statement the conditional probability is the same, namely p. Note that this is a rather strong
constraint. The following semantics are more ‘smoothing’
and built upon mean values over all proper groundings instead. For example, the averaging semantics looks at the
arithmetic mean of the probabilities of all groundings of r.
Definition 2 (Averaging Semantics). Let r = (B|A)[p] be a
conditional defined over FOL, and let P be a probability
distribution over Ω(Σfo ). Then, P is an averaging-model
of r, written
X
1
P(B′ |A′ ) = p.
·
P |=avrg r, iff
|grnd(r)|
′
′
(B |A )[p]∈grnd(r)
The idea behind the aggregating semantics is to mimic
statistical probabilities from a subjective point of view: Not
the relative frequency of the sum of the verifications of the
instances of the conditional (spread over all possible worlds)
is measured against the applicability of the conditional, but
the reasoner’s beliefs in the eventuation of these instances
are taken into account.
Definition 3 (Aggregating Semantics). Let r = (B|A)[p] be
a conditional defined over FOL, and let P be a probability
distribution over Ω(Σfo ). Then, P is an aggregating-model
of r, written
X
P(A′ ) > 0
P |=aggr r, iff
(B′ |A′ )[p]∈grnd(r)
P
′ ′
(B′ |A′ )[p]∈grnd(r) P(A B )
= p.
and P
′
(B′ |A′ )[p]∈grnd(r) P(A )
Our two novel semantics are specially constructed for
our belief fusion approach. In the approving semantics,
the probabilities of the possible worlds are weighted with
the relative frequency of the verifications of the conditional
within the respective possible world. That is, probabilities
get a higher weight the more proper groundings of the conditional are verified relative to the number of falsifications.
Definition 4 (Approving Semantics). Let r = (B|A)[p] be
a conditional defined over FOL, and let P be a probability
distribution over Ω(Σfo ). Then, P is an approving-model
of r, written
X
P |=appr r, iff
fr (ω) · P(ω) = p,
ω∈Ω(Σfo )
verr (ω) = |{(B′ |A′ )[p] ∈ grnd(r) | ω |= A′ B′ }|,
where
appr (ω) = |{(B′ |A′ )[p] ∈ grnd(r) | ω |= A′ }|.
It is 0 ≤ verr (ω) ≤ appr (ω) ≤ |grnd(r)| for all conditionals r and possible worlds ω.
Definition 1 (Grounding Semantics). Let r = (B|A)[p] be
a conditional defined over FOL, and let P be a probability distribution over Ω(Σfo ). Then, P is a grounding-model
of r, written
fr (ω) =
(
verr (ω)
appr (ω)
0
iff appr (ω) > 0
.
otherwise
Our last semantics, the uniformity semantics, works on
certain subclasses of first-order conditionals only. For simplicity, we concentrate on a Boolean fragment BOOL of
FOL here. BOOL is the quantifier-free fragment of FOL
133
5
where Pred consists of unary predicates only. Hence, formulas in BOOL are Boolean combinations of unary predicates.
We specify Const = {1, . . . , V
n}. Then, possible worlds ω
n
can be decomposed into ω = i=1 ωi such that ωi contains
those ground literals from ω that are instantiated with i. With
A[i/j] we denote the formula A in which every occurance of
the constant i is replaced by the constant j.
We now present our approach to merging (propositional) belief bases R1 , . . . , Rn of n reasoners. The core idea of our
approach is to lift the background language of the conditionals in Ri to a fragment of first-order logic. By doing so, the
fictive decision maker which has access to the merged belief
base is equipped with a more expressive language than the
reasoners and is able to express statements about the opinions of the reasoners, i.e., statements of the form
Definition 5 (Uniformity Semantics). Let r = (B|A)[p] be a
conditional defined over BOOL, and let P be a probability
distribution over Ω(Σfo ). Then, P is an uniformity-model
of r, written
X
P |=unif r, iff
“if A holds, then B follows with probability p in the
view of reasoner ri ”
P(ω)1/n > 0
and of the form
ω∈Ω(Σbool )
∀i=1,...,n: ωi [i/1]=ω1
ω1 |=A(1)
“if A holds, then B follows with probability p in the
view of the group of reasoners.”
P
and
P(ω)1/n
ω∈Ω(Σbool )
∀i=1,...,n: ωi [i/1]=ω1
ω1 |=A(1)B(1)
P
P(ω)1/n
This is a reasonable and natural extension of the language
that meets the intention of the decision maker appropriately:
The decision maker does not appear as an autonomous, standalone reasoning agent who revises her own beliefs but reflects and processes the opinions of the other reasoners.
In nearly all approaches to belief base merging, the
merged belief base R makes use of the same background
language as the single belief bases that are merged instead.
In our setting this would mean that R was a set of conditionals (B|A)[p] with A, B ∈ L(Σ). With this, the fictive
decision maker with belief base R would be able to express
the same statements as the reasoning agents but should assign aggregated probabilities to the statements in order to
achieve a consensus. She would no longer be able to differentiate between the viewpoints of the reasoners. In particular, statements that involve opposing attitudes of several
reasoners like “if the first doctor believes in disease d but the
second does not, the decision maker/group of reasoners believes in the presence of symptom s with probability p” are
not (directly) expressible. Further, it is a widely accepted but
not an uncontroversial postulate of belief base merging
Sn that
merging should be performed by set union, R = i=1 Ri ,
if the union is consistent (Konieczny and Pérez 2011). This
might be reasonable if, for example, the merged belief base
belongs to a single reasoner who connects information from
several sources that are formalized by the belief bases which
are merged. However, in our setting, merging by set union
is inappropriate, as it disregards the parts of the belief states
of the single reasoners that are given only implicitly by the
inference behaviors of the reasoners.
We now discuss the technical aspects behind our merging approach. We lift the propositional language L(Σ) to a
Boolean fragment of FOL, basically, by translating propositions from L(Σ) to unary predicates. While instantiations
of these predicates correspond to propositions in view of a
single reasoning agent, a predicate with a (free) variable represents a proposition in view of the group. This translation
eventuates in a first-order signature (Const, Pred) consisting of the finite set of constants Const = {1, . . . , n} (the
reasoners’ ids) and the finite set of predicates
= p.
ω∈Ω(Σbool )
∀i=1,...,n: ωi [i/1]=ω1
ω1 |=A(1)
We will further investigate and compare these semantics
later on in the light of social inference processes.
As minimal requirements for developing new semantics of first-order conditionals, one should guarantee that
conditionals are evaluated to probability values and that
closed conditionals are interpreted by conditional probabilities. We call semantics which satisfy these requirements
well-behaved. All five aforementioned semantics are wellbehaved.
In order to refer to an arbitrary semantics of first-order
conditionals, we write |=sem . Hence, the subscript sem
serves as a placeholder for grnd, avrg, and so on. In order to indicate that conditionals are interpreted under a certain semantics, we annotate the respective subscript also to
probability distributions. For example, we write Paggr (B|A)
when the conditional statement (B|A) is evaluated under the
aggregating semantics.
A probability distribution P is a sem-model of a firstorder belief base Rfo , i.e. a finite set of first-order conditionals, if it models all conditionals in Rfo with respect to
the sem-semantics. Once a semantics sem is fixed, the maximum entropy distribution for Rfo is defined by
ME(Rfo ) = arg max −
P|=sem Rfo
X
Belief Base Merging by First-Order
Embedding
P(ω) · log P(ω).
ω∈Ω(Σfo )
It remains to note that the maximum entropy distribution for
arbitrary first-order belief bases and with respect to arbitrary
semantics does neither need to exist nor need to be unique.
However, we will show that in our concrete application the
maximum entropy distribution will exist and will be unique
with respect to all well-behaved semantics.
Pred = {a/1 | a ∈ Σ},
134
fo
fo
separation means that the union Rfo
doc = Rdoc,1 ∪ Rdoc,2 of
the reasoners’ belief bases is no longer necessarily inconsistent as the two conditionals in Rfo
doc deal with the same issue
but from different points of view, implemented by different
syntactic elements. As we will see later on, this consistency
preservation carries over to arbitrary (consistent) prior belief
bases, and belief base merging can be performed by simply
joining their first-order pendants.
Definition 6 (First-Order Merging). The first-order merging
Rfo of the belief bases R1 , . . . , Rn is defined by
where all predicates are of arity 1. For simplicity, we name
the predicates as their corresponding propositions but write
them in sans serif letters. This leads to an easy-to-read translation while it still enables one to distinguish between propositional and first-order expressions. Hence, each proposition a ∈ Σ corresponds to an atom a(X). The set of ground
atoms becomes
Σfo = {a(i) | a ∈ Pred, i ∈ Const}.
One can say that propositions are translated into several
duplicates, one for each reasoner ri . Further, we define
the first-order translation fo(·) of formulas from L(Σ) in a
straightforward, recursive way by
Rfo = Rfo (R1 , . . . , Rn ) :=
n
G
i=1
• fo(a) = a(X) for propositions a ∈ Σ, and
Ri :=
n
[
Rfo
i .
i=1
Although the merging operator ⊔ pretty much looks like
ordinary set union, it differs from joining R1 , . . . , Rn as the
fo
fo
first-order translations
Sn R1 , . . . , Rn are joined instead, even
in the case when i=1 Ri is consistent. This deviance is
intended as already mentioned.
A side product of our approach is that the belief base of
a single reasoner ri can be recovered from the merged belief base Rfo by extracting those conditionals that mention
ground atoms with constant i and by back-translating them
into the propositional language L(Σ).
• fo(¬A) = ¬fo(A), fo(A ∧ B) = fo(A) ∧ fo(B),
and fo(A ∨ B) = fo(A) ∨ fo(B) for A, B ∈ L(Σ).
In plain words, fo(A) is the formula A in which every
proposition is replaced by its corresponding (non-grounded)
atom from first-order logic. Again, with an easy readability in mind, we name the resulting formulas of a first-order
translation with sans serif letters, i.e. fo(A) = A, similarly
as we have done for propositions. The entirety of all possible first-order translations of formulas from L(Σ) forms a
Boolean fragment of the first-order language FOL to which
we refer as BOOL(Σ). Hence, fo(·) is a bijection between
L(Σ) and BOOL(Σ). We denote the set of conditionals
(B|A)[p] with A, B ∈ BOOL(Σ) with CondB (Σ).
In order to merge belief bases R1 , . . . , Rn we compile them so that they fit into our first-order setting. For
this, we substitute every conditional (B|A)[p] ∈ Ri with
(B(i)|A(i))[p], where A(i) and B(i) are the first-order translations of A and B that are instantiated with the constant i.
As such an instantiated formula A(i) means “A in the view
of agent ri ,” the conditional (B(i)|A(i))[p] can be understood as the conditional (B|A)[p] in the view of reasoner ri .
With this, we define the first-order translation of Ri by
6
Social Inference at Maximum Entropy
Based on First-Order Embedding
We now discuss the semantical aspects of our merging approach and define a schema for generating belief fusion operators at maximum entropy
Fn based on our first-order embedding. For this, let Rfo = i=1 Ri be the first-order merging
of the belief bases R1 , . . . , Rn . Without a proper semantics, Rfo is just a collection of the single reasoners’ beliefs
that are marked with the reasoners’ id’s. The essential question when inferring fused beliefs from the belief base Rfo is
how the conditionals in Rfo should be combined in order to
observe a unified view that reflects the opinions of all reasoners. This aggregation is done by relating the ground instantiated conditionals (B(i)|A(i))[p] ∈ Rfo , i = 1, . . . , n,
to the corresponding open conditional (B(X)|A(X))[p] ∈
CondB (Σ). While (B(i)|A(i))[p] expresses a belief of reasoner ri , the open conditional (B(X)|A(X))[p] expresses the
unified view of all reasoners on the conditional event (B|A).
In Example 1, for instance, we are interested in the unified view of both doctors on the influence of symptom s
on disease d which can be formalized by the open conditional (d(X)|s(X))[p]. That is, we answer the query “With
what probability do the doctors assume d in the presence
of s?”, written (d|s)[?], with the probability p of the conditional (d(X)|s(X))[p]. Hence, the answer to the query depends on the semantics of open first-order conditionals.
Once the belief bases R1 , . . . , Rn are merged to Rfo and
a semantics of first-order conditionals is fixed, belief fusion
is straightforward.
Definition 7 (First-Order Belief Fusion). Let R1 , . . . , Rn
be consistent belief bases,Flet P(Rfo ) be a model of the
n
merged belief base Rfo = i=1 Ri , and let sem be a wellfus
behaved first-order semantics. Then, the fo-fusion Psem
of
Rfo
i = {(B(i)|A(i))[p] | (B|A)[p] ∈ Ri }
for i = 1, . . . , n.
Example 3. The first-order translation of the belief base
Rdoc,1 = {(d|s)[0.9]} of the first doctor from Example 1 is
Rfo
doc,1 = {(d(1)|s(1)[0.9]},
and the first-order translation of the belief base of the second
doctor is
Rfo
doc,2 = {(d(2)|s(2))[0.8]}.
When considering only a single reasoner ri , the first-order
translation of the belief base Ri is nothing else than renaming the propositions, since Rfo
i is grounded, and reasoning
about Rfo
works
the
same
as
reasoning about Ri . When
i
comparing the belief bases of several reasoners, the translafo
tion process implies that every two belief bases Rfo
i and Rj
with i 6= j do not share any ground atoms so that they are
syntactically separated. In particular, they are disjoint sets,
fo
i.e., Rfo
i ∩ Rj = ∅, even if this is not the case for Ri and
Rj , i.e., Ri ∩ Rj 6= ∅. In the doctors example this syntactic
135
MEfus
sem (R1 , . . . , Rn ) depends on the semantics of first-order
conditionals.
We conclude this section by illustrating the ME-fusion by
means of an example.
R1 , . . . , Rn with respect to P and sem is defined by
fus
Psem
(R1 , . . . , Rn ) |= (B|A)[p]
iff P(Rfo ) |=sem (B(X)|A(X))[p]
Example 4. We recall Example 1. One has
for propositional conditionals (B|A)[p].
MEfus
avrg (Rdoc,1 , Rdoc,2 )(d|s) =
1
= · ME(Rfo )(d(1)|s(1)) + ME(Rfo )(d(2)|s(2))
2
1
= · (0.9 + 0.8) = 0.85
2
fus
We usually omit the arguments of Psem
and P when they
are clear from the context. In particular, one has
fus
Psem
(ω) = p iff P |=sem (w(X)|⊤)[p],
where w(X) = fo(ω) is the first-order translation of the possible world ω ∈ Ω(Σ). Bear in mind that possible worlds
from Ω(Σ) are not translated to possible worlds in Ω(Σfo )
but to open formulas w(X) ∈ BOOL(Σ).
In fact, Definition 7 is not the definition of a single belief fusion operator but is a schema for generating a whole
family of belief fusion operators which can be observed by
varying the models of Rfo as well as the first-order semantics.
We have already mentioned that there are several semantics of first-order conditionals but it remains to clarify under
which constraints there is a model of Rfo . For this, F
we show
n
that the maximum entropy distribution for Rfo = i=1 Ri
exists if R1 , . . . , Rn are consistent. Hence, Rfo is consistent in this case, too. Recall that the maximum entropy optimization problem, i.e. finding a model of Rfo which has
maximal entropy among all models of Rfo , is mathematically the same in our first-order setting as in the propositional case aside from the fact that one has to replace the belief base of a single reasoner with the merged belief base and
the search space of all probability distributions over Ω(Σ)
with those over Ω(Σfo ). The maximum entropy distribution ME(Rfo ) does not depend on the interpretation of firstorder conditionals with free variables since all conditionals
in Rfo are ground, though. Therefore, the constraints in the
optimization problem are the same for all well-behaved semantics and are linear combinations of the probabilities that
have to be found. According to (Boyd and Vandenberghe
2004), the maximum entropy optimization problem has a
unique solution in this case provided that the belief bases
R1 , . . . , Rn are consistent which guarantees that the search
space is non-empty. Consequently, the maximum entropy
distribution ME(Rfo ) exists and Rfo is consistent. A further consequence is that belief fusion operators according to
Definition 7 and with respect to ME(Rfo ) always exist.
which equals MEfus
appr (Rdoc,1 , Rdoc,2 ) = 0.85. In contrast to
this, one has
MEfus
aggr (Rdoc,1 , Rdoc,2 )(d|s) ≈ 0, 8475
and
MEfus
unif (Rdoc,1 , Rdoc,2 ) ≈ 0, 8571.
We leave the more sophisticated calculations in the latter
cases to the reader. With the grounding semantics, it is not
possible to draw an inference, as for the two proper groundings of (d(X)|s(X))[p] there are stated different probabilities
in the merged belief base.
7
Comparison of First-Order Semantics in
the Light of Belief Fusion
In the last section, we have formally defined the notion of
ME-fusion. Apart from the input belief bases, the ME-fusion
operator also depends on the chosen semantics of first-order
conditionals. We reformulate the semantics from Section 4
in the light of belief fusion and prove that the aggregating
semantics coincide with the fusion operator OLEP while the
uniformity semantics leads to OSEP. Example 4 has proven
that the three remaining semantics differ from both OLEP
and OSEP.
As our first-order translation of beliefs leads to first-order
conditionals in CondB (Σ), the definitions of first-order semantics in Section 4, which take conditionals defined over
the entire language FOL into account, are too comprehensive to assess beliefs from the group of reasoners point of
view (at least for many of them; see the end of this section for an extension of the notion of beliefs of the decision
maker). Thus, we give characterizations of the semantics
in the Boolean context before we discuss the role they are
playing for belief fusion.
Definition 8 (Social Inference at Maximum Entropy). Let
R1 , . . . , Rn be consistent belief bases, and let sem be a
well-behaved first-order semantics. Then, we define a social inference operator at maximum entropy for R1 , . . . , Rn
and sem, the ME-fusion for short, by
Characterization 1 (Grounding Semantics). A probability
distribution P over Ω(Σfo ) is a grounding-model of a conditional (B(X)|A(X))[p] ∈ CondB (Σ) iff
MEfus
sem (R1 , . . . , Rn ) |= (B|A)[p] iff
∀i = 1, . . . , n : P(A(i)) > 0 and P(B(i)|A(i)) = p.
fo
ME(R ) |=sem (B(X)|A(X))[p]
The grounding semantics causes that the decision maker
only draws those inferences at maximum entropy that are
supported by all reasoners in the same manner:
for propositional conditionals (B|A)[p].
Note that in contrast to the definition of the maximum entropy distribution ME(Rfo ), the ME-fusion
136
MEfus
grnd (R1 , . . . , Rn ) |= (B|A)[p] iff
=
∀i = 1, . . . , n : ME(Ri ) |= (B|A)[p].
Proof (Sketch). The first equality holds as MEaggr satisfies
Ṡ
Syntax Splitting. By construction, Rfo = i=1,...,n Rfo
i is
a union of syntactically independent conditionals whereby
fo
conditionals in Rfo
i are defined by using atoms from Σi
only, i.e. the set of atoms that mention i. As a consequence,
fo
MEaggr (Rfo ) factorizes over Σfo
1 , . . . , Σn . See (Wilhelm,
Kern-Isberner, and Ecke 2017) for the technical details. The
second equation is an immediate consequence of the property System Independence. It states that “it should not matter
whether one accounts for independent information about independent systems separately in terms of different densities
or together in terms of a joint density.” (Shore and Johnson
1980)
The averaging semantics leads to a linear pooling of conditional probabilities. Note that this is not the same as linear
pooling of probabilities. In fact, in (Genest and Zidek 1986)
it is mentioned that there is no non-trivial linear pooling operation which also linearly pools conditional probabilities.
In Example 1, the decision maker assigns the probability
0.85, i.e. the arithmetic mean of 0.9 and 0.8, to the conditional (d|s), which is a very obvious assignment at first
glance.
Characterization 3 (Aggregating Semantics). A probability distribution P over Ω(Σfo ) is an aggregating-model of a
conditional (B(X)|A(X))[p] ∈ CondB (Σ) iff
Pn
i=1 P(A(i)B(i))
P
= p,
n
i=1 P(A(i))
Proposition 1 shows that MEaggr (Rfo ) is the joint distribution of the independent distributed opinions of the single
reasoners.
Corollary 1. Let R1 , . . . , Rn be consistent belief bases and
let (B|A)[p] with A, B ∈ L(Σ) be a conditional. Then,
MEaggr (Rfo )(B(i)|A(i)) = p iff ME(Ri )(B|A) = p.
Proof. According to Proposition 1,
MEaggr (Rfo )(B(i)|A(i)) =
P
fo
ω∈Σfo : ω|=A(i)B(i) MEaggr (R )(A(i)B(i))
P
=
fo
ω∈Σfo : ω|=A(i) MEaggr (R )(A(i))
P
Qn
fo
ω|=A(i)B(i)
j=1 MEaggr (R )(ωj )
Qn
= P
fo
ω|=A(i)
j=1 MEaggr (R )(ωj )
P
fo
ω |=A(i)B(i) MEaggr (R )(ωi )
= Pi
fo
ω |=A(i) ME(R )aggr (ωi )
P i
prop(ωi )|=AB ME(Ri )(prop(ωi ))
= P
prop(ωi )|=A ME(Ri )(prop(ωi ))
which can be reordered to a weighted arithmetic mean of the
probabilities P(B(i)|A(i)):
n
X
P(A(i))
µi · P(B(i)|A(i)) = p with µi = Pn
.
j=1 P(A(j))
i=1
The aggregating semantics leads to linear pooling of maximum entropy distributions with equal weights of 1/n as
in OLEP. This, however, is not so obvious as the weights
µi seem to differ from reasoner to reasoner (because of the
probability P(A(i)) in the numerator of µi ). This, however,
is the case because we deal with conditional statements here.
In order to prove that the aggregating semantics in combination with the principle of maximum entropy coincides with
OLEP,V we recall that each ω ∈ Ω(Σfo ) can be written as
n
ω = i=1 ωi where ωi is the conjunction of those ground
literals in ω that mention the constant i. Further, consider the
back-translation prop(ωi ) that translates the marginalized
world ωi to a possible world from Ω(Σ) by substituting each
ground atom a(i) in ωi by the proposition a ∈ Σ. For example, the possible world ω = d(1)s(1)d(2)s(2) ∈ Ω(Σfo )
decomposes into the marginalized worlds ω1 = d(1)s(1)
and ω2 = d(2)s(2) which leads to prop(ω1 ) = ds and
prop(ω2 ) = ds.
Proposition 1.FLet R1 , . . . , Rn be consistent belief bases,
n
and let Rfo = i=1 Ri . Then, for all ω ∈ Ω(Σfo ),
MEaggr (Rfo )(ω) =
MEaggr (Ri )(prop(ωi )).
i=1
As we have seen, an external decision maker would not
come to a conclusion in Example 1 as the two doctors differ in their appraisal. Belief fusion based on the grounding
semantics can thus be seen as the most cautious way of decision making.
Characterization 2 (Averaging Semantics). A probability
distribution P over Ω(Σfo ) is an averaging-model of a conditional (B(X)|A(X))[p] ∈ CondB (Σ) iff
n
1 X
P(B(i)|A(i)) = p.
·
n i=1
n
Y
n
Y
= ME(Ri )(B|A).
Corollary 1 states that the maximum entropy probabilities of the i-th instance of the conditional statement
(B(X)|A(X)) indeed corresponds to the probability which
the i-th reasoner would assign to the conditional statement
(B|A) if she were a maximum entropy reasoner. This is
a strong justification for using the aggregating semantics for
ME-fusion, and it directly leads to the central connection between OLEP and ME-fusion with respect to the aggregating
semantics.
Theorem 1. Let R1 , . . . , Rn be consistent belief bases.
Then,
MEfus
aggr (R1 , . . . , Rn ) = OLEP(R1 , . . . , Rn ).
Proof. For ω ∈ Ω(Σ), one has
MEaggr (Rfo )(ωi )
MEfus
aggr (R1 , . . . , Rn )(ω)
i=1
137
Pn
fo
i=1 MEaggr (R )(w(i))
= P
n
fo
i=1 MEaggr (R )(⊤)
=
1/n
ME⊛ (Rfo )(w(i))
=P
1/n
Qn
fo
′
i=1 MEunif (R )(w (i))
ω ′ ∈Ω(Σ)
1/n
Qn
i=1 ME(Ri )(ω)
=P
Qn
′ 1/n
i=1 ME(Ri )(ω )
ω ′ ∈Ω(Σ)
Qn
n
1 X
·
ME(Ri )(ω) = OLEP(R1 , . . . , Rn )(ω).
n i=1
i=1
From a semantical point of view, the contribution of Theorem 1 is as follows: When using OLEP for belief fusion,
one assumes that every single reasoner infers her beliefs according to the principle of maximum entropy, i.e. reasons in
a most cautious way, which is at least questionable. Theorem 1 states, however, that it is sufficient to assume that the
decision maker is a maximum entropy reasoner to observe
the same results as with OLEP. The assumption that the decision maker is a cautious reasoner is much more reasonable
and is in accordance with the idea of finding a consensus
between the reasoners.
For the approving semantics, there is no simplifying characterization aside from the fact that the numbers of verified
and applicable groundings of r = (B|A)[p] in ω reduce to
= OSEP(R1 , . . . , Rn )(ω).
We close this section with a brief discussion of the benefits owing to the gain in expressivity when translating beliefs into first-order statements. In principle, the translation
allows the decision maker to easily express beliefs that mix
the viewpoints of the single reasoners. For example, with
respect to the scenario in Example 1, one could be interested in the common belief in disease d when the first doctor believes in the presence of symptom s while the second
doctor does not, i.e., one queries the conditional statement
(d(X)|(s(1)s(2)). This is, at least under consideration of
the aggregating semantics, equivalent to the course of action
where the doctors update their beliefs with s resp. s first,
before they communicate them to the decision maker.
Further queries of interest could be: With which probability should the decision maker assume that the group of
doctors believe in d, that both doctors believe in d, that at
least one doctor believes in d, and so on. Clearly, there is
still a lot of work to be done in this line of thinking in order
to translate these queries properly into first-order conditionals such that the conditionals and the chosen semantics of
the conditionals reflect the idea behind the query properly.
verr (ω) = |{i ∈ {1, . . . , n} | ω |= A(i)B(i)}|,
appr (ω) = |{i ∈ {1, . . . , n} | ω |= A(i)}|.
Maximum entropy reasoning based on the approving semantics is neither a linear nor a logarithmic pooling operation.
Example 5. We consider Example 1 but with a third doctor
who beliefs in (d|s) with probability 0.6. A calculation of
fus
some length shows MEfus
appr (d|s) ≈ 0, 908, i.e. MEappr (d|s)
is higher than the highest rating of the doctors. In particular,
MEfus
appr is not a (weighted) arithmetic or geometric mean.
Example 5 shows that the approving semantics leads to
rather credulous decision making.
8
Characterization 4 (Uniformity Semantics). A probability
distribution P over Ω(Σfo ) is a uniformity-model of a conditional (B(X)|A(X))[p] ∈ CondB (Σ) iff
Vn
P
1/n
ω∈Ω(Σ), ω|=AB P( i=1 w(i))
Vn
P
= p.
P( i=1 w(i))1/n
ω∈Ω(Σ), ω|=A
Related Work
There have been published further approaches to fuse probabilities based on the principle of maximum entropy (Myung,
Ramamoorti, and Bailey 1996; Levy and Delic 1994;
Mohammad-Djafari 1998; Fassinut-Mombot and Choquel
2000). What comes closest to our view on belief fusion is the
fusion operator defined in (Kern-Isberner and Rödder 2004)
to which we want to refer with KIR(R1 , . . . , Rn ). This operator also merges the belief bases R1 , . . . , Rn first before
applying the maximum entropy principle to the merged belief base. In order to represent beliefs from different reasoners’ points of views, fresh propositions Wi are introduced
and conditionals (B|A)[p] are translated to (B|AWi )[p].
Hence, the conditional (B|AWi )[p] states that “B holds in
the presence of A with probability p in the view of reasoner ri .” To separate the different viewpoints, the conditionals (W1 ∨ . . . ∨ Wn |⊤)[1] and (Wi Wj |⊤)[0] for i 6= j
are added. However, this translation shows certain side effects when applying the principle of maximum entropy.
The uniformity semantics assigns positive probabilities
only to those possible worlds in Ω(Σfo ) that are duplicates
of worlds ω ∈ Ω(Σ) for each reasoner ri . With duplicated worlds we mean, for example, s(1)d(1)s(2)d(2) and
s(1)d(1)s(2)d(2) in Example 1 but not s(1)d(1)s(2)d(2).
This restriction causes a unified view on the world for all
reasoners ri .
The uniformity semantics in combination with the principle of maximum entropy results in OSEP.
Theorem 2. Let R1 , . . . , Rn be consistent belief bases.
Then,
Example 6. With respect to Example 1, it holds that
KIR(Rdoc,1 , Rdoc,2 )(d|s) ≈ 0.8456 which differs from the
results that can be obtained with OLEP and OSEP. In particular, KIR does not compute the arithmetic mean of the
probabilities 0.9 and 0.8 but tends to the lower probability 0.8. This is a direct consequence of the cautiousness of
maximum entropy reasoning which also effects the calculation of the mean values in the KIR-approach.
MEfus
unif (R1 , . . . , Rn ) = OSEP(R1 , . . . , Rn ).
Proof. One has
MEfus
unif (R1 , . . . , Rn )(ω) =
Vn
MEunif (Rfo )( i=1 w(i))1/n
Vn
=P
fo
′
1/n
ω ′ ∈Ω(Σ) MEunif (R )( i=1 w (i))
138
It is an open question whether KIR can be reproduced with
our first-order embedding approach. For a further comparison of KIR with OLEP and OSEP, see (Adamcik 2014).
9
Genest, C., and Zidek, J. V. 1986. Combining probability distributions: A critique and an annotated bibliography.
Statistical Science 1(1):114–135.
Grossi, D., and Pigozzi, G. 2014. Judgment Aggregation:
A Primer. Synthesis Lectures on Artificial Intelligence and
Machine Learning. Morgan & Claypool Publishers.
Kern-Isberner, G., and Rödder, W. 2004. Belief revision and
information fusion on optimum entropy. Int. J. Intell. Syst.
19(9):837–857.
Kern-Isberner, G., and Thimm, M. 2010. Novel semantical
approaches to relational probabilistic conditionals. In Proceedings of 12th KR Conference, 382–392. AAAI Press.
Konieczny, S., and Pérez, R. P. 2011. Logic based merging.
J. Philosophical Logic 40(2):239–270.
Levy, W. B., and Delic, H. 1994. Maximum entropy aggregation of individual opinions. IEEE Transactions on Systems, Man, and Cybernetics 24(4):606–613.
List, C. 2012. The theory of judgment aggregation: an
introductory review. Synthese 187(1):179–207.
McConway, K. J. 1981. Marginalization and linear opinion pools. Journal of the American Statistical Association
76:410–414.
Mohammad-Djafari, A. 1998. Probabilistic methods for
data fusion. Maximum Entropy and Bayesian Methods 57–
–69.
Myung, I. J.; Ramamoorti, S.; and Bailey, A. 1996. Maximum entropy aggregation of expert predictions. Management Science 42(10):1420–1436.
Paris, J. B. 1999. Common sense and maximum entropy.
Synthese 117(1):75–93.
Paris, J. B. 2006. The Uncertain Reasoner’s Companion: A
Mathematical Perspective. Cambridge University Press.
Shannon, C. E., and Weaver, W. 1949. The Mathematical
Theory of Communication. University of Illinois Press.
Shore, J. E., and Johnson, R. W. 1980. Axiomatic derivation
of the principle of maximum entropy and the principle of
minimum cross-entropy. IEEE Trans. Information Theory
26(1):26–37.
Stone, M. 1961. The opinion pool. Annals of Mathematical
Statistics 32(4):1339–1342.
Thimm, M., and Kern-Isberner, G. 2012. On probabilistic
inference in relational conditional logics. Logic Journal of
the IGPL 20(5):872–908.
Wilhelm, M.; Kern-Isberner, G.; and Ecke, A. 2017. Basic
independence results for maximum entropy reasoning based
on relational conditionals. In Proceedings of the 3rd Global
Conference on Artificial Intelligence (GCAI), 36–50.
Wilmers, G., and Jensen, O. E. 2010. The social entropy
process: Axiomatising the aggregation of probabilistic beliefs.
Wilmers, G. 2015. A foundational approach to generalising
the maximum entropy inference process to the multi-agent
context. Entropy 17:594–645.
Conclusion and Future Work
In this paper, we presented a novel approach for merging
belief bases that consist of probabilistic conditionals. While
the belief bases that were merged used a propositional background language, the merged beliefs were formalized by
first-order conditionals. The first-order lifting allowed us
to distinguish between statements of the form “if A holds,
then B follows with probability p in the view of reasoner
ri ,” and ”if A holds, then B follows with probability p in
the view of the group of reasoners.” The first type of statements was expressed by ground instantiated conditionals
(B(i)|A(i))[p] and the second type of statements by open
conditionals (B(X)|A(X))[p] with a free variable X. We proceeded with inferring a fused belief state from the merged
belief base by applying the principle of maximum entropy.
We showed that with our approach it is possible to both reproduce the well-known pooling operators OLEP and OSEP
and define novel operators by varying the semantics of firstorder conditionals. In addition, the first-order translation of
beliefs allows one to ask and answer more complex queries
than in the initial propositional setting.
In future work, we want to investigate if it is possible
to benefit from the connection between belief fusion operators and semantics of first-order conditionals the other way
around: For belief fusion operators, a wide range of postulates are declared. We plan to reformulate these postulates
in terms of first-order conditionals in order to evaluate the
quality of first-order semantics.
References
Adamcik, M. 2014. Collective Reasoning under Uncertainty and Inconsistency. Ph.D. Dissertation, University of
Manchester.
Bacharach, M. 1972. Scientific disagreement. Unpublished
manuscript, Christ Church, Oxford.
Bloch, I.; Hunter, A.; Appriou, A.; Ayoun, A.; Benferhat,
S.; Besnard, P.; Cholvy, L.; Cooke, R.; Cuppens, F.; Dubois,
D.; Fargier, H.; Grabisch, M.; Kruse, R.; Lang, J.; Moral,
S.; Prade, H.; Saffiotti, A.; Smets, P.; and Sossai, C. 2001.
Fusion: General concepts and characteristics. International
Journal of Intelligent Systems 16(10):1107–1134.
Boyd, S., and Vandenberghe, L. 2004. Convex Optimization.
Cambridge University Press.
Dietrich, F., and List, C. 2017. Probabilistic Opinion Pooling. Oxford University Press.
Dubois, D.; Liu, W.; Ma, J.; and Prade, H. 2016. The basic principles of uncertain information fusion. an organised
review of merging rules in different representation frameworks. Inf. Fusion 32(PA):12–39.
Fassinut-Mombot, B., and Choquel, J. B. 2000. An entropy
method for multisource data fusion. In Proceedings of the
Third International Conference on Information Fusion, volume 2.
139
Stratified disjunctive logic programs and the infinite-valued semantics
Panos Rondogiannis1 , Ioanna Symeonidou1 ,
1
National and Kapodistrian University of Athens
{prondo, isymeonidou}@di.uoa.gr,
Abstract
to the semantics of negation, namely the stable model semantics (Gelfond and Lifschitz 1988) and the well-founded
semantics (van Gelder, Ross, and Schlipf 1988), both agree
with the perfect model semantics when restricted to the class
of stratified programs. Yet, most attempts at defining a disjunctive well-founded semantics do not have the analogous
relationship with the disjunctive perfect model semantics.
The only exception, to our knowledge, is the Stationary Semantics (Przymusinski 1990). Interestingly, Przymusinski
has later replaced his proposal with the weaker Static Semantics (Przymusinski 1995) and criticized the disjunctive
perfect model semantics, reopening the question of the “correct” interpretation of stratified disjunctive programs.
It is in this light that we examine one of the most recent attempts at defining a well-founded semantics for disjunctive
programs, namely the infinite-valued semantics (Cabalar et
al. 2007). Under this purely model-theoretic approach, the
meaning of a program is captured by the set of its minimal
infinite-valued models. These models are defined over an
expanded truth domain with infinite values, which express
different degrees of certainty ranging from certainly true or
certainly false to undefined. In this paper, we demonstrate
how the structure of these models is closely connected to
the stratifications of the program, by proving a number of
properties. First, we show that these minimal models retain their minimality for every subset of the program defined
by a given stratification. Then we argue that the stratum in
which each atom is placed limits the degree of uncertainty
of its truth value; the atoms in the lower strata are evaluated
with greater certainty than the ones in the higher strata. This
sets the number of strata as an upper bound of the distinct
truth values appearing in the minimal infinite-valued models. At the same time, it ensures that the minimal infinitevalued models of a stratified program never assign the undefined truth value, as it also happens with the perfect models.
Moreover, we show that the two-valued interpretation produced from a minimal infinite-valued model of a stratified
program, by collapsing all true values to T and all false values to F , is always a classical minimal model of the program
– another property shared by the perfect models.
In addition, we discuss the behavior of the infinite-valued
semantics with respect to a selection of example stratified
disjunctive programs, borrowed from the available literature.
We find that, for all of these programs, there is a one-to-one
Numerous adaptations of the well-founded semantics have
been proposed for the class of disjunctive logic programs with
negation. The proposals are mostly different in philosophy
as well as in the results they produce; surprisingly perhaps,
the differences occur even in the case of stratified programs.
In this paper we focus on one of the most recent of these
approaches, namely the disjunctive infinite-valued semantics
and explore its relationships to stratification. We demonstrate some close connections between the tiered structure of
a stratified program and the structure of its minimal infinitevalued models. Most importantly, we show that the number of distinct truth values assigned by any minimal infinitevalued model of a stratified program is bound by the number
of strata. In addition, we present some evidence of the approach’s affinity to the disjunctive perfect model semantics,
such as the similar properties of the models defined by the
two semantics and their identical behavior with respect to selected benchmark programs.
1
Introduction
Disjunctive logic programming has been shown to hold
more expressive power than traditional logic programming
(Eiter and Gottlob 1993; Eiter, Gottlob, and Mannila 1994).
As a result, it has drawn a fair amount of attention and a
substantial body of work has been produced over the years
pertaining to this paradigm. Among other things, great effort has gone into the search for a disjunctive well-founded
semantics. After a plethora of variations has been proposed (Ross 1989; Przymusinski 1990; Baral 1992; Brass
and Dix 1995; Przymusinski 1995; Wang 2000; Alcântara,
Damásio, and Pereira 2005; Cabalar et al. 2007), it seems
a consensus has not yet been reached on a widely accepted
approach; or even, on the criteria by which such an approach
should be selected.
It is also noteworthy that one rarely comes across a
version of the disjunctive well-founded semantics which
is proven to extend the disjunctive perfect model semantics (Przymusinski 1988) of stratified disjunctive programs.
In traditional logic programming, stratified programs have
always had a special significance, as they are regarded to
hold a clear and indisputable meaning, captured by the
perfect model semantics (Apt, Blair, and Walker 1988;
van Gelder 1988). The two main and established approaches
140
correspondence between their minimal infinite-valued models and their perfect models, even when some of the other
approaches produce different results. Combined with the
aforementioned original results, we believe that these observations serve as evidence to the potential equivalence of the
infinite-valued semantics and the perfect model semantics
and contribute to the better understanding of the semantics
of stratified disjunctive programs.
The rest of the paper is organized as follows. Section 2
gives an overview of the major disjunctive well-founded semantics that have been proposed in the past few years and
their relationship to the disjunctive perfect model semantics.
Section 3 defines the syntax of disjunctive logic programs,
as we consider it in this paper, and gives a detailed presentation of their infinite-valued semantics. In Section 4 we recall
the definition of stratification for disjunctive programs and
in Section 5 we formally present our main results, i.e. we
demonstrate the above-mentioned properties of the minimal
infinite-valued models of stratified programs. Section 6 empirically compares the infinite-valued semantics to the perfect model semantics based on a curated set of examples.
Finally, Section 7 concludes the paper with a discussion on
possible future research directions.
2
Founded Semantics (Ross 1989), which features a procedural, top-down characterization. The meaning of the program
is given by a unique set comprising disjunctions of ground
atoms and negations of such disjunctions, all considered to
be true with respect to the program. It is shown in (Brass
and Dix 1995) that the the Strong Well-Founded Semantics
is not a generalization of the perfect model semantics.
At around the same time, the Stationary Semantics (Przymusinski 1990; Przymusinski 1991) was introduced. The
semantics is equivalently characterized in terms of program
completion, iterated minimal models and the least fixedpoint of a minimal model operator. This is the only approach, that we are aware of, which is shown to generalize
the disjunctive perfect model semantics. Consequently, it is
different from the Strong Well-Founded Semantics.
Shortly after, Baral (1992) proposed a disjunctive wellfounded semantics (DWFS) with a fixed-point and a procedural characterization. The fixed-point characterization is
a generalization of both Przymusinski’s characterization of
the classical well-founded semantics (1989) and the fixedpoint semantics of negationless disjunctive programs (Lobo,
Minker, and Rajasekar 1992). The semantics is different
from the Strong Well-Founded and the Stationary Semantics, as shown in (Baral 1992). We are not aware of any
results regarding its relationship to the disjunctive perfect
model semantics, other than that the two semantics agree
for certain benchmark programs discussed by the author.
Another proposal, labeled D-WFS, was given by Brass
and Dix (1995). It is defined abstractly as the weakest semantics which remains unchanged under a set of elementary program transformations. It is also defined procedurally, by means of a bottom-up query evaluation method, for
programs with a finite ground instantiation. The D-WFS is
proven to be strictly weaker than (and therefore not an extension of) the disjunctive perfect model semantics and different from the Strong Well-Founded Semantics in (Brass and
Dix 1995).
Przymusinski revisited the subject of disjunctive wellfounded semantics in (1995), where he introduced the Static
Semantics. The definition of the semantics is based on translating the program into a belief theory and a fixed-point characterization is given. This approach is strictly weaker than
the perfect model semantics and different from the author’s
earlier approach, the Stationary Semantics (Przymusinski
1995). On the other hand, it is strictly stronger than the
D-WFS unless we restrict attention to a common query language, in which case the two semantics coincide (Brass et
al. 2001). Przymusinski argued that the weaker nature of his
newer semantics makes it a preferable alternative (see Example 5 in Section 6 for more details).
Interest in the issue of disjunctive well-founded semantics continued in the next decade. In (2000), Wang presented a semantic framework based on argumentation-based
abduction, which he used to define a number of different semantics, including a generalization of the well-founded semantics. Wang named his proposal the Well-Founded Disjunctive Semantics (WFDS) and in (2001) he showed that
it can be equivalently characterized using several different
approaches, such as program transformations (an augmen-
Generalizations of the well-founded
semantics
The generalization of the well-founded semantics to the
class of disjunctive logic programs remains to this day an
open problem. Despite the numerous approaches that have
been proposed over the last three decades, none has succeeded in finding wide acceptance. Moreover, the completely different characterizations of the approaches and intuitions behind them make direct comparisons a particularly
challenging task. As a result, an additional volume of literature has been built around these proposals, investigating
their connections and differences and debating the criteria
by which they should be judged.
In this section we discuss some of the most popular semantics for disjunctive programs with negation, which are
genuine generalizations of the well-founded semantics. We
will make mention of results concerning the relationships of
these approaches with the perfect model semantics for stratified disjunctive programs (Przymusinski 1988), as well as
the relationships among the approaches themselves, whenever such results are available.
Most of the approaches discussed here have been found to
be different from each other and in most cases, the comparisons are limited to observing their different behavior with
respect to one or more example programs. However, there
are some instances where two semantics are compared in
terms of the amount of information they can extract from a
program. In (Brass and Dix 1995) a semantics is defined to
be weaker than a second semantics (or the second is stronger
than the first), if every disjunction of all positive or all negative ground literals which can be derived under the first semantics, can also be derived under the second.
One of the first attempts at defining a well-founded
semantics for the disjunctive case, was the Strong Well-
141
The key idea of the infinite-valued approach is that, in order to give a logical semantics to negation-as-failure and to
distinguish it from ordinary negation, one needs to extend
the domain of truth values. For example, consider the program:
p←
r ← ∼p
s ← ∼q
According to negation-as-failure, both p and s receive the
value T . However, p seems “truer” than s because there
is a rule which says so, whereas s is true only because we
are never obliged to make q true. In a sense, s is true only
by default. For this reason, it was proposed in (Rondogiannis and Wadge 2005) to introduce a “default” truth value T1
just below the “real” true T0 , and (by symmetry) a weaker
false value F1 just above (“not as false as”) the real false F0 .
Then, negation-as-failure is a combination of ordinary negation with a weakening. Thus ∼F0 = T1 and ∼T0 = F1 .
Since negations can be iterated, the new truth domain requires a sequence . . . , T3 , T2 , T1 of weaker and weaker truth
values below T0 but above the neutral value 0; and a mirror
image sequence F1 , F2 , F3 , . . . above F0 and below 0. In
fact, in (Rondogiannis and Wadge 2005) a Tα and a Fα are
introduced for all countable ordinals α; since in this paper
we deal with finite propositional programs, we will not need
this generality here. The new truth domain V is ordered as
follows:
tation of the approach of (Brass and Dix 1995)), argumentation, unfounded sets and a bottom-up procedure. He also
showed that the semantics is strictly stronger than the DWFS (2001), and different from the Static (2001) and Stationary Semantics (2000). Its relationship to the perfect
model semantics is not examined, but a comparison of example programs from (Brass and Dix 1995) and (Wang 2000)
reveals they are incompatible.
The WFSd is another fixed-point approach presented
in (Alcântara, Damásio, and Pereira 2005). The semantics
is compared to most of the previous approaches and found
to be different from all of them; in particular, it is different from the Strong Well-Founded Semantics, the Stationary
and Static Semantics and Wang’s WFDS, while it is strictly
stronger than the D-WFS. It is also shown that it doesn’t
generalize the perfect model semantics.
In (Cabalar et al. 2006), Partial Equilibrium Logic (PEL)
was employed as a framework for providing a purely declarative semantics for disjunctive programs. It was shown that
the semantics does not agree with the D-WFS, the Static
Semantics and Wang’s WFDS; in particular, it is neither
stronger nor weaker than the former two approaches. A peculiarity that distinguishes it from many approaches is that
it does not guaranty the existence of a model for every program. The authors do not attempt a comparison to the perfect model semantics.
In this paper we will focus on the most recent approach,
(Cabalar et al.
named the Infinite-Valued Semantics Lmin
∞
2007). To our knowledge, this is the only version of a
disjunctive well-founded semantics other than PEL, with a
purely model-theoretic characterization. Proportionately to
the semantics of positive programs, the meaning of a program is captured by the set of its minimal models. However,
in this case the models are defined over a new logic of infinite truth values, used to signify the decreasing reliability
of information obtained through negation-as-failure. In (Cabalar et al. 2007), the approach is compared to and shown
to be different from the D-WFS, the Static Semantics, the
WFDS, the WFSd and PEL, but it is not compared to the
perfect model semantics. More details on the intuitive ideas
behind the infinite-valued semantics, as well as a formal definition of Lmin
∞ , are given in the next section.
3
F0 < F1 < · · · < 0 < · · · < T1 < T0
Every truth value in V is associated with a natural number,
called the order of the value:
Definition 2. The order of a truth value is defined as follows: ord(Tn ) = n, ord(Fn ) = n and ord(0) = +∞.
It is straightforward to generalize the notion of interpretation under the prism of our infinite-valued logic. We use
HB P to denote the set of propositional symbols appearing
in a given program P, also called the Herbrand base of P:
Definition 3. An (infinite-valued) interpretation of a disjunctive program P is a function from HB P to the set V of
truth values.
If v ∈ V is a truth value, we will use I k v to denote the
set of atoms which are assigned the value v by I.
Definition 4. The meaning of a formula with respect to an
interpretation I can be defined as follows:
(
Tn+1 , if I(A) = Fn
Fn+1 , if I(A) = Tn
I(∼A) =
0,
if I(A) = 0
I(A ∧ B) = min{I(A), I(B)}
I(A ∨ B) =
max{I(A), I(B)}
T0 ,
if I(A) ≥ I(B)
I(A ← B) =
I(A), if I(A) < I(B)
Disjunctive Programs and the
Infinite-Valued Semantics Lmin
∞
In this section we present background on the infinite-valued
semantics for disjunctive logic programs with negation. We
follow closely the presentation of (Cabalar et al. 2007). The
authors focus on the class of disjunctive logic programs:
Definition 1. A disjunctive logic program is a finite set of
clauses of the form
p1 ∨ · · · ∨ pn ← q1 , . . . , qk , ∼r1 , . . . , ∼rm
where n ≥ 1 and k, m ≥ 0.
For the sake of simplicity of notation, the authors consider
only finite programs, but they note that the results can be
lifted to the more general first-order case. We adhere to this
simplification in this paper.
The notion of satisfiability of a clause can now be defined:
Definition 5. Let P be a program and I an interpretation.
Then, I satisfies a clause
p1 ∨ · · · ∨ pn ← q1 , . . . , qk , ∼r1 , . . . , ∼rm
142
if I(p1 ∨ · · · ∨ pn ) ≥ I(q1 , . . . , qk , ∼r1 , . . . , ∼rm ). Moreover, I is a model of P if I satisfies all clauses of P.
The semantics Lmin
∞ is a minimal model semantics. This
implies we need a partial order on the set of interpretations:
Definition 6. Let I, J be interpretations and n < ω. We
write I =n J, if for all k ≤ n, I k Tk = J k Tk , and
I k Fk = J k Fk . We write I ⊑n J, if for all k < n, I =k J
and, moreover, I k Tn ⊆ J k Tn and I k Fn ⊇ J k Fn . We
write I ❁n J, if I ⊑n J but I =n J does not hold.
Definition 7. Let I, J be interpretations. We write I ❁∞ J,
if there exists n < ω (that depends on I and J) such that
I ❁n J. We write I ⊑∞ J if I = J or I ❁∞ J.
It is easy to see that the relation ⊑∞ on the set of interpretations is a partial order (i.e., it is reflexive, transitive and
antisymmetric). On the other hand, for every n < ω, the
relation ⊑n is a preorder (i.e., reflexive and transitive).
In comparing two interpretations I and J we consider first
only those propositional symbols assigned “standard” truth
values (T0 or F0 ) by at least one of the two interpretations.
If I assigns T0 to a particular symbol and J does not, or J
assigns F0 to a particular symbol and I does not, then we
can rule out I ⊑∞ J. Conversely, if J assigns T0 to a particular variable and I does not, or I assigns F0 to a particular
variable and J does not, then we can rule out J ⊑∞ I. If
both these conditions apply, we can immediately conclude
that I and J are incomparable. If exactly one of these conditions holds, we can conclude that I ⊑∞ J or J ⊑∞ I,
as appropriate. However, if neither apply, then I and J are
equal in terms of standard truth values; they both assign T0
to each of one group of propositional symbols and F0 to each
of another. In this case we must now examine the symbols
assigned F1 or T1 . If this examination proves inconclusive,
we move on to T2 and F2 , and so on. Thus ⊑∞ gives the
standard truth values the highest priority, T1 and F1 the next
priority, T2 and F2 the next, and so on.
These ideas are illustrated by the following example:
Example 1. Consider the program:
work ∨ play ← ∼rest
play ∨ rest ←
Theorem 2. Let P be a program and let M be a minimal
infinite-valued model of P. For every propositional symbol
p ∈ HB P , M (p) ∈ {0, F0 , T0 , . . . , F|HB P |−1 , T|HB P |−1 }.
In section 5, we will show that in the case of stratified
programs, this bound can be improved.
In the special case of traditional non-disjunctive or normal programs, the number of minimal models is reduced to
exactly one. This unique minimum model is equivalent to
the well-founded model of the program. This is one of the
main results of (Rondogiannis and Wadge 2005) and is summarized in Theorem 3 below. The notion of collapsing an
infinite-valued interpretation into a three-valued one will be
useful in formally stating the theorem:
Definition 8. Let P be a program and let I be an infinitevalued interpretation of P. We denote by Col(I) the threevalued interpretation obtained from I by collapsing each Ti
to T and each Fi to F .
We may also say that I collapses to a three-valued interpretation I ′ , if I ′ = Col(I).
Theorem 3. Every normal logic program with negation P
has a unique minimum infinite-valued model, which collapses to the well-founded model of P.
Returning to disjunctive programs, another important result of (Cabalar et al. 2007) is that the semantics Lmin
∞ extends the minimal model semantics for (negation-less) disjunctive logic programs:
Theorem 4. Let P be a disjunctive program which does not
contain negation. If we identify the value T0 with T and the
value F0 with F , then
1. If M is a minimal classical model of P, then it is also a
minimal infinite-valued model of P.
2. If M is a minimal infinite-valued model of P, then it assigns to every propositional symbol in P a value of order
0 and it is a minimal classical model of P.
4
Stratification
In classical logic programming, stratification was first defined by Apt, Blair and Walker (1988) and, independently,
by van Gelder (1988). The definition was generalized to apply to disjunctive programs by Przymusinski (1988).
A stratified program allows us to determine a priority ordering of its propositional symbols, through which we evaluate a symbol only after having evaluated all of its dependencies through negation. In other words, stratified programs
do not display circular dependencies through negation.
The minimal models of the above program are:
M1 = {(play, T0 ), (rest, F0 ), (work, F0 )}
M2 = {(play, F0 ), (rest, T0 ), (work, F1 )}
M3 = {(play, F1 ), (rest, T0 ), (work, F0 )}.
From the above discussion it should be now clear that the
infinite-valued semantics of a disjunctive logic program with
negation is captured by the set of ⊑∞ −minimal infinitevalued models of the program. One of the major results
of (Cabalar et al. 2007) is that this set is always non-empty:
Theorem 1. Every disjunctive logic program with negation
has a non-empty set of minimal infinite-valued models.
Moreover, the set of minimal models of a program is finite. This is an immediate consequence of the following theorem from (Cabalar et al. 2007), which shows that the maximum order of truth values assigned by a minimal model M
is bound by the number of propositional symbols appearing
in the program.
Definition 9. A program P is stratified, if it is possible to decompose HB P into disjoint sets S0 , S1 , . . . , Sr ,
called strata, so that for every clause p1 ∨ · · · ∨ pn ←
q1 , . . . , qk , ∼r1 , . . . , ∼rm in P we have that:
1. the propositional symbols p1 , . . . , pn belong to the same
stratum, say Sl
S
2. the propositional symbols q1 , . . . , qk belong to {Sj :
j ≤ l}
S
3. the propositional symbols r1 , . . . , rm belong to {Sj :
j < l}
143
Example 2. The program
From the above definition and the definition of stratification, it becomes obvious that the clauses in a set Pn cannot
contain propositional symbols belonging to the strata above
Sn . In this sense, we might say that Pn defines a “selfcontained” subset of the program. Moreover, HB n is the
Herbrand base and I n is a valid interpretation for this subset. The next lemma, which will be useful in proving our
main result of Theorem 5, states that every minimal model
of a stratified program is also a minimal model of every such
subset of the program.
Lemma 1. Let P be a stratified program, {S0 , . . . , Sr } be a
stratification of P and M be a minimal model of P. For any
n, 0 ≤ n ≤ r, M n is a minimal model of Pn .
work ∨ play ← ∼rest
work ← workday
is stratified and a possible stratification is S0 = {rest,
workday}, S1 = {work, play}. The program of Example 1
is not stratified, because the first requirement of Definition 9
demands that play and rest are in the same stratum (due
to the second clause), while at the same time the third requirement demands that rest be placed at a lower stratum
than play (due to the first clause). Finally, the program
work ∨ play ← ∼rest
rest ∨ play ← ∼work
Proof. By definition, for each n = 1 . . . r, M n and M assign the same truth values to all propositional symbols in
HB n , so M n satisfies a clause in Pn iff M satisfies the
same clause. It follows that M n is a model of Pn for all
n = 1 . . . r, so it suffices to show that M n a minimal.
For the sake of contradiction, assume that M n is not a
minimal model of Pn . Then there is a model N n of Pn such
that N n ❁∞ M n . Assume more specifically that N n ❁k
M n for some natural number k.We define the interpretation
N of P as so: for each propositional symbol p ∈ HB
n
n
N (p) if S(p) ≤ n and ord(N (p)) ≤ k
N (p) = M (p) if S(p) > n and ord(M (p)) ≤ k
Tk+1
otherwise
is not stratified, as there exist circular dependencies through
negation.
The class of stratified programs is particularly interesting, as they have some nice semantic properties. Most
importantly, the two major semantic approaches for general programs, namely the well-founded semantics (van
Gelder, Ross, and Schlipf 1988) and the stable model semantics (Gelfond and Lifschitz 1988), coincide for this class
of programs, giving a unique two-valued minimum model
(also called the perfect model (Apt, Blair, and Walker 1988;
van Gelder 1988)) for every stratified program.
When examining first-order programs, the notion of stratification can be extended to define the significantly more general but equally “unproblematic” (from the semantics point
of view) class of locally stratified programs (Przymusinski
1988). However, the class of locally stratified programs coincides with that of stratified programs when we restrict attention to finite propositional programs as we have, so we
need not consider it in this paper.
5
We will show that:
1. N ❁k M and
2. N is a model of the program P.
Obviously, this will contradict the assumption that M is a
minimal model of P and prove that N n cannot exist.
First we show statement 1: we have that N n ❁k M n ⇒
n
N =k−1 M n . So, for all p such that S(p) ≤ n and
ord(M (p)) < k (or, equivalently, ord(N n (p)) < k) we
have that M (p) = M n (p) = N n (p) = N (p) (from the construction of N ). Moreover, for all p such that S(p) > n and
ord(M (p)) ≤ k, N (p) = M (p). One can easily see that
M =k−1 N and that, for order k, the two interpretations
differ exactly for the same atoms as M n and N n . Thus we
conclude that N ❁k M .
Now we show statement 2: We need to prove that N satisfies every clause p1 ∨ · · · ∨ pnh ← Q1 , . . . , Qnb in P.
n
If the clause
Sn is in P , then every atom appearing in the
clause is in i=0 Si . Assume that, among the atoms in the
head of the clause, N n assigns the maximum value to some
pi . Also assume that, among the literals in the body of the
clause, N n assigns the minimum value to some Qj of the
form q or the form ∼q. Because N n is a model of P:
Properties of the Infinite-Valued Models of
Stratified Programs
In this section we present the main results of this paper. In
particular, we study the relationship between the structure of
a stratified program and the structure of its minimal models
and show that these models collapse to minimal two-valued
models. As in the previous section, we restrict attention to
finite programs.
The next definition introduces notation that will be useful
in stating the results of this section.
Definition 10. Let P be a stratified program, {S0 , . . . , Sr }
be a stratification of P and I be an interpretation of P.
• We use Pi , 0 ≤ i ≤ r, to denote the set of clauses whose
head consists of propositional symbols in Si . We use Pi
Si
to denote the set k=0 Pk .
N n (pi )
max{N (p1 ), · · · , N (pnh )}
N n (p1 ∨ · · · ∨ pnh )
N n (Q1 , . . . , Qnb )
min{N n (Q1 ), . . . , N n (Qnb )}
N n (Qj )
n
i
• We use HB , 0 ≤ i ≤ r, to denote the set of propositional
symbols appearing in the clauses of Pi .
• We use I i , 0 ≤ i ≤ r, to denote the subset of I that
assigns values only to the propositional symbols in HB i .
• We define a function S, with S(p) = n if p ∈ Sn .
144
n
=
=
≥
=
=
A. If ord(M (pi )) ≤ k:
In this case, M (pi ) ≤ Fk or Tk ≤ M (pi ). Also, by the
construction of N , N (pi ) = M (pi ).
Similarly, we have:
N (p1 ∨ · · · ∨ pnh ) = max{N (p1 ), · · · , N (pnh )} ≥ N (pi )
and
A.1. If ord(M (q)) < k:
We have constructed N so that N =k M , therefore
N (q) = M (q). Because M is a model of P, N (Qj ) =
M (Qj ) ≤ M (pi ) = N (pi ), whether Qj = q or Qj =
∼q.
A.2. If ord(M (q)) ≥ k:
Because N ❁k M , if M (q) = Fk it must also be
N (q) = Fk . Then we have N (Qj ) = M (Qj ) ≤
M (pi ) = N (pi ), whether Qj = q or Qj = ∼q.
If Fk < M (q) ≤ Tk , then Fk+1 ≤ M (∼q) <
Tk+1 . Consequently, whether Qj = q or Qj = ∼q,
Fk < M (q) ≤ Tk , M (Qj ) ≤ M (pi ) and M (pi ) ≤
Fk cannot all hold at the same time, so it must be
Tk ≤ M (pi ) = N (pi ) in this case. By N ❁k M
and the construction of N , N (q) ∈ {Fk , Tk+1 , Tk }
and so N (∼q) ∈ {Fk+1 , Fk+2 , Tk+1 }. Observe that
N (Qj ) ≤ Tk ≤ M (pi ) = N (pi ) holds for all three
possible values of N (q).
N (Qj ) ≥ min{N (Q1 ), . . . , N (Qnb ))} = N (Q1 , . . . , Qnb )
This suggests that, if we can show N (pi ) ≥ N (Qj ), then
we will have shown that N satisfies the clause and thus is a
model of Pn .
We distinguish the following cases:
A. If ord(N n (pi )) ≤ k:
In this case, N n (pi ) ≤ Fk or Tk ≤ N n (pi ). Also, by the
construction of N , N (pi ) = N n (pi ).
A.1. If ord(N n (q)) ≤ k:
Again, by the construction of N , N n (q) = N (q). Because N n is a model of Pn , N (Qj ) = N n (Qj ) ≤
N n (pi ) = N (pi ), whether Qj = q or Qj = ∼q.
A.2. If ord(N n (q)) > k:
In this case, Fk < N n (q) < Tk , N (q) = Tk+1
and N (∼q) = Fk+2 . Because N n is a model of Pn
and therefore N n (Qj ) ≤ N n (pi ), it is not possible
that N n (pi ) ≤ Fk . It follows that N (Qj ) < Tk ≤
N n (pi ) = N (pi ), whether Qj = q or Qj = ∼q.
B. If ord(N n (pi )) > k:
In this case, Fk < N n (pi ) < Tk and N (pi ) = Tk+1 .
B.1. If ord(N n (q)) ≤ k:
This assumption makes N n (q) ≤ Fk or Tk ≤ N n (q)
and by the construction of N , N (q) = N n (q). Also,
N (∼q) = N n (∼q) ≤ Fk+1 or Tk+1 ≤ N n (∼q) =
N (∼q). However, if Qj = q, then N n (Qj ) ≤ N n (pi )
and N n (pi ) < Tk imply that only N n (q) ≤ Fk can
hold. Then, N (q) = N n (q) ≤ Fk < Tk+1 = N (pi ).
Similarly, if Qj = ∼q, then N n (Qj ) ≤ N n (pi ) and
N n (pi ) < Tk imply that either N (∼q) = N n (∼q) ≤
Fk+1 or N (∼q) = N n (∼q) = Tk+1 . Obviously,
N (∼q) ≤ N (pi ) for all these values.
B.2. If ord(N n (q)) > k:
By the construction of N , N (q) = Tk+1 ; then
N (∼q) = Fk+2 . Again, N (Qj ) ≤ N (pi ) holds for
either possible form of Qj .
B. If ord(M (pi )) > k:
In this case, Fk < M (pi ) < Tk and N (pi ) = Tk+1 .
B.1. If ord(M (q)) < k:
This assumption translates to M (q) < Fk or Tk <
M (q) and by N =k M , N (q) = M (q). Also,
N (∼q) = M (∼q) < Fk+1 or Tk+1 < M (∼q) =
N (∼q). However, if Qj = q, then M (Qj ) ≤ M (pi )
and M (pi ) < Tk imply that only M (q) < Fk can
hold. Then, N (q) = M (q) < Fk < Tk+1 = N (pi ).
Similarly, if Qj = ∼q, then M (Qj ) ≤ M (pi ) and
M (pi ) < Tk imply that either N (∼q) = M (∼q) ≤
Fk+1 or N (∼q) = M (∼q) = Tk+1 . Obviously,
N (∼q) ≤ N (pi ) for all these values.
B.2. If ord(M (q)) = k:
In this case M (q) = Fk or M (q) = Tk , while
M (∼q) = Tk+1 or M (∼q) = Fk+1 respectively. If
Qj = q, then M (Qj ) ≤ M (pi ) and M (pi ) < Tk imply that only M (q) = Fk can hold. Then, N ❁k M
suggests that N (Qj ) = N (q) = Fk < Tk+1 = N (pi ).
On the other hand, if Qj = ∼q, then M (q) = Fk
and M (q) = Tk both allow M to satisfy the clause.
By N ❁k M and the construction of N , N (q) ∈
{Fk , Tk+1 , Tk } and so N (∼q) ∈ {Fk+1 , Fk+2 , Tk+1 }.
Observe that N (∼q) ≤ Tk+1 = N (pi ) holds for all
three possible values of N (q).
B.3. If ord(M (q)) > k:
By the construction of N , N (q) = Tk+1 ; then
N (∼q) = Fk+2 . Again, N (Qj ) ≤ N (pi ) holds for
either possible form of Qj .
We have shown that N (Qj ) ≤ N (pi ) in every case and thus,
that N satisfies every clause in Pn .
Let us now examine a clause not in Pn . This time we
consider the values assigned by M (as opposed to N n ) and
follow a similar reasoning as before. So again we assume
that, among the atoms in the head of the clause, M assigns
the maximum value to some pi . Also we assume that, among
the literals in the body of the clause, M assigns the minimum
value to some Qj of the form q or the form ∼q. We now
have M (Qj ) ≤ M (pi ) and, as in the previous case, to show
that N satisfies the clause, it suffices to show that N (Qj ) ≤
N (pi ).
Observe that, if S(q) > n then N (Qj ) ≤ N (pi ) can be
shown in exactly the same way as in the previous case, by
simply substituting M for N n . If on the other hand S(q) ≤
n, we need to follow a slightly different line of arguments.
We distinguish the following cases:
We have shown that N (Qj ) ≤ N (pi ) in every case and thus,
that N satisfies the clauses of P that are not in Pn , as well as
those that are in Pn . In conclusion, we have shown that N is
a model of P which satisfies N ❁k M , i.e. we have reached
a contradiction. This proves that M n must be a minimal
model of Pn .
145
∼q. Observe that, because q appears in a negative literal
in a clause of Pn , it must be that S(q) = n′ for some
′
n′ < n. Then, by the induction hypothesis M n (q) ∈
′
{F0 , T0 , . . . , Fn′ , Tn′ }. Also, M n (q) = M n (q) by definition, which eventually makes ord(M n (q)) < n. In this
case, N n (q) = M n (q).
Since M n is a model of Pn and so satisfies the clause, it
must be
M n (pi ) ≥ M n (∼q)
(1)
n
n
A.1. If M (q) = Fm ⇒ M (∼q) = Tm+1 , m < n:
M n (pi ) ≥ M n (∼q) ⇒
M n (pi ) ≥ Tm+1 ⇒
ord(M n (pi )) ≤ m + 1 ≤ n ⇒
N n (pi ) = M n (pi )
The latter relation, in conjunction with N n (q) =
M n (q) and M n (pi ) ≥ M n (∼q), yields N n (pi ) ≥
N n (Qj ).
A.2. If M n (q) = Tm ⇒ M n (∼q) = Fm+1 , m < n:
M n (pi ) ≥ M n (∼q) ⇒
M n (pi ) ≥ Fm+1 ⇒
if Fn < M n (pi ) < Tn
Fn ,
n
N (pi ) =
M n (pi ), if Fm+1 ≤ M n (pi ) ≤ Fn
or Tn ≤ M n (pi )
The above lemma shows there is a relationship between
any arbitrary stratification of a given program and the structure of the program’s minimal models. In the following we
will explore another, more obvious aspect of this relationship, which highlights the proximity of the core ideas behind
stratification and the infinite-valued semantics.
In every clause of a stratified program, the propositional
symbols appearing in a negative form have to be placed in a
lower stratum than the stratum in which we place the propositional symbols of the clause head. Because of this, P0 ,
as defined by any stratification of a program P, is a positive program. As such, Theorem 4 of section 3 states that
its minimal models assign only truth values of order 0 to all
atoms. The next theorem demonstrates that this correlation
between the stratum and the order of truth values assigned
by the minimal model extends to all strata of the program.
This way, it sets the number of strata as an improved bound
(compared to the one set by Theorem 2 for the general case)
of the order of truth values in the minimal model of a stratified program.
Theorem 5. Let P be a stratified program, {S0 , . . . , Sr }
be a stratification of P and M be a minimal infinite-valued
model of P. For every propositional symbol p appearing in
P, M (p) ∈ {F0 , T0 , . . . , Fr , Tr }.
Proof. The proof is by induction on the strata of the program.
Base case: For S0 , P0 is a positive program and, by
Lemma 1, M 0 is a minimal model of P0 . From the second
part of Theorem 4, M 0 assigns to the propositional symbols
in HB 0 values in the set {F0 , T0 }.
Induction step: We will show that M n assigns to
the propositional symbols in HB n values in the set
{F0 , T0 , . . . , Fn , Tn }, assuming that for all m < n, M m
assigns to the propositional symbols in HB m values in the
set {F0 , T0 , . . . , Fm , Tm }.
Assume that there exists some propositional symbol p ∈
HB n , such that ord(M n (p)) > n.
We construct an interpretation N n as so:
Fn
if ord(M n (p)) > n
n
N (p) =
M n (p) otherwise
The latter relation, in conjunction with N n (q) =
M n (q) and Fm+1 ≤ Fn , gives us N n (pi ) ≥ N n (Qj ).
B. Among the literals in the body of the clause, M n assigns
the minimum value to some Qj of the form q. Since M n
is a model of the program, it must satisfy the clause, i.e.
M n (pi ) ≥ M n (Qj ) = M n (q). We have:
B.1. If Fn < M n (q) < Tn , then N n (q) = Fn . Moreover:
M n+1 (pi )
≥
M n (q) ⇒
M n+1 (pi )
>
N n+1 (pi )
=
Fn ⇒
Fn , if Fn < M n (pi ) < Tn
M n (pi ), if M n (pi ) ≥ Tn
This makes N n (pi ) ≥ N n (q) in every case.
B.2. If M n (q) = Tm , m ≤ n, then N n (q) = M n (q).
Moreover:
M n (pi ) ≥ M n (q) ⇒
M n (pi ) ≥ Tm ≥ Tn ⇒
N n (pi ) = M n (pi )
n
Therefore M (pi ) ≥ M n (q) ⇒ N n (pi ) ≥ N n (q) =
N n (Qj ).
B.3. If M n (q) = Fm , m ≤ n, then N n (q) = M n (q).
Moreover:
M n (pi ) ≥ M n (q) ⇒
M n (pi ) ≥ Fm ⇒
if Fn < M n (pi ) < Tn
Fn ,
n
n
N (pi ) =
M (pi ), if M n (pi ) ≤ Fn
or M n (pi ) ≥ Tn
It is simple to see that N n ❁n M n and we will show that
N n is a model of P n . This constitutes a contradiction, since
by Lemma 1 M n is a minimal model of P n . The contradiction renders impossible the existence of atoms q ∈ HB n
such that ord(M n (q)) > n.
Every clause in Pn is of the form:
p 1 ∨ · · · ∨ p n h ← Q 1 , . . . , Q nb
We examine the possible truth values that M n and N n
assign to the propositional symbols in the above arbitrary
clause and show that N n satisfies the clause in every case.
Assume that, among the propositional symbols in the
head of the clause, M n assigns the maximum truth value
to some pi . We distinguish the following cases:
A. Assume that among the literals in the body of the clause,
M n assigns the minimum value to some Qj of the form
Therefore N n (pi ) ≥ N n (q).
146
we have N ′ (ri ) = T . The latter case is not possible because it would imply that N (ri ) > 0, ie., N (∼ri ) < 0,
and therefore N (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) < 0. Therefore, for some qi it is N ′ (qi ) = F while N (qi ) >
F . This implies that Col(M )(qi ) = T and therefore Col(M )(q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = T . However,
since N ′ (p1 ∨ · · · ∨ pn ) = F , we also have Col(M )(p1 ∨
· · · ∨ pn ) = F , and thus Col(M )(p1 ∨ · · · ∨ pn ) <
Col(M )(q1 , . . . , qk , ∼r1 , . . . , ∼rm ). This is a contradiction because Col(M ) is a model of P.
C. N (p1 ∨ · · · ∨ pn ) = Fi and N (q1 , . . . , qk , ∼r1 , . . . ,
∼rm ) = Fj with i < j. Then, M (p1 ∨ · · · ∨ pn ) = Fi
and M (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = Fj , which is a contradiction because M is a model of P.
D. N (p1 ∨· · ·∨pn ) = 0 and N (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) =
Ti . Since N (p1 ∨ · · · ∨ pn ) = 0, there exists some
pi such that N (pi ) = 0, and therefore N ′ (pi ) <
Col(M )(pi ) = T .
This means that N ′ (pi ) ≤
′
0 and thus N (p1 ∨ · · · ∨ pn ) ≤ 0. However,
since N (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = Ti , we get that
N ′ (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = T . This is a contradiction because N ′ is a model of P.
E. N (p1 ∨ · · · ∨ pn ) = Ti and N (q1 , . . . , qk , ∼r1 , . . . ,
∼rm ) = Tj with i > j. Then, M (p1 ∨ · · · ∨ pn ) = Ti
and M (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = Tj , which is a contradiction because M is a model of P.
Naturally, the above theorem also applies to the class of
normal (non-disjunctive) logic programs, since this is included in the class of disjunctive programs:
Corollary 1. Let P be a stratified normal program,
{S0 , . . . , Sr } be a stratification of P and M be the minimum
infinite-valued model of P. For every propositional symbol
p appearing in P, M (p) ∈ {F0 , T0 , . . . , Fr , Tr }.
Note that a similar bound for the order or multitude of
truth values was never given in (Rondogiannis and Wadge
2005), so Corollary 1 is in itself a novel result. On the
other hand, the fact that the minimum infinite-valued model
of a stratified normal program does not contain the undefined truth value was already implied in Theorem 3, as it is
known that the well-founded model coincides with the perfect model for (locally) stratified programs. The exclusion
of the undefined truth value from the intended models of a
stratified program illustrated by Theorem 5 is a trademark
characteristic of the semantics of stratified programs in traditional logic programing, retained by the disjunctive perfect
model semantics (but not some other disjunctive semantics).
The following lemma reveals that the perfect models and
minimal infinite-valued models of stratified programs have
another quality in common, in stating that the minimal
infinite-valued models of stratified programs in fact correspond to traditional minimal models.
Lemma 2. Let P be a stratified disjunctive logic program
with negation and let M be a minimal infinite-valued model
of P. Then, Col(M ) is a minimal classical model of P.
In conclusion, N is a model of P and N ❁∞ M . This is
a contradiction because we have assumed that M is a ⊑∞ minimal model of P.
Proof. Assume that Col(M ) is not minimal. Then, there
exists a model N ′ of P with N ′ < Col(M ). Notice that
Col(M ) is two-valued by Theorem 5, but N ′ need not necessarily be two-valued. We construct the following infinitevalued interpretation:
M (p), if Col(M )(p) = N ′ (p)
N (p) =
0,
if N ′ (p) < Col(M )(p)
6
Comparing the Semantics of Stratified
Programs
In this section we revisit some examples of stratified programs presented in the literature for the purpose of comparing various semantics, including the perfect model semantics. We find that the Infinite-Valued Semantics Lmin
be∞
haves equivalently to the perfect model semantics in every
case. We also give a summary of the more general comparison results mentioned in Section 2 supplemented by a few
additional observations drawn from this section.
We begin with this example from (Brass and Dix 1994):
Notice that N < M and N ⊑∞ M . We claim that N is a
model of P. Assume it is not. Then, there exists a clause:
p1 ∨ · · · ∨ pn ← q1 , . . . , qk , ∼r1 , . . . , ∼rm
such that N (p1 ∨· · ·∨pn ) < N (q1 , . . . , qk , ∼r1 , . . . , ∼rm ).
We distinguish cases based on the values of N (p1 ∨· · ·∨pn )
and N (q1 , . . . , qk , ∼r1 , . . . , ∼rm ).
Example 3. Consider the following stratified program:
p←q
p ← ∼q
q∨r←
A. N (p1 ∨ · · · ∨ pn ) = Fi and N (q1 , . . . , qk , ∼r1 , . . . ,
∼rm ) = Tj for some i, j. But then, N (pi ) < 0
for all i; therefore N ′ (pi ) = F for all i and thus
N ′ (p1 ∨ · · · ∨ pn ) = F . Moreover N (qi ) > 0 for
all i and N (ri ) < 0 for all i. Therefore, N ′ (qi ) = T
for all i and N ′ (ri ) = F for all i, and consequently
N ′ (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = T . This means that N ′
is not a model of P (contradiction).
B. N (p1 ∨ · · · ∨ pn ) = Fi and N (q1 , . . . , qk , ∼r1 , . . . ,
∼rm ) = 0.
But then, N ′ (p1 ∨ · · · ∨ pn ) =
′
F , and since N is a model of P, it must also be
N ′ (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = F . This means that either for some qi we have N ′ (qi ) = F or for some ri
The only possible stratification for this program is S0 =
{q, r}, S1 = {p}. This program has two minimal infinitevalued models: M1 = {(p, T0 ), (q, T0 ), (r, F0 )} and M2 =
{(p, T1 ), (q, F0 ), (r, T0 )}. It also has two perfect models,
N1 = {p, q} and N2 = {p, r}. The two semantics give
equivalent results: M1 collapses to N1 and M2 to N2 .
The next example (again from (Brass and Dix 1994)) was
used to demonstrate that neither the Strong Well-Founded
Semantics (Ross 1989) nor the D-WFS (Brass and Dix
1995) coincide with the perfect model semantics. The same
147
Static
6=
6=
6=
WFDS
WFSd
PEL
Lmin
∞
Perfect
≶
6=
6=
6=
≈
6=
6=
6=
6=
≶
>
≈
>
>
≶
6=
>
6=
6=
≶
6=
>
6=
6=
6=
6=
6=
6=
6=
6=
Lmin
∞
PEL
6=
WFSd
6=
6
=
WFDS
DWFS
≶
6=
≶
Static
The Strong Well-Founded Semantics, the D-WFS and Wang’s
WFDS all leave r undefined, while the DWFS of (Baral
1992), the Stationary and the WFSd all allow to conclude r. The program has two perfect models, N1 =
{p, r} and N2 = {q, r}, and r is true in both of
them. Again, there exists a one-to-one equivalence between the perfect models and the minimal infinite-valued
models M1 = {(p, T0 ), (q, F0 ), (r, T1 )} and M2 =
{(p, F0 ), (q, T0 ), (r, T1 )}.
As mentioned in Section 2, the Static Semantics is based
on translating the program into a belief theory, i.e. a set of
rules featuring the belief operator B. The following example
is used in (1995) to justify the weak nature of the semantics:
Example 5. Consider the following stratified program:
Strong
Stationary
DWFS
D-WFS
D-WFS
p∨q←
r ← ∼p
r ← ∼q
Stationary
Strong
example is also discussed in (Baral 1992), (Wang 2000)
and (Alcântara, Damásio, and Pereira 2005).
Example 4. Consider the following stratified program:
Table 1: Relationships among disjunctive well-founded semantics
Table 1 summarizes the relationships among the discussed
generalizations of the well-founded semantics, which are either stated in or inferred (by juxtaposing examples) from the
cited references and this section. We use the symbol > to
denote that the semantics that defines the respective row of
the table is stronger than the semantics that defines the respective column. The symbol 6= simply denotes that the two
semantics are different in general, while ≶ is used when we
know that neither semantics is stronger or weaker than the
other. Finally, ≈ denotes that they coincide under certain
conditions, e.g. if restricted to a common language or to
stratified programs.
goto australia ∨ goto europe ←
goto both ← goto australia, goto europe
save ← ∼goto both
cancel reservation ← ∼goto australia
cancel reservation ← ∼goto europe
The propositional symbol cancel reservation is assigned the value T in the two perfect models of the program. Once more, the perfect models coincide with the collapsed infinite-valued models, M1 = {(goto australia,
F0 ), (goto europe, T0 ), (goto both, F0 ), (save, T1 ),
(cancel reservation, T1 )}, M2 = {(goto australia,
T0 ), (goto europe, F0 ), (goto both, F0 ), (save, T1 ),
(cancel reservation, T1 )}. Under the Static Semantics, either ∼goto australia or ∼goto europe must be
believed for cancel reservation to be derived; even
though the Static Semantics derives B(∼goto australia∨
∼goto europe) from the above program, this does
not imply B∼goto australia ∨ B∼goto europe and
cancel reservation is not concluded.
Based on this example, Przymusinski argues that the perfect model semantics is too strong. He makes a point that
cancel reservation is not derivable under his semantics by a conscious choice which could be remedied,
if so desired, by assuming the Disjunctive Belief Axiom,
B(F ∨ G) ≡ BF ∨ BG. The Static Semantics, Przymusinski continues, is a minimal approach that can be adapted to
fit a variety of requirements, by explicitly adding axioms.
This versatility is indeed an attractive quality of the Static
Semantics. However, there is no formal statement in (Przymusinski 1995) that the inclusion of the Disjunctive Belief
Axiom (or any finite set of axioms) would render the semantics equivalent to the perfect model semantics. It is our view
that the perfect model semantics and Lmin
∞ do give the intuitive interpretation of this specific program, though we acknowledge that for some applications, the outcome of the
Static Semantics could be more desirable.
7
Conclusions and future work
In this paper we have singled out the Infinite-Valued Semantics Lmin
∞ as an attractive approach to generalizing the wellfounded semantics to the class of disjunctive programs and
focused on its behavior in the case of stratified programs.
We showed that the minimal infinite-valued models of such
programs have a tiered structure that corresponds to that of
the stratified program and close connections to the perfect
models. Our most notable contribution is that the rank of the
stratum at which a propositional symbol is placed correlates
with the maximum order of the truth value this atom is assigned in a minimal model. This way we show the number of
strata to be an upper bound of the order - and, consequently,
the multitude - of truth values assigned by the minimal models. We consider this a significant improvement of the upper bound set by Theorem 2 (originally stated in (Cabalar
et al. 2007)) for the general case, namely the cardinality of
the Herbrand base of the program. Moreover, the undefined
truth value was excepted from the bound of Theorem 2; this
is not the case for our improved bound, i.e. the minimal
models of a stratified program do not assign this value.
We have presented an overview of the existing alternative
generalizations of the well-founded semantics for disjunctive programs, especially noting their very different characterizations, as well as any comparative results we have found
involving the discussed approaches and the perfect model
148
Eiter, T.; Gottlob, G.; and Mannila, H. 1994. Adding disjunction to datalog (extended abstract). In Proceedings of
the Thirteenth ACM SIGACT-SIGMOD-SIGART Symposium
on Principles of Database Systems, PODS ’94, 267–278.
New York, NY, USA: Association for Computing Machinery.
Fernández, J. A., and Minker, J. 1995. Bottom-up compuation of perfect models for disjunctive theories. J. Log.
Program. 25(1):33–51.
Gelfond, M., and Lifschitz, V. 1988. The stable model semantics for logic programming. In ICLP/SLP, 1070–1080.
MIT Press.
Lobo, J.; Minker, J.; and Rajasekar, A. 1992. Foundations
of Disjunctive Logic Programming. Cambridge, MA, USA:
MIT Press.
Przymusinski, T. C. 1988. On the declarative semantics
of deductive databases and logic programs. In Foundations
of Deductive Databases and Logic Programming. Morgan
Kaufmann. 193–216.
Przymusinski, T. C. 1989. Every logic program has a natural stratification and an iterated least fixed point model. In
Silberschatz, A., ed., PODS, 11–21. ACM Press.
Przymusinski, T. C. 1990. Stationary semantics for disjunctive logic programs and deductive databases. In Proceedings
of the 1990 North American Conference on Logic Programming, 40–59. Cambridge, MA, USA: MIT Press.
Przymusinski, T. C. 1991. Semantics of disjunctive logic
programs and deductive databases. In Delobel, C.; Kifer,
M.; and Masunaga, Y., eds., Deductive and Object-Oriented
Databases, Second International Conference, DOOD’91,
Munich, Germany, December 16-18, 1991, Proceedings,
volume 566 of Lecture Notes in Computer Science, 85–107.
Springer.
Przymusinski, T. C. 1995. Static semantics for normal and
disjunctive logic programs. Ann. Math. Artif. Intell. 14(24):323–357.
Rondogiannis, P., and Wadge, W. W. 2005. Minimum model
semantics for logic programs with negation-as-failure. ACM
Trans. Comput. Log. 6(2):441–467.
Ross, K. A. 1989. The well founded semantics for disjunctive logic programs. In DOOD, 385–402.
van Gelder, A.; Ross, K. A.; and Schlipf, J. S. 1988. Unfounded sets and well-founded semantics for general logic
programs. In Edmondson-Yurkanan, C., and Yannakakis,
M., eds., PODS, 221–230. ACM.
van Gelder, A. 1988. Negation as failure using tight derivations for general logic programs. In Minker, J., ed., Foundations of Deductive Databases and Logic Programming.
Morgan Kaufmann. 149–176.
Wang, K. 2000. Argumentation-based abduction in disjunctive logic programming. J. Log. Program. 45(1-3):105–141.
Wang, K. 2001. A comparative study of well-founded semantics for disjunctive logic programs. In Eiter, T.; Faber,
W.; and Truszczynski, M., eds., LPNMR, volume 2173 of
Lecture Notes in Computer Science, 133–146. Springer.
semantics. As noted by other authors, this great difference in
characterizations makes it particularly hard to perform thorough comparisons of the approaches. Therefore, most comparisons are limited to the examining example programs,
some of which we have discussed in this paper.
Many questions remain open in regard to the infinitevalued semantics of disjunctive programs in general and
stratified disjunctive programs in particular. For example,
the development of an immediate consequence operator for
the disjunctive infinite-valued semantics would hold great
value in itself. Additionally, it could contribute greatly to
the comparative study of the infinite-valued semantics and
other disjunctive semantics that have fixed-point characterizations, including the perfect model semantics (Fernández
and Minker 1995). This could help to better understand the
place of the disjunctive infinite-valued semantics as a generalization of the well-founded semantics and perhaps the
perfect model semantics.
References
Alcântara, J.; Damásio, C. V.; and Pereira, L. M. 2005. A
well-founded semantics with disjunction. In Gabbrielli, M.,
and Gupta, G., eds., ICLP, volume 3668 of Lecture Notes in
Computer Science, 341–355. Springer.
Apt, K. R.; Blair, H. A.; and Walker, A. 1988. Towards
a theory of declarative knowledge. In Foundations of Deductive Databases and Logic Programming. Morgan Kaufmann. 89–148.
Baral, C. 1992. Generalized negation as failure and semantics of normal disjunctive logic programs. In Voronkov, A.,
ed., LPAR, volume 624 of Lecture Notes in Computer Science, 309–319. Springer.
Brass, S., and Dix, J. 1994. A disjunctive semantics based
on unfolding and bottom-up evaluation. In GI Jahrestagung,
83–91.
Brass, S., and Dix, J. 1995. Disjunctive semantics based
upon partial and bottom-up evaluation. In ICLP, 199–213.
Brass, S.; Dix, J.; Niemelä, I.; and Przymusinski, T. C.
2001. On the equivalence of the static and disjunctive wellfounded semantics and its computation. Theor. Comput. Sci.
258(1-2):523–553.
Cabalar, P.; Odintsov, S. P.; Pearce, D.; and Valverde, A.
2006. Analysing and extending well-founded and partial stable semantics using partial equilibrium logic. In Etalle, S.,
and Truszczynski, M., eds., ICLP, volume 4079 of Lecture
Notes in Computer Science, 346–360. Springer.
Cabalar, P.; Pearce, D.; Rondogiannis, P.; and Wadge, W. W.
2007. A purely model-theoretic semantics for disjunctive
logic programs with negation. In Baral, C.; Brewka, G.; and
Schlipf, J. S., eds., LPNMR, volume 4483 of Lecture Notes
in Computer Science, 44–57. Springer.
Eiter, T., and Gottlob, G. 1993. Complexity aspects of various semantics for disjunctive databases. In Proceedings of
the Twelfth ACM SIGACT-SIGMOD-SIGART Symposium on
Principles of Database Systems, PODS ’93, 158–167. New
York, NY, USA: Association for Computing Machinery.
149
Information Revision: The Joint Revision of Belief and Trust
Ammar Yasser1 , Haythem O. Ismail2,1
1
German University in Cairo, Egypt
2
Cairo University, Egypt
ammar.abbas@guc.edu.eg, haythem.ismail@guc.edu.eg
Abstract
and Bonnet 2018). Nevertheless, we believe that there are
several issues that are left unaddressed by the logical approaches. Intuitively, trust is intimately related to misleading, on one hand, and belief revision, on the other. While
several logical treatments of misleading are to be found
in the literature (Sakama, Caminada, and Herzig 2010;
van Ditmarsch 2014; Sakama 2015; Ismail and Attia 2017,
for instance), the relation of misleading to trust erosion is
often not attended to or delegated to future work. On the
other hand, the extensive literature on belief revision (Alchourrón, Gärdenfors, and Makinson 1985; Hansson 1994;
Darwiche and Pearl 1997; Van Benthem 2007, for example),
while occasionally addressing trust-based revision of beliefs (Lorini, Jiang, and Perrussel 2014; Rodenhäuser 2014;
Booth and Hunter 2018) does not have much to say about
the revision of trust (but see (Liau 2003; Lorini, Jiang, and
Perrussel 2014) for minimal discussions) and, as far as we
know, any systematic study of jointly revising belief and
trust. The goal of this paper is, hence, twofold: (i) to motivate why belief and trust revision are intertwined and should
be carried out together, and (ii) to propose AGM-style postulates for the joint revision of trust and belief.
The paper is structured as follows. Section 2 describes what we mean by trust, information and information
sources. It also highlights the intuitions behind joint trust
and belief revision. In Section 3, we present information
states, a generic structure representing information and investigating its properties. Section 4 presents a powerful notion of relevance which information structures give rise to.
In Section 5, the formal inter-dependency of belief and trust
is explored, culminating in AGM-style postulates for joint
belief-trust revision. Finally, Section 6 presents an extended
example highlighting some of the key concepts proposed in
the paper.1
Most of our decisions are guided by trust, specifically decisions about what to believe and what not to believe. We
accept information from sources we trust and doubt information from sources we do not trust and, in general, rely on trust
in revising our beliefs. While we may have difficulty defining exactly what trust is, we can, on one hand, rather easily
explain why we trust or mistrust someone and, on the other
hand, occasionally revise how much we trust them. In this
paper, we propose that trust revision and belief revision are
inseparable processes. We address issues concerning the formalization of trust in information sources and provide AGMstyle postulates for rational joint revision of the two attitudes.
In doing so, we attempt to fill a number of gaps in the literature on trust, trust revision, and their relation to belief revision.
1
Introduction
Trust acts, even if we are not aware, as an information filter. We are willing to believe in information communicated by sources we trust, cautious about information from
sources we do not trust, and suspicious about information
from sources we mistrust. Trust and mistrust are constantly
revised; we gain more trust in information sources the more
they prove themselves to be reliable, and our trust in them
erodes as they mislead us one time after the other. Such attitudes allow us to be resilient, selective and astute. If exhibited by logic-based agents, these same attitudes would make
them less susceptible to holding false beliefs and, hence, less
prone to excessive belief revision. Moreover, by revising
trust, these agents will not forever be naively trusting nor
cynically mistrusting.
Trust has been thoroughly investigated within multiagent systems (Castelfranchi and Falcone 1998; Falcone and
Castelfranchi 2001; Jones and Firozabadi 2001; Jones 2002;
Sabater and Sierra 2005; Katz and Golbeck 2006, for instance), psychology (Simpson 2007; Elangovan, Auer-Rizzi,
and Szabo 2007; Haselhuhn, Schweitzer, and Wood 2010,
for instance), and philosophy (Holton 1994; Hardwig 1991;
McLeod 2015, for instance). Crucially, it was also investigated in the logic-based artificial intelligence (AI) literature by several authors (Demolombe 2001; Demolombe and
Liau 2001; Liau 2003; Katz and Golbeck 2006; Herzig et
al. 2010; Drawel, Bentahar, and Shakshuki 2017; Leturc
2
2.1
Trust and Belief
Trust in Information Sources
It is often noted that trust is not a dyadic relation, between
the trusted and the trustee, but is a triadic relation involving an object of trust (McLeod 2015). You trust your doctor
1
Because of space constraints, we were not able to provide all
results in this paper. Hence, some selected proofs are available
through this online appendix: proofs.
150
in σ2 who earlier conveyed ψ to us. Moreover, suppose that
φ, together with other beliefs, implies our old belief ξ. We
say that φ is a confirmation of ξ. This confirmation may
trigger us to revise, and increase, our trust in σ3 who is the
source of ξ. Thus, trust revision depends on belief revision.
In fact, belief revision may be the sole factor that triggers
rational trust revision in information sources.
We need not stop there though. For, by reducing our trust
in σ2 ’s reliability, we are perhaps obliged to stop believing
(or reduce our degree of belief in) ψ ′ which was conveyed
by σ2 . It is crucial to note that ψ ′ may be totally consistent with φ and we, nevertheless, give it away. While we
find such scenario quite plausible, classical belief revision,
with its upholding of the principle of minimal change, would
deem it irrational. Likewise, by increasing our trust in σ3
we may start believing (or raise our degree of belief) in ξ ′
which was earlier conveyed by σ3 . This second round of belief revision can start a second round of trust revision. It is
clear that we may keep on doing this for several rounds (perhaps indefinitely) if we are really fanatic about information
and its sources. Hence, we contend that belief revision and
trust revision are so entangled that they need to be combined
into one process of joint belief-trust revision or, as we shall
henceforth refer to it, information revision.
with your health, your mechanic with your car, your parents
to unconditionally believe you, and your mathematics professor to tell you only true statements of mathematics. Our
investigation of the coupling of belief and trust lets us focus
only on trust in sources of information. Trust in information sources comes in different forms. Among Demolombe’s
(Demolombe 2004; Lorini and Demolombe 2008) different
types of trust in information sources, we focus on trust in
sincerity and competence since they are the two types relevant to belief revision and realistic information sources.2
A sincere information source is one which (if capable of
forming beliefs) only conveys what it believes; a competent
source is one which only conveys what is true. In this paper,
we consider trust in the reliability of information sources,
where a source is reliable if it is both sincere and competent.3 Note that we do not take information sources to only
be cognitive agents. For example, a sensor (or perception, in
general) is a possible source of information. For information
sources which are not cognitive agents, reliability reduces to
competence.
2.2
Joint Revision of Trust and Belief
Rational agents constantly receive information, and are
faced with the question of whether to believe or not to believe. The question is rather simple when the new information is consistent with the agent’s beliefs, since no obvious
risk lies in deciding either way. Things become more interesting if the new information is inconsistent with what the
agent believes; if the agent decides to accept the new information, it is faced with the problem of deciding on which
of its old beliefs to give up in order to maintain consistency.
Principles for rationally doing this are the focus of the vast
literature on belief revision (Alchourrón, Gärdenfors, and
Makinson 1985; Hansson 1999a, for example).
It is natural to postulate that deciding whether to believe
and how to revise our beliefs–the process of belief revision–
are influenced by how much we trust the source of the new
piece of information. (Also see (Lorini, Jiang, and Perrussel 2014; Rodenhäuser 2014; Booth and Hunter 2018).) In
particular, in case of a conflict with old beliefs, how much
we trust in the source’s reliability and how much evidence
we have accumulated for competing beliefs seem to be the
obvious candidates for guiding us in deciding what to do.
Thus, rational belief revision depends on trust.
But things are more complex. For example, suppose that
information source σ1 , whom we trust very much, conveys
φ to us. φ is inconsistent with our beliefs but, because we
trust σ1 , we decide to believe in φ and give away ψ which,
together with other beliefs, implies ¬φ. In this case, we say
that φ is a refutation of ψ. So far, this is just belief revision,
albeit one which is based on trust. But, by stopping believing
in ψ, we may find it rational to revise, and decrease, our trust
3
Information States
In order for an agent A to perform information revision, revising both its beliefs and its trust in sources, it needs to be
able to recall more than just what it believes, or how much
it trusts certain sources, as is most commonly the case in the
literature. Hence, we introduce formal structures for representing information in a way that would facilitate information revision.
Definition 3.1. An information grading structure G is a
qunituple (Db , Dt , ≺b , ≺t , δ), where Db and Dt are nonempty, countable sets; ≺b and ≺t are, respectively, total
orders over Db and Dt ; and δ ∈ Dt .
Db and Dt contain the degrees of belief and trust, respectively. They are not necessarily finite, disjoint, different or
identical.4 Moreover, to be able to distinguish the strength
by which an agent believes a proposition or trusts a source,
the two sets are ordered; here, we assume them to be totally
ordered. δ is interpreted as the default trust degree assigned
to an information source with which the agent has no earlier
experience.
Definition 3.2. An information structure I is a quadruple
(L, C, S, G), where
1. L is a logical language with a Tarskian consequence operator Cn,
2. C is a finite cover of L whose members are referred to as
topics,
3. S is a non-empty finite set of information sources, and
4. G is an information grading structure.
2
Trust in completeness, for example, is unrealistic since it requires that the source informs about P whenever P is true.
3
As suggested by (Ismail and Attia 2017), it is perhaps possible
that breach of sincerity and competence should have different effects on belief revision; for simplicity, we do not consider this here
though.
4
Db and Dt are usually the same; however, a qualitative account
of trust and belief might have different sets for grading the two
attitudes.
151
the conveyance instances that make it into H(K). Hence, a
generic revision operator is denoted by ⋉F , where F is the
associated filter. Revising K with a conveyance of φ by σ is
denoted by K ⋉F (φ, σ). We require all revision operators
⋉F to have the same effect on the history:
H(K) ∪ {(φ, σ)} (φ, σ) ∈ F
H(K ⋉F (φ, σ)) =
H(K)
otherwise
Information structures comprise our general assumptions
about information. S is the set of possible information
sources. Possible pieces of information are statements of
the language L, with each piece being about one or more, but
finitely many, topics as indicated by the L-cover C. L is only
required to have a Tarskian consequence operator (Hansson
1999b). A topic represents the scope of trust. It is a set
of statements which may be closed under all connectives,
some connectives or none at all. Topics could also be disjoint or overlapping. Choosing topics to be not necessarily
closed under logical connectives allows us to accommodate
interesting cases. For example, A may have, for the same
source, a different trust value when conveying φ to when it
conveys ¬φ.
Definition 3.3. Let I = (L, C, S, (Db , Dt , ≺b , ≺t , δ)) be an
information structure. An information state K over I is a
triple (B, T , H), where
There are three major filter types. A filter F is nonforgetful if F = L × S; it is forgetful if ∅ 6= F ⊂ S × L;
and it is memory-less if F = ∅. Having filters beside the
non-forgetful one is to simulate realistic scenarios where an
agent does not always remember every piece of information
that was conveyed to it. Henceforth, the subscript F will be
dropped from ⋉F whenever this does not lead to ambiguity.
We now turn to what happens to the belief and trust bases
of a revised information state. We start with two general
definitions.
Definition 3.4. Formula φ is more entrenched in state K2
over state K1 , denoted K1 ≺φ K2 if
1. B : L ֒→ Db is a partial function referred to as the belief
base,
2. T : S × C ֒→ Dt is a partial function referred to as the
trust base, and
3. H ⊆ L × S, the history, is a finite set where, for every
T ∈ C, if φ ∈ T then (σ, T, dt ) ∈ T , for some dt ∈ Dt .
1. φ ∈
/ Cn(F or(B(K1 ))) and φ ∈ Cn(F or(B(K2 ))) or
2. (φ, b1 ) ∈ B(K1 ), (φ, b2 ) ∈ B(K2 ), and b1 ≺b b2 .
If K1 ⊀φ K2 and K2 ⊀φ K1 , we write K1 ≡φ K2 .
Definition 3.5. Source σ is more trusted on topic T in state
K2 over state K1 , denoted K1 ≺σ,T K2 if (σ, T, t1 ) ∈
T (K1 ), (σ, T, t2 ) ∈ T (K2 ), and t1 ≺t t2 . If K1 ⊀σ,T K2
and K2 ⊀σ,T K1 , we write K1 ≡σ,T K2 .
Intuitively, a belief changes after revision if is added to
or removed from the belief base, or if its associated grade
changes. Similarly, trust in a source regarding a topic
changes after revision if the associated trust grade changes.
Trust in information sources is recorded in T (K). This is
a generalization to accommodate logics with an explicit account of trust in the object language (Demolombe and Liau
2001; Leturc and Bonnet 2018, for instance) as well as those
without (Katz and Golbeck 2006; Jøsang, Ivanovska, and
Muller 2015, for example). H(K) acts as a formal device
for recording conveyance instances.5 As with T (K), we do
not require L to have an explicit account for conveying.6
With this setup, having trust on single propositions, as is
most commonly the case in the literature (Demolombe 2004;
Leturc and Bonnet 2018, for instance), reduces to restricting
all topics to be singletons. On the other hand, we may account for absolute trust in sources by having a single topic
to which all propositions belong.
So far, we defined what information states are. We now
define the following abbreviations of which we will later
make use.
4
Relevant Change
As proposed earlier, the degrees of trust in sources depend
on the degrees of belief in formulas conveyed by these
sources and vice versa. Hence, on changing the degree of
belief in some formula φ, the degree of trust in a source σ,
that previously conveyed φ, is likely to change. However,
when the degree of trust in σ changes, the degrees of belief
in formulas conveyed by σ might change as well. To model
such behavior, we need to keep track of which formulas and
which sources are “relevant” to each other. First, we recall
a piece of terminology due to (Hansson 1994): Γ ⊂ L is a
φ-kernel (φ ∈ L), Γ |= φ and, for every ∆ ⊂ Γ, ∆ 6|= φ.
Definition 4.1. Let K be an information state. The support
graph G(K) = (SK ∪ ΦK , E) is such that (u, v) ∈ E if and
only if
• σ(H(K)) = {φ | (φ, σ) ∈ H(K)}
• SK = {σ | (φ, σ) ∈ H(K)}
• F or(B(K)) = {φ | (φ, db ) ∈ B(K)}
• ΦK = F or(B(K)) ∪ {φ | φ ∈ σ(H(K)) f or all σ ∈ SK }
Information revision is the process of revising an information state K with the conveyance of a formula φ by a source
σ. Every information revision operator is associated with a
conveyance inclusion filter F ⊆ L × S which determines
1. u ∈ SK , v ∈ ΦK , and v ∈ u(H(K));
2. u ∈ ΦK , v ∈ ΦK , u 6= v, and u ∈ Γ ⊆ ΦK where Γ is a
v-kernel; or
3. u ∈ ΦK , v ∈ SK , and (v, u) ∈ E.
5
We chose H(K) to be a set for simplicity. However, it could
be more beneficial if it was a sequence instead. An agent might
need to record multiple conveyances by the same source for the
same formula or distinguish more recents one without an explicit
representation of time.
6
The transfer of information is sometimes referred to using either “inform” or “communicate”. In this paper, we use “convey” as
a cover term for any modality of information transfer.
A node u supports a node v if there is a simple path from u
to v.
Figure 1 shows an example of the support graph for the
following information state: Source σ1 conveys φ that logically implies ψ which, in turn, is conveyed by σ2 . Hence,
152
#
B1
B2
B3
B4
B5
B6
B7
B8
Figure 1: The support graph where σ1 conveys φ which logically
implies ψ which is conveyed by σ2 .
K
neither
neither
(φ, b1 ) ∈ B(K)
(φ, b1 ) ∈ B(K)
(¬φ, b1 ) ∈ B(K)
(¬φ, b1 ) ∈ B(K)
(¬φ, b1 ) ∈ B(K)
(¬φ, b1 ) ∈ B(K)
K⋉
(φ, b) ∈ B(K⋉ )
neither
(φ, b2 ) ∈ B(K⋉ )
(φ, b1 ) ∈ B(K⋉ )
(φ, b2 ) ∈ B(K⋉ )
neither
(¬φ, b2 ) ∈ B(K⋉ )
(¬φ, b1 ) ∈ B(K⋉ )
Notes
b1 ≺ b b 2
b2 ≺ b b 1
-
Table 1: The admissible scenarios of belief revision.
there is an edge from σ1 to φ and from σ2 to ψ given the
first clause in the definition of the graph. Also, there is an
edge from φ to ψ given the second clause. Finally, according
to the last clause, there is an edge from both φ and ψ to σ1
and σ2 , respectively. Note, for example, that φ supports ψ
and that σ1 supports σ2 . Intuitively, φ supports ψ directly by
logically implying it; σ1 supports σ2 by virtue of conveying
a formula (φ) which confirms a formula (ψ) conveyed by σ2 .
The support graph allows us to trace back and propagate
changes in trust and belief to relevant beliefs and information sources along support paths. Instances of support may
be classified according to the type of relata.
Observation 4.1. Let K be an information state.
1. φ ∈ ΦK supports ψ ∈ ΦK if and only if φ 6= ψ and (i)
φ ∈ Γ ⊆ ΦK where Γ is a ψ-kernel or (ii) φ supports
some σ ∈ SK which supports ψ.
2. φ ∈ ΦK supports σ ∈ SK if and only if ψ ∈ σ(H(K))
and φ ∈ Γ ⊆ ΦK where Γ is a ψ-kernel or φ supports
some σ ′ ∈ SK which supports σ.
3. σ ∈ SK supports φ ∈ ΦK if and only if ψ ∈ σ(H(K))
and ψ ∈ Γ ⊆ ΦK where Γ is a φ-kernel or σ supports
some σ ′ ∈ SK which supports φ.
4. σ ∈ SK supports σ ′ ∈ SK if and only if σ 6= σ ′ σ supports
some φ ∈ ΦK which supports σ ′ .
Thus, given the first three clauses, the support relation
from a formula to a formula, a formula to a source, or a
source to formula may be established in two ways: (i) either purely logically via a path of only formulas or (ii) with
the aid of a trust link via an intermediate source. A source
can only support a source, however, by supporting a formula
which supports that other source. Note that self-support is
avoided by requiring support paths to be simple.
The support graph provides the basis for constructing an
operator of rational information revision. Traditionally, belief revision is concerned with minimal change (Gärdenfors
and Makinson 1988; Hansson 1999a). In this paper, we
model minimality using relevance. However, our notion of
relevance is not restricted to logical relevance as with classical belief revision; it also accounts for source relevance.
When an information state K is revised with formula φ conveyed by source σ, we want to confine changes in belief and
trust to formulas and sources relevant to φ, ¬φ, and σ.
Definition 4.2. Let K be an information state and u and
v be nodes in G(K). u is v-relevant if u supports v or v
supports u. Further, if φ, ψ ∈ L with Γφ ⊆ ΦK a φ-kernel
and Γψ ⊆ ΦK a ψ-kernel, where u is v-relevant for some
u ∈ Γφ and v ∈ Γψ , then φ is ψ-relevant.
Observation 4.2. Let K be an information state where u is
v-relevant. The following are true.
1. v is u-relevant.
2. If v ∈ σ(H(K)) and u 6= σ, then u is σ-relevant.
3. If v ∈ SK , φ ∈ v(H(K)), and u 6= φ, then, u is φrelevant.
Hence, relevance is a symmetric relation. Crucially, if
σ conveys φ, then the formulas and sources relevant to φ
(other than σ) are exactly the formulas and sources relevant
to σ (other than φ). For this reason, when revising with a
conveyance of φ by σ it suffices to consider only φ-relevant
(and ¬φ-relevant) formulas and sources.
5
Information Revision
Before formalizing the postulates of information revision,
we start by presenting the intuitions of changing beliefs and
trust that constitute the foundation of said formalization.
5.1
Intuitions
Table 1 shows the possible reasonable effects on B(K) as
agent A revises its information state K with (φ, σ); K⋉ is
shorthand for K ⋉ (φ, σ). The cases depend on whether φ ∈
Cn(F or(B(K))), ¬φ ∈ Cn(F or(B(K))), or neither φ or
¬φ is in Cn(F or(B(K))). It is important to note that the
“neither” cases are that strong only to simplify introducing
the intuitions. For further simplicity, we only consider cases
where B(K) (and, of course, B(K⋉ )) is consistent so it is
never the case that both φ and ¬φ are believed.
In Case B1 , A believes neither φ nor ¬φ. Since A has
no evidence to the contrary, on revising with φ, it is believed with some degree b, as now it is confirmed by a trusted
source σ. Moreover, since ¬φ was neither refuted nor supported, it stays the same (K ≺φ K⋉ and K ≡¬φ K⋉ ). As
with Case B1 , in Case B2 , A is neutral about φ. However, on
revision, A finds that the weight of evidence for and against
φ are comparable so that it cannot accept φ (K ≡φ K⋉ and
K ≡¬φ K⋉ ).
Unlike the previous two scenarios where A believed neither φ nor its negation, in Case B3 , A already believes φ
with some degree b1 . Consequently, revision with φ confirms what is already believed. Since a new source σ now
supports φ, it becomes more entrenched (K ≺φ K⋉ and
K ≡¬φ K⋉ ). On the other hand, in Case B4 , on revising
with φ, despite A’s already believing φ which is now being
confirmed, φ does not become more entrenched (K ≡φ K⋉
153
#
B9
B10
B11
B12
B13
K
neither
(φ, b1 ) ∈ B(K)
(φ, b) ∈ B(K)
(φ, b1 ) ∈ B(K)
(¬φ, b1 ) ∈ B(K)
K⋉
(¬φ, b) ∈ B(K⋉ )
(φ, b2 ) ∈ B(K⋉ )
neither
(¬φ, b2 ) ∈ B(K⋉ )
(¬φ, b2 ) ∈ B(K⋉ )
Notes
b2 ≺ b b1
b1 ≺ b b2
its negation. To further illustrate this concept, consider a
possible example.
Example 5.1. Bob believes that there will be no classes tomorrow (φ) but he is not very certain about that. He meets
T im, who tells him “there will be no classes tomorrow”.
This is a direct confirmation of φ and, in normal circumstance, we should expect that Bob’s belief will become more
entrenched. However, Bob recalls that, time and again, Tim
has viciously lied to him about cancelled classes, thereby
harming his academic status. One may consider it rational
in this case for Bob to lower his degree of belief in φ, stop
believing φ or, in extreme cases, to opt for believing ¬φ.
Table 2: The forbidden scenarios of belief revision.
and K ≡¬φ K⋉ ). An example where this might occur is
when φ is believed with the maximum degree of belief, if
such degree exists, or when φ has only been ever conveyed
by σ, who is now only confirming itself. In this latter case,
A might choose not to increase the degree of belief in φ.
We now consider cases where A already believes ¬φ.
In Case B5 , revising with the conflicting piece of information φ coming from a highly-trusted source σ, A stops believing ¬φ and starts believing φ instead (K ≺φ K⋉ and
K⋉ ≺¬φ K). Similarly, in Case B6 , A is presented with
evidence against ¬φ. After revision, A decides that there is
not enough evidence to keep believing ¬φ. However, there
is also not enough evidence to believe φ (K ≡φ K⋉ and
K⋉ ≺¬φ K). Moreover, in Case B7 , A decides, on revision,
that there is not enough evidence to completely give up ¬φ.
However, there is enough evidence to doubt ¬φ (decrease
¬φ’s degree of belief) making it less entrenched (K ≡φ K⋉
and K⋉ ≺¬φ K). On the contrary, in Case B8 , A decides
that there is not enough evidence to change its beliefs, even
when provided with φ, and hence ¬φ remains unchanged
(K ≡φ K⋉ and K ≡¬φ K⋉ ). A possible scenario for this
is when the source is not trusted and so A decides not to
consider this instance of conveyance.
Other cases, we believe, should be forbidden for a rational
operation of information revision. These cases are presented
in Table 2.
A is neutral about φ in Case B9 . However, when provided
with evidence for φ, A neither believes φ nor does it remain
neutral. Surprisingly, A starts believing ¬φ (K⋉ ≡φ K and
K ≺¬φ K⋉ ). φ is already believed in Case B10 . However,
on getting a confirmation for φ, it becomes less entrenched
(K⋉ ≺φ K and K ≡¬φ K⋉ ). Similarly, in Case B11 , on receiving a confirmation for the already believed φ, A instead
gives up believing φ (K⋉ ≺φ K and K ≡¬φ K⋉ ). An extreme case is that of Case B12 where A, already believing φ,
receives a confirmation thereof and upon revision, ¬φ ends
up being believed (K⋉ ≺φ K and K ≺¬φ K⋉ ). Finally,
in Case B13 , A believes ¬φ; but, when provided with evidence against it, it becomes more entrenched nevertheless
(K ≡φ K⋉ and K ≺¬φ K⋉ ).
The cases in Table 2 may seem far fetched or even implausible. However, there is a line of reasoning that could
accommodate such cases. Although, in this paper, we do not
pursue this line or reasoning, it is at least worth a brief discussion. If agent A does not trust information source σ, A
may be reluctant to believe what σ conveys, given no further
supporting evidence; this much is perhaps uncontroversial.
But if A, not only does not trust σ, but strongly mistrusts
them (given a long history of being misled by the malicious
source), then A may reject what σ conveys and also believe
Intuitions about when and how trust in some information
source should change is very context-sensitive and we believe it be unwise to postulate sufficient conditions for trust
change in a generic information revision operation. For example, one might be tempted to say that, if after revision
with φ, ¬φ is no longer believed, then trust in any source
supporting ¬φ should decrease. Things are not that straightforward, though.
Example 5.2. Let the belief base of agent A be {(S →
P, b1 ), (Q → ¬S, b2 )}. Information source Jordan, conveys P then conveys Q. Since A has no evidence against
either, it believes both. Now, information source N our, who
is more trusted than Jordan, conveys S. Consequently, A
starts believing S despite having evidence against it. To
maintain consistency, A also stops believing Q (because it
supports ¬S). What should happen to A’s trust in Jordan?
We might, at first glance, think that trust in Jordan should
decrease as he conveyed Q which is no longer believed.
However, one could also argue that trust in Jordan should
increase because he conveyed P , which is now being confirmed by N our.
This example shows that setting general rules for how
trust must change is almost impossible, as it depends on several factors. Whether A ends up trusting Jordan less, more,
or without change appears to depend on how the particular revision operators manipulates grades. The situation becomes more complex if the new conveyance by N our supports several formulas supporting Jordan and refutes several formulas supported by him. In this case, how trust in
Jordan changes (or not) would also depend on how the effects of all these support relations are aggregated. We contend that such issues should not, and cannot, be settled by
general constraints on information revision.
This non-determinism about how trust changes extends
to similar non-determinism about how belief changes. According to Observation 4.1, a formula φ may support another
formula ψ by transitivity through an intermediate source σ.
Given that, in general, the effect of revising with φ on σ is
non-deterministic, then so is its effect on ψ. Hence, the postulates to follow only provide necessary conditions for different ways belief and trust may change; the general principle being that the scope of change on revising with φ is limited to formulas and sources which are φ- and ¬φ-relevant.
Postulating sufficient conditions is, we believe, ill-advised.
154
5.2
evidence for φ (⋉7 ). If σ ′ is less trusted after revision, then
it must be either that φ succeeds and σ ′ (possibly identical
to σ) is relevant to ¬φ, or that σ ′ is σ and there is believed
evidence for ¬φ that leads to rejecting φ (⋉7 ). ψ is more
entrenched after revision only if it is supported by φ (⋉9 ).
Finally, ψ is less entrenched after revision only if it is relevant to φ or ¬φ (or both) and the one it is relevant to is not
favored by the revision (⋉10 ).
Postulates
In the sequel, where φ is a formula and σ is a source, a σindependent φ-kernel is, intuitively, a φ-kernel that would
still exist if σ did not exit. More precisely, for every ψ ∈
Γ, ψ is supported by some σ ′′ 6= σ, or ψ has no source.
Of course, all formulas are conveyed by sources. However,
given a forgetful filter, record of sources for some formulas
may be missing from the history.
We believe a rational information revision operator should
observe the following postulates on revising an information
state K with (φ, σ) and φ ∈ T where T is a topic. The postulates are a formalization of the intuitions outlined earlier.
5.3
The following observations follow from the definition of information states, the support graph, and the postulates.
Observation 5.1. Let K be an information state.
(⋉1 : Closure) K⋉(φ, σ) is an information state.
(⋉2 : Default Attitude) If (σ, T, t)
(σ, T, δ) ∈ T (K⋉(φ, σ)).
∈
/
Discussion
1. Positive Entrenchment. If Cn(F or(B(K))) 6= L, then,
K⋉(φ, σ) ⊀φ K.
2. Positive Persistence. If Cn(F or(B(K))) 6= L and φ ∈
Cn(F or(B(K))), then φ ∈ Cn(F or(B(K ⋉ (φ, σ)))).
3. Negative Persistence. If ¬φ ∈
/ Cn(F or(B(K))), then
¬φ ∈
/ Cn(F or(B(K ⋉ (φ, σ)))).
4. Formula Relevance. If K 6≡ψ K ⋉ (φ, σ), then ψ is φ- or
¬φ-relevant.
5. Trust Relevance. If K 6≡σ′ ,T K ⋉ (φ, σ), then σ ′ is φ- or
¬φ-relevant.
6. No Trust Increase I. If φ ∈
/ F or(B(K ⋉ (φ, σ))), then
there is no σ ′ ∈ SK such that K ≺σ′ ,T K ⋉ (φ, σ).
7. Rational Revision. If Cn(F or(B(K))) 6= L, then an operator that observes ⋉5 and ⋉6 allows for only cases in
Table 1 to occur.
T (K), then
(⋉3 : Consistency) Cn(F or(B(K⋉(φ, σ)))) 6= L.
(⋉4 : Resilience) If Cn({φ}) = L, then K ⊀σ,T K⋉(φ, σ).
(⋉5 : Supported Entrenchment) K⋉(φ, σ) ≺φ K only if
Cn(F or(B(K))) = L.
(⋉6 : Opposed Entrenchment) K ⊀¬φ K⋉(φ, σ).
(⋉7 : Positive Relevance) If K ≺σ′ ,T K⋉(φ, σ) and φ ∈
F or(B(K⋉(φ, σ))), then
1. σ ′ 6= σ is supported by φ; or
2. σ ′ = σ and there is Γ ⊆ F or(B(K)) where Γ is a
σ-independent φ-kernel.
(⋉8 : Negative Relevance) If K⋉(φ, σ) ≺σ′ ,T K, then
The first two clauses of Observation 5.1 follow straight
away from the definition of the postulates. On the other
hand, the third and fourth clauses demonstrate how the postulates managed to reflect the intuitions behind information
revision that lead us to propose the support graph. As previously discussed, information revision is considered with
relevant change. Thus, we achieved our goal by ensuring
that if belief in a formula (or trust in a source) is revised,
this formula (or source) is relevant to the formula that triggered the revision (or possibly its negation). The fifth clause
highlights the fact that if the formula of revision is rejected,
no extra support is provided for anyone and hence no source
will be more trusted. Finally, the last clause shows how the
postulates managed to capture our intuitions about belief revision highlighted in Tables 1 and 2.
Observation 5.2. Let K0 = {{}, {}, {}} be an information
state. For i > 0, let Ki refer to any state resulting from the
revision of Ki−1 using an operator ⋉, with a non-forgetful
conveyance inclusion filter, which observes the postulates in
Section 5.2. The following hold.
1. φ ∈ F or(B(K⋉(φ, σ))) and σ ′ is ¬φ-relevant; or
2. σ ′ = σ, but, there is Γ ⊆ F or(B(K ⋉ (φ, σ))) where
Γ is a ¬φ-kernel.
(⋉9 : Belief Confirmation) If K ≺ψ K⋉(φ, σ), then, ψ 6= φ
is supported by φ.
(⋉10 : Belief Refutation) If K⋉(φ, σ) ≺ψ K, then
1. ψ is ¬φ-relevant and φ ∈ Cn(F or(B(K⋉(φ, σ)))) or
K⋉(φ, σ) ≺¬φ K; or
2. ψ is φ-relevant and φ ∈
/ Cn(F or(B(K⋉(φ, σ)))) or
K⋉(φ, σ) ≺φ K.
Information revision should yield an information state
(⋉1 ). An information source that has no prior degree of trust
is associated with the default degree of trust (⋉2 ). A revised
information state is consistent even if the revising formula is
itself contradictory (⋉3 ). If φ is inconsistent, σ should not
become more trusted7 (⋉4 ). Abiding by the admissible and
forbidden cases of information revision outlined in Tables
1 and 2, φ cannot become less entrenched unless the belief
base is inconsistent (⋉5 ) while, even if the belief base is
inconsistent, ¬φ should not become more entrenched (⋉6 ).
If an information source σ ′ is more trusted after revision,
then (i) φ succeeds and (ii) either σ ′ is different from σ and
supported by φ or σ ′ is σ and there is independent believed
1. Single Source Revision. If S = {σ}, then, for any information state Ki where i > 0, the maximum degree in
{ t | (σ, T, t) ∈ T (K) } is δ.
2. No Trust Increase II. If for every σ ∈ SKi , there is no
source σ ′ that is σ-relevant, then there is no σ ∈ SKi
such that Ki−1 ≺σ,T Ki .
3. No Trust Increase III. If for every information state Kj ,
0 < j ≤ i, and for every source σj ∈ SKj there is no
7
A specific operator might choose to actually decrease trust in
a source that conveys contradictions as this is a proof of its unreliability.
155
source σj′ that is σj -relevant, then the maximum degree in
{ t | (σ, T, t) ∈ T (Ki ) } is δ.
to be a part of revision. Contraction is the process of removing a formula from the consequences of a belief base.
AGM-Recovery states that expansion with φ after contraction with φ should yield the original belief set before
contraction (φ already belongs to the belief set). Thus
modeling recovery in information states is enforcing that
removing a formula φ from B(K) and then expanding
with (φ, σ) will result in the original belief base. As with
the previous cases, ⋉ fails to observe recovery because if
the contraction of φ occurred, the reintroduction of φ will
affect the resulting degrees of belief and trust differently
depending on the source of φ.
The first clause in Observation 5.2 represents the case
where, in fact, trust does not matter. When there is a single source, the relevance relata is reduced to logical implication between formulas. As trust can only decrease, because there can be no confirmations, information revision
becomes traditional non-prioritized belief revision. The second clause draws upon the same line of reasoning. If sources
are completely independent of each other, only self support
is present, no source will be more trusted because, intuitively, there is no reliable independent evidence present for
any of them. Last but not least, the third clause further supports our claim that if sources are not relevant to each other,
relevance reduces to logical implication and, in the absence
of source based support, no source will be more trusted (will
not exceed the default).
An ⋉ operator that observes the postulates in Section 5.2,
by design, fails to observe the following AGM postulates
(Alchourrón, Gärdenfors, and Makinson 1985):
6
Extended Example
Let information structure I = (LV , C, S, G), where
• Language LV is a propositional language with the set
V = {Arr, Inc, Doomed, Kwin, Jwin, Af ather,
Lymarrid, Lymother} of propositional variables. The
intuitive meaning of the variables is as follows. Arr
means “The army of Dany will arrive”. Inc means “Jon’s
army increased in size”. Doomed means “We are all
doomed”. Kwin denotes “The Knight King wins”, while
Jwin denotes “Jon wins”. Af ather means that “Agon is
the father of Jon”. Lymarried denotes that “Agon married Lyanna”, and finally, Lymother represents “Lyanna
is the mother of Jon”.
• Success. The success postulate states that, on revising
with φ, agent A should believe φ. The ⋉ operator fails to
observe AGM-success because information revision depends not only on the formula of revision but also on the
source of that formula. Thus, A does not just accept a new
piece of information.
• C = {{LV }}.
• Vacuity. AGM-vacuity states that expansion with φ, if
the belief base does not derive ¬φ, is a subset of revision
with φ. In order to draw a comparison, we have to first
define expansion of information states. Let K + (φ, σ)
denote the expansion of information state K with φ conveyed by σ. Expansion just adds a formula to the belief base without checking for consistency. Obviously,
φ ∈ Cn(F or(B(K +F (φ, σ)))). However, since ⋉
does not observe success, it is not always the case that
φ ∈ Cn(F or(B(K ⋉ (φ, σ)))), and hence expansion is
not a subset of revision even if ¬φ ∈
/ Cn(F or(B(K)))).
Expansion on its own does not exist for information states
as, we believe, adding a piece of information from an information source should be part of revision because: i)
accepting or rejecting φ depends, among other things, on
trust in σ, ii) an expanded belief base is not necessarily a
consistent one.
• S = {T yrion, Sam, P eter, V arys, Jon}.
• G = (N, N, ≺N , ≺N , 1) where ≺N is the natural order on
natural numbers.
Then, we define information state K0 = (B0 , T0 , H0 ) as
follows:
• B0 = {(Arr → Inc, 20), (Inc → Jwin, 20),
(¬Jwin → Kwin, 20), (Kwin → Doomed, 20),
(Af ather, 10), (Lymarried, 6),
(Af ather ∧ Lymarried → Lymother, 10)}.
• T0 = {(T yrion, 5), (Sam, 5), (V arys, 4), (P eter, 3),
(Jon, 10)}. The topic attribute was dropped from the tuples because there is only a single topic.
• H0 = {}
That is, we start revising with a consistent, non-empty, belief
base, an empty history, and with an attribution of trust for all
information sources.
Let ⋉G be an information revision operator that will be
used in this example.8 To illustrate how it works, we need to
define what we call the support degree. The support degree
of a formula φ with respect to a source σ, is the number of
believed σ-independent φ-kernels (other than {φ}) and the
number of sources (other than σ) that conveyed φ directly.
Moreover, the support degree of a source σ is the sum of
support degrees of all formulas it conveyed with respect to
σ. The intuition is as follows. A source is supported to
• Extensionality. Extensionality says that if φ ⇔ ψ, then
revision with φ is equivalent to revision with ψ. There is
no notion of an information source in the traditional AGM
approach. Again, to draw a comparison, we will consider
the case where revision is taking place with φ and ψ both
conveyed by the same information source σ. Even then,
⋉ fails to observe extensionality. Since trust in a source
is associated with a topic, and since topics need not be
closed, it is not always the case that φ conveyed by σ is
believed, if believed at all, to the same degree of ψ that
is also conveyed by σ. Hence, revision with (φ, σ), in
general, is not the same as revision (ψ, σ) even if φ ⇔ ψ.
8
⋉G is just an operator created for the purpose of demonstrating interesting cases and is not a generic operator of information
revision.
• Recovery. In our framework of information revision,
there is no operation of “contraction” on its own, it has
156
B(K2 ) = B(K1 ) ∪ {(Doomed, 5)}, T (K2 ) = T (K1 ),
and H(K2 ) = H(K1 ) ∪ {(Doomed, T yrion)}.
the extent formulas conveyed by this source are supported.
However, we took into account source-independent kernels
to eliminate exclusive self-support.
Given an information state K, with a support graph G(K),
on revising with (φ, σ), ⋉G operates as follows.
Third Instance: Sam conveys Kwin. Kwin confirms
Doomed. Since Doomed had no kernels and was conveyed only by T yrion, the support degree of T yrion in
K2 (d1 ) was 0. This is why T yrion was not more trusted
in K2 . However, after Sam conveyed Kwin, there is a
T yrion-independent Doomed-kernel {Kwin, Kwin →
Doomed}. Thus, the support degree of Doomed in
K3 with respect to T yrion is 1, which makes T yrion’s
new support degree (d2 ) 1 as well. T yrion will be
more trusted by a value equal to d2 − d1 = 1. Because T yrion is more trusted, T yrion-relevant formulas could be more entrenched. In this particular case,
Doomed will be more entrenched. Hence, K3 is as follows: B(K3 ) = B(K1 ) ∪ {(Doomed, 6), (Kwin, 5)},
T (K3 ) = (T (K1 ) \ {(T yrion, 5)}) ∪ {(T yrion, 6)}, and
H(K3 ) = H(K2 ) ∪ {(Kwin, Sam)}.
1. If φ is inconsistent, it will be rejected.
2. Otherwise, a degree of belief for φ is derived. For
any formula, in this case φ, the degree of belief bφ =
M ax(F, S). F represents the degree by which an agent
believes in φ given all φ-kernels, while S represents how
much an agent believes in φ given trust in sources that
conveyed φ. Since a kernel is as strong as its weakest
formula, let the set Γφ be the set containing, for every φkernel, the formula with the lowest degree. Then, F will
be the degree of the formula with the maximum degree in
Γψ . Intuitively, the derived degree of belief in φ, given
formulas, is that of its strongest support. Similarly, S will
be the degree of the most trusted source among those that
previously conveyed φ including σ.
Fourth Instance: P eter conveys Inc. The newly conveyed Inc is supported by a kernel {Arr, Arr → Inc}.
However, since the only source supporting both Arr and
Inc is P eter himself, P eter’s support degree will not
increase, but Inc will be believed as A has no evidence against it. K4 is as follows: B(K4 ) = B(K3 ) ∪
{(Inc, 3)}, T (K4 ) = T (K3 ), and H(K4 ) = H(K3 ) ∪
{(Inc, P eter)}.
3. Add (φ, bφ ) to the belief base. If any contradiction arises,
for example ¬φ and φ (or any two formulas ψ and ¬ψ that
both belong to Cn(F or(B(K)))), the derived degree of
belief in ¬φ is compared to that of φ and the one with the
lower degree is contracted. To contract any formula (for
example ξ), ⋉G removes, recursively, from every single
ξ-kernel the formula with the lowest degree.
Fifth Instance: V arys conveys Jwin. A has no evidence
against Jwin thus A believes it. Moreover, both Arr and
Inc are believed propositions supporting Jwin. Now,
there are 2 V arys-independent Jwin-kernels. Namely:
{Inc, Inc → Jwin} and {Arr, Arr → Inc, Inc →
Jwin}. Hence, the support degree of V arys becomes
2. Thus, V arys will be more trusted with a value of 2.
As with the third instance, since V arys is more trusted,
the formulas that V arys supports could be more entrenched and hence K5 is as follows: B(K5 ) = B(K4 ) ∪
{(Jwin, 6)}, T (K5 ) = (T (K4 ) \ {(V arys, 4)}) ∪
{(V arys, 6)}, and H(K5 ) = H(K4 )∪{(Jwin, V arys)}.
4. Once the beliefs are consistent, the support graph will be
reconstructed. Given the new graph, trust in every φ- or
¬φ-relevant source σ ′ is, possibly, revised. If (σ ′ , t1 ) ∈
T (K) and (σ ′ , t2 ) ∈ T (K⋉G ), while the support degree
of σ ′ in K is d1 and the support degree of σ ′ in K⋉G is
d2 , then the new degree of trust t2 = t1 + (d2 − d1 ). The
proposed trust update formula ensures that for any source,
if the support degree increases, trust increases, and if the
support degree decreases trust decreases.
5. Finally, given the new trust degrees derived in the previous step, for every formula ψ that is φ- or ¬φ-relevant, a
possible new degree of belief is derived in the same way
bφ was derived in step 2.
Sixth Instance: V arys conveys Lymother. A already has
evidence for Lymother. Lymother has a single kernel {Af ather, Lymarried, Af ather ∧ Lymarried →
Lymother}.
Since this kernel is not dependent
on V arys, V arys’s support degree increases by 1
(due to Lymother) resulting in V arys being more
trusted. The weakest formula in the Lymother-kernel
has a degree of 6.
However, trust in V arys, a
source who directly conveyed Lymother, is 7 and
hence Lymother will have a degree of belief equal
to 7. As sources become more trusted, belief in
formulas conveyed by these sources could increase.
Hence, belief in Jwin will increase and K6 is as follows: B(K6 ) = B(K4 ) ∪ {(Jwin, 7), (Lymother, 7)},
(T (K5 ) \ {(V arys, 6)}) ∪ {(V arys, 7)}, and H(K6 ) =
H(K5 ) ∪ {(Lymother, V arys)}.
Observation 6.1. ⋉G observes ⋉3−10 of Section 5.2
We now follow the changes to the information state of
agent A as it observes the following conveyance instances.
Every information state Ki is the result of revision of Ki−1
using ⋉G , starting from K0 . The conveyance inclusion filter is non-forgetful, hence every instance of conveyance will
make it to the history.
First Instance: P eter conveys Arr. Since A has no evidence against Arr, A believes Arr. As there is no
evidence for Arr, its degree of belief will be equal to
that of the trust in its source P eter. There was no confirmations nor refutations to any formulas, so the trust
base remains unchanged. K1 is as follows: B(K1 ) =
B(K0 ) ∪ {(Arr, 3)}, T (K1 ) = T (K0 ), and H(K1 ) =
{(Arr, P eter)}.
Seventh Instance: Jon himself after the battle conveys
¬Jwin. Here, ¬Jwin supports Kwin and Doomed.
However, it is a direct refutation to Jwin and provides
Second Instance: T yrion conveys Doomed. Similar to
the previous case, A believes Doomed. K2 is as follows:
157
evidence against Inc and Arr. This is the first time A has
evidence against the newly conveyed formula. However,
Jon has the highest degree of trust and hence ¬Jwin will
have a higher degree of belief than Jwin. Thus, A will
choose to remove Jwin as follows.
Jwin has three kernels: Γ1 = {Jwin}, Γ2 = {Inc →
Jwin, Inc} and Γ3 = {Arr, Arr → Inc, Inc →
Jwin}. The operator will remove the formula with the
lowest degree from every kernel. Γ1 has a single formula
so it is removed and hence A gives up Jwin. Moreover,
in Γ2 , Inc has a lower degree than Inc → Jwin thus
A will give up Inc. Finally, following the same line of
reasoning, Arr will be removed from Γ3 .
The support degree of V arys in K7 is 1 as opposed to 3 in
K6 . Hence, V arys’s support degree decreased by 2 and,
subsequently, V arys becomes less trusted. Although
P eter is Jwin-relevant, according to the definition
of ⋉G , in this particular case, trust in P eter will not
decrease. Both T yrion and Sam received a new confirmation and their support degrees increased by 1 resulting
in them being more trusted which lead formulas supported by them to become more entrenched. Jon is not
supported by any sources or formulas so trust in Jon will
remain unchanged. K7 is as follows: B(K7 ) = B(K0 ) ∪
{(Doomed, 7), (Kwin, 6), (Lymother, 6), (¬Jwin, 10)},
T (K7 ) = {(T yrion, 7), (Sam, 6), (V arys, 4),
(P eter, 3), (Jon, 10)}, and H(K7 ) = H(K6 ) ∪
{(¬Jwin, Jon)}.
The case of Lymother is a very interesting case.
Lymother became less entrenched after revision with
¬Jwin. In traditional AGM-approaches Lymother
would not have been considered relevant to ¬Jwin and
hence it would not change according to the principle of
minimality. However, as we previously argued, belief
in a formula depends on trust in sources of said formula. Thus, when trust in V arys decreased, irrelevant
from Lymother, formulas conveyed by V arys (including Lymother) were subject to revision.
7
Conclusion and Future Work
It is our conviction that belief and trust revision are intertwined processes that should not be separated. Hence, in this
paper, we argued why that is the case and provided a model
for performing the joint belief-trust (information) revision
with minimal assumptions on the modeling language. Then,
we introduced the notion of information states that allows
for the representation of information in a way that facilitates
the revision process. Moreover, we introduced the support
graph which is a novel formal structure that highlights the
relevance relations between not only formulas, but also, information sources. Finally, we proposed the postulates that
we believe any rational information revision operator should
observe.
Future work could go in one or more of the following directions:
1. We intend to define a representation theorem for the postulates we provided.
158
2. We intend to further investigate conveyance and information acquisition to further allow agents to trust/mistrust
their own perception(s).
3. Lastly, we would like to add desires, intentions, and other
mental attitudes to create a unified revision theory for all
mental attitudes, giving rise to an explainable AI architecture.
References
Alchourrón, C. E.; Gärdenfors, P.; and Makinson, D. 1985.
On the logic of theory change: Partial meet contraction and
revision functions. The journal of symbolic logic 50(2):510–
530.
Booth, R., and Hunter, A. 2018. Trust as a precursor to
belief revision. Journal of Artificial Intelligence Research
61:699–722.
Castelfranchi, C., and Falcone, R. 1998. Principles of trust
for MAS: Cognitive anatomy, social importance, and quantification. In Proceedings International Conference on Multi
Agent Systems, 72–79. IEEE.
Darwiche, A., and Pearl, J. 1997. On the logic of iterated
belief revision. Artificial intelligence 89(1-2):1–29.
Demolombe, R., and Liau, C.-J. 2001. A logic of graded
trust and belief fusion. In Proceedings of the 4th workshop
on deception, fraud and trust in agent societies, 13–25.
Demolombe, R. 2001. To trust information sources: a proposal for a modal logical framework. In Castelfranchi, C.,
and Tan, Y.-H., eds., Trust and Deception in Virtual Societies. Dordrecht: Springer Netherlands. 111–124.
Demolombe, R. 2004. Reasoning about trust: A formal logical framework. In Jensen, C.; Poslad, S.; and Dimitrakos,
T., eds., Trust Management, 291–303. Berlin, Heidelberg:
Springer Berlin Heidelberg.
Drawel, N.; Bentahar, J.; and Shakshuki, E. 2017. Reasoning about trust and time in a system of agents. Procedia
Computer Science 109:632–639.
Elangovan, A.; Auer-Rizzi, W.; and Szabo, E. 2007. Why
don’t I trust you now? An attributional approach to erosion
of trust. Journal of Managerial Psychology 22(1):4–24.
Falcone, R., and Castelfranchi, C. 2001. Social trust: A
cognitive approach. In Castelfranchi, C., and Tan, Y.-H.,
eds., Trust and Deception in Virtual Societies. Dordrecht:
Springer Netherlands. 55–90.
Gärdenfors, P., and Makinson, D. 1988. Revisions of knowledge systems using epistemic entrenchment. In Proceedings
of the 2nd conference on Theoretical aspects of reasoning
about knowledge, 83–95. Morgan Kaufmann Publishers Inc.
Hansson, S. O. 1994. Kernel contraction. The Journal of
Symbolic Logic 59(3):845–859.
Hansson, S. O. 1999a. A survey of non-prioritized belief
revision. Erkenntnis 50(2-3):413–427.
Hansson, S. O. 1999b. A textbook of belief dynamics - theory
change and database updating, volume 11 of Applied logic
series. Kluwer.
Hardwig, J. 1991. The role of trust in knowledge. The
Journal of Philosophy 88(12):693–708.
Haselhuhn, M. P.; Schweitzer, M. E.; and Wood, A. M.
2010. How implicit beliefs influence trust recovery. Psychological Science 21(5):645–648.
Herzig, A.; Lorini, E.; Hübner, J. F.; and Vercouter, L. 2010.
A logic of trust and reputation. Logic Journal of the IGPL
18(1):214–244.
Holton, R. 1994. Deciding to trust, coming to believe. Australasian Journal of Philosophy 72(1):63–76.
Ismail, H., and Attia, P. 2017. Towards a logical analysis
of misleading and trust erosion. In Gordon, A. S.; Miller,
R.; and Turán, G., eds., Proceedings of the Thirteenth International Symposium on Commonsense Reasoning, COMMONSENSE 2017, London, UK, November 6-8, 2017, volume 2052 of CEUR Workshop Proceedings. CEUR-WS.org.
Jones, A. J., and Firozabadi, B. S. 2001. On the characterisation of a trusting agent—aspects of a formal approach.
In Castelfranchi, C., and Tan, Y.-H., eds., Trust and Deception in Virtual Societies. Dordrecht: Springer Netherlands.
157–168.
Jones, A. J. 2002. On the concept of trust. Decision Support
Systems 33(3):225–232.
Jøsang, A.; Ivanovska, M.; and Muller, T. 2015. Trust
revision for conflicting sources. In The 18th International
Conference on Information Fusion (Fusion 2015), 550–557.
IEEE.
Katz, Y., and Golbeck, J. 2006. Social network-based trust
in prioritized default logic. In Proceedings of the TwentyFirst National Conference on Artificial Intelligence (AAAI
2006), 1345–1350.
Leturc, C., and Bonnet, G. 2018. A normal modal logic
for trust in the sincerity. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 175–183. International Foundation for Autonomous Agents and Multiagent Systems.
Liau, C.-J. 2003. Belief, information acquisition, and trust in
multi-agent systems—a modal logic formulation. Artificial
Intelligence 149(1):31–60.
Lorini, E., and Demolombe, R. 2008. From binary trust to
graded trust in information sources: A logical perspective.
In International Workshop on Trust in Agent Societies, 205–
225. Springer.
Lorini, E.; Jiang, G.; and Perrussel, L. 2014. Trust-based
belief change. In T. Schaub, G. Friedrich, B. O., ed., Proceedings of the 21st European Conference on Artificial Intelligence (ECAI 2014), volume 263 of Frontiers in Artificial
Intelligence and Applications, 549–554. Amsterdam: IOS
Press.
McLeod, C. 2015. Trust. In Zalta, E. N., ed., The Stanford
Encyclopedia of Philosophy. Metaphysics Research Lab,
Stanford University, fall 2015 edition.
Rodenhäuser, L. B. 2014. A matter of trust: Dynamic attitudes in epistemic logic. Universiteit van Amsterdam [Host].
Sabater, J., and Sierra, C. 2005. Review on computational
trust and reputation models. Artificial Intelligence Review
24(1):33–60.
Sakama, C.; Caminada, M.; and Herzig, A. 2010. A logical account of lying. In European Workshop on Logics in
Artificial Intelligence, 286–299. Springer.
Sakama, C. 2015. A formal account of deception. In 2015
AAAI Fall Symposia, Arlington, Virginia, USA, November
12-14, 2015, 34–41. AAAI Press.
Simpson, J. A. 2007. Psychological foundations of trust.
Current Directions in Psychological Science 16(5):264–268.
Van Benthem, J. 2007. Dynamic logic for belief revision.
Journal of applied non-classical logics 17(2):129–155.
van Ditmarsch, H. 2014. Dynamics of lying. Synthese
191(5):745–777.
159
Algebraic Foundations for Non-Monotonic Practical Reasoning
Nourhan Ehab1 , Haythem O. Ismail2,1
1
Department of Computer Science and Engineering, German University in Cairo
2
Department of Engineering Mathematics, Cairo University
{nourhan.ehab, haythem.ismail}@guc.edu.eg
Abstract
whenever they conflict with his beliefs and prefers to give up
his desires whenever they conflict with his obligations. What
should Ted do?
Practical reasoning is a hallmark of human intelligence. We
are confronted everyday with situations that require us to
meticulously choose among our possibly conflicting desires,
and we usually do so with ease guided by beliefs which may
be uncertain or even contradictory. The desires we end up
choosing to pursue make up intentions which we seamlessly
revise whenever our beliefs change. Modelling the intricate
process of practical reasoning has attracted a lot of attention
in the KR community giving rise to a wide array of logics
coming in diverse flavours. However, a robust logic of practical reasoning with adequate semantics representing the preferences among the agent’s different mental attitudes, while
capturing the intertwined revision of beliefs and intentions,
remains missing. In this paper, we aspire to fill this gap by introducing general algebraic foundations for practical reasoning. We present an algebraic logic we refer to as LogA PR
capable of modelling preferences among the agent’s different
mental attitudes and capturing their joint revision.
1
Since the days of Aristotle in the twelfth century BC,
modelling practical reasoning has posed a difficult challenge for philosophers, logicians, and computer scientists
alike. Several attempts have been made over the years to
come up with logical theories of practical reasoning; however, a comprehensive and adequate theory remains missing (Thomason 2018). Some endeavours at modelling
practical reasoning have been successful when the problem was viewed as a mean to model rational agency. In
this view, rational agents are thought of as practical reasoners that act based on their beliefs about the world and
driven by their desires (Rao and Wooldridge 1999; Searle
2003). The action-attitudes of the agent representing its
commitment to some motivations, as permissible by its beliefs, are classically referred to as intentions (Bratman 1987;
Cohen and Levesque 1990). In this way, the agent’s intentions are evidently dependent on both its beliefs and desires.
Taking the trinity of beliefs, desires, and intentions to be
the key elements of the agent’s mental state is the approach
taken by the much renowned BDI model of rational agents
(Rao and Georgeff 1995) and its extensions to include other
mental attitudes such as obligations (Broersen et al. 2001;
Broersen et al. 2002).
In practical settings, the agent’s beliefs and desires are
often governed by a system of preferences and are continuously revised. Consequently, the revision of beliefs and
desires must be reflected on the agent’s intentions. The existing logical approaches to modelling preferences within
the BDI architecture are the graded-BDI (g-BDI) model
(Casali, Godo, and Sierra 2008; Casali, Godo, and Sierra
2011) and TEAMLOG (Dunin-Keplicz, Nguyen, and Szalas
2010). While both approaches propose frameworks for joint
reasoning with graded beliefs, desires, and intentions; neither has an account for the joint revision of the three mental
attitudes. Moreover, the g-BDI model lacks precise semantics and TEAMLOG is based on a normal modal logic providing only a third-person account of reasoning about the
mental attitudes. On the other hand, the joint revision of
beliefs and intentions has been attempted in (Shoham 2009;
Icard, Pacuit, and Shoham 2010). These theories, however,
do not account for desire or preferences over beliefs and in-
Introduction
“What should I do?” is a question that repeatedly poses itself
in our everyday lives. The process of reflection and deliberation we undergo to answer this question is what we refer to
as practical reasoning (Broome 2002). To demonstrate the
intricate process of practical reasoning, consider the following example.
Example 1. The Weekend Dilemma
Ted needs to decide what to do over the weekend. He has
to work on a long overdue presentation as his boss will be
really mad if Ted does not give the presentation by latest the
beginning of next week. Ted also had previous plans with
his best friend Marshall to go on a hunting trip during the
weekend. Ted thinks that he can go to the trip and dedicate
some time to work on the presentation there. Barney, Ted’s
other best friend who is currently in a fall out with Marshall,
told Ted that he heard from his fiance Robin that the trip location has no internet connectivity so Ted will not be able to
work on the presentation there. Ted trusts that Robin usually tells the truth, but suspects that Barney might be lying
to make him not go on the trip with Marshall. Ted desires
to go to the trip, but he still wishes he desired to not make
his boss mad. As Ted wants to start being more rational and
responsible, he prefers to give up his desires or obligations
160
there does not exist a formalism for practical reasoning that
possesses all the following capabilities like LogA PR does.
1. LogA PR is a graded logic. The use of graded propositions in LogA PR allows the representation of preferences among the agent’s beliefs and motivations. This
is useful to represent Ted’s different trust degrees in his
own supposition that he can go to the trip and work on
the presentation and the contradicting assertion attributed
to Robin by Barney. Further, preferences among Ted’s
different motivations can likewise be represented.
2. The nesting of graded propositions in LogA PR admits
the representation of nested graded beliefs and motivations. This naturally facilitates the representation of information acquired by Ted through a chain of sources (Barney and Robin) with different trust degrees. Permitting
the nesting of graded motivations facilitates the representation of higher-order desires, first introduced in (Frankfurt 1988), which is useful for representing Ted’s wish to
desire not to make his boss mad, for instance.
3. Different scales of graded motivations can be represented
in LogA PR. Having separate scales is useful for modelling agents with contradicting motivations and allows us
to circumvent several paradoxes of deontic logic as suggested by (Ismail 2020). Two scales, personal desire and
obligation, are needed to account for Ted’s desire to go to
the trip and his obligation towards working on the presentation. We also provide an account for modelling the characters of artificial agents as an ordering over their beliefs
and motivation scales. For example, a hedonistic agent
will always prefer to pursue its desires over its obligations
while a selfless agent will always prefer to pursue its obligations over its desires. Whenever contradictions among
the agent’s motivations arise, they are resolved by alluding to the grades of the conflicting motivations in addition
to the agent’s character. In our example, Ted’s character
is represented as his preference to pursue his obligations
whenever they conflict with his desires.
4. The precise semantics of LogA PR account for joint reasoning with, and revision of, graded beliefs and motivations. We follow (Rao and Georgeff 1995) and refer to
the subset of consistent motivations the agent chooses to
pursue as its intentions.
At this point, questions about the grades associated to beliefs and motivations may occur to the reader. The set of
grades in LogA PR can be any totally ordered set of (reified) particulars. The grades may be numeric or may merely
be locations on a qualitative scale. As such, only the order
among the grades is significant and not their actual nature.
The assignment of grades to particular beliefs and motivations is out of the scope of this paper; we briefly remark on
this, however. The grades come from the same knowledge
source that provides the beliefs and motivations themselves.
If the latter are provided by a knowledge engineer, for example, then so must the former be. This should not complicate the task of the knowledge engineer as several studies
showed that domain experts are often quite good at setting
and subjectively assessing numbers to be used as grades for
the beliefs and motivations (Charniak 1991). Alternatively,
tentions. In this paper, we aspire to address this gap in the
literature. Our contribution is twofold. First, we introduce
general algebraic foundations for first-person practical reasoning with several mental attitudes where preference and
joint revision can be captured. Second, we provide precise
semantics for an algebraic logic we refer to as LogA PR for
joint reasoning with graded beliefs and motivations. “Log”
stands for logic, “A” for algebraic, and “PR” for practical reasoning. The grades associated with the beliefs and
motivations in LogA PR are reified and are taken to represent measures of trust or preference. In LogA PR, we deviate from the BDI model and its extensions in (at least)
two ways: (i) we replace the notion of desire with a more
general notion of motivation to encompass all the different
types of motivational attitudes a rational agent can have including (but not limited to) desires, obligations, and social
norms; and (ii) we follow (Castelfranchi and Paglieri 2007;
Cohen and Levesque 1990) and treat intention as a mental
attitude derived from belief and motivation rather than treating it as a basic attitude.
The rest of the paper is structured as follows. In Section 2,
we present the motivations behind employing a non-classical
logic like LogA PR by highlighting its different capabilities.
Since we are taking the algebraic route, we review in Section 3 foundational concepts of Boolean algebra on which
LogA PR will be based. We also generalize the classical
notion of filters in Boolean algebra into what we will refer
to as multifilters providing a generalized algebraic treatment
of reasoning with multiple mental attitudes. Next, in Section 4, we present the syntax and semantics of LogA PR.
In Section 5, we extend multifilters to accommodate reasoning with graded beliefs and motivations. Additionally, we
present our extended graded consequence relation representing the joint reasoning from graded beliefs and motivations
to intentions. Finally, in Section 6 we outline some concluding remarks.
2
Why LogA PR?
LogA PR is the most recent addition to a growing family
of algebraic logics (Ismail 2012; Ismail 2013; Ismail 2020;
Ehab and Ismail 2020). As such, it is essential for a treatment of practical reasoning within the algebraic framework.
Hence, independent motivations for the algebraic approach
are also motivations for LogA PR. Such motivations do
exist, and are detailed in (Ismail 2012; Ismail 2013; Ismail 2020; Ehab and Ismail 2020). Furthermore, LogA PR
is a generalization of LogA G which is an algebraic logic
we presented earlier for non-monotonic reasoning about
graded beliefs. As proven in (Ehab and Ismail 2018;
Ehab and Ismail 2019), LogA G can capture a wide array
of non-monotonic reasoning formalisms such as possibilistic logic, circumscription, default logic, autoepistemic logic,
and the principle of negation as failure. Thus, LogA G can
be considered a unifying framework for non-monotonicity.
LogA PR naturally inherits all the features of LogA G yielding a very powerful system of practical reasoning.
In what follows, we briefly present the different features
of LogA PR and motivate why they are needed by referring
to the introductory example. To the best of our knowledge,
161
and (ii) if a ≤ b then ({⊤}i−1 × {a} × {⊤}k−i ) × (P i−1 ×
{b} × P k−i ) ⊆k .
if the beliefs and motivations are learned by some machine
learning procedure, then the, typically numeric, grades can
be learned as well. Several attempts for accomplishing this
are suggested in (Fern 2010; Paccanaro and Hinton 2001;
Richardson and Domingos 2006; Yang et al. 2015; Vovk,
Gammerman, and Shafer 2005). It is also worth pointing out
that any difficulty resulting from the task of assigning grades
is a price one is bound to pay to account for non-monotonic
reasoning. It can be argued that similar equivalently challenging tasks arise in other non-graded non-monotonic formalisms. For instance, how do we set the priorities among
the default rules when using prioritized default logic? It
might even be that using quantitative grades simplifies the
problem as there are several well-defined computational approaches to setting the grades as we previously pointed out.
3
We will henceforth drop the subscript k in k whenever
there is no resulting ambiguity.
Definition 3.3. Let be a k partial order on A and C ⊆
{1, ..., k}. A -multifilter of A with respect to C is a tuple
F (C) = hF1 , F2 , ...., Fk i of subsets of P such that
1. ⊤ ∈ Fi , for 1 ≤ i ≤ k;
2. if i ∈ C, a ∈ Fi , and b ∈ Fi , then a.b ∈ Fi ; and
3. if (a1 , ..., ak ) (b1 , ..., bk ) and (a1 , ..., ak ) ∈ ×ki=1 Fi
then (b1 , ..., bk ) ∈ ×ki=1 Fi .
We can observe at this point that the three conditions on
multifilters are just generalizations of the three conditions
on filters. The second condition though need not apply to all
the sets F1 , ..., Fk . The set C specifies the sets which behave
classically in observing the second condition.
We next define how multifilters can be generated by a tuple of sets of propositions. The intuition is that each set of
propositions represents a mental attitude and the tuple of sets
represents the collective mental state.
Boolean Algebras and Multifilters
In this section we lay the algebraic foundations on which
LogA PR is based. We start by reviewing the algebraic concepts of Boolean algebras and filters underlying classical
logic, then we extend the notion of filters to accommodate a
practical logic of multiple mental attitudes.
A Boolean algebra is a sextuple A =< P, +, ·, −, ⊥, ⊤ >
where P is a non-empty set with {⊥, ⊤} ⊆ P. A is closed
under the two binary operators + and · and the unary operator − with commutativity, associativity, absorption, and
complementation properties as detailed in (Sankappanavar
and Burris 1981). For the purposes of this paper, we will
take the elements of P to be propositions and the operators
+, ·, and − to to be disjunction, conjunction, and negation,
respectively.
The following definition of filters is an essential notion
of Boolean algebras to represent an algebraic counterpart to
logical consequence. Filters are defined in pure algebraic
terms, without alluding to the notion of truth, by utilizing
the natural lattice order ≤ on the the algebra: for p1 , p2 ∈ P,
p1 ≤ p2 =def p1 · p2 = p1. Henceforth, A is a Boolean
algebra hP, +, ·, −, ⊥, ⊤i.
Definition 3.4. Let Q1 , ..., Qk ⊆ P, be a k partial order on A, and C ⊆ {1, ..., k}. The -multifilter
generated by hQ1 , ..., Qk i with respect to C, denoted
F (hQ1 , ..., Qk i, C), is a -multifilter hQ′1 , ..., Q′k i with respect to C where Q′i is the smallest set containing Qi , for
1 ≤ i ≤ k, Q′i .
The following theorem states that, under certain conditions, multifilters can be reduced to classical filters applied
to the different sets of propositions representing the different
mental attitudes.
Theorem 1. Let Q1 , ..., Qk ⊆ P, C ⊆ {1, ..., k}, and be
a k partial order on A which is classical in i for some i ∈ C.
If F (hQ1 , ..., Qk i, C) = hQ′1 , ..., Q′k i, then Q′i = F (Qi ).
In the remainder of the paper, we will be assuming that
practical reasoning is based on a tuple of sets of propositions, the first set representing the agent’s beliefs and the
rest representing different types of motivations that the agent
acts upon. When using multifilters, we will assume that
only the set of beliefs behaves classically (C = {1}). We
will henceforth use F (hQ1 , ..., Qk i) as a shorthand for
F (hQ1 , ..., Qk i, {1}).
Definition 3.1. A filter of A is a subset F of P where
1. ⊤ ∈ F ;
2. If a, b ∈ F , then a · b ∈ F ; and
3. If a ∈ F and a ≤ b, then b ∈ F .
The filter generated by Q ⊆ P is the smallest filter F (Q) of
which Q is a subset.
4
LogA PR Languages
Since practical reasoning typically involves joint reasoning with multiple mental attitudes (beliefs, motivations, intentions, wishes, etc.), we extend the notion of filters giving rise to what we will refer to as multifilters. In contrast
to classical filters that rely on the natural order ≤ on the
Boolean algebra, multifilters will rely on an order on tuples.
( Recall that ≤ is the classical lattice order.)
In this section, we present the syntax and semantics of
LogA PR in addition to defining two logical consequence
relations one for beliefs and the other for motivations. Utilizing the multifilters presented in Section 3, we show that
our logical consequence relations have the distinctive properties of classical Tarskian logical consequence.
Definition 3.2. Let k be a positive integer. A k partial-order
on A is a partial order k on P k such that, (a1 , . . . , ak ) k
(b1 , . . . , bk ) and bi = ⊥, for some 1 ≤ i ≤ k, only if aj =
⊥, for some 1 ≤ j ≤ k. Further, we say that k is classical
in i just in case, (i) if (a1 , ..., ak ) k (b1 , ..., bk ) then ai ≤ bi
4.1 LogA PR Syntax
LogA PR consists of terms constructed algebraically from
function symbols. There are no sentences; instead, we use
terms of a distinguished syntactic type to denote propositions. Propositions are included as first-class individuals
162
in the LogA PR ontology and are structured in a Boolean
algebra. Though non-standard, the inclusion of propositions in the ontology has been suggested by several authors
(Church 1950; Bealer 1979; Parsons 1993; Shapiro 1993).
Grades are also taken to be first-class individuals. As a result, propositions about graded beliefs and motivations can
be constructed, which are themselves recursively gradable.
A LogA PR language is a many-sorted language composed of a set of terms partitioned into three base sorts:
σP is a set of terms denoting propositions, σG is a set of
terms denoting grades, and σI is a set of terms denoting anything else. A LogA PR alphabet Ω includes a non-empty,
countable set of constant and function symbols each having
a syntactic sort from the set σ = {σP , σG , σI } ∪ {τ1 −→
τ2 | τ1 ∈ {σP , σG , σI } and τ2 ∈ σ} of syntactic sorts. Intuitively, τ1 −→ τ2 is the syntactic sort of function symbols that take a single argument of sort σP , σG , or σI and
produce a functional term of sort τ2 . Given the restriction of the first argument of function symbols to base sorts,
LogA PR is, in a sense, a first-order language. In addition, an alphabet Ω includes a countably infinite set of variables of the three base sorts; a set of syncategorematic symbols including the comma, various matching pairs of brackets and parentheses, and the symbol ∀; and a set of logical symbols defined as the union of the following sets: (i)
{¬} ⊆ σP −→ σP , (ii) {∧, ∨} ⊆ σP −→ σP −→ σP , (iii)
.
{⋖, =} ⊆ σG −→ σG −→ σP , and (iv) {G} ∪ {Mi }ki=1 ⊆
σP −→ σG −→ σP . G(p, g) denotes a belief that the grade
of p is g and Mi (p, g) denotes that p is a motivation of type i
with a grade of g. Terms involving ⇒ (material implication)
and ∃ are abbreviations defined in the standard way.
A LogA PR language L is the smallest set of terms
formed according to the following rules, where t and ti
(i ∈ N) are terms in L.
The bridge rules serve to “bridge” propositions across the
different mental attitudes. A bridge rule B, M1 , ..., Mk 7−→
B ′ , M1′ , ..., Mk′ means that if B is a subset of the current beliefs and Mi is a subset of the current i motivations, then B ′
should be added to the current beliefs and each Mi′ should
be added to the current i motivations.
We now go back to Example 1 showing a corresponding
encoding of it as a LogA PR theory.
Example 2. Let “p” denote working on the presentation,
“t” denote going to the trip, and “m” denote the boss’s getting mad. A possible LogA PR theory representing Example
1 is T = hB, (M1 , M2 ), Ri where:
• B is made up of the following terms.
b1. G(p ∧ t, 5)
b2. G(G(p ⇔ ¬t, 10), 2)
• M1 = {M1 (t, 1)}.
• M2 is the set made up of the following terms.
o1. M2 (p, 1)
o2. M2 (M1 (¬m, 2), 3)
• R is the set of instances of the following rule schema
where φ and g are variables.
r1. {}, {}, {M1 (φ, g)} 7−→ {}, {M1 (φ, g)}, {}.
r2. {}, {M1 (¬m, g)}, {} 7−→ {}, {M1 (p, g)}, {}.
b1 represents Ted’s belief that he can work on the presentation. He trusts his belief b1 with a degree of 5. b2
represents the information Ted acquired through a chain
of sources (Barney and Robin) that he cannot work on
the presentation while being on the trip. Since Ted trusts
Robin more than Barney, p ⇔ ¬t which is acquired
through Robin is given the grade 10 and the whole graded
belief G(¬p ⇔ ¬t, 10) acquired through Barney is given
the grade of 2 as Ted trusts Barney the least.
There are two types of motivations in this example: Ted’s
personal desires and his obligations. M1 (φ, g) represents that Ted desires to φ with a degree of g. Likewise,
M2 (φ, g) represents that Ted is obliged to φ with a degree
of g. M1 is made up of Ted’s desire to go to the trip with
a degree of 1. o1 represents Ted’s obligation to work on
the presentation with a degree of 1 as well. o2 represents
his obligation to desire to not make his boss mad.
r1 is a bridge rule motivated by Ted’s character that
prefers to pursue his obligations. So whenever Ted is
obliged to have a desire, then he has it as a desire. r2
represents that if Ted has a desire to make his boss not
mad with a degree g, then he should desire to work on the
presentation with the same degree g.
• All variables and constants in the alphabet Ω are in L.
• f (t1 , . . . , tm ) ∈ L, where f ∈ Ω is of type τ1 −→
. . . −→ τm −→ τ (m > 0) and ti is of type τi .
• ¬t ∈ L, where t ∈ σP .
• (t1 ⊗ t2 ) ∈ L, where ⊗ ∈ {∧, ∨} and t1 , t2 ∈ σP .
• ∀x(t) ∈ L, where x is a variable in Ω and t ∈ σP .
• t1 ⋖ t2 ∈ L, where t1 , t2 ∈ σG .
.
• t1 = t2 ∈ L, where t1 , t2 ∈ σG .
• G(t1 , t2 ) ∈ L, where t1 ∈ σP and t2 ∈ σG .
• Mi (t1 , t2 ) ∈ L, where t1 ∈ σP and t2 ∈ σG .
In what follows, we consider two distinguished subsets
ΦG and ΦM of σP . ΦG is the set of terms of the form
G(φ, g) and ΦM is a set of terms of the form Mi (ψ, g) with
ψ not containing any occurrence of G.
4.2
Definition 4.1. A LogA PR theory T is a triple hB, M, Ri
where:
From Syntax to Semantics
A key element in the semantics of LogA PR is the notion of
a LogA PR structure.
• B ⊆ σP represents the agent’s beliefs;
• M = (M1 , ..., Mk ) is a k-tuple of subsets of ΦM representing the agent’s k motivation types; and
• R is a set of bridge rules each of the form
B, M1 , ..., Mk 7−→ B ′ , M1′ , ..., Mk′ where B ⊆ σP ,
B ′ ⊆ ΦG , and M1 , ..., Mk , M1′ , ..., Mk′ ⊆ ΦM .
Definition 4.2. A LogA PR structure is a quintuple Sk =
hD, A, g, Mk , ≪, ei, where
• D, the domain of discourse, is a set with two disjoint, nonempty, countable subsets: a set of propositions P, and a
set of grades G.
163
B ′ , M1′ , ..., Mk′ ) ∈ R, B ′ 6= {} only if Mi = Mi′ = {}
and [[B]]V ≤ [[B ′ ]]V , then TV is classical in 1.
• A = hP, +, ·, −, ⊥, ⊤i is a complete, non-degenerate
Boolean algebra (Sankappanavar and Burris 1981).
• g : P × G −→ P is a belief-grading function.
• Mk = {mi | 1 ≤ i ≤ k} is a set of k motivation-grading
functions such that each mi : P × G −→ P.
• ≪: G × G −→ P is a ordering function imposing a total
order.
• e : G × G −→ {⊥, ⊤} is an equality function, where
for every g1 , g2 ∈ G: e(g1 , g2 ) = ⊤ if g1 = g2 , and
e(g1 , g2 ) = ⊥ otherwise.
We next utilise a multifilter based on a TV -induced order to define an extended logical consequence relation for
beliefs and motivations.
Definition 4.5. Let T = hB, (M1 , ..., Mk ), Ri a LogA PR
theory. For every φ ∈ σP , φ is a belief (or motivation) consequence of T, denoted T |=B φ (or T |=M φ),
if, for every valuation V, [[φ]]V ∈ B (or [[φ]]V ∈ Mi
for some i, 1 ≤ i ≤ k), where hB, M1 , ..., Mk i =
V
FTV (h[[B]]V , [[M]]V
1 , ..., [[M]]k i).
A valuation V of a LogA PR language is a triple
hS, Vf , Vx i, where S is a LogA PR structure, Vf is a function that assigns to each function symbol an appropriate
function on D, and Vx is a function mapping each variable
to a corresponding element of the appropriate block of D.
An interpretation of LogA PR terms is given by a function
[[·]]V .
Both |=B and |=M are monotonic and have the distinctive
properties of classical Tarskian logical consequence, with
|=B observing a variant of the deduction theorem.
Theorem 2. Let T = hB, M, Ri and T′ = hB′ , M′ , R′ i be
LogA PR theories.
Definition 4.3. Let L be a LogA PR language and let V be
a valuation of L. An interpretation of the terms of L is given
by a function [[·]]V :
1. If φ ∈ B, then T |=B φ.
2. If φ ∈ Mi for some Mi ∈ M, then T |=M φ.
3. If T |=B φ, B ⊆ B′ , Mi ⊆ M′i for all 1 ≤ i ≤ k, and
R′ ⊆ R, then T′ |=B φ.
4. If T |=M φ, B ⊆ B′ , Mi ⊆ M′i for all 1 ≤ i ≤ k, and
R′ ⊆ R, then T′ |=M φ.
5. If T |=B ψ and hB ∪ {ψ}, M, Ri |=B φ, then T |=B φ.
6. Let M′i = Mi ∪ {ψ}, for some 1 ≤ i ≤ k, and M′j = Mj ,
for j 6= i. If T |=M ψ and hB, M′ , Ri |=M φ, then
T |=M φ.
7. If hB ∪ {φ}, M, Ri |=M ψ, then T |=B φ ⇒ ψ.
• [[x]]V = Vx (x), for a variable x
• [[c]]V = Vf (c), for a constant c
• [[f (t1 , . . . , tn )]]V = Vf (f )([[t1 ]]V , . . . , [[tm ]]V ), for an madic (m ≥ 1) function symbol f
• [[(t1 ∧ t2 )]]V = [[t1 ]]V · [[t2 ]]V
• [[(t1 ∨ t2 )]]V = [[t1 ]]V + [[t2 ]]V
V
• [[¬t]]V = −[[t]]
Y
• [[∀x(t)]]V =
[[t]]V[a/x]
a∈D
•
•
•
•
5
[[t1 ⋖ t2 ]]V = [[t1 ]]V ≪ [[t2 ]]V
.
[[t1 = t2 ]]V = e([[t1 ]]V , [[t2 ]]V )
[[G(t1 , t2 )]]V = g([[t1 ]]V , [[t2 ]]V )
[[Mi (t1 , t2 )]]V = mi ([[t1 ]]V , [[t2 ]]V )
Graded Multifilters
1. If b ≤ b′ , then (b, ⊤, . . . , ⊤) TV (b′ , ⊤, . . . , ⊤).
2. If (B, . . . , Mk 7−→ B ′ , . . . , Mk′ ) ∈ R, then
([[B]]V , . . . , [[Mk ]]V ) TV ([[B ′ ]]V , . . . , [[Mk′ ]]V )
Consider the LogA PR theory of the weekend dilemma from
Example 2. Given that Ted believes G(p ∧ t, 5), and does
not believe ¬(p ∧ t), it would make sense for him to accept
p∧t despite his uncertainty about it. (Who is ever absolutely
certain of their beliefs?) Similarly, it would make sense for
Ted to add t to his desires and p to his obligations if they
do not conflict with other motivations or beliefs. However,
if we only use multifilters, we will never be able to reason
with those nested graded beliefs and motivations as they are
not themselves in the agent’s theory but only grading propositions thereof. For this reason, we extend our notion of
multifilters into a more liberal notion of graded multifilters
to enable the agent to conclude, in addition to the consequences of the initial theory, beliefs and motivations graded
by the initial beliefs and motivations (like p ∧ t). Should this
lead to contradictions, the agent’s character and the grades
of the contradictory propositions are used to resolve them.
Due to nested grading, graded multifilters come in degrees
depending on the depth of nesting of the admitted graded
propositions.
The rest of this section is dedicated to formalizing graded
filters and presenting our definition of graded consequence
for beliefs and motivations. We start by introducing some
convenient abbreviations and notational conventions.
Observation 4.1. If T = hB, M, Ri is a LogA PR theory and V a valuation, then TV is a k + 1 partialorder on A. Further, if, for every (B, M1 , ..., Mk 7−→
• Since we are modelling joint reasoning with beliefs and
different types of motivations, in the sequel we assume a
tuple Q = hQ0 , Q1 , ..., Qk i where Q0 , ..., Qk ⊆ P. Q0
V
In the rest
Q of the Vpaper, for any Γ ⊆ σp , we will use [[Γ]]
to denote p∈Γ [[p]] for notational convenience.
4.3
Logical Consequence
In this section, we employ our notion of multifilters from
Section 3 to define logical consequence for LogA PR in algebraic terms. In Section 3, we defined multifilters based
on an arbitrary partial order . We start by defining how
to construct such an order for the tuples of propositions in
P. The intuition is that the order is induced by the bridge
rules in a LogA PR theory in addition to the natural order ≤
among the belief propositions.
Definition 4.4. Let T = hB, (M1 , . . . , Mk ), Ri be a
LogA PR theory and V a valuation. A TV -induced order,
denoted TV , is a partial order over P k+1 with the following properties.
164
S∞
n
where ⊕ : i=1 G i −→ G is commutative and hCk ik=1 is a
permutation of the set of longest grading chains of p in R.
represents a set of believed propositions, and Q1 , ..., Qk
represent sets of motivation propositions where each Qi
represents a different type of motivation. We will refer to
Q as the mental state of the agent.
• For every p ∈ P and g ∈ G, g(p, g) is referred to as a
belief-grading proposition that grades p and p is a graded
belief. Similarly, mi (p, g) is a motivation-grading proposition and p is a graded motivation.
• If g(p, g) ∈ Q0 , then p is graded in Q0 . Similarly, if
mi (p, g) ∈ Qi , then p is graded in Qi .
• If R ⊆ P and p ∈ P, then GB (p, R) = {g(p, g) | g ∈ G
and g(p, g) ∈ R} and GM i (p, R) = {mi (p, g) | g ∈ G
and mi (p, g) ∈ R}, for 1 ≤ i ≤ k.
5.1
5.2
Telescoping and Graded Multifilters
The key to defining graded multifilters is the intuition that
the set of consequences of Q = hQ0 , Q1 , ..., Qk i may be
further enriched by telescoping Q and accepting some of the
beliefs and motivations embedded therein. We refer to this
process as “telescoping” as the set of graded multifilters at
increasing depths can be thought of as an inverted telescope.
To this end, we need to define (i) the process of telescoping, which is a step-wise process that considers both beliefs
and motivations at increasing degrees of embedding, and (ii)
a criterion for accepting embedded beliefs and motivations
without introducting inconsistencies. In this section, we will
be formalizing the process of telescoping and the construction of graded multifilters. A first step towards defining
graded multifilters is the notion of telescoping structures.
Definition 5.4. Let Sk be a LogA PR structure with a
depth- and fan-out-bounded P. A telescoping structure for
Sk is a sextuple T = hT , O, ⊗B , ⊕B , ⊗M , ⊕M , Ci, where
• T = hT0 , T1 , ..., Tk i, where T0 , T1 , ..., Tk ⊆ P. T0 is
referred to as the set of top beliefs, and each Ti , for 1 ≤
i ≤ k, is referred to as a set of top motivations.
• O is an ultrafilter of the subalgebra induced by
Range(≪) (an ultrafilter is a maximal filter with respect
to not including ⊥ (Sankappanavar and Burris 1981));
• ⊗B , ⊕B , ⊗M , and ⊕M are fusion functions from tuples
of grades to grades; ⊕B and ⊕M are commutative.
• C is a partial preorder over the set {0,...,k} representing
the agent’s character.
The telescoping structure provides the top beliefs and motivations that will never be given up together with their consequences. The ultrafilter O provides a total ordering over
grades to enable comparing them. The operators ⊗B , ⊕B
and ⊗M , ⊕M are used to get fused grades for beliefs and
motivations respectively as per Definition 5.3. It is worth
noting that, for simplicity, we opted for fusing the grades
of all types of motivations using the same pair of operators
⊗M and ⊕M . The agent’s character C is defined as an ordering over the set {0, ..., k}. 0 will be taken to represent
the agent’s beliefs and 1, .., k represent the different types
of motivation. The character of the agent in addition to the
grades of the motivations will be utilised when picking a
consistent set of motivations making up the agent’s intentions. For simplicity, in the sequel, we will be assuming that
the agent’s character is a total order, giving rise to what we
will refer to as a linear character.
We are now ready to define the T-induced telescoping of
Q. The process of telescoping Q is made up of first getting the multifilter of Q then extracting the graded propositions embedded at depth 1. This might introduce inconsistencies. We resolve the inconsistencies by getting the tuple
of kernel survivors κ(E 1 (F (Q)), T) given the telescoping
structure T. The telescoping structure T is useful in getting the survivors as it contains the top beliefs and motivations, fusion operators, and the agent character that will all
be used to decide which propositions to keep and which to
Embedding and Grading Chains
As a building step towards formalizing graded multifilters,
the structure of graded propositions should be carefully scrutinized.
Definition 5.1. Let R ⊆ P and X be any of B or Mi , for
n
1 ≤ i ≤ k. The set EX
(R) of X-embedded propositions at
depth n ∈ N in R is inductively defined as follows.
0
• EX
(R) = R and
i+1
i
i
• EX (R) = EX
(R)) 6= {}}.
(R) ∪ {p | GX (p, EX
In the sequel, recalling that Q = hQ0 , Q1 , . . . , Qk i is the
agent’s mental state, we let
n
n
n
(Qk )i
(Q1 ), ..., EM
E n (Q) = hEB
(Q0 ), EM
1
k
Having carefully defined the notions of embedding and
the degree of embedding of a graded proposition, we say
that a grading chain of a belief (or motivation) p is a nonempty, finite sequence hq0 , q1 , . . . , qn i where q0 , q1 , . . . , qn
are belief (or motivation) grading propositions such that qi
grades qi+1 for 1 ≤ i < n and qn grades p. We next define
some properties of sets of propositions based on the grading
chains they include.
Definition 5.2. Let R ⊆ P.
1. R is depth-bounded if there is some d ∈ N such that every
belief (or motivation) grading chain in R has at most d
distinct grading propositions.
2. R is fan-out-bounded if there is some fout ∈ N such that
every grading belief (or motivation) chain in R grades at
most fout propositions.
3. R is fan-in bounded if there is some fin ∈ N where
|GB (p, R)| ≤ fin (|GMi (p, R)| ≤ fin ), for every p ∈ R.
Since nested grading is allowed, it is necessary to define the fused grade of a graded proposition p in a chain C.
Moreover, a proposition p might be graded by more than one
grading chain. Accordingly, we also need to fuse the grades
of p across all the chains grading it in some R ⊆ P. The
intuition is to compute the fused grade of p for each chain
that grades it by some operator ⊗, then combine these fused
grades together using another operator ⊕.
Definition 5.3. Let R ⊆ P be fan-in-bounded , then the
fused grade of p in Q is defined as
M
n
f⊕ (p, R) =
hf⊗ (p, Ck )ik=1
165
of Q where ∀i, j such that i < j ∈ C, Qi appears before Qj
in QC . In the sequel, let QC = hQ′0 , ..., Q′k i be a C-ordered
C
Q and QC
⊥ be the set of ⊥-kernels in Q throughout.
give up. Since this process can cause some propositions to
be given up, other propositions may lose their support. For
this reason, we only retain the tuple of supported propositions ς(κ(E 1 (F (Q)), T), T ) amongst the kernel survivors.
Definition 5.7. Let X = hX0 , ..., Xk i be a ⊥-kernel of QC .
p does not survive X given T iff ∃i 1 ≤ i ≤ k such that
p is a graded proposition in Xi where Xi is the left-most
non-empty set in X and ∀q ∈ Xi such that q 6∈ Ti′ with
F (T ) = hT0′ , ..., Tk′ i, (fT (p, Q′i ) ≪ fT (q, Q′i )) ∈ O.
We next define what we refer to as a best-next ⊥-kernel
in QC
⊥ . The best-next ⊥-kernel is the ⊥-kernel that must
be first examined next to pick a proposition to give up from
one of its sets to resolve the inconsistency. Our intuition in
picking such a ⊥-kernel is that, as a first condition, it has
the longest sequence of empty sets from the left. That is,
we are forced to give up a proposition from a more preferred
set. If there are multiple ⊥-kernels satisfying this first condition, a best-next ⊥-kernel will be a kernel that contains
a proposition with the highest grade in the left-most nonempty set. We do this in order to tend to the ⊥-kernels
(satisfying the first condition) that contain the more preferred propositions first. It is worth noting here that there
might be multiple next-best ⊥-kernels if there are more than
one ⊥-kernel with the same number of empty sets from the
left where the left-most non-empty set contains propositions
with the same highest grade.
Definition 5.8. A next-best ⊥-kernel X ∗ = hX0 , ..., Xk i ∈
QC
⊥ satisfies the following properties.
Definition 5.5. Let T be a telescoping structure for Sk .
If Q = hQ0 , Q1 , ..., Qk i where every item in E 1 (F (Q))
is fan-in-bounded, then the T-induced telescoping of Q is
given by
τT (Q) = ς(κ(E 1 (F (Q)), T), T )
The first two steps of telescoping were already presented
in Definitions 3.4 and 5.1 respectively, we only need to define the tuples of kernel survivors and supported propositions. For kernel survival, we generalize the notion of a ⊥kernel of a belief base (Hansson 1994) to suit reasoning with
multiple sets of propositions. The intuition is that a ⊥-kernel
is a tuple of sets, one for each mental attitude, where the
union of the sets is a subset-minimal inconsistent set. This
means that if we remove all occurrences of a single proposition from the sets in the ⊥-kernel, the union becomes consistent. In what follows, we say that a set R ⊆ P is inconsistent
whenever the classical filter of R is improper (F (R) = P).
Definition 5.6. Let Q = hQ0 , ..., Qk i, X = hX0 , ..., Xk i,
and hX0′ , ..., Xk′ i = F (X ). X is a ⊥-kernel of Q iff Xi ⊆
Qi , 1 ≤ i ≤ k, and X0′ ∪ X1′ ∪ ... ∪ Xk′ is a subset-minimal
inconsistent set of propositions.
Example 3. We refer back to Example 2. The following are
examples of ⊥-kernels.1 The first set in each ⊥-kernel represents beliefs of Ted’s, the second represents desires thereof,
and the third obligations.
1. There does not exist another ⊥-kernel in QC
⊥ with a longer
sequence of empty sets from the left.
2. Let Xi be the left-most non-empty set in X ∗ with a proposition p ∈ Xi . If there does not exist another proposition
r ∈ Xi such that (fT (p, Q′i ) ≪ fT (r, Q′i )) ∈ O, then
there does not exist another ⊥-kernel hX0′ , ..., Xk′ i ∈ QC
⊥
satisfying condition (1) with Xi′ containing a proposition
q where (fT (p, Q′i ) ≪ fT (q, Q′i )) ∈ O.
We are now ready to present the construction of the tuple
of kernel survivors. What we do is, we pick a next-best ⊥kernel from QC
⊥ and get the propositions that do not survive
from it according to Definition 5.7. There might be more
that one proposition that does not survive if they all have the
same lowest grade. Such propositions are removed from all
the beliefs and motivations in Q to resolve the inconsistency.
The intuition behind doing this is that if some propositions in
some set in the next-best ⊥-kernel do not survive, then they
can not survive in any other set to guarantee the consistency
of the union of the sets in the mental state. We next proceed
to getting the kernel survivors from the updated Q until the
union of the sets in the mental state becomes consistent.
Definition 5.9. The tuple of kernel survivors of Q given T
is κ(Q, T) where κ(Q, T) is defined as follows:
1. if QC
⊥ = ∅, then κ(Q, T) = Q; and
2. if X ∗ is a next-best ⊥-kernel in QC
⊥ and S is the set of
propositions that do not survive X ∗ given T, then
1. h{p ∧ t, p ⇔ ¬t}, {}, {}i.
2. h{p ⇔ ¬t}, {t}, {p}i.
3. h{p ⇔ ¬t}, {p, t}, {}i
The first ⊥-kernel shows a contradiction within Ted’s beliefs; the second shows a contradiction between a belief, a
desire, and an obligation; and the third shows a contradiction between a belief and two desires. Note that the unions
resulting from the three ⊥-kernels are subset minimal.
How do we choose propositions to give up and resolve
inconsistency? The intuition is this: The proposition to be
given up must be from the least preferred set in the mental
state according to the agent’s character. If the least preferred
set contains more than one proposition, then the proposition
to be given up must be the proposition with the least grade
in the set. To make finding the least preferred set easier,
we reorder Q such that its items are ordered from the least
preferred to the most preferred according to the character,
and construct the ⊥-kernels out of the reordered Q. In this
way, the least preferred set in a ⊥-kernel will be the leftmost non-empty set.
Henceforth, we assume Q = hQ0 , ..., Qk i where each
set in Q is fan-in-bounded and a telescoping structure T =
hT , O, ⊗B , ⊕B , ⊗M , ⊕M , Ci with T = hT0 , T1 , ..., Tk i.
We say that QC is a C-ordered Q if QC is a permutation
κ(Q, T) = κ(Q′ , T)
1
We use the syntactic ∧ and ⇔ operators rather than their semantic counterparts for readability.
where Q′ = hQ0 − S, Q1 − S, ...., Qk − Si.
166
Example 4. Suppose we have the same ⊥-kernels in Example 3. The agent character according to Example 1
C = {0 < 1, 0 < 2, 2 < 1} with 0 representing the agent’s
beliefs, 1 representing the desires, and 2 representing the
obligations. After getting QC , QC
⊥ contains the following
three kernels. The sets in the kernels are now ordered according to the agent character with the first set containing
desires, the second set containing obligations, and the third
set containing beliefs.
we give up p ∧ t from the first ⊥-kernel. This means that any
proposition that was supported by p∧t must go away as well
as it loses its support. Therefore, the following definition
states that the supported propositions are the propositions in
the sets of the multifilter of T , or the propositions that are
graded by supported propositions in Q.
Definition 5.10. The set of supported propositions in
Q = hQ0 , Q1 , ..., Qk i given T = hT0 , T1 , ..., Tk i denoted
ς(Q, T ) is the tuple hS0 , S1 , ..., Sk i where S0 , S1 , ..., Sk are
the smallest subsets of, respectively, Q0 , Q1 , ..., Qk such
that, for 0 ≤ i ≤ k,
1. ∀p ∈ Si if F (T ) = hT0′ , T1′ , ..., Tk′ i and p ∈ Ti′ ; and
2. ∀p ∈ Si if there is a grading chain hq0 , ..., qn i of p in Si
and there is a tuple (R0 , . . . , Rk ) where Rj ⊆ Sj , for 0 ≤
j ≤ k, such that q0 ∈ Ri′ where F (R) = hR0′ , . . . , Rk′ i.
We can now present an important result. The following
theorem states that if the union of the sets in the multifilter
of T is consistent, then the union of the sets in the multifilter after getting the tuple of supported propositions in
the kernel survivors of any Q ⊆ P given T is consistent.
This basically means that the process of telescoping is consistency preserving. Accordingly, we can do the revision of
the agent’s beliefs and motivations while maintaining consistency amongst all the beliefs and motivations.
Theorem 3. Let F (T ) = hB, M1 , ..., Mk i and
F (ς(κ(Q, T), T )) = hB ′ , M′1 , ..., M′k i. If F (B ∪ M1 ∪
... ∪ Mk ) is proper, then F (B ′ ∪ M′1 ∪ ... ∪ M′k ) is proper.
Since graded multifilters come in degrees depending on
the nesting level of the telescoped propositions, we need to
extend the T-induced telescoping of Q to a generalized notion of T-induced telescoping of the Q at degree n as follows.
Definition 5.11. If each set in Q has finitely-many grading
propositions, then τT (Q) is defined, for every telescoping
structure T. In what follows, provided that the right-hand
side is defined, let
Q
if n = 0
n
τT (Q) =
τT (τTn−1 (Q)) otherwise
1. h{}, {}, {p ∧ t, p ⇔ ¬t}i
2. h{t}, {p}, {p ⇔ ¬t}i
3. h{p, t}, {}, {p ⇔ ¬t}i
The next-best ⊥-kernel is the first kernel as it has the longest
sequence of empty sets from the left. While the agent’s character prefers to give up desires and obligations rather than
beliefs, in the first kernel we are forced to give up beliefs
to resolve the contradiction as the desires and obligations
sets are empty. The beliefs that will be given up will then
be removed from the less preferred obligations and desires.
In this way, we treat ⊥-kernels where we have to give up
a proposition from a more preferred attitude first so the removal of propositions from them affects less preferred attitudes. To decide which propositions to give up, we look at
the grades of p ∧ t and p ⇔ ¬t. We consider three cases.
1. If the grade of p ⇔ ¬t is less, it will be removed from
the first kernel and Ted’s beliefs and motivations resulting in a consistent union of the sets in the mental state.
In this case, giving up p ⇔ ¬t from the first ⊥-kernel resolves the inconsistency in the second and third ⊥-kernels
as well.
2. If the grade of p ∧ t is less, it will be removed from the
first kernel and Ted’s beliefs and motivations. However,
the inconsistency in the second and third kernels is not
resolved by giving up p ∧ t. Ted now believes that he
can not work on the presentation and go to the trip, desires to go to the trip and work on the presentation, and
is obliged to work on the presentation. From the updated
QC , the second and third kernels are reconstructed. The
second and third kernels have the same number of empty
sets from the left, so to identify a next-best kernel we look
at the kernel where the desires set contains a proposition
with the highest grade. Suppose that p has a grade of 2
and t has a grade of 1. The next-best kernel will then be
the third kernel. t is removed from the third kernel and the
beliefs and the motivations resulting in a consistent mental state. Giving up t from the third kernel resolves the
inconsistency in the second kernel.
3. If the grades of p ∧ t and p ⇔ ¬t are equal, then both beliefs are given up. Both propositions are accordingly removed from the motivations resulting in a consistent union
of the sets in the mental state, and resolving the inconsistency in the second and third kernels.
We now are finally ready define graded multifilters as the
multifilter of the T-induced telescoping of the tuple of sets
of top propositions T at degree n.
Definition 5.12. Let T be a telescoping structure. We refer
to F (τTn (T )) as a degree n (∈ N) graded filter of T =
n
(T ).
hT0 , T1 , ..., Tk i, denoted F
The following observation states that there might be several graded multifilter of degree n. This is due to the possible existence of several best-next ⊥-kernels at each step of
getting the kernel survivors. The order of considering the
possible best-next ⊥-kernels will affect the graded multifilter we end up with. It is worth noting though that according
to Theorem 3 the union of the sets in all the possible graded
multifilters is consistent.
Observation 5.1. Let T be a telescoping structure with T =
n
(T )
hT0 , T1 , ..., Tk i. The degree n graded multifilter F
might not be unique.
After defining the kernel survivors, what remains for us to
fully define the process of telescoping is to present the notion of the tuple of supported propositions in Q given a tuple
of top sets of propositions T = hT0 , T1 , ..., Tk i. The motivation for defining this is the following. Suppose in Example 4
167
5.3
Graded Consequence
presented in Example 2. Figure 1 shows the graded belief
and motivation consequences of T with respect to a series of
canons with ⊗B =mean, ⊕B =max, and ⊕M , ⊗M =max, and
0 ≤ n ≤ 2 with the agent character C = {0 < 1, 0 < 2, 2 <
1}.
Level 1: Upon telescoping to level 1, the embedded beliefs, desires, and obligations at level 1 are extracted. The
classical consequences of the beliefs are added as well including p and t. Once M1 (¬m, 3) is extracted in the obligations, the bridge rule r1 fires to bridge M1 (¬m, 3) to Ted’s
desires. This fires r2 to add M1 (¬p, 3) to Ted’s desires as
well. There are no contradictions between the extracted beliefs, desires and obligations so all the extracted beliefs and
motivations survive telescoping and are supported. Hence,
at level 1, Ted believes he can work on the presentation and
go to the trip, desires to go to the trip, and is obliged to work
on the presentation.
Level 2: At level 2, the embedded graded propositions at
level 1 in the previous level are extracted adding p ⇔ ¬t to
Ted’s beliefs, ¬m and p to Ted’s desires. Note that ¬m is
not extracted in the obligations as it we only telescope obligations in the set of obligations according to Definition 5.1
and ¬m was in a desire term. No new bridge rules are fired
in Level 2. However, once we do this, we get several contradictions between Ted’s beliefs, desires, and obligations. We
get the three ⊥-kernels in Example 4. Since p∧t has a lower
grade (5) than p ⇔ ¬t (with fused grade ⊗(h10, 2i) = 6 as it
is graded in a grading chain), the second scenario explained
in Example 4 ensues resulting in removing p ∧ t from the
agent’s beliefs and t from the desires. This causes both p and
t in the agent’s beliefs to go away as they lose their support.
Hence, at level 2, Ted gives up his belief that he can work
on the presentation while being on the trip. He accordingly
gives up his desire to go to the trip and ends up desiring to
not make his boss mad and consequently desiring working
on the presentation. Ted’s obligations to desire to make his
boss not mad and to work on the presentation are retained
at level 2 as Ted’s character prefers to give up desires rather
than obligations. Note that a proposition was removed from
the beliefs even though it is the highest preferred attitude,
but this was necessary in order to resolve the contradiction
within the beliefs. This revision only happened at level 2
as we look deeper into the nested graded propositions that
contradicted Ted’s beliefs and motivations at level 1.
In what follows, given a LogA PR theory T =
hB, (M1 , ..., Mk ), Ri and a valuation V = hSk , Vf , Vx i,
let the valuation of T be denoted as V(T) =
hV(B), V(M1 ), ..., V(Mk )i. Just like we used multifilters
to define logical consequence in Section 4, we use graded
multifilters to define graded consequence as follows.
Definition 5.13. Let T = hB, (M1 , ..., Mk ), Ri be a
LogA PR theory and T be a T-induced order. For every φ ∈ σP , valuation V = hSk , Vf , Vx i where Sk has
a set P which is depth- and fan-out-bounded, and grading
canon C = h⊗B , ⊕B , ⊗M , ⊕M , C, ni, φ is a graded belief
(or motivation) consequence of T with respect to C, denoted
C
C
n
(T) = hB, M1 , ..., Mk i is
T |≃B φ (or T |≃M φ), if F
T
V
defined and [[φ]] ∈ B ( or Mi ) for every telescoping structure T = hV(T), O, ⊗B , ⊕B , ⊗M , ⊕M , Ci where O extends
F (V(T) ∩ Range(≪)) 2 .
C
C
It is worth pointing out that |≃B and |≃M are nonmonotonic and reduce to |=B and |=M respectively if n = 0.
The set of belief consequences makes up a consistent set of
beliefs, and the set of motivation consequences makes up a
consistent set of motivations representing the agent’s intentions.
5.4
The Weekend Dilemma in LogA PR
6
Figure 1: The graded consequences of the LogA PR theory in Example 2. The top portion of each level contains Ted’s beliefs, the
middle portion contains Ted’s desires, and the bottom portion contains Ted’s obligations. The newly added terms in each level is
shown in red.
In this section we revisit the weekend dilemma showing how
it can be accounted for in LogA PR illustrating the joint
belief and intention revision. Recall the LogA PR theory
T = hB, (M1 , M2 ), Ri representing the weekend dilemma
2
Conclusion
Despite the abundance of logical theories in the literature for
modelling practical reasoning, a robust theory with adequate
semantics remains missing. In this paper, we introduced
general algebraic foundations for practical reasoning with
several mental attitudes. We also provided semantics for an
algebraic logic , LogA PR, for joint reasoning with graded
beliefs and motivations to decide on sets of consistent beliefs and intentions. The LogA PR semantics also captures
the joint revision of the agent’s beliefs and intentions all in
one framework. We are currently working on a proof theory for LogA PR. Reasons for intentions are to computed in
the same way reason-maintenance systems computes supports for beliefs. The end result would be a proof theory for
An ultrafilter O extends a filter F , if F ⊆ O.
168
practical reasoning augmented with the ability to explain the
reasons for choosing to adopt particular intentions to achieve
an initial set of motivations giving rise to an explainable AI
system.
Frankfurt, H. G. 1988. Freedom of the will and the concept
of a person. In What is a person? Springer. 127–144.
Hansson, S. O. 1994. Kernel contraction. The Journal of
Symbolic Logic 59(03):845–859.
Icard, T.; Pacuit, E.; and Shoham, Y. 2010. Joint revision
of belief and intention. In Proc. of the 12th International
Conference on Knowledge Representation, 572–574.
Ismail, H. O. 2012. LogA B: A first-order, non-paradoxical,
algebraic logic of belief. Logic Journal of the IGPL
20(5):774–795.
Ismail, H. O. 2013. Stability in a commonsense ontology of
states. Proceedings of the Eleventh International Symposium
on Logical Formalization of Commonsense sense Reasoning
(COMMONSENSE 2013).
Ismail, H. O. 2020. The good, the bad, and the rational:
Aspects of character in logical agents. In ElBolock, A.; Abdelrahman, Y.; and Abdennadher, S., eds., Character Computing. Springer.
Paccanaro, A., and Hinton, G. E. 2001. Learning distributed
representations of concepts using linear relational embedding. IEEE Transactions on Knowledge and Data Engineering 13(2):232–244.
Parsons, T. 1993. On denoting propositions and facts. Philosophical Perspectives 7:441–460.
Rao, A. S., and Georgeff, M. P. 1995. BDI agents: From
theory to practice. In ICMAS, volume 95, 312–319.
Rao, A. S., and Wooldridge, M. 1999. Foundations of rational agency. In Rao, A. S., and Wooldridge, M., eds., Foundations of rational agency. Springer. 1–10.
Richardson, M., and Domingos, P. 2006. Markov logic
networks. Machine learning 62(1-2):107–136.
Sankappanavar, H., and Burris, S. 1981. A course in universal algebra. Graduate Texts Math 78.
Searle, J. R. 2003. Rationality in action. MIT press.
Shapiro, S. C. 1993. Belief spaces as sets of propositions. Journal of Experimental & Theoretical Artificial Intelligence 5(2-3):225–235.
Shoham, Y. 2009. Logical theories of intention and
the database perspective. Journal of Philosophical Logic
38(6):633.
Thomason, R. H. 2018. The formalization of pratical reasoning: Problems and prospects. In Gabbay, D. M., and
Guenthner, F., eds., Handbook of Philosophical Logic: Volume 18. Cham: Springer International Publishing. 105–132.
Vovk, V.; Gammerman, A.; and Shafer, G. 2005. Algorithmic learning in a random world. Springer Science &
Business Media.
Yang, B.; Yih, W.; He, X.; Gao, J.; and Deng, L. 2015. Embedding entities and relations for learning and inference in
knowledge bases. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May
7-9, 2015, Conference Track Proceedings.
References
Bealer, G. 1979. Theories of properties, relations, and
propositions. The Journal of Philosophy 76(11):634–648.
Bratman, M. 1987. Intention, plans, and practical reason.
Harvard University Press.
Broersen, J.; Dastani, M.; Hulstijn, J.; Huang, Z.; and
van der Torre, L. 2001. The BOID architecture: conflicts
between beliefs, obligations, intentions and desires. In Proceedings of the fifth international conference on Autonomous
agents, 9–16.
Broersen, J.; Dastani, M.; Hulstijn, J.; and van der Torre, L.
2002. Goal generation in the BOID architecture. Cognitive
Science Quarterly 2(3-4):428–447.
Broome, J. 2002. Practical reasoning. In Bermúdez, J. L.,
and Millar, A., eds., Reason and nature: Essays in the theory
of rationality. Oxford: Clarendon Press. 85–111.
Casali, A.; Godo, L.; and Sierra, C. 2008. A logical framework to represent and reason about graded preferences and
intentions. In Brewka, G., and Lang, J., eds., Principles
of Knowledge Representation and Reasoning: Proceedings
of the Eleventh International Conference, KR 2008, Sydney,
Australia, September 16-19, 2008, 27–37. AAAI Press.
Casali, A.; Godo, L.; and Sierra, C. 2011. A graded BDI
agent model to represent and reason about preferences. Artificial Intelligence 175(7-8):1468–1478.
Castelfranchi, C., and Paglieri, F. 2007. The role of beliefs
in goal dynamics: prolegomena to a constructive theory of
intentions. Synthese 155(2):237–263.
Charniak, E. 1991. Bayesian networks without tears. AI
magazine 12(4):50–50.
Church, A. 1950. On carnap’s analysis of statements of
assertion and belief. Analysis 10(5):97–99.
Cohen, P. R., and Levesque, H. J. 1990. Intention is choice
with commitment. Artificial intelligence 42(2-3):213–261.
Dunin-Keplicz, B.; Nguyen, L. A.; and Szalas, A. 2010. A
framework for graded beliefs, goals and intentions. Fundam.
Inform. 100(1-4):53–76.
Ehab, N., and Ismail, H. O. 2018. Towards a unified algebraic framework for non-monotonicity. Proceedings of the
KI 2018 Workshop on Formal and Cognitive Reasoning 26–
40.
Ehab, N., and Ismail, H. O. 2019. A unified algebraic framework for non-monotonicity. In Moss, L. S., ed., Proceedings
Seventeenth Conference on Theoretical Aspects of Rationality and Knowledge, TARK 2019, Toulouse, France, 17-19
July 2019, volume 297 of EPTCS, 155–174.
Ehab, N., and Ismail, H. O. 2020. LogA G: An algebraic non-monotonic logic for reasoning with graded propositions. Annals of Mathematics and Artificial Intelligence.
Fern, A. 2010. Weighted logic. Technical report.
169
BKLM - An expressive logic for defeasible reasoning
Guy Paterson-Jones2 , Giovanni Casini1,2 , Thomas Meyer2
1
ISTI-CNR, Italy
2
CAIR and Univ. of Cape Town, South Africa
guy.paterson.jones@gmail.com , giovanni.casini@isti.cnr.it , tmeyer@cair.org.za
Abstract
class of entailment relations for KLM-style logics (Casini,
Meyer, and Varzinczak 2019), and it is widely agreed upon
that there is no unique best answer. The options can be narrowed down, however, and Lehmann et al. propose Rational Closure (RC) as the minimally acceptable form of rational entailment. Rational closure is based on the principle
of presumption of typicality (Lehmann 1995), which states
that propositions should be considered typical unless there
is reason to believe otherwise. For instance, if we know
that birds typically fly, and all we know about a robin is
that it is a bird, we should tentatively conclude that it flies,
as there is no reason to believe it is atypical. While RC
is not always appropriate, there is fairly general consensus
that interesting forms of conditional reasoning should extend RC from an inferential perspective (Lehmann 1995;
Casini, Meyer, and Varzinczak 2019).
Since KLM-style logics have limited conditional expressivity (see Section 2.1), there has been some work in extending the KLM constructions to more expressive logics.
Perhaps the main question is whether entailment relations
resembling RC can be defined also for more expressive logics. The first investigation in such a direction was proposed
by Booth and Paris (1998) who consider an extension in
which both positive (α |∼ β) and negative (α 6|∼ β) conditionals are allowed. Booth et al. (2013) introduce a significantly more expressive logic called Propositional Typicality
Logic (PTL), in which propositional logic is extended with
a modal-like typicality operator •. This typicality operator
can be used anywhere in a formula, in contrast to KLMstyle logics, where typicality refers only to the antecedent of
conditionals of the form α |∼ β.
The price one pays for this expressiveness is that rational entailment becomes more difficult to pin down. This is
shown by Booth et al. (2015), who prove that several desirable properties of rational closure are mutually inconsistent
for PTL entailment. They interpret this as saying that the
correct form of entailment for PTL is contextual, and depends on which properties are considered more important
for the task at hand.
In this paper we consider a different extension of KLMstyle logics, which we refer to as Boolean KLM (BKLM),
and in which we allow negative conditionals, as well as arbitrary conjunctions and disjunctions of conditionals. We
do not allow the nesting of conditionals, though. We show,
Propositional KLM-style defeasible reasoning involves a core
propositional logic capable of expressing defeasible (or conditional) implications. The semantics for this logic is based
on Kripke-like structures known as ranked interpretations.
KLM-style defeasible entailment is referred to as rational
whenever the defeasible entailment relation under consideration generates a set of defeasible implications all satisfying
a set of rationality postulates known as the KLM postulates.
In a recent paper Booth et al. proposed PTL, a logic that is
more expressive than the core KLM logic. They proved an
impossibility result, showing that defeasible entailment for
PTL fails to satisfy a set of rationality postulates similar in
spirit to the KLM postulates. Their interpretation of the impossibility result is that defeasible entailment for PTL need
not be unique.
In this paper we continue the line of research in which the
expressivity of the core KLM logic is extended. We present
the logic Boolean KLM (BKLM) in which we allow for disjunctions, conjunctions, and negations, but not nesting, of defeasible implications. Our contribution is twofold. Firstly, we
show (perhaps surprisingly) that BKLM is more expressive
than PTL. Our proof is based on the fact that BKLM can
characterise all single ranked interpretations, whereas PTL
cannot. Secondly, given that the PTL impossibility result also
applies to BKLM, we adapt the different forms of PTL entailment proposed by Booth et al. to apply to BKLM.
1
Introduction
Non-monotonic reasoning has been extensively studied in
the AI literature, as it provides a mechanism for making
bold inferences that go beyond what classical methods can
provide, while retaining the possibility of revising these inferences in light of new information. In their seminal paper, Kraus et al. (1990) consider a general framework for
non-monotonic reasoning, phrased in terms of defeasible, or
conditional implications of the form α |∼ β, to be read as
‘If α holds, then typically β holds’. Importantly, they provide a set of rationality conditions, in the form of structural properties, that a reasonable form of entailment for
these conditionals should satisfy, and characterise these semantically. Lehmann and Magidor (1992) also considered
the question of which entailment relations definable in the
KLM framework can be considered to be the correct ones
for non-monotonic reasoning. In general, there is a large
170
perhaps surprisingly, that BKLM is strictly more expressive
than PTL by exhibiting an explicit translation of PTL knowledge bases into BKLM. We also prove that BKLM entailment is more restrictive than PTL entailment, in the sense
that a stronger class of entailment properties are inconsistent for BKLM. In particular, attempts to extend rational closure to BKLM in the manner of LM-entailment as defined by
Booth et al. (2015), are shown to be untenable.
The rest of the paper is structured as follows. In section 2
we provide the relevant background on the KLM approach
to defeasible reasoning, and discuss various forms of rational entailment. We then define Propositional Typicality
Logic, and give a brief overview of the entailment problem
for PTL. In section 3 we define the logic BKLM, an extension of KLM-style logics that allows for arbitrary boolean
combinations of conditionals. We investigate the expressiveness of BKLM, and show that it is strictly more expressive
PTL by exhibiting an explicit translation of PTL formulas
into BKLM. In section 4 we turn to the entailment problem
for BKLM, and show that BKLM suffers from stronger versions of the known impossibility results for PTL. Section 5
discusses some related work, while section 6 concludes and
points out some future research directions.
2
pbf
1
pbf, pbf
0
pbf, pbf, pbf
Figure 1: A ranked interpretation over P = {p, b, f}.
{u ∈ U : R(u) < ∞}, and for any α ∈ L we define
JαKR = {u ∈ U R : u α}.
Every ranked interpretation R determines a total preorder
on U in the obvious way, namely u ≤R v iff R(u) ≤
R(v). Writing the strict version of this preorder as ≺R , it
is straightforward to show that it is modular:
Proposition 1. ≺R is modular, i.e. for all u, v, w ∈ U ,
u ≺R v implies that either w ≺R v or u ≺R w.
Lehmann et al. (1992) define ranked interpretations in
terms of modular orderings on U . The following straightforward observation proves the equivalence of the two definitions:
Proposition 2. Let R1 and R2 be ranked interpretations.
Then R1 = R2 iff ≺R1 =≺R2 .
We define satisfaction with respect to ranked interpretations as follows. Given any α ∈ L, we say R satisfies α
(written R
α) iff JαKR = U R . Similarly, R satisfies a
conditional assertion α |∼ β iff min≤R JαKR ⊆ JβKR , or in
other words iff all of the ≤R -minimal valuations satisfying
α also satisfy β.
Example 1. Let R be the ranked interpretation in figure 1.
Then R satisfies p → b, b |∼ f and p |∼ ¬f. Note that in our
figures we omit rank ∞ for brevity, and we represent a valuation as a string of literals, with p indicating the negation
of the atom p.
A useful simplification is the fact that classical statements
(such as p → b) can be viewed as special cases of conditional assertions:
Proposition 3. (Kraus, Lehmann, and Magidor 1990,
p.174) For all α ∈ L, R α iff R ¬α |∼ ⊥.
In what follows we define a knowledge base as a finite set
of conditional assertions. We sometimes abuse notation by
including classical statements (of the form α ∈ L) in knowledge bases, but in the context of Proposition 3 this should
be understood to be shorthand for the conditional assertion
¬α |∼ ⊥. For example, the knowledge base {p → b, b |∼ f}
is shorthand for {¬(p → b) |∼ ⊥, b |∼ f}.
We denote the set of all ranked interpretations over P by
RI, and we write M OD(K) for the set of ranked models of
a knowledge base K. For any U ⊆ RI, we write U
α to
mean R
α for all R ∈ U . Finally, we write sat(R) for
the set of formulas satisfied by the ranked interpretation R.
Even though KLM extends propositional logic, it is still
quite restrictive, as it only permits positive conditional assertions. Booth et al. (1998) consider an extension allowing
for negative conditionals, i.e. assertions of the form α 6|∼ β.
Such an assertion is satisfied by a ranked interpretation R if
and only if R 6 α |∼ β.
Background
Let P be a set of propositional atoms, and let p, q, . . . be
meta-variables for elements of P. We write LP for the set of
propositional formulas over P, defined by α ::= p | ¬α |
α ∧ α | ⊤ | ⊥. Other boolean connectives are defined as
usual in terms of ∧, ¬, →, and ↔. We write U P for the set
of valuations of P, which are functions v : P → {0, 1}. Valuations are extended to LP in the usual way, and satisfaction
of a formula α will be denoted v α. For the remainder of
this paper we will assume that P is finite and drop superscripts whenever there isn’t any danger of ambiguity.
2.1
2
The Logic KLM
Kraus et al. (1990) study a conditional logic, which we refer
to as KLM. It is defined by assertions of the form α |∼ β,
which are read “if α, then typically β”. For example, if
P = {b, f} refers to the properties of being a bird and flying
respectively, then b |∼ f states that birds typically fly. There
are various possible semantic structures for this logic, but
in this paper we are interested in the case of rational conditional assertions. The semantics for rational conditionals
is given by ranked interpretations (Lehmann and Magidor
1992). The following is an alternative, but equivalent definition of such a class of interpretations.
Definition 1. A ranked interpretation R is a function from
U to N ∪ {∞} satisfying the following convexity condition:
if R(u) < ∞, then for every 0 ≤ j < R(u), there is some
v ∈ U for which R(v) = j.
Given a ranked interpretation R, we call R(u) the rank of
u with respect to R. Valuations with a lower rank are viewed
as being more typical than those with a higher rank, whereas
valuations with infinite rank are viewed as being impossibly
atypical. We refer to the set of possible valuations as U R =
171
(R EFL) K |≈ α |∼ α
(A ND)
K |≈ α |∼ β, K |≈ α |∼ γ
K |≈ α |∼ β ∧ γ
(L LE)
|= α ↔ β, K |≈ α |∼ γ
K |≈ β |∼ γ
(O R)
K |≈ α |∼ γ, K |≈ β |∼ γ
K |≈ α ∨ β |∼ γ
(RW)
|= β → γ, K |≈ α |∼ β
K |≈ α |∼ γ
(C M)
K |≈ α |∼ β, K |≈ α |∼ γ
K |≈ α ∧ β |∼ γ
(R M)
Definition 3. (Giordano et al. 2015, Definition 7) Given two
ranked interpretations R1 and R2 , we write R1 <G R2 ,
that is, R1 is preferred to R2 , iff R1 (u) ≤G R2 (u) for every
u ∈ U, and there is some v ∈ U s.t. R1 (v) <G R2 (v).
Consider the set of models of a K LM knowledge base K.
Intuitively, the lower a model is with respect to the ordering
≤G , the fewer exceptional valuations it has modulo the constraints of K. Thus the ≤G -minimal models can be thought
of as the semantic counterpart to the principle of typicality
seen above. This idea of making valuations as typical as possible has first been presented by Booth et al. (1998) for the
case of K LM knowledge bases with both positive and negative conditionals. For these knowledge bases, it turns out
that there is always a unique minimal model:
K 6|≈ α ∧ β |∼ γ, K 6|≈ α |∼ ¬β
K 6|≈ α |∼ γ
Figure 2: Rationality properties for defeasible entailment.
2.2
Rank Entailment
A central question in non-monotonic reasoning is determining what forms of entailment are appropriate in a defeasible
setting. Given a knowledge base K, we write K |≈ α |∼ β to
mean that K defeasibly entails α |∼ β. In the literature, there
are a plethora of options available for the entailment relation
|≈, each with their own strengths and weaknesses (Casini,
Meyer, and Varzinczak 2019). As such, it is useful to understand defeasible entailment relations in terms of their global
properties. An obviously desirable property is Inclusion:
Proposition 5. Let K ⊆ L|∼ be a knowledge base. Then if K
is consistent, M OD(K) has a unique ≤G -minimal element,
RC
denoted RK
.
The rational closure of a knowledge base can be characterised as the set of formulas satisfied by this minimal
model:
Proposition 6. (Giordano et al. 2015, Theorem 2) A conditional α |∼ β is in the rational closure of a knowledge base
RC
K ⊆ L|∼ (written K |≈RC α |∼ β) iff RK
α |∼ β.
(Inclusion) K |≈ α |∼ β for all α |∼ β ∈ K
Kraus et al. (1990) argue that a defeasible entailment relation should satisfy each of the properties given in figure 2,
known as the rationality properties. We will call such relations rational.
Rational properties are essentially intertwined with the
class of ranked interpretations.
Proposition 4 (Lehmann et al. (1992)). A defeasible entailment relation |≈ is rational iff for each knowledge base K,
there is a ranked interpretation RK such that K |≈ α |∼ β
iff RK α |∼ β.
The following natural form of entailment, called rank entailment, is not rational in general, as it fails to satisfy the
property of rational monotonicity (R M):
Definition 2. A conditional α |∼ β is rank entailed by a
knowledge base K (written K |≈R α |∼ β) iff R α |∼ β
for every ranked model R of K.
Despite failing to be rational, rank entailment is important
as it can be viewed as the monotonic core of an appropriate
defeasible entailment relation. In other words, the following
property is desirable:
A well-known behaviour of rational closure is the socalled drowning effect. To make this concrete, consider the
knowledge base K = {b |∼ f, b |∼ w, r → b, p → b, p |∼
¬f, }. This states that birds have wings and typically fly, that
robins are birds, and that penguins are birds that typically
don’t fly. Intuitively one would expect to be able to conclude
from this that robins typically have wings (r |∼ w), since
robins are not exceptional birds. More generally, every subclass that does not show any exceptional behaviour should
inherit all the typical properties of a class by default. This
is the principle of the Presumption of Typicality mentioned
earlier, to which rational closure obeys. But what happens
with subclasses that are exceptional with respect to some
property?
In the above example, since penguins are exceptional only
with respect to their ability to fly, the question is whether
penguins should inherit the other typical properties of birds,
such as having wings (p |∼ w). Rational closure does not
sanction this type of conclusion. That is, subclasses that are
exceptional with respect to a typical property of a class do
not inherit the other typical properties of the class. This is
the drowning effect which, while being a desirable form of
reasoning in some contexts, is considered a limitation if we
are interested in modelling some form of Presumption of Independence (Lehmann 1995), in which a subclass inherits
all the typical properties of a class, unless there is explicit
information to the contrary. So, even though penguins are
exceptional birds in the sense of typically not being able to
fly, the Presumption of Independence requires us to conclude
that penguins typically have wings.
There are several refinements of rational closure, such
as lexicographic closure (Lehmann 1995), relevant closure
(Casini et al. 2014) and inheritance-based closure (Casini
and Straccia 2013), that satisfy both the Presumption of Typ-
(KLM-Ampliativity) K |≈ α |∼ β whenever K |≈R α |∼ β
Note that a rational entailment relation satisfying Inclusion also satisfies KLM-Ampliativity by proposition 4.
2.3
Rational Closure
A well-known form of rational entailment for K LM is rational closure. Lehmann et al. (1992) propose rational closure as the minimum acceptable form of rational defeasible entailment, and give a syntactic characterisation of rational closure in terms of an ordering on K LM knowledge
bases. Here we refer to the semantic approach (Giordano et
al. 2015) and define rational closure in terms of an ordering
on ranked interpretations:
172
icality and the Presumption of Independence. Unlike rational closure, lexicographic closure formalises the presumptive reading of α |∼ β, which states that “α implies β unless there is reason to believe otherwise” (Lehmann 1995;
Casini, Meyer, and Varzinczak 2019).
2.4
the following properties, modelled after properties of rational closure, where |≈? is a P TL entailment relation and
Cn? (K) = {α ∈ L• : K |≈? α} is its associated consequence operator:
(Cumulativity) For all K1 , K2 ⊆ L• , if K1 ⊆ K2 ⊆
Cn? (K1 ), then Cn? (K1 ) = Cn? (K2 ).
Propositional Typicality Logic
(Ampliativity) For all K ⊆ L• , CnR (K) ⊆ Cn? (K).
The present paper investigates whether the notion of rational
closure can be extended to more expressive logics. The first
investigation in such a direction was proposed by Booth and
Paris (1998), who consider an extension of KLM in which
both positive (α |∼ β) and negative (α 6|∼ β) conditionals are allowed. This additional expressiveness introduces
some technical issues, as not every such knowledge base has
a model (consider K = {α |∼ β, α 6|∼ β}, for instance).
Nevertheless, Booth and Paris show that this is the only limit
in the validity of Proposition 5: every consistent knowledge
base in this extension has a rational closure.
Another investigated logic that extends KLM is Propositional Typicality Logic (P TL), a logic for defeasible reasoning proposed by Booth et al. (2015), in which propositional
logic is enriched with a modal typicality operator (denoted
•). Formulas for P TL are defined by α ::= ⊤ | ⊥ | p | •α |
¬α | α ∧ α, where p is any propositional atom. As before,
other boolean connectives are defined in terms of ¬, ∧, →,
↔. The intuition behind a formula •α is that it is true for
typical instances of α. Note that the typicality operator can
be nested, so α may itself contain some •β as a subformula.
The set of all P TL formulas is denoted L• .
Satisfaction for P TL is defined with respect to a ranked
interpretation R. Given a valuation u ∈ U and formula
α ∈ L• , we define u R α inductively in the same way
as propositional logic, with an additional rule for the typicality operator: u R •α if and only if u R α and there is
no v ≺R u such that v R α. We then say that R satisfies
the formula α, written R α, iff u R α for all u ∈ U R .
Given that the typicality operator can be nested and used
anywhere within a P TL formula, one would intuitively expect P TL to be at least as expressive as K LM. The following
lemma shows that this is indeed the case:
Proposition 7 (Booth et al. (2013)). A ranked interpretation
R satisfies the K LM formula α |∼ β if and only if it satisfies
the P TL formula •α → β.
Given two knowledge bases K1 and K2 , we say they are
equivalent if they have exactly the same set of ranked models, i.e. if M OD(K1 ) = M OD(K2 ). Proposition 7 can be
rephrased as saying that every K LM knowledge base has
an equivalent P TL knowledge base. Note that the converse
doesn’t hold; there are P TL knowledge bases with no equivalent in K LM:
Proposition 8 (Booth et al. (2013)). For any p ∈ P, the
knowledge base K = {•p} has no equivalent K LM knowledge base.
The obvious form of entailment for a P TL knowledge base
K is rank entailment (denoted |≈R ), presented earlier in definition 2. As noted before, rank entailment is monotonic and
therefore inappropriate in many contexts. To pin down better forms of P TL entailment, Booth et al. (2015) consider
(Strict Entailment) For all K ⊆ L• and α ∈ L, α ∈
Cn? (K) iff α ∈ CnR (K).
(Typical Entailment) For all K ⊆ L• and α ∈ L, •⊤ →
α ∈ Cn? (K) iff •⊤ → α ∈ CnR (K).
(Single Model) For all K ⊆ L• , there’s some R ∈
M OD(K) such that for all α ∈ L• , α ∈ Cn? (K) iff
R α.
Surprisingly, it turns out that an entailment relation cannot
satisfy all of these properties simultaneously:
Proposition 9 (Booth et al. (2015)). There is no P TL entailment relation |≈? satisfying Cumulativity, Ampliativity,
Strict entailment, Typical entailment and the Single Model
property.
Booth et al. suggest that this is best interpreted as an argument for developing more than one form of P TL entailment,
which can be compared to the divide between presumptive
and prototypical readings for K LM entailment. An example
of P TL entailment is LM-entailment, which is based on the
following adaption of proposition 5:
Proposition 10 (Booth et al. (2019)). Let K ⊆ L• be a consistent knowledge base. Then M OD(K) has a unique ≤G LM
minimal element, denoted RK
.
Given a knowledge base K ⊆ L• , we define LMentailment by writing K |≈LM α iff either K is inconsisLM
α. Booth et al. prove that LM-entailment
tent or RK
satisfies all of the above properties except for Strict Entailment, and hence in general there may be classical statements
that are LM-entailed by K but not rank-entailed by it. Other
forms of entailment, such as PT-entailment, can be shown
to satisfy Strict Entailment but fail both Typical Entailment
and the Single Model property.
3
Boolean KLM
In section 2.1, we noted that the logic K LM is quite restrictive, as it allows only for positive conditional assertions. As mentioned there, Booth and Paris (1998) consider an extension allowing for negative conditionals, i.e.
assertions of the form α 6|∼ β. Here we take that extension further, and propose Boolean KLM (B KLM), which allows for arbitrary boolean combinations of conditionals, but
not for nested conditionals. B KLM formulas are defined by
A ::= α |∼ β | ¬A | A ∧ A, with other boolean connectives
defined as usual. Following Booth and Paris, we will write
α 6|∼ β as a synonym for ¬(α |∼ β) where convenient, and
we denote the set of all B KLM formulas by Lb . So, for example, (α |∼ β) ∧ (γ 6|∼ δ) and ¬((α 6|∼ β) ∨ (γ |∼ δ)) are
B KLM formulas, but α |∼ (β |∼ γ) is not.
173
1
pq
0
pq
In this section we show that B KLM is maximally expressive, in the sense that it can characterise any set of ranked
interpretations. For a valuation u ∈ U , we write û to mean
any characteristic formula of u, namely any propositional
formula such that v
û iff v = u. It is easy to see that
these always exist, as P is finite, and that all characteristic
formulas of u are logically equivalent.
Figure 3: A ranked interpretation illustrating the difference between classical disjunction and B KLM disjunction.
Lemma 1. For any ranked interpretation R and valuations
u, v ∈ U, it is straightforward to check that:
Satisfaction for B KLM is defined in terms of ranked interpretations, by extending K LM satisfaction to boolean combinations of conditionals in the obvious fashion, namely
R
¬A iff R 6 A and R
A ∧ B iff R
A and
R
B. This leads to some subtle differences between
B KLM satisfaction and the other logics. For instance, care
must be taken to apply proposition 3 correctly when translating between propositional formulas and B KLM formulas.
The propositional formula p ∨ q translates to the B KLM
formula ¬(p ∨ q) |∼ ⊥, and not to the B KLM formula
(¬p |∼ ⊥) ∨ (¬q |∼ ⊥), as the following example illustrates:
1. R
2. R
3. R
Note that this lemma holds even in the vacuous case
where R(u) = ∞ for all u ∈ U . Following Lehmann et
al. (1992), we write α < β as shorthand for the defeasible
implication α ∨ β |∼ ¬β. We now show that the concept of
characteristic formulas can be applied to ranked interpretations as well:
Lemma 2. Let R be any ranked interpretation. Then there
exists a formula ch(R) ∈ Lb with R as its unique model.
Example 2. Consider the propositional formula A = p ∨ q
and the B KLM formula B = (¬p |∼ ⊥) ∨ (¬q |∼ ⊥). If R is
the ranked interpretation in figure 3, then R satisfies A but
not B, as neither clause of the disjunction is satisfied.
Proof. Consider the following knowledge bases.
1. K≺ = {û < v̂ : u ≺R v} ∪ {û 6< v̂ : u 6≺R v}
2. K∞ = {û |∼ ⊥ : R(u) = ∞} ∪ {û 6|∼ ⊥ : R(u) < ∞}
To prevent possible confusion, we will avoid mixing classical and defeasible assertions in a B KLM knowledge base.
For similar reasons, it’s also worth noting the difference between boolean connectives in P TL and the corresponding
connectives in B KLM. By proposition 7, one might expect
a B KLM formula such as ¬(p |∼ q) to translate into the P TL
formula ¬(•p → q). The following example shows that this
naı̈ve approach fails:
By lemma 1, R satisfies K = K≺ ∪K∞ . To show that it is
the unique model of K, consider any R ∗ ∈ M OD(K). Since
R ∗ satisfies K∞ , R ∗ (u) = ∞ iff R(u) = ∞ for any u ∈ U.
Now consider any u, v ∈ U, and suppose that R(u) < ∞.
Then u ≺R v iff K≺ contains û < v̂. But R ∗ satisfies K≺ ,
so this is true iff u ≺R∗ v as R ∗ (u) < ∞. On the other
hand, if R(u) = ∞, then u 6≺R v and u 6≺R∗ v. Hence
≺R =≺R∗ , which implies that R = R ∗ by
V proposition 2.
We conclude the proof by letting ch(R) = α∈K α.
Example 3. Consider the formulas A = ¬(•p → q) and
B = ¬(p |∼ q), and let R be the ranked interpretation in
figure 3. Then A is equivalent to •p∧¬q, which isn’t satisfied
by R. On the other hand, R satisfies B.
We refer to ch(R) as the characteristic formula of R.
A simple application of disjunction allows us to prove the
following more general corollary:
One might ask whether there is a more nuanced way of
translating B KLM knowledge bases into P TL. In the next
section we answer this question in the negative, by showing
that B KLM is in fact strictly more expressive than P TL.
3.1
⊤ 6|∼ ¬û iff R(u) = 0.
û |∼ ⊥ iff R(u) = ∞.
û ∨ v̂ |∼ ¬v̂ iff u ≺R v or R(u) = R(v) = ∞.
Corollary 1. Let U ⊆ RI be a set of ranked interpretations.
Then there exists a formula ch(U ) ∈ Lb with U as its set of
models.
Expressiveness of BKLM for Ranked
Interpretations
This proves that B KLM is at least as expressive as P TL
since, in principle, for every P TL knowledge base there is
some B KLM knowledge base with the same set of models.
It is not clear, however, whether there is a more natural description of this knowledge base than that provided by characteristic formulas. In the next section we will address this
shortcoming by describing an explicit translation from P TL
to B KLM knowledge bases.
In fact, B KLM is strictly more expressive than P TL. This
is illustrated by the knowledge base K = {(⊤ |∼ p) ∨ (⊤ |∼
¬p)}, which expresses the “excluded-middle” statement that
typically one of p or ¬p is true. There are two distinct ≤G minimal ranked models of K, given by R1 and R2 in figure
4, and hence K cannot have an equivalent P TL knowledge
base by proposition 10.
So far we have been rather vague about what we mean by
the expressiveness of a logic. All of the logics we consider
in this paper share the same semantic structures, which provides us with a handy definition. We say that a logic can
characterise a set of ranked interpretations U ⊆ RI if there
is some knowledge base K with U as its set of ranked models. Given this, we say that a logic is more expressive than
another logic if it can characterise at least as many sets of
interpretations.
Example 4. Let K ⊆ L|∼ be a K LM knowledge base. Then
its P TL translation K′ = {•α → β : α |∼ β ∈ K} has
exactly the same ranked models by proposition 7, and hence
P TL is at least as expressive as K LM. Proposition 8 shows
that this comparison is strict.
174
3.2
A formula α ∈ L• is satisfied by a ranked interpretation
R iff it is satisfied by every possible valuation of R. We can
combine the translation operators of definition 4 to formalise
this statement as follows:
V
def
Definition 5. tr(α) =
(û
|
6
∼
⊥)
→
tr
(α)
u
u∈U
Translating PTL Into BKLM
In section 2.4, satisfaction for P TL formulas with respect to
a ranked interpretation R was defined in terms of the possible valuations of R. In order to define a translation operator
between P TL and B KLM, our main idea is to encode satisfaction with respect to a valuation u ∈ U in terms of an
appropriate B KLM formula. In other words, we will define
an operator tru : L• → Lb such that for each u ∈ U R ,
R tru (α) iff u R α.
All that remains is to check that this formula correctly
encodes P TL satisfaction:
Lemma 4. For all α ∈ L• , a ranked model R satisfies α if
and only if it satisfies tr(α).
Definition 4. Given α, β ∈ L• , p ∈ P and u ∈ U, we define
tru by structural induction as follows:
Proof. Suppose R α. Then for all u ∈ U, either R(u) =
∞ or u R α. The former implies R û |∼ ⊥ by lemma
1, and the latter implies R tru (α) by lemma 3. Thus R
(û 6|∼ ⊥) → tru (α) for all u ∈ U , which proves R tr(α)
as required.
Conversely, suppose R
tr(α). Then for any u ∈ U ,
either R
û |∼ ⊥ and hence R(u) = ∞ by lemma 1, or
R û 6|∼ ⊥ and hence R tru (α) by hypothesis. But then
R α by lemma 3.
def
tru (p) = û |∼ p
def
û |∼ ⊤
tru (⊤) =
def
tru (⊥) = û |∼ ⊥
def
¬tru (α)
tru (¬α) =
def
tru (α) ∧ tru (β)
tru (α ∧ β) =
h
i
V
def
6. tru (•α) = tru (α) ∧ v∈U (v̂ < û) → ¬trv (α)
1.
2.
3.
4.
5.
Note that this is well-defined, as each case is defined in
terms of the translation of strict subformulas. The translations can be viewed as formal version of the definition of
P TL satisfaction - case 6 states that •α is satisfied by a possible valuation u iff u is a minimal valuation satisfying α,
for instance.
4
Entailment Results for BKLM
We now turn to the question of defeasible entailment for
B KLM knowledge bases. As in previous cases, an obvious
approach to this is rank entailment, which we define in the
usual fashion:
Definition 6. Given any K ⊆ Lb and A ∈ Lb , we say K
rank entails A (written K |≈R A) iff R
A for all R ∈
M OD(K).
Lemma 3. Let R be a ranked interpretation, and u ∈ U R
a valuation with R(u) < ∞. Then for all α ∈ L• we have
R tru (α) if and only if u R α.
Being monotonic, rank entailment serves as a useful lower
bound for defeasible B KLM entailment, but cannot be considered a good solution in its own right. Letting |≈? be an
entailment relation and Cn? its associated consequence operator, consider the entailment properties in section 2.4 in the
context of B KLM. Our first observation is that the premises
of proposition 9 can be weakened as a consequence of global
disjunction:
Proof. We will prove the result by structural induction on
the cases in definition 4:
1. Suppose that R
tru (p), i.e. R
û |∼ p. This is true
iff u |= p, which is equivalent by definition to u R p.
Cases 2 and 3 are similar.
4. Suppose that R
tru (¬α), i.e. R
¬tru (α). This is
true iff R 6 tru (α), which by the induction hypothesis is
equivalent to u 6 R α. But this is equivalent to u R ¬α
by definition. Case 5 is similar.
6. Suppose there exists an α ∈ L• such that R
tru (•α)
but u 6 R •α. Then either u 6 R α, which by the induction hypothesis is a contradiction since R
tru (α), or
there is some v ∈ U with v ≺R u such that v R α.
But by lemma 1, v ≺R u is true only if R v̂ < û. We
also have, by the induction hypothesis, that R
trv (α)
since v R α. Hence R (v̂ < û) ∧ trv (α), which implies that one of the clauses in tru (•α) is false. This is a
contradiction, so we conclude that R
tru (•α) implies
u R •α.
Conversely, suppose that u R •α. Then u R α, and
hence R
tru (α) by the induction hypothesis. We also
have that if v ≺R u then v 6 R α, which is equivalent to
R ¬trv (α) by the induction hypothesis. But by lemma
1, v ≺R u iff R v̂ < û. We conclude that R (v̂ <
û) → ¬trv (α) for all v ∈ U, and hence R tru (•α).
Lemma 5. There is no B KLM entailment relation |≈? satisfying Ampliativity, Typical Entailment and the Single Model
property.
Proof. Suppose that |≈? is such an entailment relation, and
consider the knowledge base K = {(⊤ |∼ p) ∨ (⊤ |∼ ¬p)}.
Both interpretations in figure 4, R1 and R2 , are models of K.
R1 satisfies ⊤ |∼ p and not ⊤ |∼ ¬p, whereas R2 satisfies
⊤ |∼ ¬p and not ⊤ |∼ p. Thus, by the Typical Entailment
property, K 6|≈? ⊤ |∼ p and K 6|≈? ⊤ |∼ ¬p. On the other
hand, by Ampliativity we get K |≈? (⊤ |∼ p)∨(⊤ |∼ ¬p). A
single ranked interpretation cannot satisfy all three of these
assertions, however, and hence no such entailment relation
can exist.
In the P TL context, LM-entailment satisfies Ampliativity, Typical Entailment and the Single Model property. Thus
lemma 5 is a concrete sense in which B KLM entailment is
more constrained than P TL entailment. This raises an interesting question - can we nevertheless define a notion of
entailment for BKLM, in the same spirit as rational closure
175
1
p
1
p
0
p
0
p
(a) R1
the context of an agent’s knowledge. For B KLM entailment,
it turns out that there is a partial converse to this discussion,
which we will prove in the next section.
(b) R2
4.2
Figure 4: Ranked models of K = {(⊤ |∼ p) ∨ (⊤ |∼ ¬p)}.
and LM-entailment, by giving up one of the above properties? In order to guarantee a rational entailment relation, it
is desirable to keep the Single Model property in view of
proposition 4. For the rest of this section we will investigate the consequences of this choice, and show that while it
is possible to satisfy the Single Model property for B KLM
entailment, the resulting entailment relations are heavily restricted.
4.1
The Single Model Property
In this section we prove that, under some mild assumptions,
a B KLM entailment relation satisfying the Single Model
property is always equivalent to a total order entailment relation.
Theorem 1. Suppose |≈? is a B KLM entailment relation
satisfying Cumulativity, Ampliativity and the Single Model
property. Then |≈? =|≈< , where |≈< is a total order entailment relation.
For the remainder of the section, consider a fixed B KLM
entailment relation |≈? (with associated consequence operator Cn? ), and suppose that |≈? satisfies Cumulativity, Ampliativity and the Single Model property. In what follows, we
will move between the entailment relation and consequence
operator notations freely as convenient. To begin with, we
note the following straightforward lemma:
Lemma 6. For any knowledge base K ⊆ Lb , Cn? (K) =
CnR (Cn? (K)) = Cn? (CnR (K)).
Our approach to proving theorem 1 is to assign a unique
index ind(R) ∈ N to each ranked interpretation R ∈ RI,
and then show that Cn? (K) corresponds to minimisation of
index in M OD(K). To construct this indexing scheme, consider the following algorithm:
1. Set M0 := RI, i := 0.
2. If Mi = ∅, terminate.
3. By corollary 1, there is some knowledge base Ki ⊆ Lb
such that M OD(Ki ) = Mi .
4. By the single model property, there is some Ri ∈ Mi such
that Cn? (Ki ) = sat(Ri ).
5. Set Mi+1 := Mi \ {Ri }, i := i + 1.
6. Go to step 2, and iterate until termination.
Order Entailments
As we have seen in Section 2.3, rational closure can be modeled as a form of minimal model entailment. In other words,
given a knowledge base K, we can construct the rational closure of K by placing an appropriate ordering on its set of
ranked models (in this case ≤G ), and picking out the consequences of the minimal ones. In this section we formalise
this notion of entailment, with a view towards understanding
the Single Model property for B KLM.
Definition 7. Let < be a strict partial order on RI. Then for
all knowledge bases K and formulas α, we define K |≈< α
iff R α for all <-minimal elements of M OD(K).
The relation |≈< will be referred to as the order entailment relation of <. Note that we have been deliberately
vague about which logic we are dealing with, as the construction works identically for K LM, P TL and B KLM. It is
also worth mentioning that the set of models of a consistent
knowledge base always has <-minimal elements, as we have
assumed finiteness of P, that implies a finite set of ranked
interpretations.
Example 5. By definition 6, the rational closure of any K LM
knowledge base K is the set of formulas satisfied by the
(unique) <G -minimal element of M OD(K). Thus rational
closure is the order entailment relation of <G over K LM.
In general, order entailment relations satisfy all of the
rationality properties in figure 2 except for rational monotonicity (R M). Rational monotonicity holds, for instance, if
M OD(K) has a unique <-minimal model for every knowledge base K. This is the case for rational closure and LMentailment, which both satisfy the Single Model property.
The following proposition follows easily from the definitions, and shows that this is typical:
Proposition 11. An order entailment relation |≈< satisfies
the Single Model property iff M OD(K) has a unique <minimal model for every knowledge base K.
A class of order entailment relations for which the Single
Model property always holds are the total order entailment
relations, i.e. those |≈< corresponding to a total order <.
Intuitively, this is a strong restriction, as an a priori total ordering over all possible ranked interpretations is unnatural in
This algorithm is guaranteed to terminate as M0 is finite
and 0 ≤ |Mi+1 | < |Mi |. Note that once the algorithm terminates, for each R ∈ RI there will have been a unique
i ∈ N such that R = Ri . We will call this i the index of
R, and denote it by ind(R). Given a knowledge base K, we
define ind(K) = min{ind(R) : R ∈ M OD(K)} to be the
minimum index of the knowledge base.
For clarity, when we write Rn , Kn and Mn in the following lemmas, we mean the ranked interpretations, knowledge
bases and sets of models constructed in steps 3 to 5 of the
algorithm when i = n:
Lemma 7. Given any knowledge base K ⊆ Lb , M OD(K) ⊆
Mn , where n = ind(K).
Proof. An easy induction on step 5 of the algorithm proves
that Mn = {R ∈ RI : ind(R) ≥ n}. By hypothesis,
ind(R) ≥ n for all R ∈ M OD(K), and hence M OD(K) ⊆
Mn .
The following lemma proves that entailment under |≈?
corresponds to minimisation of index:
176
Lemma 8. Given any knowledge base K ⊆ Lb , Cn? (K) =
sat(Rn ), where n = ind(K).
operator T (·) that does not go beyond the expressivity of a
K LM-style conditional language, but revised, of course, for
the expressivity of description logics.
In the context of adaptive logics, Straßer (2014) defines the logic R+ as an extension of K LM in which arbitrary boolean combinations of defeasible implications are allowed, and the set of propositional atoms has been extended
to include the symbols {li : i ∈ N}. Semantically, these
symbols encode rank in the object language, in the sense that
u li in a ranked interpretation R iff R(u) ≥ i. Straßer’s
interest in R+ is to define an adaptive logic ALC S that provides a dynamic proof theory for rational closure, whereas
our interest in B KLM is to generalise rational closure to more
expressive extensions of K LM. Nevertheless, the Minimal
Abnormality Strategy (see the work of Batens (2007), for
instance) for ALC S is closely related to LM -entailment as
defined in this paper.
Proof. For all A, Kn |≈R A iff R
A for all R ∈
M OD(Kn ) = Mn . But by lemma 7, M OD(K) ⊆ Mn
and hence CnR (Kn ) ⊆ CnR (K). On the other hand,
Rn ∈ M OD(K) by hypothesis and hence Rn
A
for all A ∈ K. By the definition of step 4 of the algorithm we have sat(Rn ) = Cn? (Kn ), and thus K ⊆
Cn? (Kn ). Applying CnR to each side of this inclusion (using the monotonicity of rank entailment), we get CnR (K) ⊆
CnR (Cn? (Kn )) = Cn? (Kn ), with the last equality following from lemma 6. Putting it all together, we have
CnR (Kn ) ⊆ CnR (K) ⊆ Cn? (Kn ), and hence by Cumulativity we conclude Cn? (K) = Cn? (Kn ) = sat(Rn ).
Consider the strict partial order on RI defined by R1 <
R2 iff ind(R1 ) < ind(R2 ). By construction, the index of a
ranked interpretation is unique, and hence < is total. It follows from lemma 8 that |≈? =|≈< , and hence |≈? is equivalent to a total order entailment relation. This completes the
proof of theorem 1.
5
6
Conclusion
The main focus of this paper is exploring the connection between expressiveness and entailment for extensions of the
core logic K LM. Accordingly, we introduce the logic B KLM,
an extension of K LM that allows for arbitrary boolean combinations of defeasible implications. We take an abstract approach to the analysis of B KLM, and show that it is strictly
more expressive than existing extensions of K LM such as
P TL (Booth, Meyer, and Varzinczak 2013) and K LM with
negation (Booth and Paris 1998). Our primary conclusion is
that a logic as expressive as B KLM has to give up several
desirable properties for defeasible entailment, most notably
the Single Model property, and thus appealing forms of entailment for P TL such as LM-entailment (Booth et al. 2015)
cannot be lifted to the B KLM case.
For future work, an obvious question is what forms of defeasible entailment are appropriate for B KLM. For instance,
is it possible to skirt the impossibility results proven in this
paper while still retaining the K LM rationality properties?
Other forms of entailment for P TL, such as PT-entailment,
have also yet to be analysed in the context of B KLM and
may be better suited to such an expressive logic.
Another line of research to be explored is whether there
is a more natural translation of P TL formulas into B KLM
than that defined in this paper. Our translation is based on a
direct encoding of P TL semantics, and consequently results
in an exponential blow-up in the size of the formulas being
translated. It is clear that there are much more efficient ways
to translate specific P TL formulas, but we leave it as an open
problem whether this can be done in general. In a similar
vein, it is interesting to ask how P TL could be extended in
order to make it equiexpressive with B KLM.
Finally, it may be interesting to compare B KLM with an
extension of K LM that allows for nested defeasible implications, i.e. formulas such as α |∼ (β |∼ γ). While such an
extension cannot be more expressive than B KLM, at least for
a semantics given by ranked interpretations, it may provide
more natural encodings of various kinds of typicality, and
thus be easier to work with from a pragmatic point of view.
Related Work
The most relevant work w.r.t. the present paper is that of
Booth and Paris (1998) in which they define rational closure
for the extended version of KLM for which negated conditionals are allowed, and the work on P TL (Booth et al. 2015;
Booth et al. 2019). The relation this work has with B KLM
was investigated in detail throughout the paper.
Delgrande (1987) proposes a logic that is as expressive as
B KLM. The entailment relation he proposes is different from
the minimal entailment relations we consider here and, given
the strong links between our constructions and the KLM approach, the remarks in the comparison made by Lehmann
and Magidor (1992, Section 3.7) are also applicable here.
Boutilier (1994) defines a family of conditional logics
using preferential and ranked interpretations. His logic is
closer to ours and even more expressive, since nesting of
conditionals is allowed, but he too does not consider minimal constructions. That is, both Delgrande and Boutilier’s
approaches adopt a Tarskian-style notion of consequence,
in line with rank entailment. The move towards a nonmonotonic notion of defeasible entailment was precisely our
motivation in the present work.
Giordano et al. (2010) propose the system Pmin which is
based on a language that is as expressive as P TL. However,
they end up using a constrained form of such a language that
goes only slightly beyond the expressivity of the language
of KLM-style conditionals (their well-behaved knowledge
bases). Also, the system Pmin relies on preferential models and a notion of minimality that is closer to circumscription (McCarthy 1980).
In the context of description logics, Giordano et al. (2007;
2015) propose to extend the conditional language with an
explicit typicality operator T (·), with a meaning that is
closely related to the P TL operator •. It is worth pointing
out, though, that most of the analysis in the work of Giordano et al. is dedicated to a constrained use of the typicality
177
References
McCarthy, J. 1980. Circumscription, a form of nonmonotonic reasoning. Art. Int. 13(1-2):27–39.
Straßer, C. 2014. An adaptive logic for rational closure.
In Adaptive Logics For Defeasible Reasoning, volume 38
of Trends in Logic. Springer International Publishing. 181–
206.
Batens, D. 2007. A universal logic approach to adaptive
logics. Logica Universalis 1:221–242.
Booth, R., and Paris, J. 1998. A note on the rational closure of knowledge bases with both positive and negative
knowledge. Journal of Logic, Language and Information
7(2):165–190.
Booth, R.; Casini, G.; Meyer, T.; and Varzinczak, I. 2015.
On the entailment problem for a logic of typicality. In IJCAI
2015, 2805–2811.
Booth, R.; Casini, G.; Meyer, T.; and Varzinczak, I. 2019.
On rational entailment for propositional typicality logic. Artificial Intelligence 277.
Booth, R.; Meyer, T.; and Varzinczak, I. 2013. A propositional typicality logic for extending rational consequence. In
Fermé, E.; Gabbay, D.; and Simari, G., eds., Trends in Belief
Revision and Argumentation Dynamics, volume 48 of Studies in Logic – Logic and Cognitive Systems. King’s College
Publications. 123–154.
Boutilier, C. 1994. Conditional logics of normality: A modal
approach. Artificial Intelligence 68(1):87–154.
Casini, G., and Straccia, U. 2013. Defeasible inheritancebased description logics. JAIR 48:415–473.
Casini, G.; Meyer, T.; Moodley, K.; and Nortje, R. 2014.
Relevant closure: A new form of defeasible reasoning for
description logics. In JELIA 2014, 92–106.
Casini, G.; Meyer, T.; and Varzinczak, I. 2019. Taking defeasible entailment beyond rational closure. In Calimeri, F.;
Leone, N.; and Manna, M., eds., Logics in Artificial Intelligence - 16th European Conference, JELIA 2019, Rende,
Italy, May 7-11, 2019, Proceedings, volume 11468 of Lecture Notes in Computer Science, 182–197. Springer.
Delgrande, J. 1987. A first-order logic for prototypical properties. Artificial Intelligence 33:105–130.
Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G.
2007. Preferential description logics. In Dershowitz, N.,
and Voronkov, A., eds., Logic for Programming, Artificial
Intelligence, and Reasoning (LPAR), number 4790 in LNAI,
257–272. Springer.
Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. L.
2010. A nonmonotonic extension of KLM preferential
logic P. In Logic for Programming, Artificial Intelligence,
and Reasoning - 17th International Conference, LPAR-17,
Yogyakarta, Indonesia, October 10-15, 2010. Proceedings,
317–332.
Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. 2015.
Semantic characterization of rational closure: From propositional logic to description logics. Art. Int. 226:1–33.
Kraus, S.; Lehmann, D.; and Magidor, M. 1990. Nonmonotonic reasoning, preferential models and cumulative logics.
Artificial Intelligence 44:167–207.
Lehmann, D., and Magidor, M. 1992. What does a conditional knowledge base entail? Art. Int. 55:1–60.
Lehmann, D. 1995. Another perspective on default reasoning. Annals of Math. and Art. Int. 15(1):61–82.
178
Towards Efficient Reasoning with Intensional Concepts
Jesse Heyninck1 , Ricardo Gonçalves2 , Matthias Knorr2 , João Leite2
1
Technische Universität Dortmund
2
NOVA LINCS, Departamento de Informátia, Faculdade de Ciências e Tecnologia, Universidade Nova
de Lisboa
jesse.heyninck@tu-dortmund.de, {rjrg,mkn,jleite}@fct.unl.pt
Abstract
as “next”, “at time”, “during interval T”) or space (e.g., “at
place P”, “within a given radius”, “connected to”), but also
legal reasoning (e.g., “is obliged to”, “is permitted”).
Example 1. Consider an airport that constantly receives
data from sensors, cameras, etc. for monitorization, which,
in combination with, e.g., facial recognition algorithms, allows one to automatize and optimize relevant tasks, such
as boarding, checking in, and giving or denying access to
certain parts of the airport. For example, checked-in passengers that are missing for boarding can be traced and
alerted, and, in the case of non-compliance to proceed to
the gate given the constraints on time and location, be subject to penalties. Also, passengers that comply with relevant
security procedures can be automatically boarded, and irregularities can be communicated to the security personnel
for possible intervention, allowing for a more efficient allocation of (human) resources, and a better-functioning and
safer airport.
In this context, efficient reasoning with non-monotonic
rules over intensional concepts is indeed mandatory, since a)
rules allow us to encode monitoring and intervention guidelines and policies in a user-friendly and declarative manner;
b) conclusions may have to be revised in the presence of
newly arriving information; c) different intensional concepts
need to be incorporated in the reasoning process; and d)
timely decisions are required, even in the presence of large
amounts of data, as in streams. However, relevant existing
work usually deals with only one kind of intensional concepts (as detailed before), and, in general, the computational
complexity of the proposed formalisms is too high, usually
due to both the adopted underlying formalism and the unrestricted reasoning with expressive intensional concepts.
In this paper, we introduce a formalism that allows us
to reason with defeasible knowledge over intensional concepts. We build on so-called intensional logic programs
(Orgun and Wadge 1992), extended with non-monotonic default negation, and equip them with a novel three-valued semantics with favorable properties. In particular, we define a
well-founded model in the line of the well-founded semantics for logic programs (Gelder, Ross, and Schlipf 1991).
Provided the adopted intensional operators satisfy certain
properties, which turn out to be aligned with practical applications such as the one outlined in Ex. 1, the well-founded
model is unique, minimal among the three-valued models,
Recent developments triggered by initiatives such as the Semantic Web, Linked Open Data, the Web of Things, and geographic information systems resulted in the wide and increasing availability of machine-processable data and knowledge
in the form of data streams and knowledge bases. Applications building on such knowledge require reasoning with
modal and intensional concepts, such as time, space, and obligations, that are defeasible. For example, in the presence
of data streams, conclusions may have to be revised due to
newly arriving information. The current literature features a
variety of domain-specific formalisms that allow for defeasible reasoning using specific intensional concepts. However,
many of these formalisms are computationally intractable and
limited to one of the mentioned application domains. In this
paper, we define a general method for obtaining defeasible inferences over intensional concepts, and we study conditions
under which these inferences are efficiently computable.
1
INTRODUCTION
In this paper, we develop a solution that allows us to efficiently reason with intensional concepts, such as time, space
and obligations, providing defeasible/non-monotonic inferences in the presence of large quantities of data.
Initiatives such as the Semantic Web, Linked Open Data,
and the Web of Things, as well as modern Geographic Information Systems, resulted in the wide and increasing availability of machine-processable data and knowledge in the
form of data streams and knowledge bases. To truly take
advantage of this kind of knowledge, it is paramount to be
able to reason in the presence of intensional or modal concepts, which has resulted in an increased interest in formalisms, often based on rules with defeasible inferences,
that allow for reasoning with time (Anicic et al. 2012;
Gonçalves, Knorr, and Leite 2014; Brandt et al. 2018;
Beck, Dao-Tran, and Eiter 2018; Brewka et al. 2018;
Walega, Kaminski, and Grau 2019), space (Brenton, Faber,
and Batsakis 2016; Walega, Schultz, and Bhatt 2017; Izmirlioglu and Erdem 2018; Suchan et al. 2018), and possibility
or obligations (Panagiotidi, Nieves, and Vázquez-Salceda
2009; Gonçalves and Alferes 2012; Governatori, Rotolo,
and Riveret 2018; Beirlaen, Heyninck, and Straßer 2019).
Examples of such concepts may be found in applications
with data referring for example to time (e.g., operators such
179
in the sense of only providing derivable consequences, and,
crucially, its computation is tractable. Our approach allows
us to add to relevant related work in the sense of providing
a well-founded semantics to formalisms that did not have
one, which we illustrate on a relevant fragment of LARS
programs (Beck, Dao-Tran, and Eiter 2018).
The remainder of the paper is structured as follows. We
introduce intensional logic programs in Sec. 2, define our
three-valued semantics in Sec. 3, show how to compute the
well-founded model in Sec. 4, discuss the complexity and
related work in Secs. 5 and 6, respectively, before we conclude.
2
a finite set of rules of the form:
A ← A1 , . . . , A n , ∼ B 1 , . . . , ∼ B m
(1)
where A, A1 , . . . , An , B1 , . . . , Bm ∈ LA
O . We call A the
head of the rule, and A1 , . . . , An , ∼ B1 , . . . , ∼ Bm its
body.
We also call P simply a program when this does not cause
confusion and positive if it does not contain default negation. Intensional logic programs are highly expressive as
intensional operators can appear arbitrarily anywhere in the
rules, in particular in rule heads and in scope of default negation.
Example 2. Consider a fragment of the setting in Ex. 1 with
two gateways: a and b. The area before gateway a is α,
the area between gateway a and b is β, and the area behind
gateway b is γ. α and β are transit zones where one is not
allowed to wait. For simplicity we assume a finite timeline
T = {1, 2}.1 Consider the set of operators O1 :
INTENSIONAL LOGIC PROGRAMS
In this section, building on previous work by Orgun and
Wadge [1992], we introduce intensional logic programs, a
very expressive framework that allows us to reason with intensional concepts, such as time, space, and obligations, in
the presence of large quantities of data, including streams of
data. Such intensional logic programs are based on rules,
as used in normal logic programs, enriched with atoms that
introduce the desired intensional concepts. The usage of default negation in the rules is a distinctive feature compared
to the original work (Orgun and Wadge 1992) and is particularly well-suited to model non-monotonic and defeasible reasoning (Gelfond 2008) and allows us to capture many
other forms of non-monotonic reasoning, see, e.g., (Caminada et al. 2015; Chen et al. 2010).
To assign meaning to intensional programs, we rely on
the framework of neighborhood semantics (Pacuit 2017), a
generalization of the Kripke semantics, that easily allows us
to capture a wide variety of intensional operators. In this
section, we introduce neighborhood frames to assign semantics to intensional operators, and leave the definition of our
novel three-valued semantics for such programs to the next
section.
We start by defining the basic elements of our language. We consider a function-free first-order signature
Σ = hP, Ci, a set X of variables, and a set of operation
symbols O, such that the sets P (of predicates), C (of constants), X and O are mutually disjoint. The set of atoms over
Σ and X is defined in the usual way. We say that an atom
is ground if it does not contain variables, and we denote by
AΣ the set of all ground atoms over Σ. In what follows, and
without loss of generality, we leave the signature Σ implicit
and consider only the set of ground atoms over Σ, denoted
by A.
The set O contains the symbols representing the various
intensional operators ∇. Based on these, we introduce intensional atoms.
Definition 1. Given a set of atoms A and a set of operation
A
symbols O, the set IO
of intensional atoms over A and O is
A
defined as IO = {∇p | p ∈ A and ∇ ∈ O}, and the set of
A
A
program atoms LA
O is defined as LO = A ∪ IO .
We can define intensional logic programs as sets of rules
with default negation, denoted by ∼, over program atoms.
Definition 2. Given a set of atoms A and a set of operation
symbols O, an intensional logic program P over A and O is
O1 = {O, t , @t,ℓ , @ℓ , ⊳t | t ∈ T, ℓ ∈ {α, β, γ}}
where O expresses that “something is obligatory”, t
means “something is the case at time t and every place
ℓ”, @ℓ means “something is the case at location ℓ”, @t,ℓ
means “something is the case at time t and location ℓ”, and
⊳t means “something is the case at or before time t”. We
use a signature hP, Ci where the set C of constants contains identifiers representing persons, including p representing Petra, and the set P of predicates is composed of the
following unary predicates: passeda , passedb move, is
and called. They express that x passed through gate a or
b, x moves, x is at a spatio-temporal point, and x is called,
respectively.
Consider program P composed of the following rules: 2
called(x) ← Omove(x), ∼ move(x)
Omove(x) ← ∼ @γ is(x)
t move(x) ← @t+1,β is(x), @t,α is(x)
t move(x) ← @t+1,γ is(x), @t,β is(x)
@t,β is(x) ← ⊳t passeda (x), ∼ ⊳t passedb (x)
@t,γ is(x) ← ⊳t passedb (x)
1 passeda (p) ←
(2)
(3)
(4)
(5)
(6)
(7)
(8)
Rule (2) encodes that if a person should move, but does
not, she will be called. Rule (3) encodes that a person ought
to move if she is not at γ. In the case of rules (4) and (5),
a person moved if she was at two different locations at two
subsequent time points. Rule (6) encodes that if a person
passed through gate a, but not through gate b, she is at
β, whereas rule (7) imposes that she is at γ if she passed
through gate b. Finally, rule (8) asserts that Petra passed
through gate a at time 1.
1
Note that for applications such as the one described here, in
practice, considering an arbitrarily large, but finite timeline does
indeed suffice.
2
In the course of this example, we use variables to ease the
presentation. They represent the ground instantiation of such rules
with all possible constants in the usual way. Time variable t also
represents all possible values.
180
3
In order to give semantics to intensional operators, we follow the same ideas as employed by Orgun and Wadge [1992]
and consider the neighborhood semantics, a strict generalization of Kripke-style semantics that allows capturing intensional operators (Pacuit 2017) such as temporal, spatial,
or deontic operators, even those that do not satisfy the normality property imposed by Kripke frames (Chellas 1980).
We start by recalling neighborhood frames.
THREE-VALUED SEMANTICS
In this section, we define a three-valued semantics for intensional logic programs as an extension of the well-founded
semantics for logic programs (Gelder, Ross, and Schlipf
1991) that incorporates reasoning over intensional concepts.
The benefit of this approach over the more commonly used
two-valued models is that, although there are usually several
such three-valued models, we can determine a unique minimal one – intuitively the one which contains all the minimally necessary consequences of a program – which can
be efficiently computed. In fact, even for programs without
intensional concepts, a unique two-valued minimal model
does not usually exist (Gelfond and Lifschitz 1991).
We consider three truth values, “true”, “false”, and “undefined”, where the latter corresponds to neither true nor false.
Given a neighborhood frame, we start by defining interpretations that contain a valuation function which indicates in
which worlds (of the frame) an atom from A is true (W ⊤ ),
and in which ones it is true or undefined (W u ), i.e., not
false 4 .
Definition 3. Given a set of operation symbols O, a neighborhood frame (over O) is a pair F = hW, N i where W is a
non-empty set (of worlds) and N = {θ∇ | ∇ ∈ O} is a set
of neighborhood functions θ∇ : W → ℘(℘(W )).3
Thus, in comparison to Kripke frames, instead of a relation over W , neighborhood frames have functions for each
operator that map worlds to a set of sets of worlds. These
sets intuitively represent the atoms necessary (according to
the correspondent intensional operator) at that world.
Example 3. The operators from Ex. 2 are given semantics
using a neighborhood frame where the set of worlds W1 is
composed of triples (t, ℓ, ⋆) where t ∈ T is a time point,
ℓ ∈ {α, β, γ} is a location and ⋆ ∈ {I, A} indicates if the
world is the actual world A or the ideal world I (postulating
ideal worlds is a standard technique for giving semantics to
modal operators (McNamara 2019)). The neighborhoods of
O1 are defined, for t, t′ ∈ T , ℓ, ℓ′ ∈ {α, β, γ}, w ∈ W1 and
⋆ ∈ {I, A}, as:
Definition 4. Given a set of atoms A and a frame F =
hW, N i, an interpretation I over A and F is a tuple
hW, N, V i with a valuation function V : A → ℘(W ) ×
℘(W ) s.t., for every p ∈ A, V (p) = (W ⊤ , W u ) with
W ⊤ ⊆ W u . If, for every p ∈ A, W ⊤ = W u , then we
call I total.
The subset inclusion on the worlds ensures that no p ∈ A
can be true and false in some world simultaneously. This
intuition of the meaning is made precise with the denotation
of program atoms for which we use the three truth values.
We denote the truth values true, undefined and false with ⊤,
u, and ⊥, respectively, and we assume that the language LA
O
contains a special atom u (associated to u).
• θO ((t, ℓ, ⋆)) = {W ′ ⊆ W1 | (t, ℓ, I) ∈ W ′ };
• θt (w) = {W ′ ⊆ W1 | (t, ℓ, A) ∈ W ′ for every ℓ ∈
{α, β, γ}}.
• θ@ℓ ((t, ℓ′ , ⋆)) = {W ′ ⊆ W1 | (t, ℓ, A) ∈ W ′ };
• θ@t,ℓ (w) = {W ′ ⊆ W1 | (t, ℓ, A) ∈ W ′ };
Definition 5. Given a set of atoms A, a frame F, and an
interpretation I = hW, N, V i, we define the denotation of
A ∈ LA
O in I:
• θ⊳t (w) = {W ′ ⊆ W1 | (t′ , ℓ, A) ∈ W ′ , for some t′ ≤
t and for some ℓ ∈ {α, β, γ}}.
Intuitively, θO ((t, ℓ, ⋆)) consists of all the sets of worlds
which include the ideal counterpart (t, ℓ, I) of (t, ℓ, ⋆);
θt (w) consists of all the sets of worlds which includes all
the actual worlds with time component t; θt (w) consists
of all the sets of worlds that include all actual worlds with
a time stamp t; θ@ℓ (w) contains all sets of worlds that contains at least one actual world with a space component ℓ; a
set of worlds is in θ@t,ℓ (w) if it contains (t, ℓ, A); finally, a
set of worlds is in θ⊳t (w) if it contains at least one actual
world with a time stamp t′ which is earlier or equal as t.
• kpk†I = W † if A = p ∈ A, with V (p) = (W ⊤ , W u ) and
† ∈ {⊤, u};
• kuku = W and kuk⊤ = ∅, if A = u;
A
• k∇pk†I = {w ∈ W | kpk†I ∈ θ∇ (w)} if A = ∇p ∈ IO
and † ∈ {⊤, u};
u
A
• kAk⊥
I = W \ kAkI for A ∈ LO .
⊤
For a formula A ∈ LA
O and an interpretation I, kAkI
is the set of worlds in which A is true, kAkuI is the set of
worlds in which A is not false, i.e., undefined or true, and
kAk⊥
I is the set of worlds in which A is false. For atoms
p ∈ A, the denotation is straightforwardly derived from the
interpretation I, i.e., from the valuation function V , and for
the special atom u it is defined as expected (undefined in all
worlds). For an intensional atom ∇p, w is in the denotation
k∇pk†I of ∇p if the denotation of p (according to I) is a
neighborhood of ∇ for w, i.e. kpk†I ∈ θ∇ (w).
Thus, neighborhood functions θ can be both invariant under
the input w, i.e., θ(w) = θ(w′ ) for any w, w′ ∈ W (e.g., θt
and θ@t,ℓ ), or variate depending on w (e.g., θO and @ℓ ). This
is why the above definitions of neighborhood functions that
depend on w need to explicit the components of the world
w, i.e., (t, ℓ, ⋆).
3
Note that we often leave O implicit as N allows to uniquely
determine all elements from O. Also, to ease the presentation, we
only consider unary intensional operators. Others can then often be
represented using rules (cf. also (Orgun and Wadge 1992)).
4
We follow the usual notation in modal logic and interpretations
explicitly include the corresponding frame.
181
We often leave the subscript I from kAk†I as well as the
reference to A and F for interpretations and programs implicit.
Example 4. Consider hW1 , {θO , θ@1 ,α , θ@β }i as in
Ex. 3 and I1 = hW1 , {θO , θ@1 ,α , θ@β }, V i where:
V (passeda (p)) =
({(1, α, A)}, {(1, α, A)})
V (move(p)) =
({(1, α, I)}, {(1, α, I), (2, β, A)})
Then the following are examples of denotations of intensional atoms:
kOmove(p)k⊤
I1 = {(1, α, A), (1, α, I)}
k@1,α passeda (p)k⊤
I1 = W 1
k@β move(p)kuI1 = {(2, ℓ, ⋆) | ℓ ∈ {α, β, γ}, ⋆ ∈ {A, I}}
since
We explain the first denotation kOmove(p)k⊤
I1 :
kmove(p)k ∋ (1, α, I) and {(1, α, I)} ∈ θO ((1, α, I))
and {(1, α, I)} ∈ θO ((1, α, A)), we get the denotation
kOmove(p)k⊤
I1 as stated above.
Based on the denotation, we can now define our model
notion, which is inspired by partial stable models (Przymusinski 1991), which come with two favorable properties,
minimality and support. The former captures the idea of
minimal assumption, the latter provides traceable inferences
from rules. We adapt this notion here by defining a reduct
that, given an interpretation, transforms programs into positive ones, for which a satisfaction relation and a minimal
model notion are defined.
We start by adapting two orders for interpretations, the
truth ordering, ⊑, and the knowledge ordering, ⊑k . The
former prefers higher truth values in the order ⊥ < u < ⊤,
the latter more knowledge (i.e., less undefined knowledge).
Formally, for interpretations I and I ′ , and every p ∈ A:
• I ⊑k I iff
kpk⊤
I
′
⊆
kpk⊤
I′
′
and
kpk⊥
I
⊆
Stable models can now be defined by imposing minimality w.r.t. the truth ordering on the corresponding reduct.
Definition 8. Let A be set of atoms, and F = hW, N i a
frame. An interpretation I is a stable model of a program P
if:
• for every w ∈ W , I satisfies P/Iw at w, and
• there is no interpretation I ′ such that I ′ ⊏ I and, for each
w ∈ W , I ′ satisfies P/Iw at w.
Example 5. Recall P from in Ex. 2. For simplicity of presentation suppose that the set of constants C only contains
p, resulting in the following grounded program.6
called(p) ← Omove(p), ∼ move(p)
Omove(p) ← ∼ @γ is(p)
t move(p) ← @t+1,β is(p), @t,α is(p)
t move(p) ← @t+1,γ is(p), @t,β is(p)
@t,β is(p) ← ⊳t passeda (p), ∼ ⊳t passedb (p)
@t,γ is(p) ← ⊳t passedb (p)
1 passeda (p) ←
Consider F = hW1 , O1 i as in Ex. 3 and the total interpretation I1 defined by:
• I ⊑ I ′ iff kpk†I ⊆ kpk†I ′ for every † ∈ {⊤, u};
′
Definition 7. Let A be a set of atoms, and F = hW, N i a
frame. An interpretation I satisfies a positive program P at
w ∈ TW iff for each r ∈ P of the form (1), we have that
w ∈ i≤n kAi k† implies w ∈ kAk† (for any † ∈ {⊤, u}) 5 .
kpasseda (p)k⊤
I1 = {(1, ℓ, A) | ℓ ∈ {α, β, γ}}
kpk⊥
I′ .
kpassedb (p)k⊤
I1 = {}
′
We write I ≺ I if I I and I 6 I for ∈ {⊑, ⊑k }.
We proceed with a generalization of the notion of reduct
to programs with intensional atoms.
kis(p)k⊤
I1 = {(1, β, A)}
kmove(p)k⊤
I1 = {(t, ℓ, I) | t ∈ T ; ℓ ∈ {α, β, γ}}
kcalled(p)k⊤
I1 = {(t, ℓ, A) | t ∈ T ; ℓ ∈ {α, β, γ}}
Definition 6. Let A be set of atoms, and F = hW, N i a
frame. The reduct of a program P at w ∈ W w.r.t. an interpretation I, P/Iw , contains for each r ∈ P of the form
(1):
S
• A ← A1 , . . . , An if w 6∈ i≤m kBi ku
S
S
• A ← A1 , . . . , An , u if w ∈ i≤m kBi ku \ i≤m kBi k⊤
We see that for any (t′ , ℓ, A) ∈ W1 , P/(I1 )w consists of
the following rules occuring in P.
called(p) ← Omove(p)
Omove(p) ←
t move(p) ← @t+1,β is(p), @t,α is(p)
t move(p) ← @t+1,γ is(p), @t,β is(p)
@t,β is(p) ← ⊳t passeda (p)
@t,γ is(p) ← ⊳t passedb (p)
1 passeda (p) ←
Intuitively, for each rule r of P, the reduct P/Iw contains
either a rule of the first form, if all negated program atoms
in the body of r are false at w (or the body does not have
negated atoms), or a rule of the second form, if none of the
negated program atoms in the body of r are true at w, but
some of these are undefined at w, or none, otherwise. This
also explains why the reduct is defined at w: truth and undefinedness vary for different worlds. The special atom u is
applied to ensure that rules for the second case cannot impose the truth of the head in the notion of satisfaction for
positive programs.
Note that the reduct of a program is a positive program,
for which we can define a notion of satisfaction as follows.
5
Since the intersection of an empty sequence of subsets of a set
is the entire set, then, for n=0, i.e., when the body of the rule is
empty, the satisfaction condition is just w ∈ kAk† for any † ∈
{⊤, u}.
6
To ease the presentation, we still use t to represent all possible
values.
182
Whereas for any (t′ , ℓ, I) ∈ W1 , P/(I1 )w consists of:
θ|1,2|γ (w) for any w ∈ W1 , k|1, 2|γ is(p)k⊤
I2 = ∅,
which means that P/I2 = {@1,γ is(p) ←; @2,γ is(p) ←
; @3,γ is(p) ←}. Clearly, I2 is the ⊏-minimal interpretation that satisfies P/I2 . However, I1 ⊏ I2 and thus, I2 is
not a truth-minimal stable model.
To counter that, we consider monotonic operators. Formally, given a set of atoms A and a frame F, an intensional operator ∇ is said to be monotonic in F if, for any
two interpretations I and I ′ such that I ⊑ I ′ , we have that
k∇pk†I ⊆ k∇pk†I ′ for every p ∈ A and † ∈ {⊤, u}.
If all intensional operators in a frame are monotonic, then
truth-minimality of stable models is guaranteed.
Proposition 2. Let A be set of atoms, and F a frame in
which all intensional operators are monotonic. If I is a stable model of P, then there is no stable model I ′ of P such
that I ′ ⊏ I.
Regarding support, recall that the stable models semantics
of normal logic programs satisfies the support property, in
the sense that for every atom of a stable model there is a
rule that justifies it. In other words, if we remove an atom p
from a stable model some rule becomes false in the resulting
model. Such rule can be seen as a justification for p being
true at the stable model. In the case of intensional logic
programs we say that an interpretation I = hW, N, V i is
supported for a program P if, for every p ∈ A and w ∈ W ,
if w ∈ kpk⊤ , then there is a rule r ∈ P/Iw that is not
satisfied by I ′ at w, where I ′ = hW, N, V ′ i is such that
V ′ (q) = V (q) for q 6= p, and V ′ (p) = hW ⊤ \ {w}, W u i
where V (p) = hW ⊤ , W u i.
This notion of supportedness is desirable for intensional
logic programs since we also want a justification why each
atom is true at each world in a stable model. The following
results show that this is indeed the case.
Proposition 3. Let A be set of atoms, and F a frame. Then,
every stable model of a program P is supported.
In general, the existence and uniqueness of stable models of a program is not guaranteed, not even for positive
programs and/or under the restriction of all operators being
monotonic.
Example 7. Let O = {⊕}, A = {p} and F =
h{1, 2}, {θ⊕ }i where θ⊕ (1) = θ⊕ (2) = {{1}, {2}}. Let
P = {⊕p ←}. This program has two stable models:
• I1 with V1 (p) = ({1}, {1});
• I2 with V2 (p) = ({2}, {2}).
The existence of two stable models of the above positive
program is caused by the non-determinism introduced by
the intensional operator in the head of the rule. Formally,
an
T operator θ of a frame F = hW, N i is deterministic if
θ(w) ∈ θ(w) for every w ∈ W . A program P is deterministic in the head if, for every rule r ∈ P of the form (1),
if A = ∇p, then θ∇ is deterministic.
We can show that every positive program that is deterministic in the head and only considers monotonic operators has
a single minimal model.
Proposition 4. Given a set of atoms A and a frame F, if P
is a positive program that is deterministic in the head and
Omove(p) ←
t move(p) ← @t+1,β is(p), @t,α is(p)
t move(p) ← @t+1,γ is(p), @t,β is(p)
@t,β is(p) ← ⊳t passeda (p)
@t,γ is(p) ← ⊳t passedb (p)
1 passeda (p) ←
It can be checked that I1 satisfies minimality and is therefore
a stable model of P.
Consider now the total interpetation I2 identical to I1 except for kpassedb (p)k⊤
I2 = {(2, ℓ, A) | ℓ ∈ {α, β, γ}}.
Then for e.g. (2, α, A), the reduct P/(I2 )(2,α,A) is:
called(p) ← Omove(p)
Omove(p) ←
t move(p) ← @t+1,β is(p), @t,α is(p)
t move(p) ← @t+1,γ is(p), @t,β is(p)
@t,γ is(p) ← ⊳t passedb (p)
1 passeda (p) ←
Since (2, α, A) 6∈ k@2,γ is(p)k = ∅ even though
@2,γ is(p) ← ⊳2 passedb (p) ∈ P/(I2 )(2,α,A) and
(2, α, A) ∈ k ⊳2 passedb (p)k⊤
I2 = W1 , we see that I2 does
not satisfy P/(I2 )(2,α,A) and therefore is not a stable model.
We can show that our model notion is faithful to partial stable models of normal logic programs (Przymusinski
1991), i.e., if we consider a program without intensional
atoms, then its semantics corresponds to that of partial stable
models.
Proposition 1. Let A be set of atoms, F a frame, and P a
program with no intensional atoms. Then, there is a one-toone correspondence between the stable models of P and the
partial stable models of the normal logic program P.
While partial stable models are indeed truth-minimal, this
turns out not to be the case for intensional programs, due to
non-monotonic intensional operators.
Example 6. Consider the operator |j, k|γ representing that
an atom is true at γ at all time points in [j, k], and not in any
interval properly containing [j, k]. This operator has the following neighborhood (given W1 from Ex. 3): θ|j,k| (w) =
{W ′ ⊆ W1 | {(j, γ, A), (j + 1, γ, A), . . . , (k, γ, A)} ⊆
W ′ and (j − 1, γ, A), (k + 1, γ, A) 6∈ W ′ }. Consider the
following program P consisting of:
@1,γ is(p) ←
@2,γ is(p) ←
@3,γ is(p) ←∼ |1, 2|γ is(p)
This program has two stable models, of which one is
not minimal. In more detail, the following interpreta= kis(p)kuI1 =
tions are stable: I1 with kis(p)k⊤
I1
⊤
{(1, γ, A), (2, γ, A)} and I2 with kis(p)kI2 = kis(p)kuI2 =
{(1, γ, A), (2, γ, A), (3, γ, A)}. To see that I2 is stable,
observe first that since {(1, γ, A), (2, γ, A), (3, γ, A)} 6∈
183
every ∇ ∈ O is monotonic in F, then it has a unique stable
model.
atoms. This operator is composed of three operators to incorporate reasoning over rules as well as over intensional
atoms.
We first define an immediate consequence operator applied to sets of labelled program atoms. To ease notation,
here and in the following, we use LW to represent (LA
O )W .
Definition 10. Given a frame F, a set of LW -formulas ∆,
and a positive program PW , we define TPW (∆) as follows:
Due to this result, in what follows, we focus on monotonic
operators and programs that are deterministic in the head, as
this is important for several of the results we obtain subsequently. This does not mean that non-montonic intensional
operators cannot be used in our framework. In fact, we can
take advantage of the default negation operator ∼ to define
non-monotonic formulas on the basis of monotonic operators and default negation. As an example, consider again
the operator |j, k| from example 6. We can use the following rule to define |j, k|p for some atom p ∈ A: |j, k|p ←
@j p, @j+1 p, . . . , @k−1 p, @k p, ∼ @j−1 p, ∼ @k+1 p.
Among the stable models of a program, we can distinguish the well-founded models as those that are minimal in
terms of the knowledge order.
TPW (∆) = {w : A | w : A ← A1 , . . . An ∈ PW ,
w : A1 , . . . , w : An ∈ ∆}
Example 9. Let PW = {(2, α, A) : @1 passeda (p) ←}.
Then TPW (∅) = {(2, α, A) : @1 passeda (p)}.
The result of TPW may contain labelled intensional
atoms, such as in Ex. 9, which implies that passeda (p)
holds at (1, α, A).
The next operator, the intensional extraction operator
IE∇ allows us to derive such labelled atoms from labelled
intensional atoms.
Definition 11. Given a frame F, a set of LW -formulas ∆,
and ∇ ∈ O, we define IE∇ (∆) as follows:
\
IE∇ (∆) = {w : A | w′ : ∇A ∈ ∆, w ∈
θ∇ (w′ )}
Definition 9. Given a set of atoms A and a frame F, an
interpretation I = hW, N, V i is a well-founded model of a
program P if it is a stable model of P, and, for every stable
model I ′ of P, it holds that I ⊑k I ′ .
Example 8 (Example 5 continued). Since I2 is in fact the
unique stable model, it is therefore the well-founded model.
Given our assumptions about monotonicity and determinism in the head, we can also show that the well-founded
model of an intensional program exists and is unique.
As IE∇ is intended to be applied to results of TPW , and
these only contain intensional atoms occurring in the head
of some rule in a given program P, we restrict IE∇ to this
set of intensional operators, which we denote by OP . Since
P is deterministic in the head, this also ensures that IE∇
has a unique outcome. Also, since programs do not contain
nested operators, we can consider the union of IE∇ for all
∇ ∈ OP .
Finally, the intensional consequence operator IC∇ maps
atoms to intensional atoms that are implied by the former,
i.e., it maps w1 : A, . . . , wn : A to w : ∇A if {w1 , . . . , wn } ∈
θ∇ (w).
Definition 12. Given a frame F = hW, N i, a set of LW formulas ∆, and θ∇ ∈ N , we define IC∇ (∆) as follows:
Theorem 1. Given a set of atoms A, and a frame F, every
program P has a unique well-founded model.
4
ALTERNATING FIXPOINT
In this section, we show how the well-founded model can be
efficiently computed. Essentially, we extend the idea of the
alternating fixpoint developed for logic programs (Gelder
1989), that builds on computing, in an alternating manner,
underestimates of what is necessarily true, and overestimates of what is not false, with the mechanisms to handle
intensional inferences. Namely, we first define an operator for inferring consequences from positive programs, and
make it applicable in general using the reduct. Based on
an iterative application of these, the alternating fixpoint provides the well-founded model.
First, since different pieces of knowledge are inferable
in different worlds, we need a way to distinguish between
these. Therefore, we introduce labels referring to worlds
and apply them to formulas of a given language as well as
programs.
Given a language L, a frame F = hW, N i, and a program P, we define the language labelled by W , LW , as
{w : A | w ∈ W and A ∈ L}, and the program labelled
by W , PW , as {w : r | r ∈ P, w ∈ W }. PW is positive iff
P is. This allows us to say that some (program) atom is true
(or undefined) at world w, which can, e.g., be used for inferences using rules labelled with w. This way, we can also
compute inferences for several worlds simultaneously, since
the labels allow us to avoid any potential overlaps.
We now proceed to define an operator for positive labelled
programs for computing inferences given a set of labelled
IC∇ (∆) = {w : ∇A ∈ LW | {w′ | w′ : A ∈ ∆} ∈ θ∇ (w),
w ∈ W}
Here, we can also simply consider the union for all θ∇ ∈ N .
Example 10. Consider the frame F = hW1 , {θ@1 , θ⊳1 }i as
defined in Ex. 3. Let ∆ = {(2, α, A) : @1 passeda (p)}.
Since
(2, α, A) : @1 passeda (p) ∈ ∆ and (1, l, A) ∈
T
θ@1 ((2, α, A)), IE@1 (∆) = {(1, ℓ, A) : passeda (p) |
ℓ ∈ {α, β, γ}}. Informally, to believe ∆, and thus to believe
that Petra passed gate a at time 1, passeda (p) has to be
true at every actual world with time stamp 1.
Also, IC⊳1 (IE@1 (∆)) = {w : ⊳1 passeda (p) | w ∈ W1 },
since {(1, ℓ, A) | ℓ ∈ {α, β, γ}} ∈ θ⊳1 (w) for any w ∈
W1 and (1, ℓ, A) : passeda (p) ∈ IE@1 (∆) for any ℓ ∈
{α, β, γ}. Informally, since we know that passeda (p) is the
case at time 1, we derive that passeda (p) is the case at or
before time 1 (formally: ⊳1 passeda (p)).
We are now ready to define the closure operator as the
least fixpoint of the composition of TPW , IE∇ and IC∇ .
184
passeda (p) | w ∈ W1 , ℓ ∈ {α, β, γ}} and N 1 = P 1 ∪
{(1, ℓ, A) : @1,β is(p) | ℓ ∈ {α, β, γ}}. From this we derive PW /P 1 = PW /N 1 = {w : @1 passeda (p) ←; w :
@1,β is(p) ← passeda (p) | w ∈ W1 }, which allows us to
calculate P 2 = N 2 = N 1 . Notice that a fixpoint is reached
and thus P ω = N ω = N 1 .
Definition 13. Given a frame F = hW, N i and a positive
program PW , Cn(PW ) is defined as the least fixpoint 7 of
!
[
[
IC∇
IE∇ (TPW ) .
∇∈O
∇∈O P
The consequence operator defined above is adequate in
the sense that it calculates the minimal (total) model of a
positive program. In more detail, we define an interpretation
I(Cn): Wp = {w ∈ W | w : p ∈ Cn(PW )}, I(Cn) =
hW, N, V i is a minimal model with V (p) = (Wp , Wp ) for
every p ∈ A.
Proposition 5. Given a frame F = hW, N i, and a positive
program P, if every ∇ ∈ O is monotonic in F and P is deterministic in the head, then I(Cn) is the unique (and total)
stable model of P.
This result can be generalized to arbitary programs relying on the reduct and on the alternating fixpoint (Gelder
1989). For the former, we adapt the reduct from Def. 6 to labelled languages, with the benefit that we can define a single
reduct for all worlds w ∈ W .
Given a frame F = hW, N i, a set of LW -formulas ∆ and a
program PW , the reduct, PW /∆ = {w : A ← A1 , . . . , An |
r ∈ P of the form (1) and, ∀i 6 m, w : Bi 6∈ ∆}.
The idea of the alternating fixpoint can be summarized as
follows. We create two sequences, that are meant to represent an underestimation of what is true (P i ) and an overestimation of what is not false (N i ). Each iteration is meant
to further increase the elements in P i and further decrease
the elements in N i using Cn over reducts obtained from the
labelled program and the results from the previous iteration.
Given a frame F = hW, N i and a program P, we define:
P0 = ∅
Given a frame F, for which any ∇ ∈ O is monotonic in
F, the alternating fixpoint construction defined above offers
a characterization of the well-founded model for programs
that are deterministic in the head. In more detail, given a
pair h∆, Θi of sets of LW -formulas, we define a partial interpretation I(h∆, Θi) = (W, N, V ) on the basis of ∆ as
follows: for every A ∈ A, V (A) = ({w ∈ W | w : A ∈
∆}, {w ∈ W | w : A ∈ Θ}). We can then show this correspondence.
Theorem 2. Given a frame F = hW, N i, and a program P
s.t. every ∇ ∈ O is monotonic in F and P is deterministic in
the head, then I(hP ω , N ω i) is the well-founded model of P.
Thus the result of the alternating fixpoint operator is a precise representation of the well-founded model of the considered intensional program.
5
In this section, we study the computational complexity of
several of the problems considered. We recall that the problem of satisfiability under neighborhood semantics has been
studied for a variety of epistemic structures (Vardi 1989).
Here, we consider the problem of determining models for
the two notions we established, stable models and the wellfounded model, and we focus on the propositional case.8
We assume familiarity with standard complexity concepts, including oracles and the polynomial hierarchy.
We first provide a result in the spirit of model-checking
for programs P. As we do not impose any semantic properties on the neighborhood frames we consider, determining a
model for a frame that can be arbitrarily chosen is not meaningful. Thus, in the remainder, we assume a fixed frame F,
fixing the worlds and the semantics of the intensional operators.9
N 0 = LW
P i+1 = Cn(PW /N i )
[
Pω =
Pi
i
i
COMPUTATIONAL COMPLEXITY
N i+1 = Cn(PW /P i )
\
Nω =
Ni
i
We can show that P is increasing, and that N i is decreasing, and both sequences reach a fixpoint because the
operator for determining Cn is monotonic and the reduct is
antitonic.
Proposition 6. Given a set of atoms A, a frame F =
hW, N i, and a positive program P, if every ∇ ∈ O is monotonic in F and P is deterministic in the head, then there are
i, j ∈ N s.t. P i = P i+1 and N j = N j+1 .
Example 11. Consider the frame F = hW1 , {θ@1 , θ@1,β }i
as defined in Ex. 3. Let PW = {w : @1 passeda (p) ←; w :
@1,β is(p) ← passeda (p), ∼ passedb (p) | w ∈ W1 }. The
alternating fixpoint construction is carried out as follows:
We start with P 0 = ∅ and N 0 = LW . Then PW /N 0 =
{w : @1 passeda (p) ←| w ∈ W1 } and PW /P 0 = {w :
@1 passeda (p) ←; w : @1,β is(p) ← passeda (p) | w ∈
W1 }, which implies P 1 = {w : @1 passeda (p); (1, ℓ, A) :
Proposition 7. Given a program P and an interpretation I,
deciding whether I is a stable model of P is in coNP.
This result is due to the minimization of stable models,
i.e., we need to check for satisfaction and verify that there
is no other interpretation which is smaller (cf. Def. 8). This
also impacts on the complexity of finding a stable model
given a fixed frame.
Theorem 3. Given a program P, deciding whether there is
a stable model of P is in ΣP
2.
8
Corresponding results for the data complexity of this problem
for programs with variables can then be achieved in the usual way
(Dantsin et al. 2001).
9
This also aligns well with related work, e.g., for reasoning
with time, such as stream reasoning where often a finite timeline
is assumed, and avoids the exponential explosion on the number of
worlds for satisfiability for some epistemic structures (Vardi 1989).
7
Recall that, given an operator T over lattice (L, 6) and an
ordinal α, T ↑ α is defined as: T ↑ 0 S
= ∅, T ↑ α = T (T ↑ α − 1)
for successor ordinals, and T ↑ α = α T ↑ α for limit ordinals.
185
LARS (Beck, Dao-Tran, and Eiter 2018) assumes a set
of atoms A and a stream S = (T, v), where T is a closed
interval of the natural numbers and v is an evaluation function that defines which atoms are true at each time point
of T . Several temporal operators are defined, including expressive window operators, and answer streams, a generalization of FLP-semantics, are employed for reasoning. A
number of related approaches are covered including CQL
(Arasu, Babu, and Widom 2006), C-SPARQL (Barbieri et al.
2010), and CQELS (Phuoc et al. 2011). Among the implementations exists LASER (Bazoobandi, Beck, and Urbani
2017), which focuses on a considerable fragment, called
plain LARS.
We can represent a plain LARS program P for stream S
as a program PS , encoding S using the @t operator. This
allows us to show the following result relating the answer
streams of plain LARS to the total stable models of such a
program PS .
Note that these results do not require that intensional operators be monotonic or deterministic in the head. In fact, if
intensional operators are monotonic, we obtain the following improved results on Prop. 7 and Thm. 3 from Prop. 2.
Corollary 1. Given a program P such that all operators
ocurring in P are monotonic, and an interpretation I, deciding whether I is a stable model of P is in P.
Corollary 2. Given a program P, deciding whether there is
a stable model of P is in NP.
Thus, if all operators are monotonic the complexity results do coincide with that of normal logic programs (without intensional atoms) (Dantsin et al. 2001), which indicates
that monotonic operators do not add an additional burden in
terms of computational complexity.
Now, if we in addition consider programs that are deterministic in the head, then we know that there exists the
unique well-founded model (cf. Thm. 1). As we have shown,
this model can be computed efficiently (cf. Thm. 2), and we
obtain the following result in terms of computational complexity.
Proposition 9. Given a plain LARS program P for stream
S and PS , there is a one-to-one correspondence between
answer streams of P for S and total stable models of PS .
Theorem 4. Given a program P that is deterministic in the
head and all operators occurring in P are monotonic, computing the well-founded model of P is P-complete.
In addition, for such an encoding of plain LARS programs
into intensional programs, we can apply our well-founded
semantics, since the operators applied in plain LARS are
monotonic and deterministic. Hence, our work also provides
a well-founded semantics for plain LARS, i.e., we allow the
usage of unrestricted default negation while preserving polynomial reasoning.
ETALIS (Anicic et al. 2012) aims at complex event processing. It assumes as input atomic events with a time stamp
and uses complex events, based on Allen’s interval algebra (Allen 1990), that are associated with a time interval,
and is therefore considerably different from LARS (which
considers time points). It contains no negation in the traditional sense, but allows for a negated pattern among the
events. Many of the complex event patterns from ETALIS
can be captured as neighborhood functions in our framework. However, ETALIS also makes use of some event patterns that would result in a non-monotonic operator, such
as the negated pattern not(p)[q, r] which expresses that p is
not the case in the interval between the end time of q and the
starting time of r. We conjecture that such a negation can
be modelled with a combination of the default negation ∼
and an operator [q, r]p which expresses that p is the case in
the interval between the end time of q and the starting time
of r, which in turn can to be defined using rules such as:
[q, r]p ← [t, t′ ]p, @t q, @t′ r, ∼ @t+1 q, ∼ @t′ −1 r. Defining
a transformation that converts a set of ETALIS rules into an
intensional logic program is left for future work.
Deontic logic programs of (Gonçalves and Alferes 2012)
are similar in spirit to our work as they extend logic programs with deontic logic formulas under stable model semantics. Although complex deontic formulas can appear
in the rules, the deontic operators are restricted to those of
Standard Deontic Logic (SDL), and computational aspects
are not considered.
Answer Set Programming Modulo Theories extended to
the Qualitative Spatial Domain (in short, ASPMT(QS))
Note that this result is indeed crucial in contexts were reasoning with a variety of intensional concepts needs to be
highly efficient.
6
RELATED WORK
In this section, we discuss related work establishing relations
to relevant formalisms in the literature.
Intensional logic programs were first defined by Orgun
and Wadge [1992] focussing on the existence of models in
function of the properties of the intensional operators. Only
positive programs are considered, but nesting of intensional
operators is allowed. The latter however can be covered in
our approach by introducing corresponding additional operators that represent each nesting occurring in such a program. This allows us to show that our approach covers the
previous work.
Proposition 8. Let P be program as in (Orgun and Wadge
1992). Then there is a positive intensional program P ′ such
that there is a one-to-one correspondence between the models of P and the total stable models of P ′ .
The contrary is not true already for programs without intensional operators. We could use a non-monotonic intensional operator for representing default negation, but these
are not considered in (Orgun and Wadge 1992) confirming
that our work is indeed an extension of the previous approach.
Since (Orgun and Wadge 1992) covers classical approaches for intensional reasoning, such as TempLog (Abadi
and Manna 1989) and MoLog (del Cerro 1986), our work
applies to these as well.
It also relates to more recent work with intensional operators, and we first discuss two prominent approaches in the
area of stream reasoning.
186
2011), also based on an alternating fixpoint construction,
together with its efficient implementation (Kasalica et al.
2020) may prove fruitful for such an endeavour.
(Walega, Schultz, and Bhatt 2017) allows for the systematic modelling of dynamic spatial systems. It is based on
logic programs over first-order formulas (with function symbols), which are not yet integrated in our approach. On the
other hand, this work does only conside spatial reasoning.
An interesting option for future work would be considering
an extension incorporating such formulas.
7
Acknowledgments The authors are indebted to the anonymous reviewers of this paper for helpful feedback. The
authors were partially supported by FCT project RIVER
(PTDC/CCI-COM/30952/2017) and by FCT project NOVA
LINCS (UIDB/04516/2020). J. Heyninck was also supported by the German National Science Foundation under
the DFG-project CAR (Conditional Argumentative Reasoning) KE-1413/11-1.
CONCLUSIONS
Building on work by Orgun and Wadge (1992), we have introduced intensional logic programs that allow defeasible
reasoning with intensional concepts, such as time, space,
and obligations, and with streams of data. Relying on the
neighborhood semantics (Pacuit 2017), we have introduced
a novel three-valued semantics based on ideas adapted from
partial stable models (Przymusinski 1991). Due to the expressivity of the intensional operators, stable models may
not be minimal nor deterministic even for programs without default negation. Hence, we have studied the characteristics of our semantics for monotonic intensional operators and programs that only admit deterministic operators
in the heads of the rules, and shown that a unique minimal
model, the well-founded model, exists and can be computed
with an alternating fixpoint construction. We have studied
the computational complexity of checking for existence of
models and computation of models and established that the
well-founded model can be computed in polynomial time.
Finally, we have discussed related work and shown that several relevant approaches in the literature can be covered.
In terms of future work, we want to investigate in more
detail the exact relations to existing approaches in the literature, that are not formally covered in this paper. Furthermore, this work can be generalized in several directions, for example, by allowing for first-order formulas instead of essentially propositional formulas (which is what
programs with constants and variables over a finite instantiation domain amount to) and nested, non-deterministic,
and non-monotonic intensional operators. Furthermore,
we may want to consider intensional operators with multiple minimal neighborhoods, by defining IE∇ as a nondetermistic operator that extracts a minimal neighborhood
W ′ ∈ θ∇ (w′ ). In that case, of course, the alternating fixpoint construction as it is defined now might not result in
a unique well-founded model. However, the occurence of
non-deterministic operators in the heads of rules is very similar to disjunctive logic programs, where the truth of a head
of a rule can also be guaranteed by a choice of different
atoms (the disjuncts) being made true. Therefore, we plan
to look at techniques from disjunctive logic programming to
generate unique well-founded extensions (cf. references in
(Knorr and Hitzler 2007)). Finally, the integration with taxonomic knowledge in the form of description logic ontologies
(Baader et al. 2007) may also be worth pursuing as applications sometimes require both (see e.g. (Alberti et al. 2011;
Alberti et al. 2012; Kasalica et al. 2019)). Hybrid MKNF
knowledge bases (Motik and Rosati 2010) are a more prominent approach among the existing approaches for combining non-monotonic rules and such ontologies, and the wellfounded semantics for these (Knorr, Alferes, and Hitzler
References
Abadi, M., and Manna, Z. 1989. Temporal logic programming. J. Symb. Comput. 8(3):277–295.
Alberti, M.; Gomes, A. S.; Gonçalves, R.; Leite, J.; and
Slota, M. 2011. Normative systems represented as hybrid
knowledge bases. In CLIMA, volume 6814 of LNCS, 330–
346. Springer.
Alberti, M.; Knorr, M.; Gomes, A. S.; Leite, J.; Gonçalves,
R.; and Slota, M. 2012. Normative systems require hybrid
knowledge bases. In AAMAS, 1425–1426. IFAAMAS.
Allen, J. F. 1990. Maintaining knowledge about temporal intervals. In Readings in qualitative reasoning about physical
systems. Elsevier. 361–372.
Anicic, D.; Rudolph, S.; Fodor, P.; and Stojanovic, N. 2012.
Stream reasoning and complex event processing in ETALIS.
Semantic Web 3(4):397–407.
Arasu, A.; Babu, S.; and Widom, J. 2006. The CQL continuous query language: semantic foundations and query execution. VLDB J. 15(2):121–142.
Baader, F.; Calvanese, D.; McGuinness, D. L.; Nardi, D.;
and Patel-Schneider, P. F., eds. 2007. The Description
Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, 2nd edition.
Barbieri, D. F.; Braga, D.; Ceri, S.; Valle, E. D.; and Grossniklaus, M. 2010. C-SPARQL: a continuous query language
for RDF data streams. Int. J. Semantic Computing 4(1):3–
25.
Bazoobandi, H. R.; Beck, H.; and Urbani, J. 2017. Expressive stream reasoning with LASER. In Procs. of ISWC,
volume 10587 of LNCS, 87–103. Springer.
Beck, H.; Dao-Tran, M.; and Eiter, T. 2018. LARS: A logicbased framework for analytic reasoning over streams. Artif.
Intell. 261:16–70.
Beirlaen, M.; Heyninck, J.; and Straßer, C. 2019. Structured
argumentation with prioritized conditional obligations and
permissions. Journal of Logic and Computation 29(2):187–
214.
Brandt, S.; Kalayci, E. G.; Ryzhikov, V.; Xiao, G.; and Zakharyaschev, M. 2018. Querying log data with metric temporal logic. J. Artif. Intell. Res. 62:829–877.
Brenton, C.; Faber, W.; and Batsakis, S. 2016. Answer
set programming for qualitative spatio-temporal reasoning:
187
Methods and experiments. In Technical Communications of
ICLP, volume 52 of OASICS, 4:1–4:15. Schloss Dagstuhl Leibniz-Zentrum fuer Informatik.
Brewka, G.; Ellmauthaler, S.; Gonçalves, R.; Knorr, M.;
Leite, J.; and Pührer, J. 2018. Reactive multi-context systems: Heterogeneous reasoning in dynamic environments.
Artif. Intell. 256:68–104.
Caminada, M.; Sá, S.; Alcântara, J.; and Dvořák, W. 2015.
On the equivalence between logic programming semantics
and argumentation semantics. International Journal of Approximate Reasoning 58:87–111.
Chellas, B. F. 1980. Modal Logic: An Introduction. Cambridge University Press.
Chen, Y.; Wan, H.; Zhang, Y.; and Zhou, Y. 2010. dl2asp:
implementing default logic via answer set programming.
In European Workshop on Logics in Artificial Intelligence,
104–116. Springer.
Dantsin, E.; Eiter, T.; Gottlob, G.; and Voronkov, A. 2001.
Complexity and expressive power of logic programming.
ACM Comput. Surv. 33(3):374–425.
del Cerro, L. F. 1986. MOLOG: A system that extends PROLOG with modal logic. New Generation Comput. 4(1):35–
50.
Gelder, A. V.; Ross, K. A.; and Schlipf, J. S. 1991. The
well-founded semantics for general logic programs. J. ACM
38(3):620–650.
Gelder, A. V. 1989. The alternating fixpoint of logic
programs with negation. In Procs. of SIGACT-SIGMODSIGART, 1–10. ACM Press.
Gelfond, M., and Lifschitz, V. 1991. Classical negation in
logic programs and disjunctive databases. New Generation
Comput. 9(3-4):365–385.
Gelfond, M. 2008. Answer sets. In Handbook of Knowledge Representation, volume 3 of Foundations of Artificial
Intelligence. Elsevier. 285–316.
Gonçalves, R., and Alferes, J. J. 2012. Specifying and reasoning about normative systems in deontic logic programming. In Procs. of AAMAS, 1423–1424. IFAAMAS.
Gonçalves, R.; Knorr, M.; and Leite, J. 2014. Evolving
multi-context systems. In ECAI, volume 263 of Frontiers
in Artificial Intelligence and Applications, 375–380. IOS
Press.
Governatori, G.; Rotolo, A.; and Riveret, R. 2018. A deontic
argumentation framework based on deontic defeasible logic.
In International Conference on Principles and Practice of
Multi-Agent Systems, 484–492. Springer.
Izmirlioglu, Y., and Erdem, E. 2018. Qualitative reasoning
about cardinal directions using answer set programming. In
Procs. of AAAI, 1880–1887. AAAI Press.
Kasalica, V.; Gerochristos, I.; Alferes, J. J.; Gomes, A. S.;
Knorr, M.; and Leite, J. 2019. Telco network inventory
validation with nohr. In LPNMR, volume 11481 of LNCS,
18–31. Springer.
Kasalica, V.; Knorr, M.; Leite, J.; and Lopes, C. 2020.
NoHR: An overview. Künstl Intell.
Knorr, M.; Alferes, J. J.; and Hitzler, P. 2011. Local
closed world reasoning with description logics under the
well-founded semantics. Artif. Intell. 175(9-10):1528–1554.
Knorr, M., and Hitzler, P. 2007. A comparison of disjunctive
well-founded semantics. In FAInt, volume 277 of CEUR
Workshop Proceedings. CEUR-WS.org.
McNamara, P. 2019. Deontic logic. In Zalta, E. N., ed.,
The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, summer 2019 edition.
Motik, B., and Rosati, R. 2010. Reconciling description
logics and rules. J. ACM 57(5):30:1–30:62.
Orgun, M. A., and Wadge, W. W. 1992. Towards a unified
theory of intensional logic programming. The Journal of
Logic Programming 13(4):413–440.
Pacuit, E. 2017. Neighborhood semantics for modal logic.
Springer.
Panagiotidi, S.; Nieves, J. C.; and Vázquez-Salceda, J. 2009.
A framework to model norm dynamics in answer set programming. In MALLOW.
Phuoc, D. L.; Dao-Tran, M.; Parreira, J. X.; and Hauswirth,
M. 2011. A native and adaptive approach for unified processing of linked streams and linked data. In Procs. of ISWC,
volume 7031 of LNCS, 370–388. Springer.
Przymusinski, T. C. 1991. Stable semantics for disjunctive
programs. New Generation Comput. 9(3/4):401–424.
Suchan, J.; Bhatt, M.; Walega, P. A.; and Schultz, C. P. L.
2018. Visual explanation by high-level abduction: On
answer-set programming driven reasoning about moving objects. In Procs. of AAAI, 1965–1972. AAAI Press.
Vardi, M. Y. 1989. On the complexity of epistemic reasoning. In Procs. of LICS, 243–252. IEEE Computer Society.
Walega, P. A.; Kaminski, M.; and Grau, B. C. 2019. Reasoning over streaming data in metric temporal datalog. In
Procs. of AAAI, 3092–3099. AAAI Press.
Walega, P. A.; Schultz, C. P. L.; and Bhatt, M. 2017. Nonmonotonic spatial reasoning with answer set programming
modulo theories. TPLP 17(2):205–225.
188
Obfuscating Knowledge in Modular Answer Set Programming∗
Ricardo Gonçalves1 , Tomi Janhunen2 , Matthias Knorr1 , João Leite1 , Stefan Woltran3
1
Universidade Nova de Lisboa
2
Tampere University
3
Vienna University of Technology
{rjrg,mkn,jleite}@fct.unl.pt, tomi.janhunen@tuni.fi, woltran@dbai.tuwien.ac.at
Abstract
declarative meaning, but also because it may be necessary,
e.g., as a means to deal with privacy and legal issues such as
to eliminate illegally obtained data, or to comply with the recently enacted right to be forgotten (European Union 2016).
Whereas forgetting in the context of classical logic is
essentially a solved problem (Bledsoe and Hines 1980;
Weber 1986; Middeldorp, Okui, and Ida 1996; Lang, Liberatore, and Marquis 2003; Moinard 2007; Gabbay, Schmidt,
and Szalas 2008), new challenging issues arise when it is
considered in the context of a non-monotonic logic based
language such as ASP (Zhang and Foo 2006; Eiter and
Wang 2008; Wong 2009; Wang, Wang, and Zhang 2013;
Knorr and Alferes 2014; Wang et al. 2014; Delgrande and
Wang 2015; Gonçalves, Knorr, and Leite 2016b). According
to (Goncalves, Knorr, and Leite 2016a), forgetting in ASP
is best captured by strong persistence (Knorr and Alferes
2014), a property inspired by strong equivalence, which requires that there be a correspondence between the answer
sets of a program before and after forgetting a set of atoms,
and that such correspondence be preserved in the presence
of additional rules not containing the atoms to be forgotten.
However, it has also been shown that, in ASP, it is not always
possible to forget and satisfy strong persistence (Gonçalves,
Knorr, and Leite 2016b).
What about forgetting in Modular ASP? Do the same negative results hold, and sometimes it is simply impossible to
forget while satisfying strong persistence? Is strong persistence an adequate requirement in the case of Modular ASP?
Can forgetting be reconciled with the module theorem?
Investigating forgetting in the context of Modular ASP is
the central topic of this paper. Our main contributions are:
Modular programming facilitates the creation and reuse of
large software, and has recently gathered considerable interest in the context of Answer Set Programming (ASP). In this
setting, forgetting, or the elimination of middle variables no
longer deemed relevant, is of importance as it allows one to,
e.g., simplify a program, make it more declarative, or even
hide some of its parts without affecting the consequences for
those parts that are relevant. While forgetting in the context
of ASP has been extensively studied, its known limitations
make it unsuitable to be used in Modular ASP. In this paper,
we present a novel class of forgetting operators and show that
such operators can always be successfully applied in Modular
ASP to forget all kinds of atoms – input, output and hidden
– overcoming the impossibility results that exist for general
ASP. Additionally, we investigate conditions under which this
class of operators preserves the module theorem in Modular
ASP, thus ensuring that answer sets of modules can still be
composed, and how the module theorem can always be preserved if we further allow the reconfiguration of modules.
1
Introduction
Modularity in Answer Set Programming (ASP) (Dao-Tran
et al. 2009; Harrison and Lierler 2016; Baral, Dzifcak, and
Takahashi 2006; Janhunen et al. 2009; Oikarinen and Janhunen 2008), just as in many other programming paradigms,
is a fundamental concept to ease the creation and reuse of
large programs. In one of the most significant general approaches to modularity – the so-called programming-in-thelarge – compositional operators are provided for combining
separate and independent modules, i.e., essentially answer
set programs extended with well-defined input/output interfaces, based on standard semantics. The compositionality
of the semantics of individual modules is ensured by the socalled module theorem (Janhunen et al. 2009).
The operation of forgetting, which aims at eliminating
a set of variables from a knowledge base while preserving
all relationships (direct and indirect) between the remaining variables, has recently gained a lot of attention, not only
because it is useful, e.g., as a means to clean up a theory
by eliminating all auxiliary variables that have no relevant
• We argue that, given that the input of a module is just a set
of facts, strong persistence is too strong when forgetting
in Modular ASP, and that it is more suitable to rely on
uniform equivalence (Sagiv 1988; Eiter and Fink 2003)
for a weaker form of persistence, say uniform persistence,
which has not been considered before.
• We thoroughly investigate forgetting in ASP under uniform equivalence, including formalizing uniform persistence and showing that, unlike with strong persistence, it
is always possible to forget under this new property.
∗
This paper has been published in the Proceedings of the Thirtythird AAAI Conference on Artificial Intelligence (AAAI), 2019
(Gonçalves et al. 2019).
• We show that no previously known class of forgetting operators satisfies uniform persistence, which leads us to in-
189
P2 (Lifschitz, Pearce, and Valverde 2001). Given a set of
atoms V , the V -exclusion of a set of HT-interpretations M,
MkV , is {hX\V, Y \V i | hX, Y i ∈ M}.
A forgetting operator over a class C of programs over A is
a partial function f : C × 2A → C s.t. the result of forgetting
about V from P , f(P, V ), is a program over A(P )\V , for
each P ∈ C and V ⊆ A. We denote the domain of f by C(f)
and usually we focus on C = Ce , and leave C implicit. The
operator f is called closed for C ′ ⊆ C(f) if f(P, V ) ∈ C ′ , for
every P ∈ C ′ and V ⊆ A. A class F of forgetting operators
(over C) is a set of forgetting operators f s.t. C(f) ⊆ C.
We recall notions of modules using ELP-functions, a generalization of DLP-functions (Janhunen et al. 2009).1 An
ELP-function, Π, is a quadruple hP, I, O, Hi, where I, O,
and H are pairwise distinct sets of input atoms, output
atoms, and hidden atoms, respectively, and P is a logic program s.t. for each rule A ← B, not C, not not D of P ,
troduce a new class of forgetting operators that satisfies
uniform persistence, and investigate its other properties.
• We employ the newly introduced class of operators to
forget in a prominent approach of modular ASP, DLPfunctions (Janhunen et al. 2009), and show how it can adequately be used to forget input, output, and hidden atoms
from a module, while obeying uniform persistence.
• We also show that, not unexpectedly, the module theorem
no longer holds in general after forgetting.
• To overcome the latter problem, we investigate ways to
modify modules so that the module theorem can be preserved while forgetting under uniform persistence i.e.,
ways to reconfigure ASP modules by merging and splitting modules, so that we can properly forget while preserving the compositionality of stable models of modules.
2
Preliminaries
1. A ∪ B ∪ C ∪ D ⊆ I ∪ O ∪ H, and
We start by recalling some notions about logic programs.
An (extended) rule r is an expression of the form
a1 ∨ . . . ∨ an ← b1 , ..., bm , not c1 , ..., not ck ,
not not d1 , ..., not not dl ,
2. if A 6= ∅, then A ∩ (O ∪ H) 6= ∅.
Input atoms and output atoms are also called visible atoms.
An interpretation for an ELP-function Π = hP, I, O, Hi
is an arbitrary set M ⊆ A(Π), where A(Π) = I ∪ O ∪ H.
We denote by Ai (Π), Ao (Π), Ah (Π), and by Mi , Mo , Mh
the subsets of A(Π) and M restricted to elements in I, O,
and H, respectively. Given ELP-function Π = hP, I, O, Hi
and interpretation M , the reduct of Π w.r.t. M is the ELPfunction ΠM = hP M , I, O, Hi, where P M is the reduct of
P w.r.t. M . An interpretation N is a model of ΠM iff N is
a model of P M . A model N of ΠM is I-minimal iff there
is no model N ′ of ΠM such that Ni′ = Ni and N ′ ⊂ N .
An interpretation M is a stable model2 of Π iff M is an Iminimal model of ΠM . The set of all stable models of Π
is denoted by SM(Π). We have M ∈ SM(Π) iff M ∈
AS(P ∪ Mi ) (Lierler and Truszczynski 2011).
Given a program P and a set of atoms S, the set of defining rules for S is Def P (S) = {A ← B, not C, not not D ∈
P | A ∩ S 6= ∅}. Two ELP-functions Π1 = hP1 , I1 , O1 , H1 i
and Π2 = hP2 , I2 , O2 , H2 i respect the input/output interfaces of each other iff (1) (I1 ∪ O1 ∪ H1 ) ∩ H2 = ∅; (2)
(I2 ∪O2 ∪H2 )∩H1 = ∅; (3) O1 ∩O2 = ∅; (4) Def P1 (O1 ) =
Def P1 ∪P2 (O1 ), and (5) Def P2 (O2 ) = Def P1 ∪P2 (O2 ).
Let Π1 = hP1 , I1 , O1 , H1 i and Π2 = hP2 , I2 , O2 , H2 i
be ELP-functions that respect the input/output interfaces of
each other. The composition Π1 ⊕ Π2 is defined as
(1)
where a1 , . . . , an , b1 , . . . , bm , c1 , . . . , ck , and d1 , . . . , dl are
atoms of a given propositional alphabet A. Note that double negation is standard in the context of forgetting in ASP.
We also write such rules as A ← B, not C, not not D where
A = {a1 , . . . , an }, B = {b1 , . . . , bm }, C = {c1 , . . . , ck },
and D = {d1 , . . . , dl }. An (extended) logic program is a
finite set of rules. By A(P ) we denote the set of atoms appearing in P and by Ce the class of extended programs. We
call r disjunctive if D = ∅; normal if, additionally, A has
at most one element; Horn if on top of that C = ∅; and fact
if also B = ∅. The classes of disjunctive, normal and Horn
programs, Cd , Cn , and CH , are then defined as usual.
Given a program P and an interpretation I, i.e., a set
I ⊆ A, the reduct P I is defined as P I = {A ← B | A ←
B, not C, not not D ∈ P, C ∩ I = ∅, D ⊆ I}. An interpretation I is a model of a rule A ← B if A ∩ I 6= ∅ whenever
B ⊆ I; I is a model of a reduct R if it satisfies every rule of
R; I is a minimal model of the reduct R if I is a model of R
and there is no model I ′ of R s.t. I ′ ⊂ I; and I is an answer
set of an extended program P if it is a minimal model of the
reduct P I . The set of all answer sets of a program P is denoted by AS(P ). Given a set of atoms V , the V -exclusion
of a set of sets M, denoted MkV , is {X\V | X ∈ M}.
Two programs P1 and P2 are said to be equivalent if
AS(P1 ) = AS(P2 ), strongly equivalent, denoted by P1 ≡
P2 , if AS(P1 ∪ R) = AS(P2 ∪ R) for any R ∈ Ce , and uniformly equivalent, denoted by P1 ≡u P2 , if AS(P1 ∪ R) =
AS(P2 ∪ R), for any set of facts R.
An HT -interpretation is a pair hX, Y i s.t. X ⊆ Y ⊆ A.
Given a program P , an HT -interpretation hX, Y i is an HT model of P if Y |= P and X |= P Y , where |= stands for
the classical satisfaction relation for rules. The set of all HTmodels of P is denoted by HT (P ). Also, Y ∈ AS(P ) iff
hY, Y i ∈ HT (P ) and there is no X ⊂ Y s.t. hX, Y i ∈
HT (P ). Also, HT (P1 ) = HT (P2 ) precisely when P1 ≡
hP1 ∪ P2 , (I1 \ O2 ) ∪ (I2 \ O1 ), O1 ∪ O2 , H1 ∪ H2 i.
The join ⊔ of modules builds on this composition imposing further restrictions. The positive dependency graph of
an ELP-function Π = hP, I, O, Hi is the pair DG+ (Π) =
hO ∪ H, ≤1 i, where b ≤1 a holds for a, b ∈ (O ∪ H) iff
there is a rule A ← B, not C, not not D ∈ P s.t. a ∈ A and
b ∈ B. The reflexive and transitive closure of ≤1 provides
1
While we limit our generalization to extended logic programs
to the necessary notions for individual modules, we do not foresee
major difficulties for other aspects left out of scope of this paper.
2
We reserve the term “answer set” for programs and the term
“stable model” for ELP-functions to ease the reading.
190
the dependency relation ≤ over output and hidden atoms.
A strongly connected component (SCC) S of DG+ (Π) is
a maximal set S ⊆ Ao (Π) ∪ Ah (Π) s.t. b ≤ a for all
pairs a, b ∈ S. If Π1 ⊕ Π2 is defined, then Π1 and Π2
are mutually dependent iff DG+ (Π1 ⊕ Π2 ) has an SCC S
s.t. S ∩ Ao (Π1 ) 6= ∅ and S ∩ Ao (Π2 ) 6= ∅, and mutually
independent otherwise. Thus, given ELP-functions Π1 and
Π2 , if the composition Π1 ⊕ Π2 is defined and Π1 and Π2
are mutually independent, then the join Π1 ⊔ Π2 of Π1 and
Π2 is defined and coincides with Π1 ⊕ Π2 (Janhunen et al.
2009).
3
Due to this negative result and the fact that it is not always possible to forget while satisfying (SP), the question
that arises is whether this is actually different for (UP), given
that it is less demanding in its requirements.
Example 1. Consider program P used in the impossibility
result for (SP) (Gonçalves, Knorr, and Leite 2016b):
a←p
b←q
p ← not q
q ← not p
Adding program R = {a ←; b ←}, it is shown there that
any result of forgetting {p, q} from P , f(P, {p, q}), that satisfies (SP) is required to have an HT-model hab, abi5 . At the
same time, since {a, b} (modulo {p, q}) is not an answer set
of P , we must have hX, abi ∈ HT (f(P, {p, q})) for at least
one X ⊂ {a, b}, to prevent {a, b} from being an answer set
of f(P, {p, q}). It is then shown that due to different programs R, hX, abi 6∈ HT (f(P, {p, q})) for any such X, thus
causing a contradiction. However, in the case of X = ∅,
R = {a ← b; b ← a} is used, which is not a set of facts
and thus not relevant w.r.t. (UP). In fact, given the only possible four sets of facts over {a, b} to be considered for R, we
can verify that P ′ = {a ← not b; a ← not not a, b; b ←
not a; b ← not not b, a} is a result of forgetting {p, q} from
P for which the condition of (UP) is satisfied.
Forgetting under Uniform Persistence
Arguably, among the many properties for forgetting in ASP,
strong persistence is the one that should intuitively hold,
since it imposes the preservation of all original direct and indirect dependencies between atoms not to be forgotten. Here
and in the sequel, F is a class of forgetting operators.
(SP) F satisfies Strong Persistence if, for each f ∈ F, P ∈
C(f) and V ⊆ A, we have AS(f(P, V ) ∪ R) = AS(P ∪
R)kV , for all programs R ∈ C(f) with A(R) ⊆ A\V .
Essentially, (SP) requires that the answer sets of f(P, V )
correspond to those of P , no matter what programs R over
A\V we add to both, which is closely related to the concept of strong equivalence. However, this property is rather
demanding, witnessed by the fact that it cannot always be
satisfied (Gonçalves, Knorr, and Leite 2016b). On the other
hand, in the case of a module, i.e., an ELP-function, its program P is fixed, and we only vary the input, which is closely
related to considering a fixed ASP program, encoding the
declarative specification of a problem, and only varying the
instances corresponding to the specific problem to be solved.
This is captured by the notion of uniform equivalence, which
weakens strong equivalence by considering that only facts
can be added. To investigate forgetting in such cases, we introduce Uniform Persistence, (UP), obtained from (SP) by
restricting the varying programs R to sets of facts.
A naive approach to define a class of forgetting operators that satisfies (UP) would be to use relativized uniform
equivalence (Eiter, Fink, and Woltran 2007), which is close
in spirit to (UP). However, this would not work, for the same
reasons that a similar approach based on relativized strong
equivalence fails to capture (SP) (Gonçalves et al. 2017;
Gonçalves et al. 2020).
Instead, we define a class of forgetting operators that satisfies (UP), dubbed FUP , whose more involved definition –
that we will gently introduce in an incremental way – builds
on the manipulation of HT-models given an input program
P and a set of atoms V ⊆ A(P ) to forget. To this end, we
aim at devising a mapping from HT (P ) to the set of HTmodels of the result of forgetting, f(P, V ), for any operator
f ∈ FUP . This mapping can be illustrated as follows.
Example 2. The program P from Ex. 1 has 15 HT-models:
(UP) F satisfies Uniform Persistence if, for each f ∈ F, P ∈
C(f) and V ⊆ A, we have AS(f(P, V ) ∪ R) = AS(P ∪
R)kV , for all sets of facts R with A(R) ⊆ A\V .
hbq, bqi
hap, api
hap, abpi
habp, abpi
Having introduced (UP) as the desired property for forgetting in ELP-functions, we now turn our attention to which
forgetting operator to use. Unfortunately, none of the existing classes mentioned in the literature3 satisfy (UP).4
hbq, abqi
habq, abqi
h∅, abpqi
ha, abpqi
hb, abpqi
hbq, abpqi
hap, abpqi
hab, abpqi
habp, abpqi
habq, abpqi
habpq, abpqi
The HT-models for the proposed result P ′ of forgetting are
ha, ai, hb, bi, h∅, abi and hab, abi.
But how could we determine the latter set of HT-models
for any P and V ? Given the HT-models listed above, the set
HT (P )kV contains extra tuples such as ha, abi and hb, abi.
Thus, a more involved analysis of HT-models is in order.
By the definition of (UP), an answer set Y of f(P, V ) ∪ R
corresponds to an answer set Y ∪ A of P ∪ R, for some A ⊆
V . We will therefore collect all HT-models hX, Y ∪ Ai in
HT (P ) with the same Y and join them in blocks separated
Theorem 1. None of the classes F of forgetting operators
studied in (Goncalves, Knorr, and Leite 2016a; Gonçalves
et al. 2017) satisfy (UP).
3
Cf. the survey on forgetting in ASP (Goncalves, Knorr, and
Leite 2016a), (Gonçalves et al. 2017; Gonçalves et al. 2020), and
references therein.
4
Note that the result in (Goncalves, Knorr, and Leite 2016a)
(Fig. 1) indicating that class FSas satisfies (SP), the generalization
of (UP), is in fact not entirely accurate, since the only known operator in FSas is not defined for a class of programs, but rather for
instances of forgetting.
5
We follow a common convention and abbreviate sets in HTinterpretations such as {a, b} with the sequence of its elements,
ab.
191
which is why h∅, abi ∈ HT (P ′ ) in Ex. 2 holds. Generalizing this observation, whenever there is a set X s.t. each
Y,A
′
′
NhP,V
i contains an element X with X ⊆ X , then adding
X as facts to P cannot result in an answer set of P , and thus,
hX, Y i should be part of the forgetting result. In Ex. 6, the
only such set X is indeed X = ∅.
Y,A
We thus collect all sets NhP,V
i for each Y , define tuples
over this set of sets, and intersections over these tuples. The
latter correspond to the maximal subsets X, which suffices
for uniform equivalence (Eiter, Fink, and Woltran 2007).
Definition 1. Let P be a program, V ⊆ A, and Y ⊆ A\V .
Y,i
Y
Consider the indexed family of sets ShP,V
i = {NhP,V i }i∈I
Y
where I = SelhP,V
i . For each tuple (Xi )i∈I such that Xi ∈
T
Y,i
NhP,V i , we define the intersection of its sets as i∈I Xi . We
denote by SIntYhP,V i the set of all such intersections.
by the varying A. To this end, we first characterize all the
different total HT-models of P , namely, for each Y ⊆ A\V :
Y
SelhP,V
i = {A ⊆ V | hY ∪ A, Y ∪ Ai ∈ HT (P )}.
Example 3. Given the HT-models (Ex. 2) for P of Ex. 1 and
{a}
∅
V = {p, q}, we obtain SelhP,V
i = ∅, SelhP,V i = {{p}},
{b}
{a,b}
SelhP,V i = {{q}}, and SelhP,V i = {{p}, {q}, {p, q}}.
Clearly, the total models to be considered in the result
Y
of forgetting should be restricted to those Y s.t. SelhP,V
i
is non-empty. But not all these sets should be considered.
Example 4. Let P be a program over A = {a, b, p, q} s.t. its
HT-models of the form hX, {a, b}∪Ai with A ⊆ V = {p, q}
are hab, abpi, habp, abpi, habp, abpqi, and habpq, abpqi. We
{a,b}
have that SelhP,V i = {{p}, {p, q}}. Nevertheless, the nontotal models hab, abpi and habp, abpqi do not allow {a, b, p}
and {a, b, p, q} to be answer sets of P ∪ R, for any R over
{a,b}
A\V = {a, b}. So, although SelhP,V i 6= ∅, the set {a, b}
should not be a possible answer set of the forgetting result.6
Taking this observation into account, we define the set of
total models for the result of forgetting V from P :
The resulting intersections indeed correspond to sets X
pointed out in the preceding discussion. Therefore, we obtain the definition of FUP by combining the total models
based on ThP,V i and the non-total ones based on SIntYhP,V i ,
but naturally restricted to those cases where the corresponding total model exists.
Definition 2 (UP-Forgetting). Let FUP be the class of forgetting operators defined as:
Y
ThP,V i = {Y ⊆ A\V | there exists A ∈ SelhP,V
i s.t.
hY ∪ A′ , Y ∪ Ai ∈
/ HT (P ) for every A′ ⊂ A}.
Example 5. Based on the HT-models of P listed in Ex. 2, the
Y
sets SelhP,V
i identified in Ex. 3, and V = {p, q}, we observe
that ThP,V i = {{a}, {b}, {a, b}}. In each of the three cases,
the condition in the definition of ThP,V i is satisfied by some
Y
element of SelhP,V
i . For Y = {a, b} in particular, the set
A can be either {p} or {q}, but not {p, q}. Given ThP,V i ,
we expect three total HT-models for the result of forgetting
{p, q} from P , i.e., the ones indicated in Ex. 2 for P ′ .
The crucial question now is how to extract the non-total
HT-models for the result of forgetting in general. For this
Y
purpose, for each A ∈ SelhP,V
i , we first consider the nontotal HT-models of P of the form hX, Y ∪ Ai:
{f |HT (f(P, V )) = ({hY, Y i | Y ∈ ThP,V i } ∪
{hX, Y i | Y ∈ ThP,V i and X ∈ SIntYhP,V i })
for all P ∈ C(f) and V ⊆ A}.
Example 7. Recall P from Ex. 1. Following the discussion
after Ex. 6, we can verify that the result of forgetting about
V = {p, q} from P according to FUP has the expected HTmodels (cf. Ex. 2): ha, ai, hb, bi, h∅, abi, and hab, abi.
The definition of FUP characterizes the HT-models of a result of forgetting for any f ∈ FUP , but not an actual program.
This may raise the question whether there actually is such an
operator, and we can answer this question positively.
To this end, we recall the necessary notions and results related to countermodels in here-and-there (Cabalar
and Ferraris 2007), which have been used previously in
a similar manner for computing concrete results of forgetting for classes of forgetting operators based on HTmodels (Wang, Wang, and Zhang 2013; Wang et al. 2014;
Gonçalves et al. 2020).
Essentially, the HT-interpretations that are not HT-models
of P (hence the name countermodels) can be used to determine rules, that, if conjoined, result in a program P ′ that is
strongly equivalent to P .
Let P be a program and X ⊆ Y ⊆ A. An
HT-interpretation hX, Y i is an HT-countermodel of P if
hX, Y i 6|= P . We also define the following rules:
Y,A
NhP,V
i = {X \V | hX, Y ∪Ai ∈ HT (P ) and X 6= Y ∪A}.
Example 6. Continuing Ex. 5, these non-total models, in
particular those relevant for the desired result h∅, abi, are:
{a,b},{p}
NhP,V i
{a,b},{q}
= {{a}}, NhP,V i
{a,b},{p,q}
NhP,V i
= {{b}}, and
= {∅, {a}, {b}, {a, b}}.
Now, since HT-models of facts never include h∅, Y i for
any Y , we know that any HT-model h∅, Y i of P will not
occur in HT (P ∪ R) for any (non-empty) set of facts R.
Y,A
Hence, either one of the NhP,V
i is empty, in which case P
itself has an answer set Y modulo V and the result of forY,A
getting should have an answer set Y , or ∅ ∈ NhP,V
i for any
A results in an HT-model h∅, Y i for the result of forgetting,
6
Similar considerations have been used in the context of relativized equivalence (Eiter, Fink, and Woltran 2007) and in forgetting (Gonçalves, Knorr, and Leite 2016b).
rX,Y = (Y \X) ← X, not (A\Y ), not not (Y \X)
(2)
rY,Y = ∅ ← Y, not (A\Y )
(3)
The relation between these rules and HT-countermodels has
been established as follows.
192
Lemma 1 ((Cabalar and Ferraris 2007)). Let X ⊂ Y ⊆ A
and U ⊆ V ⊆ A.
(W) F satisfies Weakening if, for each f ∈ F, P ∈ C(f) and
V ⊆ A, we have P |=HT f(P, V ).
(PP) F satisfies Positive Persistence if, for each f ∈ F, P ∈
C(f) and V ⊆ A: if P |=HT P ′ , with P ′ ∈ C(f) and
A(P ′ ) ⊆ A\V , then f(P, V ) |=HT P ′ .
(SI) F satisfies Strong (addition) Invariance if, for each f ∈
F, P ∈ C(f) and V ⊆ A, we have f(P, V ) ∪ R ≡ f(P ∪
R, V ) for all programs R ∈ C(f) with A(R) ⊆ A\V .
(EC ) F satisfies Existence for C, i.e., F is closed for a class
of programs C if there exists f ∈ F s.t. f is closed for C.
(CP) F satisfies Consequence Persistence if, for each f ∈
F, P ∈ C(f) and V ⊆ A, we have AS(f(P, V )) =
AS(P )kV .
(wC) F satisfies weakened Consequence if, for each f ∈
F, P ∈ C(f) and V ⊆ A, we have AS(P )kV ⊆
AS(f(P, V )).
Note that P |=HT P ′ holds if HT (P ) ⊆ HT (P ′ ), in
which case P ′ is said to be a HT-consequence of P .
We obtain that FUP satisfies the following properties.
Proposition 3. FUP satisfies (sC), (wE), (SE), (CP), (wC),
(ECe ), (ECH ), but not (W), (PP), (SI), (SP), (ECd ), (ECn ).
Given the close connection between the class FUP and uniform equivalence (cf. Thm. 3), it is not surprising that some
properties of forgetting that are closely connected to strong
equivalence are not satisfied by FUP , notably (PP) and (SI),
which are satisfied by the class of forgetting operators defined for forgetting w.r.t. (SP) when forgetting is possible
(Gonçalves, Knorr, and Leite 2016b).
Finally, we obtain that deciding whether a program is the
result of forgetting for f ∈ FUP is in ΠP
3.
Theorem 4. Given programs P , Q, and V ⊆ A, deciding
whether P ≡ f(Q, V ) for f ∈ FUP is in ΠP
3.
Note that the same problem for the classes of forgetting
operators that approximate forgetting under (SP) is ΠP
3complete (Gonçalves et al. 2017; Gonçalves et al. 2020).
Also, by (Wang et al. 2014) and Prop. 2, if Q is Horn, then
this problem is in ΠP
1.
(i) hU, V i is an HT-countermodel of rX,Y iff U = X and
V =Y.
(ii) hU, V i is an HT-countermodel of rY,Y iff V = Y .
This allows us to determine a program for a set of HTmodels provided such program exists. Recall that not all sets
of HT-interpretations correspond to the set of HT-models
of some program. A set of HT-interpretations S is HTexpressible iff hX, Y i ∈ S implies hY, Y i ∈ S. In this
case, we are able to determine a corresponding program.
Proposition 1 ((Cabalar and Ferraris 2007)). Let M be a set
of HT-interpretations which is HT-expressible and define the
program PM as
PM = {rX,Y | hX, Y i ∈
/ M and hY, Y i ∈ M }
∪ {rY,Y | hY, Y i ∈
/ M }.
Then, HT (PM ) = M .
Note that according to Def. 2, the HT-models of the forgetting result for any operator in FUP are HT-expressible.
Thus, based on these ideas, we can define a concrete operator that belongs to the class FUP .
Theorem 2. There exists f such that f ∈ FUP .
While the definition of UP-Forgetting itself is certainly
non-trivial, it turns out that for the case of Horn programs, a
considerably simpler definition can be used.
Proposition 2. Let f be in FUP . Then, for every V ⊆ A:
HT (f(P, V )) = HT (P )kV for each P ∈ CH .
This result serves as further indication that UP-Forgetting
is well-defined, given that essentially all classes of forgetting
operators coincide with this definition for the class of Horn
programs (Goncalves, Knorr, and Leite 2016a).
We are able to show that FUP indeed satisfies (UP) which
guarantees that, unlike for the property (SP), it is always
possible to forget satisfying (UP).
Theorem 3. FUP satisfies (UP).
4
Despite (SP) being the property that best captures the
essence of forgetting in ASP in general, of which (UP) is
the weaker version that is sufficient when dealing with modules, other properties have been investigated in the literature
(cf. (Goncalves, Knorr, and Leite 2016a)), which we recall
in the following.
Let F be a class of forgetting operators.
Forgetting in Modules
We now turn our attention to the use of FUP to forget in
modules i.e., ELP-functions. Towards characterizing results
of forgetting in modules, the notion of equivalence between
ELP-functions – modular equivalence (Janhunen et al. 2009)
– first needs to be adapted, since it is too strong as it requires
the existence of a bijection between stable models of different ELP-functions, which is not possible in general when
reducing the language, as illustrated by the next example.
Example 8. Take Π = h{a ←; b ← not not b}, ∅, {a}, {b}i
with SM(Π) = {{a}, {a, b}}. Forgetting b should yield,
e.g., Π′ = h{a ←}, ∅, {a}, ∅i with SM(Π′ ) = {{a}}, but
then no bijection between SM(Π) and SM(Π′ ) is possible.
Therefore, we introduce a novel notion of equivalence for
program modules according to which two modules are V equivalent if they coincide on I and O ignoring V , and if
their stable models coincide ignoring V .
(sC) F satisfies strengthened Consequence if, for each f ∈
F, P ∈ C(f) and V ⊆ A, we have AS(f(P, V )) ⊆
AS(P )kV .
(wE) F satisfies weak Equivalence if, for each f ∈ F,
P, P ′ ∈ C(f) and V ⊆ A, we have AS(f(P, V )) =
AS(f(P ′ , V )) whenever AS(P ) = AS(P ′ ).
(SE) F satisfies Strong Equivalence if, for each f ∈ F,
P, P ′ ∈ C(f) and V ⊆ A: if P ≡ P ′ , then f(P, V ) ≡
f(P ′ , V ).
193
Definition 3 (V-Equivalence). Let Π1 and Π2 be ELPfunctions, and V a set of atoms. Then, Π1 and Π2 are V equivalent, denoted by Π1 ≡V Π2 , iff
Thm. 4)), easily encodable with extended rules, and we can
forget about input atoms from ELP-functions as follows.
Theorem 8 (Forgetting input atoms). Given a set V ⊆ I of
input atoms to forget, an ELP-function Π = hP, I, O, Hi is
V -equivalent to any
Π′ = hf(P ∪ {a ← not not a | a ∈ V }, V ), I\V, O, Hi
based on a uniformly persistent forgetting operator f ∈ FUP .
This construction of Π′ can also be used to hide input atoms.
Theorem 9 (Hiding input atoms). Given a set V ⊆ I of
input atoms to hide, an ELP-function Π = hP, I, O, Hi
is V -equivalent to Π′ = hP ∪ {a ← not not a | a ∈
V }, I\V, O, H ∪ V i.
Combining these results, we can now define a general notion of a module resulting from forgetting elements of single
parts of a module’s interface. From now on, we assume that
some forgetting operator f ∈ FUP has been fixed.
Definition 4. Given an ELP-function Π = hP, I, O, Hi and
a set V of atoms to forget, the ELP-function resulting from
forgetting V , also denoted Π\V , is defined as follows:
1. Ai (Π1 )\V = Ai (Π2 )\V and Ao (Π1 )\V = Ao (Π2 )\V ;
2. SM(Π1 )kV = SM(Π2 )kV .
Forgetting from each of the pairwise disjoint sets of atoms
considered in a module – input, output and hidden – needs
to be characterised in turn. Additionally, in the case of input and output atoms, we also consider hiding them – useful when atoms are not declaratively meaningful outside the
module, or should not be shown – and discuss its difference
with respect to forgetting them.
We start by showing that the hidden atoms of an ELPfunction can be forgotten without affecting its behavior perceived in terms of visible atoms, ensuring that we can deal
with cases when we are not allowed to express a certain
piece of information in terms of our hidden atoms, or do
not want to show it to someone who wants to visualize the
program of a module.
Theorem 5 (Forgetting hidden atoms). Given a set V ⊆
H of hidden atoms to forget, an ELP-function Π =
hP, I, O, Hi is V -equivalent to any ELP-function Π′ =
hf(P, V ), I, O, H\V i based on a uniformly persistent forgetting operator f ∈ FUP .
hf(P ∪ {a ← not not a | a ∈ I ∩ V }, V ), I\V, O\V, H\V i.
We can show that this notion indeed fits the expectations.
Corollary 1. For an ELP-function Π and a set of atoms V ⊆
A(Π), we have SM(Π\V ) = SM(Π)kV .
And it follows that we can forget sets of atoms iteratively.
Proposition 4. Let Π be an ELP-function and V ⊆ A(Π).
Then, if V1 ∪ V2 = V and V1 ∩ V2 = ∅, we have
But forgetting is also applicable to the visible elements
of a module. For instance, whenever output atoms are no
longer used by other modules, they can effectively be removed without affecting the behavior of the module.
Theorem 6 (Forgetting output atoms). Given a set V ⊆
O of output atoms to forget, an ELP-function Π =
hP, I, O, Hi is V -equivalent to any ELP-function Π′ =
hf(P, V ), I, O\V, Hi based on a uniformly persistent forgetting operator f ∈ FUP .
SM(Π\V ) = SM((Π\V1 )\V2 ) = SM((Π\V2 )\V1 ).
In (Janhunen et al. 2009), it is shown, through the module
theorem, that the stable model semantics of modules is fully
compositional, which should be preserved under forgetting.
In the case of two modules Π1 = hP1 , I1 , O1 , H1 i and
Π2 = hP2 , I2 , O2 , H2 i that do not mention each other’s hidden atoms and their join Π1 ⊔ Π2 is defined (coincides with
the composition Π1 ⊕ Π2 ), the module theorem states that
SM(Π) = SM(Π1 ) ⊲⊳ SM(Π2 ) where the join of sets of
stable models captured by the operator ⊲⊳ contains M1 ∪ M2
whenever M1 ∈ SM(Π1 ), M2 ∈ SM(Π2 ), and M1 and
M2 are compatible, i.e., M1 ∩ (I2 ∪ O2 ) = M2 ∩ (I1 ∪ O1 )
so that M1 and M2 coincide on visible atoms.
Limited to forgetting atoms that are not shared by two
modules, if we consider two modules whose join is defined,
then the module theorem can be preserved while forgetting.
Theorem 10. If Π is an ELP-function obtained as a join of
two ELP-functions Π1 and Π2 , and V ⊆ A(Π) is a set of
atoms to forget s.t. V ∩ (I1 ∪ O1 ) ∩ (I2 ∪ O2 ) = ∅, then
SM(Π\V ) = SM(Π1 \V ) ⊲⊳ SM(Π2 \V ).
We can generalize this result to deal with cases where
atoms to be forgotten appear in more than two modules.
Theorem 11. If Π is an ELP-function obtained as a join of n
ELP-functions Π1 , . . . , Πn , and V ⊆ A(Π) is a set of atoms
to forget s.t., for all i, j ∈ {1, . . . , n}, i 6= j, V ∩ (Ii ∪ Oi ) ∩
(Ij ∪ Oj ) = ∅, then SM(Π\V ) = ⊲⊳ni=1 SM(Πi \V ).
An alternative to forgetting output atoms is hiding them.
Given an ELP-function Π = hP, I, O, Hi and a set V ⊆
O of output atoms, we could create an ELP-function
hP, I, O\V, H ∪V i where the atoms of V are simply hidden.
This would be computationally cheap since P would not
change, but could be regarded insufficient under the strict
interpretation of forgetting V , i.e., the elements of V should
not appear in the result at all. Nevertheless, we derive the
following counterpart to Thm. 6.
Theorem 7 (Hiding output atoms). Given a set V ⊆ O of
output atoms to hide, an ELP-function Π = hP, I, O, Hi is
V -equivalent to the ELP-function Π′ = hP, I, O\V, H ∪V i.
Thus, both hiding and forgetting output atoms yields V equivalent ELP-functions.
Turning to forgetting (or hiding) of input atoms, no analogous result exists without making changes to the program.
Example 9. Take Π = h{a ← b}, {b}, {a}, ∅i. Then,
SM(Π) = {∅, {a, b}}, but moving b from I to H yields
Π′ with SM(Π′ ) = {{}}, which is not {b}-equivalent.
Nevertheless, if we allow programs to change, such V equivalent ELP-functions can be constructed using the idea
of an input generator (cf. (Oikarinen and Janhunen 2006,
194
i ∼V j iff V ∩ (Ii ∪ Oi ) ∩ (Ij ∪ Oj ) 6= ∅. This relation identifies those ELP-functions that share atoms to forget, i.e.,
that can cause problems with the module theorem. We denote by ∼∗V the reflexive and transitive closure of ∼V on N .
Since ∼V is clearly a symmetric relation, its reflexive and
transitive closure, ∼∗V , is an equivalence relation on N . We
can therefore consider the quotient set N \∼∗V , i.e., the set of
equivalence classes defined by ∼∗V on N . We then
F consider,
for each e ∈ N \∼∗V , the ELP-function Πe = i∈e Πi , the
join of those ELP-functions corresponding to the considered
equivalence class. This allows us to prove a relaxed version
of the module theorem.
Yet, if we lift the restrictions on where the atoms to forget
appear, we lose a full correspondent to the module theorem.
Theorem 12. If Π is an ELP-function obtained as a join of
two ELP-functions Π1 and Π2 , and V ⊆ A(Π) is a set of
atoms to forget, then SM(Π\V ) ⊆ ⊲⊳ni=1 SM(Πi \V ).
Only one of the two inclusions one would expect actually
holds, and this is not by chance. In general, it is possible that
modules Π1 \V and Π2 \V possess compatible stable models
M1 and M2 such that M = M1 ∪ M2 ∈ SM(Π1 \V ) ⊲⊳
SM(Π2 \V ) but M 6∈ SM(Π\V ) as illustrated next.
Example 10. Let us consider ELP-functions Π1 = h{a ←
b}, {b}, {a}, ∅i and Π2 = h{b ← not c}, {c}, {b}, ∅i and
their join Π = hP, {c}, {a, b}, ∅i with P = P1 ∪ P2 for the
respective sets of rules P1 and P2 of Π1 and Π2 .
As regards forgetting V = {b}, we have Π1 \V =
h{a ← not not a}, ∅, {a}, ∅i, Π2 \V = h∅, {c}, ∅, ∅i, and
Π\V = h{a ← not c}, {c}, {a}, ∅i. It remains to observe
that M1 = ∅ ∈ SM(Π1 \V ), M2 = ∅ ∈ SM(Π2 \V ), and
M1 ∪ M2 6∈ SM(Π\V ) = {{a}, {c}} although M1 and
M2 are (trivially) compatible.
Theorem 13. Let Π be an ELP-function obtained as a join
of n ELP-functions Π1 , . . . , Πn , and V ⊆ A(Π) a set of
atoms to forget. Let N = {1, . . . , n}, and consider ∼∗V
the equivalence relation on N as defined previously, and
N \∼∗V = {e1 , . . . , ek } the respective quotient set. Then,
SM(Π\V ) = ⊲⊳ki=1 SM(Πei \V ).
This shows that joining those modules that share atoms to
be forgotten allows for the preservation of the module theorem.
Joining entire modules is not ideal. However, it may happen that only part of a module is relevant to the shared
atom to be forgotten, in which case we can use the operation of decomposing (or splitting) modules to do a more
fine-grained recomposition of modules that still preserves
the module theorem. Towards this end, we adapt the necessary notions to introduce module decomposition (Janhunen
et al. 2009). Given an ELP-function Π = hP, I, O, Hi, let
SCC + (Π) denote the set of strongly connected components
of DG+ (Π). The dependency relation ≤ can be lifted to
SCC + (Π) by setting S1 ≤ S2 iff there are atoms a1 ∈ S1
and a2 ∈ S2 s.t. a1 ≤ a2 . It is easy to check that ≤ is
well-defined over SCC + (Π), i.e., it does not depend on the
chosen a1 ∈ S1 and a2 ∈ S2 , and that hSCC + (Π), ≤i is a
partially ordered set, i.e., ≤ is reflexive, transitive, and antisymmetric. For each S ∈ SCC + (Π) we consider the ELPfunction ΠS = hDef P (S), A(Def P (S))\S, S ∩ O, S ∩ Hi.
Some of these modules ΠS , however, may share hidden
atoms, and therefore cannot be joined. To overcome this,
such components of SCC + (Π) need to be identified.
The example suggests that it is not safe to use f ∈ FUP to
forget shared atoms that inherently change the I/O interface
between the modules. The same is also true for hiding.
Example 11. Consider again Ex. 10. We obtain the three
modules in each of which b has been hidden as follows:
Π′1 = h{a ← b; b ← not not b}, ∅, {a}, {b}i, Π′2 = h{b ←
not c}, {c}, ∅, {b}i and (Π1 ⊔ Π2 )′ = h{a ← b; b ←
not c}, {c}, {a}, {b}i. But then Π′1 and Π′2 do not respect
the input/output interfaces of each other. We could circumvent this by renaming one of the occurrences of b, but we
would also lose the prior dependency of a on c.
5
Module Reconfiguration
Preserving the compositionality of stable models of modules while forgetting is desirable by the very idea of modular
ASP: we want users to define ASP modules that can be composed into larger programs/modules. However, as we have
seen, the module theorem no longer works entirely whenever some atom to be forgotten is shared by two modules.
In such cases, one alternative is to somehow modify the
modules so that these atoms cease to occur in the visible components of different modules, i.e., reconfigure ASP
modules by merging and splitting modules, so that we can
forget while preserving the compositionality of stable models of modules. Of course, for this to be feasible, we must
have access to the modules in question (by communication,
or because we own the modules). This may require sharing some information about some module, which may not
always be desirable, but, arguably, whenever possible, this
is a reasonable trade-off for being able to forget atoms from
modules while preserving (UP) and the module theorem.
One way to address the problem, provided all involved
modules are mutually independent and their composition is
defined, is to join all the modules that contain such atoms.
Let Π be an ELP-function obtained as a join of n ELPfunctions Π1 , . . . , Πn , and V ⊆ A(Π) a set of atoms to
forget. Consider the following relation on N = {1, . . . , n}:
Definition 5. Given an ELP-function Π = hP, I, O, Hi,
components S1 , S2 ∈ SCC + (Π) do not respect the hidden
atoms of each other, denoted by S1 !h S2 , if and only if
S1 6= S2 and (at least) one of the following conditions holds:
1. there is h ∈ Ah (ΠS1 ) such that h ∈ Ai (ΠS2 ),
2. there is h ∈ Ah (ΠS2 ) such that h ∈ Ai (ΠS1 ),
3. there are h1 ∈ Ah (ΠS1 ) and h2 ∈ Ah (ΠS2 ) such that
both occur in some integrity constraint of Π.
It is clear that the relation !h is irreflexive and symmetric on SCC + (Π) for every ELP-function Π. If we consider the reflexive and transitive closure of !h , denoted by
!∗h , we obtain an equivalence relation. A repartition of
SCC + (Π) can then be obtained by considering the quotient
set SCC + (Π)\!∗h , i.e., the set of equivalence classes of
!∗h over SCC + (Π), which can be used to decompose Π.
195
Definition 6. Given an ELP-function Π = hP, I, O, Hi, the
decomposition induced by SCC + (Π) and !∗h includes an
ELP-function Π0 = hIC0 (P ), A(IC0 (P )) ∪ (I \ A(P )), ∅, ∅i
where IC0 (P ) = {← B, not C, not not D ∈ P | (B ∪
C ∪ D) ∩ H = ∅} and, for each S ∈ SCC + (Π)\!∗h ,
an ELP-function ΠS = hDef P (S) ∪ ICS S
(P ), A(Def P (S) ∪
ICS (P )) \ S, S ∩ O, S ∩ Hi, where S = S and ICS (P ) =
{← B, not C, not not D ∈ P | (B∪C ∪D)∩(S ∩H) 6= ∅}.
We showed that, unlike with (SP), it is always possible to
to forget under (UP). Perhaps surprisingly, we also showed
that, in general, none of the operators defined in the literature
satisfies this weaker form of persistence, which led us to introduce the class of forgetting operators FUP that we proved
to obey (UP), as well as a set of other properties commonly
discussed in the literature.
We then turned our attention to the application of this
class of forgetting operators to forget input, output, and
hidden atoms from modules, and related it with the operation of hiding. Despite showing that we can always forget atoms from modules under uniform persistence, we also
showed that the important module theorem no longer holds
in general, with negative consequences in the compositionality of stable models. Subsequently, after pinpointing the
conditions under which the module theorem holds, we proceeded by investigating how the theorem could be “recovered” through a reconfiguration of the modules obtained by
suitable decomposition and composition operations.
Possible avenues for future work include investigating
forgetting in other existing ways to view modular ASP, such
as (Dao-Tran et al. 2009; Harrison and Lierler 2016), and the
precise relationship of (UP) and UP-Forgetting to the notion
of relativized uniform equivalence (Eiter, Fink, and Woltran
2007), and obtaining syntactic operators for UP-Forgetting
in the line of (Berthold et al. 2019).
The module Π0 keeps track of integrity constraints as well
as input atoms that are not mentioned by the rules of P . We
can adapt straightforwardly (from (Janhunen et al. 2009))
that this decomposition of an ELP-function is valid.
Proposition 5. Given
an ELP-function Π = hP, I, O, Hi,
F
then Π = Π0 ⊔ ( S∈SCC + (Π)\!∗ ΠS ).
h
We now show that this decomposition can be used to allow forgetting while still preserving the module theorem.
Let Π1 = hP1 , I1 , O1 , H1 i and Π2 = hP2 , I2 , O2 , H2 i
be two ELP-functions such that their join is defined. Since
Prop. 4 shows that we can forget a set of atoms by forgetting iteratively every atom in the set, we focus on forgetting a single atom p. Suppose that p is shared by the two
modules, i.e., p ∈ (I1 ∪ O1 ) ∩ (I2 ∪ O2 ), and recall that
we cannot guarantee that forgetting p separately in Π1 and
Π2 preserves the module theorem. We first consider the set
of components of the decomposition of Π1 that are relevant
for atom p, i.e., R(Π1 , p) = {S ∈ SCC + (Π1 )\!∗h |
p ∈ Ao (ΠS ) ∪ Ai (ΠS )}. We denote by Πp1 the
F union of
the ELP-functions in R(Π1 , p), i.e., Πp1 =
R(Π1 , p),
by R(Π1 , p) the set of components of the decomposition
of Π1 that are not relevant for p, i.e., R(Π1 , p) = {S ∈
/ R(Π1 , p)}, and by Πp1 the union of
SCC + (Π1 )\!∗h | S ∈
F
the ELP-functions in R(Π1 , p), i.e., Πp1 = R(Π1 , p).
The decomposition of Π1 can then be used to obtain a
restricted version of the module theorem.
Acknowledgments Authors R. Gonçalves, M. Knorr, and
J. Leite were partially supported by FCT project FORGET (PTDC/CCI-INF/32219/2017) and by FCT project
NOVA LINCS (UIDB/04516/2020). T. Janhunen was partially supported by the Academy of Finland project 251170.
S. Woltran was supported by the Austrian Science Fund
(FWF): Y698, P25521.
References
Theorem 14 (Reconfiguration). Let Π be an ELP-function
obtained as a join of two ELP-functions Π1 and Π2 , and let
p ∈ (Ai (Π1 ) ∪ Ao (Π1 )) ∩ (Ai (Π2 ) ∪ Ao (Π2 )). Then,
Baral, C.; Dzifcak, J.; and Takahashi, H. 2006. Macros,
macro calls and use of ensembles in modular answer set programming. In Etalle, S., and Truszczynski, M., eds., Procs.
of ICLP, volume 4079 of LNCS, 376–390. Springer.
Berthold, M.; Gonçalves, R.; Knorr, M.; and Leite, J. 2019.
A syntactic operator for forgetting that satisfies strong persistence. Theory Pract. Log. Program. 19(5-6):1038–1055.
Bledsoe, W. W., and Hines, L. M. 1980. Variable elimination
and chaining in a resolution-based prover for inequalities.
In Bibel, W., and Kowalski, R. A., eds., Procs. of CADE,
volume 87 of LNCS, 70–87. Springer.
Cabalar, P., and Ferraris, P. 2007. Propositional theories are
strongly equivalent to logic programs. TPLP 7(6):745–759.
Dao-Tran, M.; Eiter, T.; Fink, M.; and Krennwallner, T.
2009. Modular nonmonotonic logic programming revisited.
In Hill, P. M., and Warren, D. S., eds., Procs. of ICLP, volume 5649 of LNCS, 145–159. Springer.
Delgrande, J. P., and Wang, K. 2015. A syntax-independent
approach to forgetting in disjunctive logic programs. In
Bonet, B., and Koenig, S., eds., Procs. of AAAI, 1482–1488.
AAAI Press.
SM(Π\{p}) = SM(Πp1 \{p}) ⊲⊳ SM((Π2 ⊔ Πp1 )\{p}).
Thus, to allow forgetting in modules and preserve the
module theorem, we can essentially decompose certain
modules and reconfigure them in such a way that all rules
on the considered shared atom occur in a single module.
6
Conclusions
In this paper, we thoroughly investigated the operation of
forgetting in the context of modular ASP.
We began by observing that strong persistence (SP) – the
property usually taken to best characterize forgetting in ASP,
which cannot always be guaranteed – is too strong when we
consider modular ASP. Given the structure of modules in the
context of modular ASP, namely their restricted interface, a
weaker notion of persistence based on uniform equivalence
is sufficient to properly characterise forgetting in this case,
which led us to introduce uniform persistence (UP).
196
Eiter, T., and Fink, M. 2003. Uniform equivalence of logic
programs under the stable model semantics. In Palamidessi,
C., ed., Procs. of ICLP, volume 2916 of LNCS, 224–238.
Springer.
Eiter, T., and Wang, K. 2008. Semantic forgetting in answer
set programming. Artif. Intell. 172(14):1644–1672.
Eiter, T.; Fink, M.; and Woltran, S. 2007. Semantical characterizations and complexity of equivalences in answer set
programming. ACM Trans. Comput. Log. 8(3).
European Union. 2016. General Data Protection Regulation.
Official Journal of the European Union L119:1–88.
Gabbay, D. M.; Schmidt, R. A.; and Szalas, A. 2008. Second
Order Quantifier Elimination: Foundations, Computational
Aspects and Applications. College Publications.
Gonçalves, R.; Knorr, M.; Leite, J.; and Woltran, S. 2017.
When you must forget: Beyond strong persistence when forgetting in answer set programming. TPLP 17(5-6):837–854.
Gonçalves, R.; Janhunen, T.; Knorr, M.; Leite, J.; and
Woltran, S. 2019. Forgetting in modular answer set programming. In AAAI, 2843–2850. AAAI Press.
Gonçalves, R.; Knorr, M.; Leite, J.; and Woltran, S. 2020.
On the limits of forgetting in answer set programming. Artif.
Intell. 286:103307.
Goncalves, R.; Knorr, M.; and Leite, J. 2016a. The ultimate
guide to forgetting in answer set programming. In Baral, C.;
Delgrande, J.; and Wolter, F., eds., Procs. of KR, 135–144.
AAAI Press.
Gonçalves, R.; Knorr, M.; and Leite, J. 2016b. You can’t always forget what you want: on the limits of forgetting in answer set programming. In Fox, M. S., and Kaminka, G. A.,
eds., Procs. of ECAI, 957–965. IOS Press.
Harrison, A., and Lierler, Y. 2016. First-order modular logic
programs and their conservative extensions. TPLP 16(56):755–770.
Janhunen, T.; Oikarinen, E.; Tompits, H.; and Woltran, S.
2009. Modularity aspects of disjunctive stable models. J.
Artif. Intell. Res. (JAIR) 35:813–857.
Knorr, M., and Alferes, J. J. 2014. Preserving strong equivalence while forgetting. In Fermé, E., and Leite, J., eds.,
Procs. of JELIA, volume 8761 of LNCS, 412–425. Springer.
Lang, J.; Liberatore, P.; and Marquis, P. 2003. Propositional
independence: Formula-variable independence and forgetting. J. Artif. Intell. Res. (JAIR) 18:391–443.
Lierler, Y., and Truszczynski, M. 2011. Transition systems
for model generators - A unifying approach. TPLP 11(45):629–646.
Lifschitz, V.; Pearce, D.; and Valverde, A. 2001. Strongly
equivalent logic programs. ACM Trans. Comput. Log.
2(4):526–541.
Middeldorp, A.; Okui, S.; and Ida, T. 1996. Lazy narrowing:
Strong completeness and eager variable elimination. Theor.
Comput. Sci. 167(1&2):95–130.
Moinard, Y. 2007. Forgetting literals with varying propositional symbols. J. Log. Comput. 17(5):955–982.
Oikarinen, E., and Janhunen, T. 2006. Modular equivalence
for normal logic programs. In Brewka, G.; Coradeschi, S.;
Perini, A.; and Traverso, P., eds., Procs. of ECAI, 412–416.
Oikarinen, E., and Janhunen, T. 2008. Achieving compositionality of the stable model semantics for smodels programs. TPLP 8(5-6):717–761.
Sagiv, Y. 1988. Optimizing datalog programs. In Minker,
J., ed., Foundations of Deductive Databases and Logic Programming. Morgan Kaufmann. 659–698.
Wang, Y.; Zhang, Y.; Zhou, Y.; and Zhang, M. 2014. Knowledge forgetting in answer set programming. J. Artif. Intell.
Res. (JAIR) 50:31–70.
Wang, Y.; Wang, K.; and Zhang, M. 2013. Forgetting for
answer set programs revisited. In Rossi, F., ed., Procs. of
IJCAI, 1162–1168. IJCAI/AAAI.
Weber, A. 1986. Updating propositional formulas. In Expert
Database Conf., 487–500.
Wong, K.-S. 2009. Forgetting in Logic Programs. Ph.D.
Dissertation, The University of New South Wales.
Zhang, Y., and Foo, N. Y. 2006. Solving logic program
conflict through strong and weak forgettings. Artif. Intell.
170(8-9):739–778.
197
A framework for a modular multi-concept
lexicographic closure semantics
Laura Giordano , Daniele Theseider Dupré
DISIT - Università del Piemonte Orientale, Italy
{laura.giordano, dtd}@uniupo.it
which are concerned with subject Ci are admitted in mi . We
call a collection of such modules a modular multi-concept
knowledge base.
This modularization of the defeasible part of the knowledge base does not define a partition of the set D of defeasible inclusions, as an inclusion may belong to more than
one module. For instance, the typical properties of employed students are relevant both for the module with subject Student and for the module with subject Employee.
The granularity of modularization has to be chosen by the
knowledge engineer who can fix how large or narrow is
the scope of a module, and how many modules are to be
included in the knowledge base (for instance, whether the
properties of employees and students are to be defined in
the same module with subject Person or in two different
modules). At one extreme, all the defeasible inclusions in
D can be put together in a module associated with subject
⊤ (Thing). At the other extreme, which has been studied
in (Giordano and Theseider Dupré 2020), a module mi is a
defeasible TBox containing only the defeasible inclusions
of the form T(Cj ) ⊑ D for some concept Ci . In this paper we remove this restriction considering general modules,
containing arbitrary sets of defeasible inclusions, intuitively
pertaining some subject.
In (Giordano and Theseider Dupré 2020), following Gerard Brewka’s framework of Basic Preference Descriptions
for ranked knowledge bases (Brewka 2004), we have assumed that a specification of the relative importance of typicality inclusions for a concept Ci is given by assigning ranks
to typicality inclusions. However, for a large module, a specification by hand of the ranking of the defeasible inclusions
in the module would be awkward. In particular, a module
may include all properties of a class as well as properties
of its exceptional subclasses (for instance, the typical properties of penguins, ostriches, etc. might all be included in
a module with subject Bird ). A natural choice is then to
consider, for each module, a lexicographic semantics which
builds on the rational closure ranking to define a preference
ordering on domain elements. This preference relation corresponds, in the propositional case, to the lexicographic order
on worlds in Lehmann’s model theoretic semantics of the
lexicographic closure (Lehmann 1995). This semantics already accounts for the specificity relations among concepts
inside the module, as the lexicographic closure deals with
Abstract
We define a modular multi-concept extension of the lexicographic closure semantics for defeasible description logics
with typicality. The idea is that of distributing the defeasible properties of concepts into different modules, according
to their subject, and of defining a notion of preference for
each module based on the lexicographic closure semantics.
The preferential semantics of the knowledge base can then be
defined as a combination of the preferences of the single modules. The range of possibilities, from fine grained to coarse
grained modules, provides a spectrum of alternative semantics.
1 Introduction
Kraus, Lehmann and Magidor’s preferential logics for nonmonotonic reasoning (Kraus, Lehmann, and Magidor 1990;
Lehmann and Magidor 1992), have been extended to description logics, to deal with inheritance with exceptions
in ontologies, allowing for non-strict forms of inclusions,
called typicality or defeasible inclusions, with different preferential and ranked semantics (Giordano et al. 2007;
Britz, Heidema, and Meyer 2008) as well as different closure constructions such as the rational closure
(Casini and Straccia 2010;
Casini et al. 2013;
Giordano et al. 2013b; Giordano et al. 2015), the lexicographic closure (Casini and Straccia 2012),
the
relevant closure (Casini et al. 2014), and MP-closure
(Giordano and Gliozzi 2019).
In this paper we define a modular multi-concept extension
of the lexicographic closure for reasoning about exceptions
in ontologies. The idea is very simple: different modules
can be defined starting from a defeasible knowledge base,
containing a set D of typicality inclusions (or defeasible inclusions) describing the prototypical properties of classes in
the knowledge base. We will represent such defeasible inclusions as T(C) ⊑ D (Giordano et al. 2007), meaning that
“typical C’s are D’s” or “normally C’s are D’s”, corresponding to conditionals C |∼ D in KLM framework.
A set of modules m1 , . . . , mn is introduced, each one concerning a subject, and defeasible inclusions belong to a module if they are related with its subject. By subject, here, we
mean any concept of the knowledge base. Module mi with
subject Ci does not need to contain just typicality inclusions
of the form T(Ci ) ⊑ D, but all defeasible inclusions in D
198
extended to complex concepts as follows:
specificity, based on ranking of concepts computed by the
rational closure of the knowledge base.
Based on the ranked semantics of the single modules,
a compositional (preferential) semantics of the knowledge
base is defined by combining the multiple preference relations into a single global preference relation <. This gives
rise to a modular multi-concept extension of Lehmann’s
preference semantics for the lexicographic closure. When
there is a single module, containing all the typicality inclusions in the knowledge base, the semantics collapses to a natural extension to DLs of Lehmann’s semantics, which corresponds to Lehmann’s semantics for the fragment of ALC
without universal and existential restrictions.
We introduce a notion of entailment for modular multiconcept knowledge bases, based on the proposed semantics, which satisfies the KLM properties of a preferential
consequence relation. This notion of entailment has good
properties inherited from lexicographic closure: it deals
properly with irrelevance and specificity, and it is not subject to the “blockage of property inheritance” problem, i.e.,
the problem that property inheritance from classes to subclasses is not guaranteed, which affects the rational closure
(Pearl 1990). In addition, separating defeasible inclusions in
different modules provides a simple solution to another problem of the rational closure and its refinements (including the
lexicographic closure), that was recognized by Geffner and
Pearl (1992), namely, that “conflicts among defaults that
should remain unresolved, are resolved anomalously”, giving rise to too strong conclusions. The preferential (not necessarily ranked) nature of the global preference relation <
provides a simple way out to this problem, when defeasible
inclusions are suitably separated in different modules.
2
⊤I = ∆
⊥I = ∅
(¬C)I = ∆\C I
(C ⊓ D)I = C I ∩ DI
(C ⊔ D)I = C I ∪ DI
(∀R.C)I = {x ∈ ∆ | ∀y.(x, y) ∈ RI → y ∈ C I }
(∃R.C)I = {x ∈ ∆ | ∃y.(x, y) ∈ RI & y ∈ C I }.
The notion of satisfiability of a KB in an interpretation and
the notion of entailment are defined as follows:
Definition 1 (Satisfiability and entailment). Given an ALC
interpretation I = h∆, ·I i:
- I satisfies an inclusion C ⊑ D if C I ⊆ DI ;
- I satisfies an assertion C(a) if aI ∈ C I ;
- I satisfies an assertion R(a, b) if (aI , bI ) ∈ RI .
Given a KB K = (T , A), an interpretation I satisfies T
(resp., A) if I satisfies all inclusions in T (resp., all assertions in A). I is an ALC model of K = (T , A) if I satisfies
T and A.
Letting a query F to be either an inclusion C ⊑ D (where
C and D are concepts) or an assertion (C(a) or R(a, b)), F
is entailed by K, written K |=ALC F , if for all ALC models
I =h∆, ·I i of K, I satisfies F .
Given a knowledge base K, the subsumption problem is
the problem of deciding whether an inclusion C ⊑ D is entailed by K. The instance checking problem is the problem
of deciding whether an assertion C(a) is entailed by K. The
concept satisfiability problem is the problem of deciding, for
a concept C, whether C is consistent with K (i.e., whether
there exists a model I of K, such that C I 6= ∅).
In the following we will refer to an extension of ALC
with typicality inclusions, that we will call ALC + T as
in (Giordano et al. 2007), and to the rational closure of
ALC + T knowledge bases (T , A) (Giordano et al. 2013b;
Giordano et al. 2015). In addition to standard ALC inclusions C ⊑ D (called strict inclusions in the following), in
ALC + T the TBox T also contains typicality inclusions
of the form T(C) ⊑ D, where C and D are ALC concepts. Among all rational closure constructions for ALC
mentioned in the introduction, we will refer to the one in
(Giordano et al. 2013b), and to its minimal canonical model
semantics. Let us recall the notions of preferential, ranked
and canonical model of a defeasible knowledge base (T , A),
that will be useful in the following.
Definition 2 (Interpretations for ALC + T). A preferential
interpretation N is any structure h∆, <, ·I i where: ∆ is a
domain; < is an irreflexive, transitive and well-founded relation over ∆; ·I is a function that maps all concept names,
role names and individual names as defined above for ALC
interpretations, and provides an interpretation to all ALC
concepts as above, and to typicality concepts as follows:
(T(C))I = min< (C I ), where min< (S) = {u : u ∈ S
and ∄z ∈ S s.t. z < u}.
When relation < is required to be also modular (i.e., for all
x, y, z ∈ ∆, if x < y then x < z or z < y), N is called a
ranked interpretation.
Preliminaries: The description logics ALC
and its extension with typicality inclusions
Let NC be a set of concept names, NR a set of role names
and NI a set of individual names. The set of ALC concepts
(or, simply, concepts) can be defined inductively as follows:
• A ∈ NC , ⊤ and ⊥ are concepts;
• if C and D are concepts and R ∈ NR , then C ⊓ D, C ⊔
D, ¬C, ∀R.C, ∃R.C are concepts.
A knowledge base (KB) K is a pair (T , A), where T is a
TBox and A is an ABox. The TBox T is a set of concept
inclusions (or subsumptions) C ⊑ D, where C, D are concepts. The ABox A is a set of assertions of the form C(a)
and R(a, b) where C is a concept, R ∈ NR , and a, b ∈ NI .
An ALC interpretation (Baader et al. 2007) is a pair I =
h∆, ·I i where: ∆ is a domain—a set whose elements are
denoted by x, y, z, . . . —and ·I is an extension function that
maps each concept name C ∈ NC to a set C I ⊆ ∆, each
role name R ∈ NR to a binary relation RI ⊆ ∆ × ∆, and
each individual name a ∈ NI to an element aI ∈ ∆. It is
199
included in mi , and, if T(Bird ) ⊑ FlyingAnimal and
T(FlyingAnimal ) ⊑ BigWings are defeasible inclusions
in the knowledge base, they both may be relevant properties
of birds to be included in mi . For this reason we will not put
restrictions on the typicality inclusions that can belong to a
module. We will see later that the semantic construction for
a module mi will be able to ignore the typicality inclusions
which are not relevant for subject Ci and that there are cases
when not even the inclusions T(C) ⊑ D with C subsumed
by Ci are admitted in mi .
The modularization m1 , . . . , mk of the defeasible part D
of the knowledge base does not define a partition of D, as
the same inclusion may belong to more than one module
mi . For instance, the typical properties of employed students are relevant for both concept Student and concept
Employee and should belong to their related modules (if
any). Also, a granularity of modularization has to be chosen and, as we will see, this choice may have an impact
on the global semantics of the knowledge base. At one extreme, all the defeasible inclusions in D are put together
in the same module, e.g., the module associated with concept ⊤. At the other extreme, which has been studied
in (Giordano and Theseider Dupré 2020), a module mi contains only the defeasible inclusions of the form T(Ci ) ⊑ D,
where Ci is the subject of mi (and in this case, the inclusions T(C) ⊑ D with C subsumed by Ci are not admitted in mi ). In this regard, the framework proposed in
this paper could be seen as an extension of the proposal
in (Giordano and Theseider Dupré 2020) to allow coarser
grained modules, while here we do not allow for userdefined preferences among defaults.
Let us consider an example of multi-concept knowledge
base.
Example 5. Let K be the knowledge base
hT , D, m1 , m2 , m3 , A, si, where A = ∅, T contains
the strict inclusions:
Preferential interpretations for description logics were
first studied in (Giordano et al. 2007), while ranked interpretations (i.e., modular preferential interpretations) were first
introduced for ALC in (Britz, Heidema, and Meyer 2008).
A preferential (ranked) model of an ALC + T knowledge
base K is a preferential (ranked) ALC + T interpretation
N = h∆, <, ·I i that satisfies all inclusions in K, where: a
strict inclusion or an assertion is satisfied in N if it is satisfied in the ALC model h∆, ·I i, and a typicality inclusion
T(C) ⊑ D is satisfied in N if (T(C))I ⊆ DI .
Preferential entailment in ALC + T is defined in the usual
way: for a knowledge base K and a query F (a strict or
defeasible inclusion or an assertion), F is preferentially entailed by K (K |=ALC+T F ) if F is satisfied in all preferential models of K.
A canonical model for K is a preferential (ranked) model
containing, roughly speaking, as many domain elements as
consistent with the knowledge base specification K. Given
an ALC + T knowledge base K = (T , A) and a query F ,
let us define SK as the set of all ALC concepts (and subconcepts) occurring in K or in F , together with their complements. We consider all the sets of concepts {C1 , C2 , . . . ,
Cn } ⊆ SK consistent with K, i.e., s.t. K 6|=ALC+T
C1 ⊓ C2 ⊓ · · · ⊓ Cn ⊑ ⊥.
Definition 3 (Canonical model). . A preferential model
M =h∆, <, Ii of K is canonical with respect to SK if
it contains at least a domain element x ∈ ∆ s.t. x ∈
(C1 ⊓ C2 ⊓ · · · ⊓ Cn )I , for each set {C1 , C2 , . . . , Cn } ⊆ SK
consistent with K.
For finite, consistent ALC + T knowledge bases, existence of finite (ranked) canonical models has been proved in
(Giordano et al. 2015) (Theorem 1). In the following, as we
will only consider finite ALC + T knowledge bases, we can
restrict our consideration to finite preferential models.
3
Modular multi-concept knowledge bases
Employee ⊑ Adult
Adult ⊑ ∃has SSN .⊤
PhdStudent ⊑ Student
PhDStudent ⊑ Adult
Has no Scolarship ≡ ¬∃hasScolarship.⊤
PrimarySchoolStudent ⊑ Children
PrimarySchoolStudent ⊑ HasNoClasses
Driver ⊑ Adult
Driver ⊑ ∃has DrivingLicence.⊤
In this section we introduce a notion of a multi-concept
knowledge base, starting from a set of strict inclusions T ,
a set of assertions A, and a set of typicality inclusions D,
each one of the form T(C) ⊑ D, where C and D are ALC
concepts.
Definition 4. A modular multi-concept knowledge base K
is a tuple hT , D, m1 , . . . , mk , A, si, where T is an ALC
TBox, D is a set of typicality inclusions, such that m1 ∪ . . . ∪
mk = D, A is an ABox, and s is a function associating each
module mi with a concept, s(mi ) = Ci , the subject of mi .
and the defeasible inclusions in D are distributed in the modules m1 , m2 , m3 as follows.
Module m1 has subject Employee, and contains the defeasible inclusions:
(d1 ) T(Employee) ⊑ ¬Young
(d2 ) T(Employee) ⊑ ∃has boss.Employee
(d3 ) T(ForeignerEmployee) ⊑ ∃has Visa.⊤
(d4 ) T(Employee ⊓ Student ) ⊑ Busy
(d5 ) T(Employee ⊓ Student ) ⊑ ¬Young
Module m2 has subject Student, and contains the defeasible inclusions:
(d6 ) T(Student ) ⊑ ∃has classes.⊤
(d7 ) T(Student ) ⊑ Young
The idea is that each mi is a module defining the typical
properties of the instances of some concept Ci . The defeasible inclusions belonging to a module mi with subject Ci
are the inclusions that intuitively pertain to Ci . We expect
that all the typicality inclusions T(C) ⊑ D, such that C is
a subclass of Ci , belong to mi , but not only. For instance,
for a module mi with subject Ci = Bird , the typicality inclusion T(Bird ⊓ Live at SouthPole) ⊑ Penguin, meaning that the birds living at the south pole are normally
penguins, is clearly to be included in mi . As penguins
are birds, also inclusion T(Penguin) ⊑ Black is to be
200
we use <i , instead of <, for the preference relation in Ni ,
for i = 1, . . . , k).
In his seminal work on the lexicographic closure,
Lehmann (1995) defines a model theoretic semantics of the
lexicographic closure construction by introducing an order
relation among propositional models, considering which defaults are violated in each model, and introducing a seriousness ordering ≺ among sets of violated defaults. For two
propositional models w and w′ , w ≺ w′ (w is preferred to
w′ ) is defined in (Lehmann 1995) as follows:
(d8 ) T(Student ) ⊑ Has no Scolarship
(d9 ) T(HighSchoolStudent ) ⊑ Teenager
(d10 ) T(PhDStudent ) ⊑ ∃hasScolarship.Amount
(d11 ) T(PhDStudent ) ⊑ Bright
(d4 ) T(Employee ⊓ Student ) ⊑ Busy
(d5 ) T(Employee ⊓ Student ) ⊑ ¬Young
Module m3 has subject V ehicle, and contains the defeasible inclusions:
(d12 ) T(Vehicle) ⊑ ∃has owner .Driver
(d13 ) T(Car ) ⊑ ¬SportsCar
(d14 ) T(SportsCar ) ⊑ RunFast
(d15 ) T(Truck ) ⊑ Heavy
(d16 ) T(Bicycle) ⊑ ¬RunFast
w ≺ w′ iff V (w) ≺ V (w′ )
(1)
′
w is preferred to w when the defaults V (w) violated by w
are less serious than the defaults V (w′ ) violated by w′ . As
we will recall below, the seriousness ordering depends on the
number of defaults violated by w and by w′ for each rank.
In a similar way, in the following, we introduce a ranked
relation <i on the domain ∆ of a model of Ki . Let us first define, for a preferential model Ni = h∆, <i , ·I i of Ki , what it
means that an element x ∈ ∆ violates a typicality inclusion
T(C) ⊑ D in mi .
Observe that, in previous example, (d4 ) and (d5 ) belong
to both modules m1 and m2 . An additional module might
be added containing the prototypical properties of Adults.
4 A lexicographic semantics of modular
multi-concept knowledge bases
In this section, we define a semantics of modular multiconcept knowledge bases, based on Lehmann’s lexicographic closure semantics (1995). The idea is that, for each
module mi , a semantics can be defined using lexicographic
closure semantics, with some minor modification.
Given a modular multi-concept knowledge base K =
hT , D, m1 , . . . , mk , A, si, we let rank (C ) be the rank of
concept C in the rational closure ranking of the knowledge
base (T ∪ D, A), according to the rational closure construction in (Giordano et al. 2013b). In the rational closure ranking, concepts with higher ranks are more specific than concepts with lower ranks. While we will not recall the rational
closure construction, let us consider again Example 5. In
Example 5, the rational closure ranking assigns to concepts
Adult, Employee, ForeignEmployee, Driver , Student,
HighSchoolStudent , PrimarySchoolStudent the rank 0,
while to concepts PhDStudent and Employee ⊓ Student
the rank 1. In fact, PhDStudent are exceptional students, as
they have a scholarship, while employed students are exceptional students, as they are not young. Their rank is higher
than the rank of concept Student as they are exceptional
subclasses of class Student.
Based on the concept ranking, the rational closure assigns
a rank to typicality inclusions: the rank of T(C) ⊑ D is
equal to the rank of concept C. For each module mi of a
knowledge base K = hT , D, m1 , . . . , mk , A, si, we aim
to define a canonical model, using the lexicographic order
based on the rank of typicality inclusions in mi . In the following we will assume that the knowledge base hT ∪ D, Ai
is consistent in the logic ALC + T, that is, it has a preferential model. This also guarantees the existence of (finite)
canonical models (Giordano et al. 2015). In the following,
as the knowledge base K is finite, we will restrict our consideration to finite preferential and ranked models.
Let us define the projection of the knowledge base K on
module mi as the knowledge base Ki = hT ∪ mi , Ai. Ki is
an ALC + T knowledge base. Hence a preferential model
Ni = h∆, <i , ·I i of Ki is defined as in Section 2 (but now
Definition 6. Given a module mi of K, with s(mi ) = Ci ,
and a preferential model Ni = h∆, <i , ·I i of Ki , an element
x ∈ ∆ violates a typicality inclusion T(C) ⊑ D in mi if
x ∈ C I and x 6∈ DI .
Notice that, the set of typicality inclusions violated by a
domain element x in a model only depends on the interpretation ·I of ALC concepts, and on the defeasible inclusions in
mi . Let Vi (x) be the set of the defeasible inclusions of mi
violated by domain element x, and let Vih (x) be the set of all
defeasible inclusions in mi with rank h which are violated
by domain element x.
In order to compare alternative sets of defaults, in
(Lehmann 1995) the seriousness ordering ≺ among sets of
defaults is defined by associating with each set of defaults
D ⊆ K a tuple of numbers hn0 , n1 , . . . , nr i, where r is the
order of K, i.e. the least finite i such that there is no default with the finite rank r or rank higher than r (but there
is at least one default with rank r − 1). The tuple is constructed considering the ranks of defaults in the rational closure. n0 is the number of defaults in D with rank ∞ and,
for 1 ≤ i ≤ k, ni is the number of defaults in D with rank
r − i (in particular, nr is the number of defaults in D with
rank 0). Lehmann defines the strict modular order ≺ among
sets of defaults from the natural lexicographic order over the
tuples hn0 , n1 , . . . , nk i. This order gives preference to those
sets of defaults containing a larger number of more specific
defaults. As we have seen from equation (1), ≺ is used by
Lehmann to compare sets of violated defaults and to prefer
the propositional models whose violations are less serious.
We use the same criterion for comparing domain elements, introducing a seriousness ordering ≺i for each module mi . Considering that the defaults with infinite rank must
be satisfied by all domain elements, we will not need to consider their violation in our definition (that is, we will not
consider n0 in the following).
The set Vi (x) of defaults from module mi which are violated by x, can be associated with a tuple of numbers
201
ti,x = h|Vir−1 (x)|, . . . , |Vi0 (x)|i. Following Lehmann, we
let Vi (x) ≺i Vi (y) iff ti,x comes before ti,y in the natural
lexicographic order on tuples (restricted to the violations of
defaults in mi ), that is:
⊤, containing all the typicality inclusions in K, the preference relation <1 corresponds to Lehmann’s lexicographic
closure semantics, as its definition is based on the set of all
defeasible inclusions in the knowledge base.
Vi (x) ≺i Vi (y) iff ∃l such that |Vil (x)| < |Vil (y)|
5 The combined lexicographic model of a KB
and, ∀h > l, |Vih (x)| = |Vih (y)|
For multiple modules, each <i determines a ranked preference relation which can be used to answer queries over
module mi (i.e. queries whose subject is Ci ). If we want
to evaluate the query T(C) ⊑ D (are all typical C elements also D elements?) in module mi (assuming that C
concerns subject Ci ), we can answer the query using the
<i relation, by checking whether min<i (C I ) ⊆ DI . For
instance, in Example 5, the query “are all typical Phd students young?” can be evaluated in module m2 . The answer
would be positive, as the property of students of being normally young is inherited by PhD Student. The evaluation
of a query in a specific module is something that is considered in context-based formalisms, such as in the CKR
framework (Bozzato, Eiter, and Serafini 2014), where there
is a language construct eval (X , c) for evaluating a concept
(or role) X in context c.
The lexicographic orders <i and <j (for i 6= j) do not
need to agree. For instance, in Example 5, for two domain
elements x and y, we might have that x <1 y and y <2 x,
as x is more typical than y as an employee, but less typical
than x as a student. To answer a query T(C) ⊑ D, where C
is a concept which is concerned with more than one subject
in the knowledge base (e.g., are typical employed students
young?), we need to combine the relations <i .
A simple way of combining the modular partial order relations <i is to use Pareto combination. Let ≤i be defined as
follows: x ≤i y iff y 6<i x. As <i is a modular partial order,
≤i is a total preorder. Given a canonical multi-concept lexicographic model M = h∆, <1 , . . . , <k , ·I i of K, we define
a global preference relation < on ∆ as follows:
I
Definition 7. A preferential model Ni = h∆, <i , · i of
Ki = hT ∪ mi , Ai, is a lexicographic model of Ki if h∆, ·I i
is an ALC model of hT , Ai and <i satisfies the following
condition:
x <i y iff Vi (x) ≺i Vi (y).
(2)
Informally, <Cj gives higher preference to domain elements violating less typicality inclusions of mi with higher
rank. In particular, all x, y 6∈ CiI , x ∼Ci y, i.e., all ¬Ci elements are assigned the same preference wrt <i , the least
one, as they trivially satisfy all the typicality properties in
mi . As in Lehmann’s semantics, in a lexicographic model
Ni = h∆, <i , ·I i of Ki , the preference relation <i is a strict
modular partial order, i.e. an irreflexive, transitive and modular relation. As well-foundedness trivially holds for finite
interpretations, a lexicographic model Ni of Ki is a ranked
model of Ki .
Proposition 8. A lexicographic model Ni = h∆, <i , ·I i of
Ki = hT ∪ mi , Ai is a ranked model of Ki .
A multi-concept model for K can be defined as a multipreference interpretation with a preference relation <i for
each module mi .
Definition 9 (Multi-concept interpretation). Let K =
hT , D, m1 , . . . , mk , A, si be a multi-concept knowledge
base. A multi-concept interpretation M for K is a tuple
h∆, <1 , . . . , <k , ·I i such that, for all i = 1, . . . , k, h∆, <i
, ·I i is a ranked ALC + T interpretation, as defined in Section 2.
Definition 10 (Multi-concept lexicographic model). Let
K = hT , D, m1 , . . . , mk , A, si be a multi-concept knowledge base. A multi-concept lexicographic model M =
h∆, <1 , . . . , <k , ·I i of K is a multi-concept interpretation
for K, such that, for all i = 1, . . . , k, Ni = h∆, <i , ·I i is a
lexicographic model of Ki = hT ∪ mi , Ai.
A canonical multi-concept lexicographic model of K is
multi-concept lexicographic model of K such that ∆ and ·I
are the domain and interpretation function of some canonical
preferential model of hT ∪ D, Ai, according to Definition 3.
Definition 11 (Canonical multi-concept lexicographic
model). Given a multi-concept knowledge base K =
hT , D, m1 , . . . , mk , A, si, a canonical multi-concept lexicographic model of K, M = h∆, <1 , . . . , <k , ·I i, is a multiconcept lexicographic model of K such that there is a canonical ALC + T model h∆, <∗ , ·I i of hT ∪ D, Ai, for some
<∗ .
Observe that, restricting to the propositional fragment of
the language (which does not allow universal and existential
restrictions nor assertions), for a knowledge base K without
strict inclusions and with a single module m1 , with subject
x < y iff (i) for some i = 1, . . . , k, x <i y and
(ii) for all j = 1, . . . , k, x ≤j y,
(∗)
The resulting relation < is a partial order but, in general,
modularity does not hold for <.
Definition 12. Given a canonical multi-concept lexicographic model M = h∆, <1 , . . . , <k , ·I i of K, the combined lexicographic interpretation of M, is a triple MP =
h∆, <, ·I i, where < is the global preference relation defined
by (*).
We call MP a combined lexicographic model of K
(shortly, an mcl -model of K).
Proposition 13. A combined lexicographic model MP of
K is a preferential interpretation satisfying all the strict inclusions and assertions in K.
A combined lexicographic model MP of K is a preferential interpretation as those defined for ALC + T in Definition 2 (and, in general, it is not a ranked interpretation).
However, preference relation < in MP is not an arbitrary irreflexive, transitive and well-founded relation. It is obtained
by first computing the lexicographic preference relations <i
202
MP = h∆, <, ·I i of K is a mcl -model of K such that all the
typicality inclusions in K are satisfied in MP , i.e., for all
T(C) ⊑ D ∈ D, min< (C I ) ⊆ DI .
for modules, and then by combining them into <. As MP
satisfies all strict inclusions and assertions in K but is not
required to satisfy all typicality inclusions T(C) ⊑ D in K,
MP is not a preferential ALC + T model of K as defined
in Section 2.
Consider a situation in which there are two concepts,
Student and YoungPerson, that are very related in that
students are normally young persons and young persons
are normally students (i.e., T(Student ) ⊑ YoungPerson
and T(YoungPerson) ⊑ Student) and suppose there are
two modules m1 and m2 such that s(m1 ) = Student and
s(m2 ) = YoungPerson. The two classes may have different (and even contradictory) prototypical properties, for
instance, normally students are quiet (e.g., when they are
in their classrooms), T(Student ) ⊑ Quiet, but normally
young persons are not quiet, T(YoungPerson) ⊑ ¬Quiet.
Considering the preference relations <1 and <2 , associated with the two modules in a canonical multi-concept lexicographic model, we may have that, for two young persons Bob and John, which are also students, bob <1 john
and john <2 bob, as Bob is quiet and John is not. Then,
John and Bob are incomparable in the global relation
<. Both of them, depending on the other prototypical
properties of students and young persons, might be minimal, among students, wrt the global preference relation <.
Hence, the set min< (Student I ) is not necessarily a subset of min<1 (Student I ). That is, typical students in the
global relation may include instances (e.g., john ) which
do not satisfy all the typicality inclusions for Student, as
they are are (globally) incomparable with the elements in
min<1 (Student I ). This implies that the notion of mcl entailment (defined below) cannot be stronger than preferential entailment in Section 2. However, given the correspondence of mcl -models with the lexicographic closure in
the case of a single module with subject ⊤, containing all
the typicality inclusions in D, mcl -entailment can neither be
weaker than preferential entailment.
In general, for a knowledge base K and a module mi ,
with s(mi ) = Ci , the inclusion min< (CiI ) ⊆ min<i (CiI )
may not hold and, for this reason, a combined lexicographic
interpretation may fail to satisfy all typicality inclusions. In
this respect, canonical multi-concept lexicographic models
are more liberal than KLM-style preferential models for typicality logics (Giordano et al. 2009), where all the typicality
inclusions are required to be satisfied and, in the previous
example, min< (Student I ) ⊆ Quiet I must hold for the typicality inclusion to be satisfied. In fact, the knowledge base
above is inconsistent in the preferential semantics and has
no preferential model: from T(Student ) ⊑ YoungPerson
and T(YoungPerson) ⊑ Student, it follows that
T(Student ) = T(YoungPerson) should hold in all
preferential models of the knowledge base, which is
impossible given the conflicting typicality inclusions
T(Student ) ⊑ Quiet and T(YoungPerson) ⊑ ¬Quiet .
To require that all typicality inclusions in K are satisfied
in MP , the notion of mcl -model of K can be strengthened
as follows.
Observe that, mcl T-model MP = h∆, <, ·I i of
K = hT , D, m1 , . . . , mk , A, si is a KLM-style preferential
model for the ALC + T knowledge base hT ∪ D, Ai, as defined in Section 2. As a difference, the preference relation <
in a mcl T-model is not an arbitrary irreflexive, transitive and
well-founded relation, but is defined from the lexicographic
preference relations <i ’s according to equation (*).
We define a notion of multi-concept lexicographic entailment (mcl -entailment) in the obvious way: a query F is mcl entailed by K (K |=mcl F ) if, for all mcl -models MP =
h∆, <, ·I i of K, F is satisfied in MP . Notice that a query
T(C) ⊑ D is satisfied in MP when min< (C I ) ⊆ DI .
Similarly, a notion of mcl T-entailment can be defined:
K |=mcl T F if, for all mcl T-models MP = h∆, <, ·I i of
K, F is satisfied in MP .
As, for any multi-concept knowledge base K, the set
of mcl T-models of K is a subset of the set of mcl -models
of K, and there is some K for which the inclusion is
proper (see, for instance, the student and young person
example above), mcl T-entailment is stronger than mcl entailment. It can be proved that both notions of entailment
satisfy the KLM postulates of preferential consequence
relations, which can be reformulated for a typicality
logic, considering that typicality inclusions T(C) ⊑ D
(Giordano et al. 2007) stand for conditionals C |∼D in KLM
preferential logics (Kraus, Lehmann, and Magidor 1990;
Lehmann and Magidor 1992). See also (Booth et al. 2019)
for the formulation of KLM postulates in the Propositional
Typicality Logic (PTL).
In the following proposition, we let “T(C) ⊑ D” mean
that T(C) ⊑ D is mcl -entailed from a given knowledge base
K.
Proposition 15. mcl -entailment satisfies the KLM postulates
of preferential consequence relations, namely:
(REFL) T(C) ⊑ C
(LLE) If A ≡ B and T(A) ⊑ C, then T(B) ⊑ C
(RW) If C ⊑ D and T(A) ⊑ C, then T(A) ⊑ D
(AND) If T(A) ⊑ C and T(A) ⊑ D, then T(A) ⊑ C ⊓ D
(OR) If T(A) ⊑ C and T(B) ⊑ C, then T(A ⊔ B) ⊑ C
(CM) If T(A) ⊑ D and T(A) ⊑ C, then T(A ⊓ D) ⊑ C
Stated differently, the set of the typicality inclusions
T(C) ⊑ D that are mcl -entailed from a given knowledge
base K is closed under conditions (REFL)-(CM) above. For
instance, (LLE) means that if A and B are equivalent concepts in ALC and T(A) ⊑ C is mcl -entailed from a given
knowledge base K, than T(B) ⊑ C is also mcl -entailed
from K; similarly for the other conditions (where inclusion
C ⊑ D is entailed by K in ALC). It can be proved that also
mcl T-entailment satisfies the KLM postulates of preferential
consequence relations.
It can be shown that both mcl -entailment and mcl Tentailment are not stronger than Lehmann’s lexicographic
closure in the propositional case. Let us consider again Example 5.
Definition 14. A T-compliant mcl -model (or mcl T-model)
203
Example 16. Let us add another module m4 with subject
Citizen to the knowledge base K, plus the following additional axioms in T :
Italian ⊑ Citizen
French ⊑ Citizen
Canadian ⊑ Citizen
Module m4 has subject Citizen, and contains the defeasible
inclusions:
(d17 ) T(Italian) ⊑ DriveFast
(d18 ) T(Italian) ⊑ HomeOwner
Suppose the following typicality inclusion is also added to
module m2 :
(d19 ) T(PhDStudent ) ⊑ ¬HomeOwner
What can we conclude about typical Italian PhD
students?
We can see that neither the inclusion
T(PhDStudent ⊓ Italian) ⊑ HomeOwner nor the inclusion T(PhDStudent ⊓ Italian) ⊑ ¬HomeOwner are mcl entailed by K.
In fact, in all canonical multi-concept lexicographic
models M = h∆, <1 , . . . , <4 , ·I i of K, all elements in
min<2 ((P hDStudent ⊓ Italian)I ) ( the minimal Italian
PhDStudent wrt <2 ), have scholarship, are bright, are not
home owners (which are typical properties of PhD students),
have classes and are young (which are properties of students
not overridden for PhD students).
On the other end, all elements in min<4 ((PhDStudent
⊓Italian)I ) (i.e., the minimal Italian PhDStudent wrt <4 )
have the properties that they drive fast and are home owners.
As <2 -minimal elements and <4 -minimal PhDStudent
⊓Italian -elements are incomparable wrt <, the <-minimal
Italian PhD students will include them all.
Hence,
min< ((PhDStudent ⊓ Italian)I ) 6⊆ HomeOwner I
and
min< ((PhDStudent ⊓ Italian)I ) 6⊆ (¬HomeOwner )I .
in his logical framework for default reasoning, leading to
a generalization of the approach to allow a partial ordering
between premises. The example above shows that our approach using ranked preferences for the single modules, but
a non-ranked global preference relation < for their combination, does not suffer from this problem, provided a suitable modularization is chosen (in example above, obtained
by separating the typical properties of Italians and those of
students in different modules).
6 Further issues: Reasoning with a hierarchy
of modules and user-defined preferences
The approach considered in Section 4 does not allow to
reason with a hierarchy of modules, but it considers a flat
collection of modules m1 , . . . , mk , each module concerning some subject Ci . As we have seen, a module mi may
contain defeasible inclusions referring to subclasses of Ci ,
such as PhDStudent in the case of module m2 with subject
Student. When defining the preference relation <i the lexicographic closure semantics already takes into account the
specificity relation among concepts within the module (e.g.,
the fact that PhDStudent is more specific than Student).
However, nothing prevents us from defining two modules mi (with subject Ci ) and mj (with subject Cj ), such
that concept Cj is more specific than concept Ci . For instance, as a variant of Example 5, we might have introduced two different modules m2 with subject Student and
m5 with subject PhDStudent. As concept PhDStudent
is more specific than concept Student (in particular,
PhDStudent ⊑ Student is entailed from the strict part of
knowledge base T in ALC), the specificity information
should be taken into account when combining the preference
relations. More precisely, preference <5 should override
preference <2 when comparing PhDStudent-instances.
This is the principle followed by Giordano and Theseider Dupré (2020) to define a global preference relation, in
the case when each module with subject Ci only contains
typicality inclusions of the form T(Ci ) ⊑ D. A more sophisticated way to combine the preference relations <i into
a global relation < is used to deal with this case with respect to Pareto combination, by exploiting the specificity relation among concepts. While we refer therein for a detailed
description of this more sophisticated notion of preference
combination, let us observe that this solution could be as
well applied to the modular multi-concept knowledge bases
considered in this paper, provided an irreflexive and transitive notion of specificity among modules is defined.
Another aspect that has been considered in the previously
mentioned paper is the possibility of assigning ranks to the
defeasible inclusions associated with a given concept. While
assigning a rank to all typicality inclusions in the knowledge
base may be awkward, often people have a clear idea about
the relative importance of the properties for some specific
concept. For instance, we may know that the defeasible
property that students are normally young is more important
than the property that student normally do not have a scholarship. For small modules, which only contain typicality inclusions T(Ci ) ⊑ D for a concept Ci , the specification of user-
The home owner example is a reformulation of the example used by Geffner and Pearl to show that the rational closure of conditional knowledge bases sometimes
gives too strong conclusions, as “conflicts among defaults that should remain unresolved, are resolved anomalously” (Geffner and Pearl 1992). Informally, if defaults
(d18 ) and (d19 ) are conflicting for Italian Phd students before adding any default which makes PhD students exceptional wrt Students (in our formalization, default (d10 )),
they should remain conflicting after this addition. Instead, in the propositional case, both the rational closure
(Lehmann and Magidor 1992) and Lehmann’s lexicographic
closure (1995) would entail that normally Italian Phd students are not home owners. This conclusion is unwanted,
and is based on the fact that (d18 ) has rank 0, while (d19 )
has rank 1 in the rational closure ranking. On the other
hand, T(PhDStudent ⊓ Italian) ⊑ ¬ HomeOwner is neither mcl -entailed from K, nor mcl T-entailed from K. Both
notions of entailment, when restricted to the propositional
case, cannot be stronger than Lehmann’s lexicographic closure.
Geffner and Pearl’s Conditional Entailment (1992) does
not suffer from the above mentioned problem as it is based
on (non-ranked) preferential models. The same problem,
which is related to the representation of preferences as levels of reliability, has also been recognized by Brewka (1989)
204
defined ranks of the Ci ’s typical properties is a feasible option and a ranked modular preference relation can be defined
from it, by using Brewka’s # strategy from his framework of
Basic Preference Descriptions for ranked knowledge bases
(Brewka 2004). This alternative may coexist with the use
of the lexicographic closure semantics built from the rational closure ranking for larger modules. A mixed approach,
integrating user-specified preferences with the rational closure ranking for the same module, might be an interesting
alternative. This integration, however, does not necessarily
provide a total preorder among typicality inclusions, which
is our starting point for defining the modular preferences <i
and their combination. Alternative semantic constructions
should be considered for dealing with this case.
According to the choice of fine grained or coarse grained
modules, to the choice of the preferential semantics for
each module (e.g., based on user-specified ranking or on
Lehmann’s lexicographic closure, or on the rational closure,
etc.), and to the presence of a specificity relation among
modules, alternative preferential semantics for modularized
multi-concept knowledge bases can emerge.
features from most of NMR formalisms in the literature.
In addition to those already mentioned in the introduction, let us recall the work by Straccia on inheritance
reasoning in hybrid KL-One style logics (1993) the
work on defaults in DLs (Baader and Hollunder 1995),
on description logics of minimal knowledge and negation as failure (Donini, Nardi, and Rosati 2002), on
circumscriptive
DLs
(Bonatti, Lutz, and Wolter 2009;
Bonatti, Faella, and Sauro 2011), the generalization of
rational closure to all description logics (Bonatti 2019).
as well as the combination of description logics and
rule-based languages (Eiter et al. 2008; Eiter et al. 2011;
Motik and Rosati 2010;
Knorr, Hitzler, and Maier 2012;
Gottlob et al. 2014; Giordano and Theseider Dupré 2016;
Bozzato, Eiter, and Serafini 2018).
Our multi-preference semantics is related with the multipreference semantics for ALC developed by Gliozzi
(Gliozzi 2016), which is based on the idea of refining the
rational closure construction considering the preference relations <Ai associated with different aspects, but we follow
a different route concerning the definition of the preference
relations associated with modules, and the way of combining
them in a single preference relation. In particular, defining
a refinement of rational closure semantics is not our aim in
this paper, as we prefer to avoid some unwanted conclusions
of rational and lexicographic closure while exploiting their
good inference properties.
The idea of having different preference relations, associated with different typicality operators, has been studied by
Gil (2014) to define a multipreference formulation of the typicality DL ALC + Tmin , mentioned above. As a difference,
in this proposal we associate preferences with modules and
their subject, and we combine the different preferences into
a single global one. An extension of DLs with multiple preferences has also been developed by Britz and Varzinczak
(2018; 2019) to define defeasible role quantifiers and defeasible role inclusions, by associating multiple preference relations with roles.
The relation of our semantics with the lexicographic closure for ALC by Casini and Straccia (2010; 2013) should be
investigated. A major difference is in the choice of the rational closure ranking for ALC, but it would be interesting to
check whether their construction corresponds to our semantics in the case of a single module m1 with subject ⊤, when
the same rational closure ranking is used.
Bozzato et al. present extensions of the CKR (Contextualized Knowledge Repositories) framework by Bozzato et al.
(2014; 2018) in which defeasible axioms are allowed in the
global context and exceptions can be handled by overriding
and have to be justified in terms of semantic consequence,
considering sets of clashing assumptions for each defeasible
axiom. An extension of this approach to deal with general
contextual hierarchies has been studied by the same authors
(Bozzato, Eiter, and Serafini 2019), by introducing a coverage relation among contexts, and defining a notion of preference among clashing assumptions, which is used to define
a preference relation among justified CAS models, based on
which CKR models are selected. An ASP based reasoning
procedure, that is complete for instance checking, is devel-
7 Conclusions and related work
In this paper, we have proposed a modular multi-concept extension of the lexicographic closure semantics, based on the
idea that defeasible properties in the knowledge base can be
distributed in different modules, for which alternative preference relations can be computed. Combining multiple preferences into a single global preference allows a new preferential semantics and a notion of multi-concept lexicographic
entailment (mcl -entailment) which, in the propositional case,
is not stronger than the lexicographic closure.
mcl -entailment satisfies the KLM postulates of a preferential consequence relation. It retains some good properties
of the lexicographic closure, being able to deal with irrelevance, with specificity within the single modules, and not
being subject to the “blockage of property inheritance” problem. The combination of different preference relations provides a simple solution to a problem, recognized by Geffner
and Pearl, that the rational closure of conditional knowledge
bases sometimes gives too strong conclusions, as “conflicts
among defaults that should remain unresolved, are resolved
anomalously” (Geffner and Pearl 1992). This problem also
affects the lexicographic closure, which is stronger than the
rational closure. Our approach using ranked preferences
for the single modules, but a non-ranked preference < for
their combination, does not suffer from this problem, provided a suitable modularization is chosen. As Geffner and
Pearl’s Conditional Entailment (Geffner and Pearl 1992),
also some non-monotonic DLs, such as ALC + Tmin , a
typicality DL with a minimal model preferential semantics
(Giordano et al. 2013a), and the non-monotonic description
logic DLN (Bonatti et al. 2015), which supports normality
concepts based on a notion of overriding, do not not suffer
from the problem above.
Reasoning about exceptions in ontologies has led to
the development of many non-monotonic extensions of
Description Logics (DLs), incorporating non-monotonic
205
oped for SROIQ-RL.
For the lightweight description logic EL+
⊥ , an Answer Set Programming (ASP) approach has been proposed
(Giordano and Theseider Dupré 2020) for defeasible inference in a miltipreference extension of EL+
⊥ , in the specific case in which each module only contains the defeasible inclusions T(Ci ) ⊑ D for a single concept Ci ,
where the ranking of defeasible inclusions is specified in
the knowledge base, following the approach by Gerhard
Brewka in his framework of Basic Preference Descriptions for ranked knowledge bases (Brewka 2004). A specificity relation among concepts is also considered. The
ASP encoding exploits asprin (Brewka et al. 2015), by formulating multipreference entailment as a problem of computing preferred answer sets, which is proved to be Πp2 complete. For EL+
⊥ knowledge bases, we aim at extending this ASP encoding to deal with the modular multiconcept lexicographic closure semantics proposed in this
paper, as well as with a more general framework, allowing for different choices of preferential semantics for the
single modules and for different specificity relations for
combining them. For lightweight description logics of
the EL family (Baader, Brandt, and Lutz 2005), the ranking of concepts determined by the rational closure construction can be computed in polynomial time in the size of
the knowledge base (Giordano and Theseider Dupré 2018;
Casini, Straccia, and Meyer 2019). This suggests that we
may expect a Πp2 upper-bound on the complexity of multiconcept lexicographic entailment.
Booth, R.; Casini, G.; Meyer, T.; and Varzinczak, I. 2019.
On rational entailment for propositional typicality logic. Artif. Intell. 277.
Bozzato, L.; Eiter, T.; and Serafini, L. 2014. Contextualized
knowledge repositories with justifiable exceptions. In DL
2014, volume 1193 of CEUR Workshop Proceedings, 112–
123.
Bozzato, L.; Eiter, T.; and Serafini, L. 2018. Enhancing
context knowledge repositories with justifiable exceptions.
Artif. Intell. 257:72–126.
Bozzato, L.; Eiter, T.; and Serafini, L. 2019. Justifiable
exceptions in general contextual hierarchies. In Bella, G.,
and Bouquet, P., eds., Modeling and Using Context - 11th
International and Interdisciplinary Conference, CONTEXT
2019, Trento, Italy, November 20-22, 2019, Proceedings,
volume 11939 of Lecture Notes in Computer Science, 26–
39. Springer.
Brewka, G.; Delgrande, J. P.; Romero, J.; and Schaub, T.
2015. asprin: Customizing answer set preferences without a
headache. In Proc. AAAI 2015, 1467–1474.
Brewka, G. 1989. Preferred subtheories: An extended logical framework for default reasoning. In Proceedings of
the 11th International Joint Conference on Artificial Intelligence. Detroit, MI, USA, August 1989, 1043–1048.
Brewka, G. 2004. A rank based description language for
qualitative preferences. In Proceedings of the 16th Eureopean Conference on Artificial Intelligence, ECAI’2004, Valencia, Spain, August 22-27, 2004, 303–307.
Britz, K., and Varzinczak, I. J. 2018. Rationality and context in defeasible subsumption. In Proc. 10th Int. Symp. on
Found. of Information and Knowledge Systems, FoIKS 2018,
Budapest, May 14-18, 2018, 114–132.
Britz, A., and Varzinczak, I. 2019. Contextual rational closure for defeasible ALC (extended abstract). In Proc. 32nd
International Workshop on Description Logics, Oslo, Norway, June 18-21, 2019.
Britz, K.; Heidema, J.; and Meyer, T. 2008. Semantic preferential subsumption. In Brewka, G., and Lang, J., eds., Principles of Knowledge Representation and Reasoning: Proceedings of the 11th International Conference (KR 2008), 476–
484. Sidney, Australia: AAAI Press.
Casini, G., and Straccia, U. 2010. Rational Closure for Defeasible Description Logics. In Janhunen, T., and Niemelä,
I., eds., Proc. 12th European Conf. on Logics in Artificial
Intelligence (JELIA 2010), volume 6341 of LNCS, 77–90.
Helsinki, Finland: Springer.
Casini, G., and Straccia, U. 2012. Lexicographic Closure
for Defeasible Description Logics. In Proc. of Australasian
Ontology Workshop, vol.969, 28–39.
Casini, G., and Straccia, U. 2013. Defeasible inheritancebased description logics. Journal of Artificial Intelligence
Research (JAIR) 48:415–473.
Casini, G.; Meyer, T.; Varzinczak, I. J.; ; and Moodley, K.
2013. Nonmonotonic Reasoning in Description Logics: Rational Closure for the ABox. In 26th International Workshop
Acknowledgement: We thank the anonymous referees
for their helpful comments and suggestions. This research
is partially supported by INDAM-GNCS Project 2019.
References
Baader, F., and Hollunder, B. 1995. Embedding defaults
into terminological knowledge representation formalisms. J.
Autom. Reasoning 14(1):149–180.
Baader, F.; Calvanese, D.; McGuinness, D.; Nardi, D.; and
Patel-Schneider, P. 2007. The Description Logic Handbook - Theory, Implementation, and Applications, 2nd edition. Cambridge.
Baader, F.; Brandt, S.; and Lutz, C. 2005. Pushing the EL
envelope. In Kaelbling, L., and Saffiotti, A., eds., Proceedings of the 19th International Joint Conference on Artificial
Intelligence (IJCAI 2005), 364–369. Edinburgh, Scotland,
UK: Professional Book Center.
Bonatti, P. A.; Faella, M.; Petrova, I.; and Sauro, L. 2015.
A new semantics for overriding in description logics. Artif.
Intell. 222:1–48.
Bonatti, P. A.; Faella, M.; and Sauro, L. 2011. Defeasible
inclusions in low-complexity dls. J. Artif. Intell. Res. (JAIR)
42:719–764.
Bonatti, P. A.; Lutz, C.; and Wolter, F. 2009. The Complexity of Circumscription in DLs. Journal of Artificial Intelligence Research (JAIR) 35:717–773.
Bonatti, P. A. 2019. Rational closure for all description
logics. Artif. Intell. 274:197–223.
206
Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. L.
2015. Semantic characterization of rational closure: From
propositional logic to description logics. Artificial Intelligence 226:1–33.
Gliozzi, V. 2016. Reasoning about multiple aspects in rational closure for DLs. In Proc. AI*IA 2016 - XVth International Conference of the Italian Association for Artificial Intelligence, Genova, Italy, November 29 - December 1, 2016,
392–405.
Gottlob, G.; Hernich, A.; Kupke, C.; and Lukasiewicz, T.
2014. Stable model semantics for guarded existential rules
and description logics. In Proc. KR 2014.
Knorr, M.; Hitzler, P.; and Maier, F. 2012. Reconciling
owl and non-monotonic rules for the semantic web. In ECAI
2012, 474479.
Kraus, S.; Lehmann, D.; and Magidor, M. 1990. Nonmonotonic reasoning, preferential models and cumulative logics.
Artificial Intelligence 44(1-2):167–207.
Lehmann, D., and Magidor, M. 1992. What does a conditional knowledge base entail? Artificial Intelligence 55(1):1–
60.
Lehmann, D. J. 1995. Another perspective on default reasoning. Ann. Math. Artif. Intell. 15(1):61–82.
Motik, B., and Rosati, R. 2010. Reconciling Description
Logics and rules. Journal of the ACM 57(5).
Pearl, J. 1990. System Z: A Natural Ordering of Defaults
with Tractable Applications to Nonmonotonic Reasoning. In
Parikh, R., ed., TARK (3rd Conference on Theoretical Aspects of Reasoning about Knowledge), 121–135. Pacific
Grove, CA, USA: Morgan Kaufmann.
Straccia, U. 1993. Default inheritance reasoning in hybrid
kl-one-style logics. In Bajcsy, R., ed., Proceedings of the
13th International Joint Conference on Artificial Intelligence
(IJCAI 1993), 676–681. Chambéry, France: Morgan Kaufmann.
on Description Logics (DL 2013), volume 1014 of CEUR
Workshop Proceedings, 600–615.
Casini, G.; Meyer, T.; Moodley, K.; and Nortje, R. 2014.
Relevant closure: A new form of defeasible reasoning for
description logics. In JELIA 2014, LNCS 8761, 92–106.
Springer.
Casini, G.; Straccia, U.; and Meyer, T. 2019. A polynomial
time subsumption algorithm for nominal safe elo⊥ under rational closure. Inf. Sci. 501:588–620.
Donini, F. M.; Nardi, D.; and Rosati, R. 2002. Description
logics of minimal knowledge and negation as failure. ACM
Transactions on Computational Logic (ToCL) 3(2):177–225.
Eiter, T.; Ianni, G.; Lukasiewicz, T.; Schindlauer, R.; and
Tompits, H. 2008. Combining answer set programming with
description logics for the semantic web. Artif. Intell. 172(1213):1495–1539.
Eiter, T.; Ianni, G.; Lukasiewicz, T.; and Schindlauer, R.
2011. Well-founded semantics for description logic programs in the semantic web. ACM Trans. Comput. Log.
12(2):11.
Geffner, H., and Pearl, J. 1992. Conditional entailment:
Bridging two approaches to default reasoning. Artif. Intell.
53(2-3):209–244.
Gil, O. F. 2014. On the Non-Monotonic Description Logic
ALC+Tmin . CoRR abs/1404.6566.
Giordano, L., and Gliozzi, V. 2019. Reasoning about exceptions in ontologies: An approximation of the multipreference semantics. In Symbolic and Quantitative Approaches
to Reasoning with Uncertainty, 15th European Conference,
ECSQARU 2019, Belgrade, Serbia, September 18-20, 2019,
Proceedings, 212–225.
Giordano, L., and Theseider Dupré, D. 2016. ASP for minimal entailment in a rational extension of SROEL. TPLP
16(5-6):738–754. DOI: 10.1017/S1471068416000399.
Giordano, L., and Theseider Dupré, D. 2018. Defeasible
Reasoning in SROEL from Rational Entailment to Rational
Closure. Fundam. Inform. 161(1-2):135–161.
Giordano, L., and Theseider Dupré, D. 2020. An ASP approach for reasoning in a concept-aware multipreferential
lightweight DL. CoRR abs/2006.04387.
Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. L.
2007. Preferential Description Logics. In Dershowitz, N.,
and Voronkov, A., eds., Proceedings of LPAR 2007 (14th
Conference on Logic for Programming, Artificial Intelligence, and Reasoning), volume 4790 of LNAI, 257–272.
Yerevan, Armenia: Springer-Verlag.
Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. L.
2009. ALC+T: a preferential extension of Description Logics. Fundamenta Informaticae 96:1–32.
Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. L.
2013a. A NonMonotonic Description Logic for Reasoning
About Typicality. Artificial Intelligence 195:165–202.
Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G.
2013b. Minimal Model Semantics and Rational Closure
in Description Logics . In 26th International Workshop on
Description Logics (DL 2013), volume 1014, 168 – 180.
207
An Approximate Model Counter for ASP∗
Flavio Everardo1,2 , Markus Hecher1,3 , Ankit Shukla4
1
University of Potsdam, Germany
2
Tecnológico de Monterrey Puebla Campus, Mexico
3
TU Wien, Vienna, Austria
4
JKU, Linz, Austria
flavio.everardo@cs.uni-potsdam.de, mhecher@gmail.com, ankit.shukla@jku.at
Abstract
these cover also applications in machine learning and probabilistic inference (Chavira and Darwiche 2008). In terms of
computational complexity, counting has been well-studied
since the late 70s (Durand, Hermann, and Kolaitis 2005;
Hemaspaandra and Vollmer 1995; Valiant 1979b; 1979a).
There are also results for counting involving projection,
where one wants to count only with respect to a given set of
projected atoms, which has been established for logic (Aziz
et al. 2015; Capelli and Mengel 2019; Fichte et al. 2018;
Lagniez and Marquis 2019; Gupta et al. 2019; Sharma et
al. 2019), reliability estimation (Dueñas-Osorio et al. 2017)
as well as in ASP (Gebser, Kaufmann, and Schaub 2009;
Aziz 2015; Fichte and Hecher 2019).
Given that in general counting answer sets is rather hard,
namely # · coNP-complete (Fichte et al. 2017; Durand,
Hermann, and Kolaitis 2005), which further increases to
# · ΣP
2 -completeness (Fichte and Hecher 2019) if counting with respect to a projection, a different approach than
exact counting seems to be required in practice. Indeed,
such approaches were successful for propositional logic
(SAT), which is # · P-complete and where reasoning modes
like sampling (near-uniform generation) or (approximate)
model counting (Gomes, Sabharwal, and Selman 2007a;
Chakraborty, Meel, and Vardi 2013a; 2013b; Sharma et al.
2019) were studied. For this purpose, so-called parity (XOR)
constraints are used specifically to partition the search space
in parts that preferably are of roughly the same size.
Parity constraints have been recently accommodated
in ASP as the fundamental part of the clingo-based system xorro (Everardo et al. 2019). With different solving approaches over parity constraints in xorro, these constraints amount to the classical XOR operator following the
aggregates-like syntax using theory atoms. These constraints
are interpreted as directives, solved on top of an ASP program acting as answer set filters that do not satisfy the parity
constraint in question.
With most of the applications of XOR constraints in the
neighboring area of SAT (Meel 2018), only a few attention
has been paid to treat parity constraints as well as reasoning modes like sampling or approximate model counting
for ASP. To this end, we present an extension of xorro towards approximate answer set counting following the work
from (Chakraborty, Meel, and Vardi 2013b) benefiting from
the advanced interfaces of the ASP solver clingo (Gebser et
Answer Set Programming (ASP) is a declarative framework
that is well-suited for problems in KR, AI, and other areas
with plenty of practical applications, in both the academy and
in the industry. While modern ASP solvers not only compute
one solution (answer set) but support different (reasoning)
problems, the problem of counting answer sets has not been
subject to intense studies. This is in contrast to the neighboring area of propositional satisfiability (SAT), where several
applications and problems related to quantitative reasoning
trace back to model counting. However, due to high computational complexity and depending on the actual application,
approximate counting might be sufficient. Indeed, there are
plenty of applications, where approximate counting for SAT
is well-suited. This work deals with establishing approximate
model counting for ASP, thereby lifting ideas from SAT to
ASP. We present the first approximate counter for ASP by
extending the clingo-based system xorro and also show preliminary experiments for several problems. While we do not
have specific guarantees in terms of accuracy, our preliminary
results look promising.
1
Introduction
Answer Set Programming (ASP) (Lifschitz 1999; Brewka,
Eiter, and Truszczyński 2011; Gebser et al. 2012) is a
problem modeling and solving framework that is wellknown in the area of knowledge representation and reasoning and artificial intelligence. This framework has been
practically applied to several problems in both academic
and industry (Balduccini, Gelfond, and Nogueira 2006;
Niemelä, Simons, and Soininen 1999; Nogueira et al. 2001;
Guziolowski et al. 2013; Schaub and Woltran 2018)., 1
Recently, there has been growing interest in counting solutions to problems. Indeed, counting solutions is a wellknown task not only in mathematics and computer science,
but also in other areas (Chakraborty, Meel, and Vardi 2016;
Domshlak and Hoffmann 2007; Gomes, Sabharwal, and Selman 2009; Sang, Beame, and Kautz 2005). Examples of
∗
The work has been supported by the Austrian Science Fund
(FWF), Grants Y698 and P32830, and the Vienna Science and
Technology Fund, Grant WWTF ICT19-065. It is also accepted for
presentation at the ASPOCP’20 workshop (Everardo, Hecher, and
Shukla 2020).
1
An incomplete but vast list of ASP applications:
https://www.dropbox.com/s/pe261e4qi6bcyyh/aspAppTable.pdf
208
al. 2016), and the sophisticated solving techniques developed
in SAT (e.g. the award-winning solver crypto-minisat (Soos,
Nohl, and Castelluccia 2009)). While we do not yet have
theoretical guarantees in terms of the accuracy of our approach in general, the results look promising and we hope
that this will foster further research in approximate answer
set counting.
2
drops to NP-complete (Bidoı́t and Froidevaux 1991; Marek
and Truszczyński 1991).
Example 1 Assume a graph G consisting of vertices V =
a, b, c, d and edges E = {{a, b}, {a, c}, {b, c}, {c, d}}.
Then, the vertex cover problem asks for a set S ⊆ V of vertices such that for each edge e ∈ E we have that S ∩ e 6= ∅.
An extension is the subset-minimal vertex cover problem,
where we ask only for sets S, where no subset S ′ ( S is
a vertex cover of G. We elegantly encode the computation
of subset-minimal vertex covers into an ASP program Π as
follows: For each edge {u, v} ∈ E, program Π contains the
rules u ∨ v ←. Observe that the resulting program Π indeed
precisly characterizes the subset-minimal vertex covers of G,
which are {a, c}, {b, c}, and {a, b, d}.
Preliminaries
Computational Complexity. We assume familiarity with
standard notions in computational complexity (Papadimitriou
1994) and use counting complexity classes of the form # · C
as defined in the literature (Toda and Watanabe 1992;
Durand, Hermann, and Kolaitis 2005; Hemaspaandra and
Vollmer 1995). Let Σ and Σ′ be finite alphabets, I ∈ Σ∗
an instance, and kIk denote the size of I. A witness func′∗
tion W : Σ∗ → 2Σ maps an instance I ∈ Σ∗ to its
witnesses. A counting problem L : Σ∗ → N0 is a function that maps a given instance I ∈ Σ∗ to the cardinality
of its witnesses |W(I)|. Let C be a decision complexity
class, e.g., P. Then, # · C denotes the class of all counting
problems whose witness function W satisfies (i) there is a
function f : N0 → N0 such that for every instance I ∈ Σ∗
and every W ∈ W(I) we have |W | ≤ f (kIk) and f is
computable in time O(kIkc ) for some constant c and (ii) for
∗
every instance I ∈ Σ∗ and every candidate witness W ∈ Σ′ ,
the problem of deciding whether W ∈ W(I) indeed holds,
is in the complexity class C. As a result, # · P is the complexity class consisting of all counting problems associated
with decision problems in NP.
Answer Set Programming (ASP). We assume familiarity
with propositional satisfiability (SAT) (Kleine Büning and
Lettman 1999) and follow standard definitions of propositional ASP (Brewka, Eiter, and Truszczyński 2011). Let
m, n, ℓ be non-negative integers such that m ≤ n ≤ ℓ, a1 ,
. . ., aℓ be distinct propositional atoms. Moreover, we refer by literal to an atom or the negation thereof. A (logic)
program Π is a set of rules of the form a1 ∨ · · · ∨ am ←
am+1 , . . . , an , ¬an+1 , . . . , ¬aℓ . For brevity, we use choice
rules (Simons, Niemelä, and Soininen 2002) of the form
{a} ←, which is a shortcut that corresponds to two rules
a ← ¬a′ and a′ ← ¬a, where a′ is a fresh atom. For a rule r,
we let Hr = {a1 , . . . , am }, Br+ = {am+1 , . . . , an }, and
Br− = {an+1 , . . . , aℓ }. We denote the sets of atoms occurring in a rule rSor in a program Π by at(r) = Hr ∪ Br+ ∪ Br−
and at(Π) = r∈Π at(r).
An interpretation I is a set of atoms. I satisfies a rule r if
(Hr ∪ Br− ) ∩ I 6= ∅ or Br+ \ I 6= ∅. I is a model of Π if
it satisfies all rules of Π, in symbols I |= Π. The GelfondLifschitz (GL) reduct of Π under I is the program ΠI obtained
from Π by first removing all rules r with Br− ∩ I 6= ∅ and
then removing all ¬z where z ∈ Br− from every remaining
rule r (Gelfond and Lifschitz 1991). I is an answer set of
a program Π if I is a minimal model of ΠI . We denoted
the set of all answer sets of program Π by Sol(Π). The
problem of deciding whether an ASP program has an answer
set is called consistency, which is ΣP
2 -complete (Eiter and
Gottlob 1995). If no rule uses disjunction, the complexity
Answer Set Counting (#ASP). The problem #ASP asks
for a given program Π to compute the number of answer
sets of Π. In general we have that problem #ASP is
# · co-NP-complete (Fichte et al. 2017). If we restrict the
problem #ASP to normal programs without disjunction, the
complexity drops to # · P-completeness, which is easy to
see via standard reductions that preserve the number of answer sets from and to propositional satisfiability (SAT), see,
e.g., (Janhunen 2006).
Example 2 Recall the previous example, graph G and example program Π. Since G has 3 subset-minimal vertex covers,
the solution to problem #ASP for given program Π is 3.
3
Parity Constraints, xorro, and Search
Space Partition
Towards the definition of parity constraints, let ⊤ and ⊥ stand
for the Boolean constants true and false, respectively. Given
atoms a1 and a2 , the exclusive or (XOR for short) of a1 and
a2 is denoted by a1 ⊕ a2 and it is satisfied if either a1 or
a2 is true (but not both). Generalizing the idea for n distinct atoms a1 , . . . , an , we obtain an n-ary XOR constraint
(((a1 ⊕ a2 ) . . . ) ⊕ an ) by multiple applications of ⊕. Since
it is satisfied iff an odd number of atoms among a1 , . . . , an
are true, we can simply refer it to as an odd parity constraint
and it can be written simply as a1 ⊕ . . . ⊕ an due to associativity. Analogously, an even parity (or XOR) constraint is
defined by a1 ⊕ . . . ⊕ an ⊕⊤ as it is satisfied iff an even number of atoms among a1 , . . . , an hold. Then, e.g., a1 ⊕ a2 ⊕ ⊤
is satisfied iff none or both of a1 and a2 hold. Similarly, an
even parity constraint can be represented in terms of the odd
parity by an uneven number of negated literals. For instance,
¬a1 ⊕ a2 is equivalent to a1 ⊕ a2 ⊕ ⊤ and pairs of negated
literals cancel parity inversion, for example ¬a1 ⊕ ¬a2 is
equivalent to a1 ⊕ a2 Finally, XOR constraints of forms a ⊕ ⊥
and a ⊕ ⊤ are called unary.
To accommodate parity constraints in ASP’s input language, we rely on clingo’s theory language extension (Gebser
et al. 2016) following the common syntax of aggregates (Gebser et al. 2015):
&even{1:p(1);4: not p(4);5:p(5)}.
&odd{2:not p(2);5: not p(5);6:p(6)}.
&odd{X:p(X), X > 2}.
&even{X:p(X), X < 3}.
209
sets, i.e., partition the Sol(Π) into cells with a roughly equal
number of answer sets. The difficulty is resolved by the use
of universal hashing (Carter and Wegman 1977).
The universal hashing selects a hash function randomly
from a family of hash functions with a low probability of
collision between any two keys. By choosing a random hash
function from a family, we randomize over arbitrary input
distribution. This guarantees a low number of collisions in
expectation irrespective of how data is chosen.
&odd{5:p(5)}.
That is, xorro extends the input language of clingo by
aggregate names &even and &odd that are followed by
a set, whose elements are terms conditioned by conjunctions of literals separated by commas.2 From a search
space of 64 answer sets in the context of the choice
rule {p(1..6)}., the parity constraints shown above
amounts to the XOR operations: p(1) ⊕ p(4) ⊕ p(5),
p(2) ⊕ p(5) ⊕ p(6), p(3) ⊕ p(4) ⊕ p(5) ⊕ p(6), p(1) ⊕
p(2) ⊕ ⊤, and p(1) ⊕ ⊥ yielding the answer sets {p(5)}
and {p(1),p(2),p(4),p(5),p(6)}. This means that
each parity constraint divides the search space (roughly) by
half, and for this symmetric search space example with five
parity constraints (m = 5), we have 2m clusters, each with
two answer sets.
Currently, these constraints are interpreted as directives,
filtering answer sets that do not satisfy the parity constraint
in question. 3 Hence, the first two constraints hold for
any combination of an uneven number of literals from
p(1), p(4), p(5) and p(2), p(5), p(6) respectively. The third constraint holds if an odd number of literals
from p(3) to p(6) are true, while the fourth constraint requires that either none or both of the atoms p(1) and p(2)
are included. The last parity constraint discards any answer
set not containing the atom p(5).
The solver xorro handles parity constraints in six different
ways, switching between ASP encodings of parity constraints
(eager approaches), and the use of theory propagators within
clingo’s Python interface (lazy approaches). 4
4
Definition 1 A family of functions H = {h : U → [m]} is
called a universal family if,
∀x, y ∈ U, x 6= y :
Pr [h(x) = h(y)] ≤
h∈H
1
m
if the hash function h is drawn randomly from H, the probability at which any two keys of the universe U will collide
is at most 1/m. This probability of collision is what we generally expect if the hash function assigned truly random hash
codes h(x) to every key x ∈ U .
For model counting with high guarantees this is not enough,
we not only need every input to be hashed uniformly but
also to be hashed independently. We need a family of kwise independent hash functions (Wegman and Carter 1979).
A random hash function h is k-wise independent if for all
choices of distinct x1 ,. . . ,xk the values h(x1 ),. . . ,h(xk ) are
independent.
Definition 2 Let H = {h : U → [m]} be a family of hash
functions. We call H k-wise independent if for any distinct
x1 , . . . , xk ∈ U and any y1 , ..., yk ∈ [m] we have
Prh∈H [h(x1 ) = y1 ∧ ... ∧ h(xk ) = yk ] = m1k
Universal hashing and XOR’s
One approach to solving #ASP is to count all the answer sets
by enumeration (almost), known as exact counting. We focus
on the problem of approximate model counting.
In applications of model counting like probabilistic reasoning, planning with uncertainty etc, it may be sufficient
to approximate the solution count and avoid the overhead
of exact model counting. An approximate counting tries to
approximately compute the number of answer sets by using a
probabilistic algorithm ApproxASP(Π, ǫ, δ). The algorithm
takes a program Π with a tolerance ǫ > 0 and a confidence
0 < δ ≤ 1 as an input. The output is an estimate mc based
on the parameter ǫ and δ. Proving theoretical bounds and
adapting the algorithm for the same is an ongoing work.
The central idea of the approximate model counting approach is the use of hash functions to partition the set Sol(Π)
of answer sets for a program Π, into roughly equal small
cells. Then pick a random cell and scale it by the number of
cells to obtain an ǫ-approximate estimate of the model count.
Note that apriori we do not know the distribution of the solution set Sol(Π) of answer sets. We have to perform good
hashing without the knowledge of the distribution of answer
Intuitively a family H = {h : U → [m]} of hash functions
is k-wise independent if for any distinct keys x1 , ..., xk ∈
U , the hash codes h(x1 ), ..., h(xk ) are independent random
variables and for any fixed x, h(x) is uniformly distributed in
[m]. Let n, m and k be positive integers, we use H(n, m, k)
to denote a family of k-wise independent hash functions
mapping {0, 1}n to {0, 1}n .
The canonical construction of a k-wise independent family
is based on polynomials of degree k - 1. Let p ≥ |U | be
prime. Picking random a0 , ..., ak−1 ∈ {0, ..., p-1}, the hash
function is defined by:
h(x) = ((ak−1 xk−1 + ... + a1 x + a0 ) mod p) mod m
For p ≫ m, the hash function is statistically close to kwise independent. The 2-wise independent hash function
is:
h(x) = (((a1 x + a0 ) mod p) mod m)
The higher the k, the stronger will be the guarantee on a
range of the size of cells. To encode k-wise independence
we will require polynomial of degree k - 1. But the higher
the k, the harder it will be to solve the formula with these
constraints. To balance this trade-off we use 3-wise independent hash functions. If we use k-wise independence, all cells
will be small and we will get a uniform generation. Using
3-wise independence we achieve a random cell is small with
high probability, known as almost-uniform generation.
2
In turn, multiple conditional terms within an aggregate are
separated by semicolons.
3
XOR constraints cannot occur either in the bodies or heads of
rules.
4
The distinction of eager and lazy approaches follows the
methodology in Satisfiability modulo theories (Barrett et al. 2009).
210
Algorithm 2: CountAS(Π, S, pivot)
1 /* Assume z1 , ..., zn are the atoms of Π */
2 i, l = ⌊log2 (pivot) − 1⌋
3 while (1 ≤ |S| ≤ pivot) ∨(i = n) do
4
i=i+1
5
h ← Hxor (n, i − l, 3)
6
α ← {0, 1}i−l
7
S = xorro(Π ∧ (h(z1 , ..., zn ) = α, pivot + 1)
8 if |S| > pivot or |S| = 0 then
9
return ⊥
10 else
11
return |S| ·2i−l
Algorithm 1: ApproxASP
1 ApproxASP(Π, ǫ, δ) ;
Result: Approximate number of answer sets or ⊥
2 counter = 0 ; C = [ ] ;
1/2
3 pivot = 2 × ⌈3e
(1 + 1ǫ )2 ⌉ ;
4 S = xorro(Π, pivot + 1) ;
5 if |S| ≤ pivot then
6
return |S|
7 else
8
iter = ⌈27 log2 (3/δ)⌉;
9
while counter < iter do
10
c = CountAS(Π, S, pivot) ;
11
counter = counter + 1 ;
12
if c 6= ⊥ then
13
AddToList(C, c)
14
end
15
end
16 end
17 return FindMean(C) ;
5
ASP model count c returned by CountAS are appended to
the list C. The final estimate of the model count returned
by ApproxASP is the mean of the estimates stored in C,
computed using FindMean(C).
Algorithm CountAS takes as input an ASP program Π
and a pivot. It returns an ǫ-approximate estimate of the model
count of the program Π. The algorithm uses random hash
functions from Hxor (n, i − l, 3) to partition the model space
of the program Π. This is done by choosing a random hash
function h (on line 5) and choosing randomly with probability
half (α, line 6) bits to set. Then it conjuncts the chosen XOR
with the program Π and uses xorro to check whether it has at
most pivot + 1 models. This process repeats (lines 5-7) and
the loop terminates if either a randomly chosen cell is found
to be small (|S| ≤ pivot) and non-empty, or if the number of
cells generated > 2n + 1/pivot.
We scale the size of S by the number of cells generated by
the corresponding hashing function to compute an estimate
of the model count. If all randomly chosen cells were either
empty or not small, we return ⊥ and report a counting error.
Algorithm
We use an already evolved algorithm from the case of propositional satisfiability (Chakraborty, Meel, and Vardi 2013b)
and lift it to ASP. For our work we use a specific family of hash functions denoted by Hxor (n, m, 3), to partition the set of models of an input formula into “small”
cells. This family of hash functions has been used in
(Gomes, Sabharwal, and Selman 2006; Chakraborty, Meel,
and Vardi 2013b), and is shown to be 3-wise independent in (Gomes, Sabharwal, and Selman 2007b). Please
refer to (Gomes, Sabharwal, and Selman 2006; 2007b;
Chakraborty, Meel, and Vardi 2013b) for details. Our resulting system extends xorro (Everardo et al. 2019) and is
called xampler5 .
We assume that xampler has access to xorro (Everardo et
al. 2019) that takes as input an ASP program Π′ possibly in
conjunction with XOR constraints, as well as a bound b ≥ 0.
The function xorro(Π′ , b) returns a set of models S of Π′
such that |S| = min(b, #Π′ ).
The system xampler implements algorithm ApproxASP,
which takes as input ASP program Π, a tolerance ǫ (0 <
ǫ ≤ 1) and a confidence δ (0 < δ ≤ 1) as an input. It
computes a threshold pivot that depends on ǫ to determine
the chosen value of the size of a small cell. Then it checks
if the input program Π has at least a pivot number of answer
sets. It uses xorro to check if the input program has at least
b (b = pivots + 1) answer sets. If the total number of answer
sets S of Π is less than or equals to b the algorithm returns the
answer set count |S|. Otherwise, the algorithm continues and
calculates a parameter iter(≥ 1) that determines the number
of times CountAS is invoked. Note that iter depends only
on δ. Next, there are at most iter number of calls that are
made to CountAS. The resulted in non-⊥ estimates of the
6
Experiments
To test the algorithms above, we benchmarked our resulting approximate counter xampler, which extends xorro by
these algorithms. For now, we focus only on the quality
of the counting, leaving the scalability and performance for
further work. To test the quality of the counting, we generated ten random instances from different ASP problem
classes, where we aim for counting graph colorings, subsetminimal vertex covers, solutions (witnesses) to the schur
decision problem, hamiltonian paths, subset-minimal independent dominating sets as well as solving projected model
counting on 2-QBFs (Durand, Hermann, and Kolaitis 2005;
Kleine Büning and Lettman 1999). 6 Note that projected
model counting on 2-QBFs is proven to be # · co-NPcomplete (Durand, Hermann, and Kolaitis 2005). Further, we
also suspect that both problems of counting all the subsetminimal vertex covers as well as counting subset-minimal
independent dominating set are hard for this complexity class.
At least there are no known polynomial encodings for SAT
5
The system xampler is open-source and available at https://
github.com/flavioeverardo/xampler/.
6
The encodings and instances can be found at: https://tinyurl.
com/approx-asp
211
Best
init
#Answer Sets
Median
Q
Mean
Q
Median
1
2
3
4
5
6
7
8
9
10
262,080
90,000
62,400
37,680
70,560
4,800
20,880
9,959,040
13,996,920
5,569,560
Average
230,400
77,568
49,408
33,536
64,768
4,240
17,600
5,636,096
10,289,152
4,407,296
0.88
0.86
0.79
0.89
0.92
0.88
0.84
0.57
0.74
0.79
0.82
253,097
92,529
64,201
38,524
72,083
4,869
20,917
10,146,567
13,373,132
5,840,620
0.97
1.03
1.03
1.02
1.02
1.01
1
1.02
0.96
1.05
1.01
170,752
54,272
36,608
20,864
41,216
3,200
13,440
4,259,840
4,816,896
2,564,096
Worst
Q
0.65
0.6
0.59
0.55
0.58
0.67
0.64
0.43
0.34
0.46
0.55
Mean
Q
295,872
95,603
50,749
28,517
89,509
5,344
23,239
12,793,030
17,169,132
4,992,504
1.13
1.06
0.81
0.76
1.27
1.11
1.11
1.28
1.23
0.9
1.07
Table 1: Approximate answer set count over random instances of the Graph Coloring problem.
Best
init
#Answer Sets
Median
Q
Mean
Q
Median
1
2
3
4
5
6
7
8
9
10
104,640
23,856
71,136
1,537,680
104,640
608,400
2,530,080
1,261,008
23,756,544
8,406,048
Average
90,112
14,912
69,888
1,327,104
89,088
552,960
1,941,504
686,080
12,713,984
4,554,752
0.86
0.63
0.98
0.86
0.85
0.91
0.77
0.54
0.54
0.54
0.75
104,866
24,143
74,317
1,592,656
107,800
585,332
2,592,145
1,268,456
23,627,783
8,339,475
1
1.01
1.04
1.04
1.03
0.96
1.02
1.01
0.99
0.99
1.01
54,528
5,344
55,296
1,048,576
59,392
312,320
1,314,816
332,800
8,650,752
2,314,240
Worst
Q
0.52
0.22
0.78
0.68
0.57
0.51
0.52
0.26
0.36
0.28
0.47
Mean
Q
92,076
16,867
79,189
1,378,273
83,326
740,732
2,122,486
1,081,334
25,093,716
7,244,022
0.88
0.71
1.11
0.9
0.8
1.22
0.84
0.86
1.06
0.86
0.92
Table 2: Approximate answer set count over random instances of the Schur decision problem.
init
#Answer Sets
Median
Q
1
2
3
4
5
6
7
8
9
10
480,163
14,439
362,880
74,156
63,861
20,705
19,020
653,487
49,837
1,271,017
Average
304,128
10,192
221,184
39,424
44,800
14,816
15,488
443,392
49,408
872,448
0.63
0.71
0.61
0.53
0.7
0.72
0.81
0.68
0.99
0.69
0.71
Best
Mean
478,946
14,944
395,350
77,848
63,243
21,980
22,224
659,262
47,274
1,196,522
Worst
Q
Median
Q
Mean
Q
1
1.03
1.09
1.05
0.99
1.06
1.17
1.01
0.95
0.94
1.03
202,752
6,176
89,088
25,408
30,528
6,088
10,592
234,496
18,816
453,632
0.42
0.43
0.25
0.34
0.48
0.29
0.56
0.36
0.38
0.36
0.39
584,700
15,377
652,665
98,129
95,465
15,884
30,914
874,308
65,238
2,055,348
1.22
1.06
1.8
1.32
1.49
0.77
1.63
1.34
1.31
1.62
1.36
Table 3: Approximate answer set count over random instances of the Hamiltonian Path problem.
212
Best
init
#Answer Sets
Median
Q
Mean
Q
Median
1
2
3
4
5
6
7
8
9
10
46,737
58,538
6,405,610
5,330,500
7,460,775
5,733,125
187,928
919,808
493,431,189
5,785,344
Average
46,592
55,296
6,193,152
5,210,112
7,536,640
5,734,400
183,296
909,312
492,830,720
5,799,936
1
0.94
0.97
0.98
1.01
1
0.98
0.99
1
1
0.99
45,171
60,324
6,449,549
5,174,303
7,446,894
5,811,551
188,198
916,914
494,853,532
5,868,650
0.97
1.03
1.01
0.97
1
1.01
1
1
1
1.01
1
36,224
40,448
5,701,632
3,604,480
6,717,440
4,620,288
155,648
962,560
353,370,112
6,307,840
Worst
Q
0.78
0.69
0.89
0.68
0.9
0.81
0.83
1.05
0.72
1.09
0.84
Mean
Q
50,838
49,216
7,413,423
4,619,493
6,787,805
5,966,710
195,252
963,723
440,323,668
6,407,616
1.09
0.84
1.16
0.87
0.91
1.04
1.04
1.05
0.89
1.11
1
Table 4: Approximate answer set count over random instances of the Subset-Minimal Vertex Cover problem.
init
#Answer Sets
Median
Q
1
2
3
4
5
6
7
8
9
10
47,730
41,113
118,985
133,564
33,792
12,800
99,215
103,471
48,970
57,266
Average
51,200
40,960
137,216
143,360
57,344
20,480
71,680
85,632
49,152
61,440
1.07
1
1.15
1.07
1.7
1.6
0.72
0.83
1
1.07
1.12
Best
Mean
87,210
40,715
237,918
232,852
79,616
20,293
143,568
79,759
72,946
103,509
Q
Median
1.83
0.99
2
1.74
2.36
1.59
1.45
0.77
1.49
1.81
1.6
40,960
24,320
147,456
154,112
65,536
22,528
60,416
25,984
38,912
63,488
Worst
Q
Mean
0.86
0.59
1.24
1.15
1.94
1.76
0.61
0.25
0.79
1.11
1.03
104,170
60,945
363,503
287,719
80,630
39,082
201,910
167,701
85,419
117,394
Q
2.18
1.48
3.06
2.15
2.39
3.05
2.04
1.62
1.74
2.05
2.18
Table 5: Approximate answer set count over random instances of the Subset-Minimal Independent Dominating Set problem.
Best
init
#Answer Sets
Median
Q
Mean
Q
Median
1
2
3
4
5
6
7
8
9
10
32,716
65,320
130,854
260,463
928,467
1,045,622
4,168,467
8,294,512
8,346,882
522,316
Average
32,768
65,536
131,072
262,144
1,638,400
1,048,576
6,291,456
8,388,608
8,388,608
524,288
1
1
1
1.01
1.76
1
1.51
1.01
1
1
1.13
53,620
96,483
179,361
400,777
3,167,573
1,530,013
7,689,557
4,882,538
12,582,912
664,462
1.64
1.48
1.37
1.54
3.41
1.46
1.81
0.59
1.51
1.27
1.61
65,536
65,536
131,072
522,240
3,014,656
1,048,576
8,323,072
131,072
8,388,608
786,432
Worst
Q
2
1
1
2.01
3.25
1
2
0.02
1
1.51
1.48
Mean
Q
55,050
102,221
283,704
522,240
4,969,813
1,643,081
7,542,101
12,558,336
14,198,285
785,521
1.68
1.56
2.17
2.01
5.35
1.57
1.84
1.51
1.7
1.5
2.09
Table 6: Approximate answer set count over random instances on Projected Model Counting on 2-QBFs.
213
that precisely capture the solutions to these problems. Hence,
it is unlikely that one can easily approximate the number of
solutions by means of approximate SAT counting. Also, to
track the counting, we cared that these instances were “easy
to solve” for clingo, meaning that clingo must enumerate all
answer sets within 600 seconds timeout (without printing).
To get the feeling for our initial counting experiments, we
tried different values for both the tolerance and the confidence, seeking for different size of clusters and number of
iterations, as shown in lines 3 and 8 from Algorithm 1, respectively. It is worth reminding that these parameters directly
affect the density of the parity constraints (lines 5 and 6 from
Algorithm 2). These constraints follow the syntax and principles discussed in Sections 3 and 4, respectively. As part of
the setup of the experiments and for comparison, we asked
xorro (Everardo et al. 2019) to estimate the count also by
calculating the median, taken from the original ApproxMC
Algorithm in (Chakraborty, Meel, and Vardi 2013b).
The experiments were run sequentially under the Ubuntubased Elementary OS on a 16 GB memory with a 2.60
GHz Dual-Core Intel Core i7 processor laptop using Python
3.7.6. Each benchmark instance (in smodels output format,
generated offline with the grounder gringo that is part of
clingo (Gebser et al. 2016)) was run five times without any
time restriction. As shown in Algorithm 1, a run is finished
with one of two possible situations, either xorro returns the
approximate answer sets count or unsatisfiability.
Our experiments’ results are summarized in Tables 1-6
listing for each problem class instances, the number of answer
sets in the first two columns. The remainder of the table is
divided into the best and worst runs from the five. For both,
the median and the mean counts, we add a quality factor (Q)
estimating the closeness to the total number of answer sets.
The last row of each table displays the average Q for each
count.
In the first three tables, we can see the pattern that the
mean count got better results even in their worst case. On the
other hand, the medians under approximate the counts. For
instance, in the Schur problem, the last three instances where
almost 50% under approximated, lowering the average on the
bottom line. However, for the subset-minimal vertex covers,
we see that both counts were almost exactly on average. In
this example, also the worst cases are close to an exact count.
For the most complex problems shown in Tables 5 and 6, the
average counts over approximate the number of solutions.
However, the margin for the median’s best case is close to
an exact count. A proof for this is in Table 6 where a Q of
1 was gotten in six instances out of ten. In these problems,
the mean count over approximates the number of answer sets
giving no proper estimations. The best-case scenario goes 60
percent over the desired number. It is also noticeable that in
most of the cases, the median count under approximate the
number of answer sets, and the opposite happened with the
mean (over approximate).
The large deviations between the best and the worst cases
correspond to one of two possible scenarios. If the count
is under approximating, it means the partition was not well
distributed, and some clusters had too few or too many answer sets. On the opposite case, where there is an over-
approximation count, our set of XORs contains linear combinations or linearly dependent equations, meaning that the
partitioning is not performed concerning the number of
XOR s. For instance, the conjunction of the XOR constraints
a ⊕ ⊤ ∧ b ⊕ ⊤ ∧ a ⊕ b ⊕ ⊤ can be equivalently reduced to
a ⊕ ⊤ ∧ b ⊕ ⊤. Back to our example in Section 3, instead of
counting |S| ·25 being S = 2, one linear combination causes
the double of answer sets from the resulting cluster, so for
this case, S = 4, and the approximate count is 1024 instead
of 64.
As we mentioned above, the performance was not examined for this paper, meaning that it is worth considering for
further experiments by testing all the different approaches
from xorro. For the experiments above, we ran xorro with
the lazy counting approach, which got the highest overall performance score from all the six implementations. However,
the random parity constraints generated during each counting
iteration were quite small, meaning that other approaches
would benefit more for these XORs densities, like the Unit
Propagation approach (Everardo et al. 2019).
7
Conclusion and Future work
This paper discusses an extension of the ASP system xorro
towards approximate answer set counting. This is established by lifting ideas from existing techniques for SAT
to the formalism of ASP. While our preliminary results
are promising and show that indeed approximate counting
works for ASP, there is still potential for future works. On
the one hand, we highly recommend studying and showing proper guarantees that prove our results are guaranteed
to be accurate with high probability and do not deviate far
from the actual result. Further, we highly encourage additional tuning and improvements to our preliminary implementation concerning several aspects such as the parity
constraints solving, scalability, and the abolition of linear
combinations. Talking about scalability, we need to perform more experiments and algorithm revisions in order
to perform better than clingo’s (clasp’s) enumeration. We
hope that this work fosters applications and further research
on quantitative reasoning like e.g., (Kimmig et al. 2011;
Tsamoura, Gutiérrez-Basulto, and Kimmig 2020), for ASP.
References
Aziz, R. A.; Chu, G.; Muise, C.; and Stuckey, P. 2015.
#(∃)SAT: Projected Model Counting. In Heule, M., and
Weaver, S., eds., Proceedings of the 18th International Conference on Theory and Applications of Satisfiability Testing
(SAT’15), 121–137. Austin, TX, USA: Springer.
Aziz, R. A. 2015. Answer Set Programming: Founded
Bounds and Model Counting. Ph.D. Dissertation, Department
of Computing and Information Systems , The University of
Melbourne.
Balduccini, M.; Gelfond, M.; and Nogueira, M. 2006. Answer set based design of knowledge systems. Ann. Math.
Artif. Intell. 47(1-2):183–219.
Barrett, C.; Sebastiani, R.; Seshia, S.; and Tinelli, C. 2009.
Satisfiability modulo theories. In Biere, A.; Heule, M.; van
214
Maaren, H.; and Walsh, T., eds., Handbook of Satisfiability, volume 185 of Frontiers in Artificial Intelligence and
Applications. IOS Press. chapter 26, 825–885.
Bidoı́t, N., and Froidevaux, C. 1991. Negation by default and
unstratifiable logic programs. Theoretical Computer Science
78(1):85–112.
Brewka, G.; Eiter, T.; and Truszczyński, M. 2011. Answer
set programming at a glance. Communications of the ACM
54(12):92–103.
Capelli, F., and Mengel, S. 2019. Tractable QBF by knowledge compilation. In Niedermeier, R., and Paul, C., eds.,
STACS 2019, volume 126 of LIPIcs, 18:1–18:16. Schloss
Dagstuhl - Leibniz-Zentrum fuer Informatik.
Carter, L., and Wegman, M. N. 1977. Universal classes
of hash functions (extended abstract). In Hopcroft, J. E.;
Friedman, E. P.; and Harrison, M. A., eds., Proceedings of
the 9th Annual ACM Symposium on Theory of Computing,
May 4-6, 1977, Boulder, Colorado, USA, 106–112. ACM.
Chakraborty, S.; Meel, K.; and Vardi, M. 2013a. A scalable
and nearly uniform generator of SAT witnesses. In Sharygina, N., and Veith, H., eds., Proceedings of the Twenty-fifth
International Conference on Computer Aided Verification
(CAV’13), volume 8044 of LNCS, 608–623. Springer.
Chakraborty, S.; Meel, K. S.; and Vardi, M. Y. 2013b. A
scalable approximate model counter. In Schulte, C., ed., Principles and Practice of Constraint Programming - 19th International Conference, CP 2013, Uppsala, Sweden, September
16-20, 2013. Proceedings, volume 8124 of Lecture Notes in
Computer Science, 200–216. Springer.
Chakraborty, S.; Meel, K. S.; and Vardi, M. Y. 2016. Improving approximate counting for probabilistic inference: From
linear to logarithmic SAT solver calls. In Kambhampati, S.,
ed., Proceedings of 25th International Joint Conference on
Artificial Intelligence (IJCAI’16), 3569–3576. New York
City, NY, USA: The AAAI Press.
Chavira, M., and Darwiche, A. 2008. On probabilistic
inference by weighted model counting. Artificial Intelligence
172(6–7):772—799.
Domshlak, C., and Hoffmann, J. 2007. Probabilistic planning
via heuristic forward search and weighted model counting. J.
Artif. Intell. Res. 30.
Dueñas-Osorio, L.; Meel, K. S.; Paredes, R.; and Vardi,
M. Y. 2017. Counting-based reliability estimation for powertransmission grids. In Singh, S. P., and Markovitch, S., eds.,
Proceedings of the Thirty-First AAAI Conference on Artificial
Intelligence (AAAI’17), 4488–4494. San Francisco, CA,
USA: The AAAI Press.
Durand, A.; Hermann, M.; and Kolaitis, P. G. 2005. Subtractive reductions and complete problems for counting complexity classes. Theoretical Computer Science 340(3):496–513.
Eiter, T., and Gottlob, G. 1995. On the computational cost
of disjunctive logic programming: Propositional case. Ann.
Math. Artif. Intell. 15(3–4):289–323.
Everardo, F.; Janhunen, T.; Kaminski, R.; and Schaub, T.
2019. The return of xorro. In Balduccini, M.; Lierler, Y.; and
Woltran, S., eds., Logic Programming and Nonmonotonic
Reasoning - 15th International Conference, LPNMR 2019,
Philadelphia, PA, USA, June 3-7, 2019, Proceedings, volume 11481 of Lecture Notes in Computer Science, 284–297.
Springer.
Everardo, F.; Hecher, M.; and Shukla, A. 2020. Extending xorro with Approximate Model Counting. In ASPOCP@ICLP.
Fichte, J. K., and Hecher, M. 2019. Treewidth and counting
projected answer sets. In LPNMR’19, volume 11481 of
LNCS, 105–119. Springer.
Fichte, J. K.; Hecher, M.; Morak, M.; and Woltran, S. 2017.
Answer set solving with bounded treewidth revisited. In LPNMR, volume 10377 of Lecture Notes in Computer Science,
132–145. Springer.
Fichte, J. K.; Hecher, M.; Morak, M.; and Woltran, S. 2018.
Exploiting treewidth for projected model counting and its limits. In SAT’18, volume 10929 of LNCS, 165–184. Springer.
Gebser, M.; Kaminski, R.; Kaufmann, B.; and Schaub, T.
2012. Answer Set Solving in Practice. Morgan & Claypool.
Gebser, M.; Harrison, A.; Kaminski, R.; Lifschitz, V.; and
Schaub, T. 2015. Abstract gringo. TPLP 15(4-5):449–463.
Gebser, M.; Kaminski, R.; Kaufmann, B.; Ostrowski, M.;
Schaub, T.; and Wanko, P. 2016. Theory solving made easy
with clingo 5. In Carro, M.; King, A.; Saeedloei, N.; and
Vos, M. D., eds., Technical Communications of the 32nd
International Conference on Logic Programming, ICLP 2016
TCs, October 16-21, 2016, New York City, USA, volume 52
of OASICS, 2:1–2:15. Schloss Dagstuhl - Leibniz-Zentrum
fuer Informatik.
Gebser, M.; Kaufmann, B.; and Schaub, T. 2009. Solution
enumeration for projected boolean search problems. In van
Hoeve, W.-J., and Hooker, J. N., eds., Proceedings of the
6th International Conference on Integration of AI and OR
Techniques in Constraint Programming for Combinatorial
Optimization Problems (CPAIOR’09), volume 5547 of LNCS,
71–86. Berlin: Springer.
Gelfond, M., and Lifschitz, V. 1991. Classical negation in
logic programs and disjunctive databases. New Generation
Comput. 9(3/4):365–386.
Gomes, C. P.; Sabharwal, A.; and Selman, B. 2006. Model
counting: A new strategy for obtaining good bounds. In
AAAI, 54–61.
Gomes, C.; Sabharwal, A.; and Selman, B. 2007a. Nearuniform sampling of combinatorial spaces using XOR constraints. In Schölkopf, B.; Platt, J.; and Hofmann, T., eds.,
Proceedings of the Twentieth Annual Conference on Neural
Information Processing Systems (NIPS’06), 481–488. MIT
Press.
Gomes, C. P.; Sabharwal, A.; and Selman, B. 2007b. Nearuniform sampling of combinatorial spaces using xor constraints. In Advances In Neural Information Processing Systems, 481–488.
Gomes, C. P.; Sabharwal, A.; and Selman, B. 2009. Chapter
20: Model counting. In Biere, A.; Heule, M.; van Maaren,
H.; and Walsh, T., eds., Handbook of Satisfiability, volume
215
185 of Frontiers in Artificial Intelligence and Applications.
Amsterdam, Netherlands: IOS Press. 633–654.
Gupta, R.; Sharma, S.; Roy, S.; and Meel, K. S. 2019. Waps:
Weighted and projected sampling. In Vojnar, T., and Zhang,
L., eds., Proceedings of the 25th International Conference
on Tools and Algorithms for the Construction and Analysis
of Systems (TACAS’19), 59–76. Prague, Czech Republic:
Springer. Held as Part of the European Joint Conferences on
Theory and Practice of Software.
Guziolowski, C.; Videla, S.; Eduati, F.; Thiele, S.; Cokelaer,
T.; Siegel, A.; and Saez-Rodriguez, J. 2013. Exhaustively
characterizing feasible logic models of a signaling network
using answer set programming. Bioinformatics 29(18):2320–
2326. Erratum see Bioinformatics 30, 13, 1942.
Hemaspaandra, L. A., and Vollmer, H. 1995. The satanic
notations: Counting classes beyond #P and other definitional
adventures. SIGACT News 26(1):2–13.
Janhunen, T. 2006. Some (in)translatability results for normal
logic programs and propositional theories. Journal of Applied
Non-Classical Logics 16(1-2):35–86.
Kimmig, A.; Demoen, B.; Raedt, L. D.; Costa, V. S.; and
Rocha, R. 2011. On the implementation of the probabilistic
logic programming language problog. Theory Pract. Log.
Program. 11(2-3):235–262.
Kleine Büning, H., and Lettman, T. 1999. Propositional
logic: deduction and algorithms. Cambridge University
Press, Cambridge.
Lagniez, J.-M., and Marquis, P. 2019. A recursive algorithm
for projected model counting. In Hentenryck, P. V., and
Zhou, Z.-H., eds., Proceedings of the 33rd AAAI Conference
on Artificial Intelligence (AAAI’19).
Lifschitz, V. 1999. Answer set planning. In ICLP, 23–37.
MIT Press.
Marek, W., and Truszczyński, M. 1991. Autoepistemic logic.
J. of the ACM 38(3):588–619.
Meel, K. S. 2018. Constrained counting and sampling:
Bridging the gap between theory and practice. CoRR
abs/1806.02239.
Niemelä, I.; Simons, P.; and Soininen, T. 1999. Stable model
semantics of weight constraint rules. In LPNMR’99, volume
1730 of LNCS, 317–331. Springer.
Nogueira, M.; Balduccini, M.; Gelfond, M.; Watson, R.; and
Barry, M. 2001. An A-Prolog decision support system for the
Space Shuttle. In PADL’01, volume 1990 of LNCS, 169–183.
Springer.
Papadimitriou, C. H. 1994. Computational Complexity.
Addison-Wesley.
Sang, T.; Beame, P.; and Kautz, H. 2005. Performing
bayesian inference by weighted model counting. In Veloso,
M. M., and Kambhampati, S., eds., Proceedings of the 29th
National Conference on Artificial Intelligence (AAAI’05).
The AAAI Press.
Schaub, T., and Woltran, S. 2018. Special issue on answer
set programming. KI 32(2-3):101–103.
Sharma, S.; Roy, S.; Soos, M.; and Meel, K. S. 2019. Ganak:
A scalable probabilistic exact model counter. In Kraus, S.,
ed., Proceedings of the 28th International Joint Conference
on Artificial Intelligence, IJCAI-19, 1169–1176. IJCAI.
Simons, P.; Niemelä, I.; and Soininen, T. 2002. Extending
and implementing the stable model semantics. Artif. Intell.
138(1-2):181–234.
Soos, M.; Nohl, K.; and Castelluccia, C. 2009. Extending
SAT solvers to cryptographic problems. In Kullmann, O.,
ed., Theory and Applications of Satisfiability Testing - SAT
2009, 12th International Conference, SAT 2009, Swansea,
UK, June 30 - July 3, 2009. Proceedings, volume 5584 of
Lecture Notes in Computer Science, 244–257. Springer.
Toda, S., and Watanabe, O. 1992. Polynomial time 1-turing
reductions from #ph to #p. Theor. Comput. Sci. 100(1):205–
221.
Tsamoura, E.; Gutiérrez-Basulto, V.; and Kimmig, A. 2020.
Beyond the grounding bottleneck: Datalog techniques for
inference in probabilistic logic programs. In AAAI, 10284–
10291. AAAI Press.
Valiant, L. G. 1979a. The complexity of computing the
permanent. Theoretical Computer Science 8(2):189–201.
Valiant, L. 1979b. The complexity of enumeration and
reliability problems. SIAM J. Comput. 8(3):410–421.
Wegman, M. N., and Carter, L. 1979. New classes and
applications of hash functions. In 20th Annual Symposium
on Foundations of Computer Science, San Juan, Puerto Rico,
29-31 October 1979, 175–182. IEEE Computer Society.
216
A Survey on Multiple Revision
Fillipe Resina∗ , Renata Wassermann
Universidade de São Paulo
{fmresina, renata}@ime.usp.br
Abstract
2012). Another possible scenario occurs when an agent receives a set of new beliefs and, based on its previous knowledge, selects the most reliable ones to incorporate.
In some situations, it may be possible to reduce multiple
revision to single revision, for example, taking the conjunction of all new sentences, but it is not always feasible. Thus,
a framework for multiple changes is needed. In addition, it
is not the same as applying revision in an iterated way (Darwiche and Pearl 1997), taking the input set and revising sequentially, one by one. In Multiple Revision, it is assumed
that there is no preference over the input sentences, i.e., all
of them have equal priority and should be processed at the
same time. Besides that, since the order in which you would
process the sentences can make a difference in the final result, working with iterated revision may cause an asymmetry. In (Delgrande and Jin 2012) it is also observed that, in
many approaches to iterated revision, if an agent revises by a
sentence and then by a sentence that is inconsistent with the
previous one, then the agent’s beliefs are precisely the same
as if only the second revision had been performed. We are
also not going to address belief merging, a kind of change
operation in which preceding and new beliefs play symmetric roles. For more information refer to (Fuhrmann 1997;
Konieczny and Pérez 2002; Falappa et al. 2012).
Given the importance of Multiple Revision, the purpose
of this paper is to summarize the literature on the field, providing unified terminology and notation for readers looking
for an overview. We also identify some limitations of the
models and present some comparisons between them.
Belief Revision deals with the problem of how a rational
agent should proceed in face of new information. A revision
occurs when an agent receives new information possibly inconsistent with its epistemic state and has to change it in order
to accommodate the new belief in a consistent way. However,
this new information may come as a set of beliefs (instead of
a single one), a problem known as Multiple Revision. Unlike Iterated Revision, in Multiple Revision all information is
processed simultaneously. The purpose of this survey is to
bring and organize the state-of-the-art in the area, showing
the different approaches developed since 1988 and the open
problems that still exist.
1
Introduction
According to Gärdenfors (1988), it is not very useful to
know how to represent knowledge if at the same time we
do not know how to change it when we receive new information. The motivation of this idea is that knowledge is not
static, what means that we need to be able to deal with its dynamics. That is the context of the studies in the area of Belief
Revision, which aims to handle the problem of adding or removing new information to a knowledge base in a consistent
way. Most of the literature about Belief Revision is based on
the AGM paradigm, named after the authors of the seminal
paper (Alchourrón, Gärdenfors, and Makinson 1985).
In the AGM paradigm, given a set of beliefs, there are
three possible changes in relation to a new belief: expansion, contraction and revision. Expansion occurs when the
base simply absorbs the information without loss. A contraction consists in retracting beliefs from the base until the
specified information is not derivable. Finally, revision happens when the new belief is added in a consistent way, possibly demanding a repair in order to eliminate inconsistency.
In this survey, we are going to focus on this last operation.
In the original framework, the new belief is assumed to be
represented by a single formula. Nevertheless, there are situations in which the information by which we are going to
revise comes in block, that is, a concurrent acceptance of a
(possibly infinite) set of beliefs. Besides, when we deal with
a multi-agent context it may be necessary to revise a belief
state by another belief state, as pointed out in (Falappa et al.
∗
Notation In this article, we assume a formal language L
and use Cn to represent an operator that returns the set of
logical consequences of the input set. For atomic formulas,
we use lowercase Greek letters (α, β, ...) and for sets of
formulas, uppercase Latin letters (A, B, C,...). K is reserved
to represent a belief
V set (i.e. K = Cn(K)). We use ⊥ for the
falsity constant. A stands for set conjunction. We denote
the power set of S by P(S).
Organization of the article Section 2 presents singleton
revision. Section 3 shows the first works developed on Multiple Revision. Section 4 explores the context when the input is infinite, while Section 5 studies further approaches using systems of spheres. Section 6 gathers approaches using
direct construction in package revision and Section 7 gath-
Supported by the Brazilian funding agency CAPES
217
ers approaches in non-prioritized revision. Section 8 synthesizes alternative constructions based on core beliefs of a
belief state. Finally, some conclusions and open problems
are discussed.
2
The following representation theorem connects the construction to the rationality postulates:
Theorem 1. (Alchourrón, Gärdenfors, and Makinson 1985)
Let K be a belief set. An operator ∗ on K is a partial meet
revision function iff ∗ satisfies (K ∗ 1) − (K ∗ 6).
The Revision Operation
For belief bases, the same construction of internal partial
meet revision can be used (Hansson 1993).
When an agent needs to accept new information inconsistent
with its previous beliefs, it may be necessary to give up one
or more beliefs to avoid the inconsistency. Aiming at information economy, only what is needed should be changed.
This kind of change is known as revision. AGM revision (∗)
receives a set K of beliefs, a new sentence α and returns a
new set K ∗ α in which α was consistently added.
The Levi Identity (Gärdenfors 1988) relates revision to
contraction (−) and expansion (+): K ∗ α = (K − ¬α) + α
Hansson (1993) called this operation internal revision,
and proposed external revision as a reverse operation: K ±
α = (K + α) − ¬α
External revision does not usually work for belief sets, as
if α is inconsistent with K, the expansion of K by α trivializes the set. Therefore, this kind of operation was introduced
for belief bases (sets of sentences not necessarily closed under logical consequence). In (Hansson 1999b) the difference
between internal and external revision is fully explored.
The AGM theory defines some rationality postulates that
a revision should obey:
(K*1) K ∗ α is a belief set
(K*2) α ∈ K ∗ α
(K*3) K ∗ α ⊆ K ∪ {α}
(K*4) If ¬α ∈
/ Cn(K), then K ∗ α = Cn(K ∪ {α})
(K*5) If ¬α ∈
/ Cn(∅), then K ∗ α is consistent under Cn
(K*6) If Cn(α) = Cn(β), then K ∗ α = K ∗ β
(K*7) K ∗ (α ∧ β) ⊆ Cn((K ∗ α) ∪ {β})
(K*8) Cn((K ∗ α) ∪ {β}) ⊆ K ∗ (α ∧ β), provided that
¬β ∈
/ K ∗α
These properties were later generalized for belief bases as
will be seen in Section 3.3.
2.1
Kernel Revision Hansson (1994) proposed kernel contraction as an alternative construction in which we remove
from a belief base B at least one element of each minimal
subset of B that implies α (B ⊥⊥ α), obtaining a belief base
that does not imply α. To perform these removals of elements, we use an incision function σ, i.e., a function that
selects at least one sentence from each kernel.
From the definition of kernel contraction one can obtain a
kernel revision:
Definition 2. The kernel revision on B based on an incision
function σ is the operator ∗σ such that for all sentences α:
B ∗σ α = (B \ σ(B ⊥⊥ ¬α)) ∪ {α}
Systems of Spheres Grove (1988) proposed a construction for revision based on sets of possible worlds, defined as
maximal consistent subsets of L. In what follows, the set
of all possible worlds will be represented by ML . For any
set R ⊆ ML , [R] denotes the set of possible worlds that
contain R, i.e., [R] = {M ∈ ML : R ⊆ M }. If R is inconsistent this will be the empty set. The elements of [R] are
the R-worlds. For a sentence ϕ ∈ L, [ϕ] is an abbreviation
of [{ϕ}]. The elements of [ϕ] are the ϕ-worlds.
Definition 3. (Grove 1988) Let X be a subset of ML . A system of spheres centered on X is a collection S ⊆ P(ML ),
that satisfies the following conditions:
(S1) S is totally ordered with respect to set inclusion; that
is, if U , V ∈ S, then U ⊆ V or V ⊆ U.
(S2) X ∈ S, and if U ∈ S then X ⊆ U .
(S3) ML ∈ S (and so it is the largest element of S).
(S4) For every ϕ ∈ L, if there is any element in S intersecting [ϕ] then there is also a smallest element in S
intersecting [ϕ].
Constructions
We are going to quickly recall three different constructions
for revision:
Partial Meet Revision This operation was first suggested
in (Alchourrón and Makinson 1982), being explored with
more details in (Alchourrón, Gärdenfors, and Makinson
1985) and, later, being generalized for belief bases, as can
be seen in (Hansson 1999b). Given a set K and a formula
α, the remainder set of K in relation to α (K⊥α) is formed
by the maximal subsets of K that do not imply α. A selection function γ selects at least one element of K⊥α if
it is not empty. Otherwise, γ selects {K}. Finally, the
partial meetTcontraction on K generated by γ is defined as
K −γ α = γ(K⊥α).
Partial meet revision is obtained by applying the Levi
Identity to pertial meet contraction:
Definition 1. (Alchourrón, Gärdenfors, and Makinson
1985) Let γ be a selection function for K. The operator
∗γ of (internal) partial meet revision for K is defined as:
K ∗γ α = (K −γ ¬α) + α
The elements of S are called spheres.
For any consistent sentence ϕ ∈ L, the smallest sphere
in S intersecting [ϕ] is denoted by CS (ϕ). fS (ϕ) denotes
the set consisting of the ϕ-worlds closest to X , i.e., fS (ϕ) =
[ϕ] ∩ CS (ϕ).
A revision based on a system of spheres S is defined as
the intersection of the ϕ-worlds closest to K:
Definition 4. (Grove 1988) Let S be a system of spheres
centered on [K].
∩fS (ϕ) if [ϕ] 6= ∅
K ∗S ϕ =
L
otherwise
2.2
Non-prioritized Revision
Non-prioritized revision is a class of revision operations
where the input is not always accepted, i.e., success is not
a necessary property. The idea is that a new belief should
218
Package revision can be defined from package contraction
via a generalized version of the Levi Identity:
not always have primacy over previous beliefs, i.e., it may
be rejected if it conflicts with more valuable previous beliefs.
Hansson (1997) proposed semi-revision as an operation
of non-prioritized revision that can be based on the idea of
consolidation, i.e, extracting a consistent subset of an inconsistent belief base (essentially a contraction by ⊥). Semirevision (denoted by ?) lies in the expansion + consolidation
variety of non-prioritized belief revision (Hansson 1999a).
For possibilities of interdefinability between semi-revision
and consolidation, see (Hansson 1997).
Fermé and Hansson (1999) explored a new possibility of
non-prioritized revision which incorporates only a part of the
input belief, calling it selective revision. Many models were
developed to work with two options: complete acceptance of
the input information or full rejection of it. So the selective
revision model aimed to be an in-between approach.
In order to achieve the partial acceptance, they propose
the use of a transformation function f to the input α aiming
to extract, roughly speaking, the most trustworthy part of it.
Given an AGM revision ∗ and a transformation function f ,
the selective revision ◦ is given by K ◦ α = K ∗ f (α).
Selective revision lies in the decision + revision variety of
non-prioritized belief revision (Hansson 1999a).
3
K ∗p A = (K −p ¬A) + A
Theorem 2. (Fuhrmann 1988) If the operation of package
revision is defined via Levi Identity from the operation of
package contraction, then the following conditions hold:
(closure) K ∗p A is a theory;
(success) A ⊆ K ∗p A;
(inclusion) K ∗p A ⊆ K + A;
(consistency) if A 6⊢ ⊥, then K ∗p A is consistent.
(vacuity) if K ∩ ¬A = ∅ then K + A ⊆ K ∗p A;
(extensionality) if Cn(A) = Cn(A′ ), K ∗p A = K ∗p A′ ;
(conjunctive inclusion) K ∗p (A′ ∪ A) ⊆ (K ∗p A′ ) + A;
(conjunctive vacuity) if K∩¬A = ∅, then (K∗p A′ )+A ⊆
K ∗p (A′ ∪ A).
Furhmann claims that the operation of choice revision is
less intuitive than package revision, making it hard to find
practical applications. If in a choice revision K ∗c A an
agent has to add at least one sentence from A, which ones
should he choose?
If [K −c ¬A] represents the choice contraction of K by
those elements of ¬A that are “easiest to retract”, using the
Levi Identity, we have:
K ∗c A = [K −c ¬A] + A ∩ {α : ¬α ∈
/ [K −c ¬A]}
Early Steps on Multiple Operations
In (Fuhrmann 1988) we have a first picture of a change operation that is not necessarily by a single input. The author
claims that sometimes we need to withdraw more than one
proposition of a belief set at the same time, proposing the
name Multiple Contraction for this case.
Let K be a belief set and A a set of sentences to be retracted from K. If an agent wants no sentence of A to be
implied by K − A, i.e., Cn(K − A) ∩ A = ∅, we have
a (multiple) package contraction1 (denoted by −p ). On the
other hand, if an agent simply does not want A to be a in the
consequences of K − A, i.e., A * Cn(K − A), we have a
(multiple) choice contraction (here denoted by −c ). For further information about multiple contraction see (Fuhrmann
and Hansson 1994).
3.1
3.2
Generalizing Grove’s result
Lindström (1991) introduced a set of operations called infinitary belief revision. In his article, he explores nonmonotonic inference operations and brings a connection with
AGM revision. Nevertheless, in order to achieve total interdefinability between belief revision and nonmonotonic inference it was necessary to support possibly infinite sets of
propositions as input.
The axioms proposed are direct generalizations of the basic revision axioms presented in (Gärdenfors 1988), as well
as done in (Fuhrmann 1988). The only difference is that
Lindström joins the inclusion and vacuity postulates to form
a new one called expansion:
(expansion) if K ∪ A is consistent, then K ∗ A = K + A.
Compared to AGM, Lindström makes weaker assumptions about the underlying logic. He only assumes that it
is a deductive logic, while in the AGM framework, supraclassicality and satisfaction of the deduction theorem4 are
also required.
The following theorem shows the definition of infinitary belief revision in terms of Grove’s systems of
spheres (Grove 1988):
Multiple Revision
Fuhrmann (1988) discussed properties of revision operations that receive as input more than one sentence simultaneously, named Multiple Revision. Analogously to Multiple
Contraction, when the result of K ∗ A should imply everything in A (A ⊆ Cn(K ∗ A)) we have (multiple) package
revision2 (denoted by ∗p ), while when the result of K ∗ A
should contain some elements (but not necessarily all) of A
(Cn(K ∗ A) ∩ A 6= ∅), then we have (multiple) choice revision3 (here denoted by ∗c ).
In order to proceed with the generalization, we need the
definition of set negation. Fuhrmann defined it this way: if
A is a set of formulas, ¬A = {¬α : α ∈ A}
Theorem 3. (Lindström 1991) Let K be a belief set and ∗ be
any (multiple) belief revision operation on K. Then, for all
A, ∗ is a system of spheres-based revision iff it satisfies closure, success, extensionality, inclusion, vacuity, consistency
and conjunctive vacuity.
1
In (Fuhrmann 1988), this was called meet contraction, receiving the name package only in (Fuhrmann and Hansson 1994).
2
In (Fuhrmann 1988) this operation was denominated meet revision and in (Rott 2001), bunch revision.
3
Rott (2001) called this one pick revision.
4
219
β ∈ Cn(A ∪ {α}) iff α → β ∈ Cn(A).
3.3
Internal and External Revision
Definition 6. (Fuhrmann 1997) X ∈ K ↓ A iff X ⊆ K,
X ∪ A 6⊢ ⊥ and ∀X ′ s.t. X ⊂ X ′ ⊆ K, then X ′ ∪ A ⊢ ⊥.
Hansson (1993), when extending the AGM framework for
belief bases, chose to generalize it to the multiple case at the
same time. Considering the operation of revision obtained
from the Levi Identity, in order to proceed with the generalization we need, for sets, an equivalent way of negating the
input:
Definition 5. (Hansson 1999b) Let X be a finite set of sentences. Then neg(X) (the sentential negation of X) is defined as follows: 1. neg(∅) = ⊥; 2. if X = {α}, then
neg(X) = ¬α; 3. if X = {α1 , ..., αn } for some n > 1,
then neg(X) = ¬α1 ∨ ¬α2 ∨ ... ∨ ¬αn .
From this definition of set negation, he defines internal
revision as B ∗γ A = ∩ γB (B⊥{neg(A)}) ∪ A
Then, it is possible to characterize it axiomatically:
Theorem 4. (Hansson 1993) The operator * is an operator
of multiple internal partial meet revision for a belief base B
iff it satisfies:
(success) A ⊆ B ∗ A
(inclusion) B ∗ A ⊆ B ∪ A
(consistency) B ∗ A is consistent if A is consistent
(uniformity) If for all B ′ ⊆ B, B ′ ∪ A is inconsistent iff
B ′ ∪ C is inconsistent, then B ∩ (B ∗ A) = B ∩ (B ∗ C).
(relevance) If α ∈ B and α ∈
/ B ∗ A, then there is some
B ′ such that B ∗ A ⊆ B ′ ⊆ B ∪ {A}, B ′ is consistent but
B ′ ∪ {α} is inconsistent.
The postulates for consistency, inclusion and success are
direct generalizations of the corresponding Gärdenfors’ postulates for singleton revision.
Now we can define multiple external partial meet revision
as K ±γ A = ∩ γK∪A (K ∪ A)⊥{neg(A)}).
Theorem 5. (Hansson 1993) The operator * is an operator of multiple external partial meet revision iff it satisfies
consistency, inclusion, relevance, success and, in addition:
(weak uniformity) If A and C are subsets of B and it
holds that for all B ′ ⊆ B that B ′ ∪ A is inconsistent iff
B ′ ∪ C} is inconsistent, then B ∗ A = B ∗ C.
(pre-expansion) B + A ∗ A = B ∗ A
3.4
After defining the generalized remainder set, a selection
function γ selects from K ↓ A the preferred elements6 .
Multiple package partial meet
T revision is defined as:
K ∗p A = γ(K ↓ A) ∪ A
The following result was obtained for this package revision:
Theorem 6. (Fuhrmann 1997) The operation ∗p just defined
satisfies the following conditions:
(success) A ⊆ K ∗ A
(inclusion) K ∗ A ⊆ K ∪ A
(consistency) If K ∗A is inconsistent, then A is inconsistent
(congruence) If ¬A ≡K ¬C then K ∗ A = K ∗ C 7
(relevance) If α ∈ K \ K ∗ A then ∃X : (K ∗ A) ∩ K ⊆
X ⊆ K and X 0 ¬A and X, α ⊢ ¬A
Fuhrmann also explored the relation between this construction and one using remainders:
Theorem 7. (Fuhrmann 1997) If the operation ∗ satisfies the five conditions from Theorem 6, then there exists a selectionfunction γK for K such that: K ∗ A =
T
γ(K⊥¬A) ∪ A
According to the principle of categorial matching, when
applying revision, belief sets should map onto belief sets and
belief bases should map onto belief bases. In the way the
operation was defined so far, it is not true that in general a
closeness is preserved. Fuhrmann called the operation defined so far as pre-revision and established an operation of
matching revision in accordance with the principle of categorial matching:
Cn(A ∗ B) if A = Cn(A)
A⋆B =
A∗B
otherwise
The following observation indicates that the operation ⋆ is
in practice a simple adaptation of pre-revision.
Observation 1. (Fuhrmann 1997) If ∗ satisfies the conditions from Theorem 6, then ⋆ satisfies the five conditions and
also closure: if K = Cn(K) then K ⋆ A = Cn(K ⋆ A).
One of the questions that emerge is if it is possible to reduce multiple operations to operations by singletons. For
package revision, Fuhrmann obtained the following result:
V
Lemma 1. (Fuhrmann 1997) For finite A, K ∗A = K ∗ A
Multiple Package Partial Meet Revision
As shown in the second condition of Theorem 2, the generalization of the revision operation inherits an important characteristic already present in the original definition: the input
set has priority over the sentences to be revised. Following
this approach, we have in (Fuhrmann 1997) a further exploration of the topic, bringing us a construction. Fuhrmann
adresses the issue focusing on the package variety and shows
how the operations of revision and contraction can be interdefinable.
Following the partial meet approach, when we have arbitrary sets of sentences K and A and need to revise K in
order to consistently incorporate all elements of A, we can
first find all maximal subsets of K that are consistent with
A, which form a generalized version of the remainder set for
multiple operations in revision5 :
5
However, there is a logical drawback for this reduction.
For comparison purposes, suppose that, instead of revising
by a set {α, β} (a collection of items of information), the
agent decides to revise by the conjunction α ∧ β (a single
item of information). As observed in (Delgrande and Jin
2012), although the two options have the same logical content (since they imply precisely the same formulas), revising
by {α, β} should result in a belief state such that, if there is
no known link between α and β, then if β were afterwards
found out as not true, then α should still be considered as
true.
6
7
In (Fuhrmann 1997), this is called K open for A (K op A).
220
In (Fuhrmann 1997), this function is called choice function.
¬A ≡K ¬C iff ∀X ⊆ K: X ⊢ ¬A iff X ⊢ ¬C
4
Infinitary Belief Change
Definition 9. (Zhang 1996) A NOP Σ = (K, P, <) of K is
called a nice-well-ordering partition (NWOP) of K if < is a
well-ordering relation on P.
A contraction generated by NWOP was presented to establish a more computational-oriented method to deal with
general belief revision.
In (Zhang et al. 1997a), the authors claim that in (Zhang
1996) we have a complete extension of AGM’s postulates
for belief changes but without a representation theorem to
the framework proposed. So, in their paper they provide
two representation theorems for general contraction and, in
addition, a new property called Limit Postulate in order to
specify properties of infinite belief changes. They also developed a Partial Meet model to the general contraction.
When the input is not finite, the postulates for general contraction are not enough to characterize NOP contraction by
infinite sets of sentences.
A plausible idea is to assume that the contraction by an
infinite set as a limit of the contractions by its finite subsets.
Let Ā be a finite subset of an infinite set A. So the Limit
Postulate for general contraction can be defined as follows:
S
T
(K ⊖ LP) K ⊖ A =
K ⊖ Ā′
Zhang (1996) observed that there was still a need for a general framework for belief revision by sets of sentences, especially infinite sets, a topic that was retaken in (Zhang et al.
1997a; Zhang and Foo 2001). The Levi identity, for example, does not hold for infinite inputs. So the purpose of the
articles was to extend the AGM framework (its axiomatization and modeling) to a more general one in order to include
revision by any set of sentences.
4.1
General contraction
In the traditional AGM account of contraction, contracting
K by A means removing enough from K so that A is not
implied anymore. However, according to Zhang (1996), this
would break the connection between revision and contraction when A is not finite. He proposed a new operator called
general contraction (denoted by ⊖ and also called set contraction in (Zhang and Foo 2001)) whose purpose is to delete
sentences from K so that the remaining set is consistent with
A and logically closed. Zhang and Foo (2001) observe that,
even though it is different from the initial purpose of contraction, this new operator elucidates a significant intuition
about contraction: we only give up our beliefs when they
conflict with some new information. General contraction
comes down to an auxiliary tool to build revision. General
contraction amounts to the first step of internal revision and
can be constructed using Definition 6.
Zhang and Foo showed that the relations between AGM
revision and contraction are, with suitable adaptations, also
valid between multiple revision and general contraction. According to the authors, that is why they focused on set contraction in their article, i.e., multiple revision can be derived
through the Levi identity.
Zhang (1996) proposed a structure called total-ordering
partition (TOP), which is logically independent. If the partition is rearranged in order to satisfy some logical constraints
we have a nice-ordering partition (NOP)8 .
Ā⊆f A Ā′ ⊆f Cn(A)
Ā⊆Ā′
Theorem 8. (Zhang et al. 1997a) If ⊖ is a NOP contraction
function, then ⊖ satisfies (K ⊖ LP ).
Theorem 9. (Zhang et al. 1997a) Let ⊖ be a general contraction function over K. If ⊖ satisfies (K ⊖LP ), then there
exists a NOP Σ = (K, P, <) of K such that ⊖ is exactly the
NOP contraction generated Σ.
4.2
General revision (or set revision), denoted by ⊕ as in (Lindström 1991), can be defined in terms of general contraction
analogously to the Levi Identity:
(Def ⊕) K ⊕ A = Cn((K ⊖ A) ∪ A)
The eight postulates proposed for general revision are
the same as the ones given by Lindström (1991) and Peppas (2004), denoted by (K ⊕ 1)-(K ⊕ 8).
Considering infinite inputs, according to (Zhang et al.
1997a) a corresponding assumption for general revision can
be obtained in terms of (Def ⊕):
S
T
(K ⊕ LP) K ⊕ A =
K ⊕ Ā′
Definition 7. (Zhang 1996) For any belief set K, let P be
a partition of K and < a total-ordering relation on P. The
triple Σ = (K, P, <) is called a TOP of K. For any p ∈ P,
if A ∈ p, p is called the rank of A, denoted by b(A).
A TOP Σ = (K, P, <) is a NOP if it satisfies the
following logical constraint: if A1 , ..., An ⊢ B, then
sup{b(A1 ), ..., b(An )} ≥ b(B).
Using these new concepts, we are given an explicit construction for multiple contraction functions:
Ā⊆f A Ā′ ⊆f Cn(A)
Ā⊆Ā′
In (Zhang and Foo 2001) it is proven that the Limit Postulate is enough to complete the fully characterization of general belief change operations.
Proposition 1. (Zhang and Foo 2001) If ⊕ satisfies the postulates (K ⊕ 1) − (K ⊕ 8), then (K ⊕ LP ) is equivalent to
the following condition:
T
K ⊕A=
(K ⊕ Ā) + A
Definition 8. (Zhang 1996) Let Σ = (K, P, <) be a NOP
of a belief set K. The NOP contraction ⊖ is such that:
• if A ∪ K is consistent, then K ⊖ A = K;
• otherwise, B ∈ K ⊖ A iff B ∈ K and there exists C ∈ K
such that A ⊢ ¬C and:
∀D ∈ K(C ⊢ D ∧ A ⊢ ¬D → (b(C) ∨ B < b(D)∨ ⊢
C ∨ B))
Ā⊆f Cn(A)
Zhang (1996) provided a set of rationality postulates for
NOP contraction, as well as another constructive approach:
8
Constructing revision
The results presented in (Zhang et al. 1997a) both for general contraction and revision give a groundwork for exploring the link between non-monotonic reasoning and multiple
NOP generalizes epistemic entrenchment (Gärdenfors 1988).
221
Theorem 10. (Peppas 2004) Let K be a theory of L and ⊕
a function satisfying (K ⊕ 1) − (K ⊕ 8). Then there exists
a well ranked system of spheres S centered on [K] such that
for all nonempty Γ ⊆ L, condition (S⊕) is satisfied.
On the connection between multiple revision and AGM
sentence revision, the author brings the definition of restriction and extendability:
Definition 14. (Peppas 2004) For a multiple revision ⊕, the
restriction of ⊕ to sentences is a function ∗ defined such that,
for all theories K and ϕ ∈ L, K ∗ ϕ = K ⊕ {ϕ}. An AGM
revision function ∗ is extendable iff there exists a multiple
revision function ⊕ whose restriction to sentences is ∗.
Based on the results from (Lindström 1991), Peppas states
that the class of extendable revision functions corresponds to
the family of revision functions corresponding, by means of
(S*), to well ranked systems of spheres. As a consequence,
all well behaved revision functions are extendable.
About the plausibility to reduce multiple revision to sentence revision, if the input is finite, the reduction is presented
as already
showed in the previous section, i.e., K ⊕ Γ =
V
K ∗ Γ. If, on the other hand, the input is infinite, a possibility of reduction is proposed in the form of a theorem that
works for sets Γ of arbitrary size. Nevertheless, it depends
on a boundness condition. A multiple revision function ⊕ is
bounded iff there exists a system of spheres S corresponding
to ⊕ by meansVof (S⊕) that has only finitely many spheres.
Let Z(Γ) = { ∆ : ∆ ⊆f Γ}. Then:
Theorem 11. (Peppas 2004) Let ⊕ be a bounded multiple
revision function and ∗ its restriction to sentences. Then, for
any theory K and any set of sentencesTΓ of L, the condition
(K ⊕ F ) holds as follows: K ⊕ Γ = ϕ∈Z(Γ) (K ∗ ϕ) + Γ.
belief revision, a topic investigated in (Zhang et al. 1997b),
where the authors proposed a rational non-monotonic system and provided two representation theorems relating Multiple Revision and their system. According to (Zhang and
Foo 2001), the Limit Postulate for revision, due to its equivalence to a property of non-monotonic reasoning called Finite Supracompactness (Zhang et al. 1997b), can be called
the compactness of belief revision.
5
More on Systems of Spheres
After the initial work developed by Lindström (1991), Peppas (2004) studied some smoothness conditions on systems
of spheres and their connection with multiple revision, giving also a constructive model for multiple revision based on
systems of spheres along with a representation result.
A specific smoothness condition satisfied by a total preorder on possible worlds is called the limit assumption
which, in the definition of system of spheres (Section 2.1),
corresponds to condition (S4). Its central function is to ensure that for every consistent sentence ϕ there is always a
’most-plausible’ ϕ-world. The smoothness conditions considered by Peppas is his article are actually variants of the
limit assumption.
5.1
Well behaved functions
Peppas analyzed aspects of well orderness of spheres:
Definition 10. (Peppas 2004) A system of spheres S is well
ordered with respect to set inclusion iff it satisfies the (SW)
property: every nonempty subset of S has a smallest element
with respect to ⊆.
(SW) is stronger than (S4). With this definition, it is possible to define a class of revision functions:
(K ⊕ F ) defines a reduction that starts with a revision
of K by every finite conjunction ϕ of sentences in Γ. Then,
each such revised theory K ∗ϕ is expanded by Γ and, finally,
all expanded theories (K ∗ ϕ) + Γ are intersected.
A set V of consistent complete theories is elementary iff
V = [∩V]. In words, V is elementary if no world outside
V is compatible with the theory ∩V. Consider the following
condition on a system of spheres S:
S
(SF)
For every G ⊆ S, G is elementary.
It is exactly the smoothness condition needed for the reduction of multiple revision to sentence revision in the spirit
of (K ⊕ F ):
Theorem 12. (Peppas 2004) Let K be a theory of L, ⊕ a
multiple revision function and ∗ its restriction to sentences.
Moreover, let S be a well ranked system of spheres centered
on [K] that corresponds to ⊕ by means of (S⊕). Then, S
satisfies (SF) iff ⊕ satisfies (K ⊕ F ).
Definition 11. (Peppas 2004) A revision ∗ is well behaved
iff it can be constructed by well ordered systems of spheres.
The extension of the construction based on systems of
spheres to multiple revision can be defined as follows:
Definition 12. (Peppas 2004) Let K be a theory of L and S
a system of spheres centered on [K]. The multiple revision
of K by Γ is:
∩fS (Γ) if [Γ] 6= ∅
(S⊕)
K ⊕Γ=
L
otherwise
However, Peppas observes that if S is restricted only by
conditions (S1) − (S4), we cannot assume that for a set of
sentences Γ there is always a smallest sphere intersecting
[Γ], even for consistent inputs. Thus, an extra constraint on
S is needed:
Definition 13. (Peppas 2004) A system of spheres S is called
a well ranked system of spheres if it satisfies the (SM) property: for every consistent set of sentences Γ there exists a
smallest sphere in S intersecting [Γ].
5.2
Extra constraints
Peppas, Koutras and Williams (2012) observed that the limit
postulate demanded additional constraints on systems of
spheres and its relationship with the condition defined in
Proposition 1 was still an open problem.
Theorem 13. (Peppas, Koutras, and Williams 2012) There
exists a consistent theory K and a multiple revision function
For studying multiple revision, the author restricted the
systems of spheres considered to the family of well ranked
ones. He recalls the postulates for multiple revision as defined by Lindström (1991), calling a function ⊕ that obeys
the set of postulates as a multiple revision function.
222
(Uniformity 2) Given A and A′ two consistent sets, for all
subset X of B, if X ∪ A ⊢ ⊥ iff (X ∪ A′ ) ⊢ ⊥ then
B ∩ (B ∗ A) = B ∩ (B ∗ A′ ).
⊕ satisfying (K⊕1)−(K⊕8) such that ⊕ satisfies (K⊕LP )
but violates (K ⊕ F ) at K.
From the theorem above we can conclude that (K ⊕ LP )
is strictly weaker than (K ⊕ F ).
A sphere U ∈ S is said to be proper if it contains at least
one world outside all spheres smaller
S than U . The core of U ,
denoted by U c , in the set U c = {V ∈ S : V ⊆ U }. So,
a sphere U ∈ S is proper iff U 6= U c . Considering proper
spheres, it is possible to add an extra restriction to them:
(EL)
(Relevance) If α ∈ B \ (B ∗ A) then there is a set C such
that B ∗ A ⊆ C ⊆ (B ∪ A), C is consistent with A but
C ∪ {α} is inconsistent with A.
(Core-retainment) If α ∈ B \ (B ∗ A) then there is a set
C such that C ⊆ (B ∪ A), C is consistent with A but
C ∪ {α} is inconsistent with A.
All proper spheres in S are elementary.
Except for Weak Success and Uniformity 1, the postulates
above were adapted from similar postulates for singleton
revision (Hansson 1999b). By generalizing the techniques
from classical belief base revision, the authors defined two
kinds of construction: Package Kernel Revision and Package Partial Meet Revision9 .
There is a dissimilarity between (K ⊕ LP ) and (EL):
Lemma 2. (Peppas, Koutras, and Williams 2012) There is a
consistent theory K and a well ranked system of spheres S
centered on [K] such that S satisfies (EL) and yet the multiple revision function ⊕ induced from S violates (K ⊕ LP ).
Definition 16. (Falappa et al. 2012) Let B be a belief base
and σ an incision function. The package kernel revision on
B generatedby σ is the operator ∗σ such that, for all set A:
(B \ σ(B ↓↓ A)) ∪ A if A is consistent
B ∗σ A =
K
otherwise
where B ↓↓ A stands for the minimal subsets of B inconsistent with A.
This last result shows that (EL) is not enough to characterize (K ⊕ LP ) and, at the same time, (SF) is too strong.
So there is a need for something in the middle. Before that,
we need the definition of a finitely reachable sphere:
Definition 15. (Peppas, Koutras, and Williams 2012) Let K
be a theory and S a system of spheres centered on [K]. A
sphere V is finitely reachable in S iff there exists a consistent
sentence ϕ ∈ L such that CS (ϕ) = V.
Theorem 15. (Falappa et al. 2012) An operator ∗ is a package kernel revision for B iff it satisfies inclusion, consistency, weak success, vacuity 1, uniformity 1 (and uniformity
2), and core-retainment.
Consider the following restrictions on a system of spheres
S, where V is an arbitrary sphere in S:
(R1) If V is finitely reachable then V is elementary.
(R2) If V is finitely reachable
then V c is elementary.
T
(R3) If V c 6= V then [ V c ] ⊆ V.
Definition 17. (Falappa et al. 2012) Let B be a set of sentences and γ a selection function. The package partial meet
revision on B generated by γ is the operator ∗γ such that,
for all sets A: T
γ(B ↓ A) ∪ A if A is consistent
B ∗γ A =
B
otherwise
Theorem 14. (Peppas, Koutras, and Williams 2012) Let K
be a theory, S a well-ranked system of spheres centered on
[K] and ∗ the multiple revision induced from S on K via
(S⊕). Then ∗ satisfies (K ⊕LP ) iff S satisfies (R1)−(R3).
6
Theorem 16. (Falappa et al. 2012) An operator ∗ is a package partial meet revision for K iff it satisfies inclusion, consistency, weak success, vacuity 1, uniformity 2 (and uniformity 1), and relevance.
Direct Constructions
Fuhrmann (1988), based on the Levi Identity, states that revision is clearly a compound operation, using this argument
to defend that, given the not-very-complex nature of expansion operations, a theory of belief change should concentrate
on the contraction operation. However, it would be interesting to study the definition of revision operations in a direct way, i.e., without using contraction as an intermediate
step. This is one of the main goals of the works developed
in (Falappa et al. 2012; Valdez and Falappa 2016) for belief
bases.
Let B, A and A′ be sets of sentences and ∗p be a package revision operator on belief bases. Falappa et al. (2012)
defined postulates for this operation:
The only non-prioritized aspect of these operators is that
no change is performed when the incoming information is
inconsistent.
In a very similar way to what was done in (Falappa et al.
2012), Valdez and Falappa (2016) proposed two constructions for multiple package revision in Horn logic, one based
on kernel and the other based on partial meet. Both of them
were characterized axiomatically.
7
Non-prioritized Multiple Revision
As seen in Section 3.1, multiple revision comes in two flavors. Choice revision is a non-prioritized kind of (multiple)
revision, as the input does not have priority with respect to
the initial beliefs: the agent can incorporate part of the new
beliefs whilst ignoring the rest. (Falappa et al. 2012) and
(Zhang 2018) also note that choice revision cannot be reduced to selective revision (Fermé and Hansson 1999).
(Inclusion) B ∗ A ⊆ B ∪ A.
(Weak Success) If A is consistent then A ⊆ B ∗ A.
(Consistency) If A is consistent then B ∗ A is consistent.
(Vacuity 1) If A is inconsistent then B ∗ A = B.
(Uniformity 1) Given A and A′ two consistent sets, for all
subset X of B, if X ∪ A ⊢ ⊥ iff (X ∪ A′ ) ⊢ ⊥ then
B \ (B ∗ A) = B \ (B ∗ A′ ).
9
223
Originally, Package Revision was called Prioritized Change.
7.1
The Levi Identity model
Due to the usage of set-negation to perform the contraction part of the choice operation, this approach proposed in
(Zhang 2018) depends on negation and on disjunction of
sentences.
Zhang (2018) proposed two types of choice revision, one on
the contraction + expansion approach and the other on the
expansion + contraction one. Both of them were axiomatically characterized. Before defining the operation, the author introduces an auxiliary operation called partial expansion (∔), which is a generalization of the traditional expansion operation. Given two sets of sentences B and A, B ∔ A
contains the whole B plus some of A.
Zhang also adapted a definition of negation set given in
(Hansson 1993):
Definition 18. (Zhang 2018) Let A be some set of sentences.
Then the negation set neg(A) of A is defined as follows:
1. neg(∅) = ⊤,
2. neg(A) = ∪n≥1 {¬ϕ1 ∨ ¬ϕ2 ∨ · · · ∨ ¬ϕn | ϕi ∈
A for every i such that 1 ≤ i ≤ n}
Finally, two constructions are shown: one for internal revision and one for external revision, both of them depending
on contraction.
Definition 19. An operator ∗c is an internal choice revision
iff there exists a choice contraction −c and a consistencypreserving partial expansion ∔ such that for all sets B and
A, B ∗c A = B −c neg(A) ∔ A.
Theorem 17. ∗c is an internal choice revision on consistent belief bases with finite inputs iff it satisfies the following
conditions: for every consistent B and finite A and A′ ,
(∗c -inclusion) B ∗c A ⊆ (B ∪ A)
(∗c -success) If A 6= ∅, then A ∩ (B ∗c A) 6= ∅
(∗c -iteration) B ∗c A = (B ∩ (B ∗c A)) ∗c A
(∗c -consistency) If A 6≡ {⊥}, then B ∗c A 0 ⊥
(∗c -coincidence) If A ∩ B 6= ∅ and A ⊆ A′ ⊆ (A ∪ B),
then B ∗c A = B ∗c A′
(∗c -uniformity) If it holds for all B ′ ⊆ B that B ′ ∪ {ϕ} ⊢
⊥ for some ϕ ∈ A iff B ′ ∪ {ψ} ⊢ ⊥ for some ψ ∈ A′ ,
then B ∩ (B ∗c A) = B ∩ (B ∗c A′ )
(∗c -relevance) If ϕ ∈ B \ B ∗c A, then there is some B ′
with B ∩ (B ∗c A) ⊆ B ′ ⊆ B, such that B ′ ∪ {ψ} 0 ⊥
for some ψ ∈ A and B ′ ∪ {ϕ} ∪ {λ} ⊢ ⊥ for every λ ∈ A
Definition 20. An operator ∗c is an external choice revision iff there exists a package contraction −p and a partial expansion ∔ such that for all B and A, B ∗c A =
B ∔ A −p neg(A′ ), where A′ = (B ∔ A) \ B.
Theorem 18. ∗c is an external choice revision iff, for all
B, B1 , B2 , A and A′ , it satisfies ∗c -inclusion, success, coincidence and:
(∗c -confirmation) If (A∩(B∗c A)) ⊆ B, then B∗c A = B
(∗c -consistency) If (B ∗c A)\B 6= ∅ and (B ∗c A)\B 0 ⊥
(∗c -uniformity) If B1 6= ((B1 ∗c A)∪B1 ) = B = ((B2 ∗c
A′ ) ∪ B2 ) 6= B2 and it holds for all B ′ ⊆ B that B ′ ∪
((B1 ∗c A) \ B1 ) 0 ⊥ iff B ′ ∪ ((B2 ∗c A′ ) \ B2 ) 0 ⊥,
then B1 ∗c A = B2 ∗c A′
(∗c -relevance) If ϕ ∈ B \ B ∗c A, then there is some B ′
with B ∗c A ⊆ B ′ ⊆ B ∪ (B ∗c A), such that B ′ 0 ⊥ and
B ′ ∪ {ϕ} ⊢ ⊥
7.2
The Descriptor Revision approach
Zhang (2019) also proposed two types of choice revision but
based on a different approach to belief change named Descriptor Revision (Hansson 2014). This approach applies a
“select-direct” procedure by considering that there is a set
of belief sets as possible results of belief change and this
change is implemented through a direct choice among these
possible results. Both types were characterized axiomatically through a set of postulates and a representation theorem, with the assumption of a finite language.
It is important to observe that, in this approach, revision
was explored without taking into account its connection with
contraction, i.e., it was defined without using contraction as
an intermediate step.
Zhang (2019) shows that, in general, choice revision V
by a
finite set A cannot be reduced to selective revision by A
and, similarly, it is not possible to perform choice revision
by an AGM revision using a disjunction of all the sentences
of the input.
7.3
The Multiple Believability Relations approach
Zhang (2019) also proposed a second modeling for choice
revision, based on Multiple Believability Relations and without assuming a finite language. Zhang defines a believability
relation as a binary relation representing that “the subject
is at least as prone to believing ϕ as to believing ψ” (ϕ ψ).
A multiple believability relation ∗ is a binary relation on
finite sets of formulas satisfying ϕ ψ iff {ϕ} ∗ {ψ}.
One of the ways of proceeding with the generalization described above is defining choice multiple believability relations (c ). A c B indicates that it is easier for an agent to
absorb the plausible information in A than that in B.
A construction for this operation was proposed and axiomatically characterized.
7.4
The Semi-Revision approach
The operators defined in (Falappa, Kern-Isberner, and
Simari 2002) work with partial acceptance in the following
way: for a belief set K and an input set A, the incoming set
is initially accepted and, then, all possible inconsistencies of
K ∪ A are removed. So it is a kind of external revision.
However, the input sets considered are explanations. An
explanation contains an explanans (the beliefs that support a
consequence) and an explanandum (the final consequence).
Each explanation is a set of sentences with some restrictions.
The idea is that it does not seem rational for an agent to
absorb any external belief without an explanation to support
the provided belief, especially if the new information is not
consistent with its own set of beliefs.
The authors generalized the framework from (Hansson
1997) to define an operator ◦ that support sets of sentences
(explanations) as input. Two constructions were proposed,
one based on kernel sets and the other based on remainder
sets, and both were characterized axiomatically.
224
8
The core beliefs approach
Theorem 19. (Yuan, Ju, and Wen 2014) Let (B, A) ∈ B,
A = Cn(∅) ∩ B and ⊲ be an EMR operator for (B, A).
In the literature, we find two approaches for multiple revision which are based on the concept of core beliefs. The
belief set to be revised has a subset taken as core, which is
entirely preserved independently of the new information.
Both approaches are characterized axiomatically and also
receive two constructions: one based on kernels and another
based on remainders. In addition, the approaches use the
concept of belief state defined as follows:
1. If ⊲ is a KEMR operator, then ∗⊲ is a multiple package
kernel revision operator.
2. If ⊲ is a PMEMR operator, then ∗⊲ is a multiple package
partial meet revision operator.
EMR is also similar to selective revision, as in both approaches the input is treated by a separate mechanism before effectively being used to perform revision. Nevertheless, while the transformation function from Selective Revision usually returns logical consequences of the input, the
decision module from EMR produces subsets of the incoming set. In addition, Selective Revision does not protect core
beliefs. Hence, EMR cannot be considered a generalization
of Selective Revision.
Definition 21. (Yuan, Ju, and Wen 2014) A belief state is a
pair S = (B, A) satisfying: A ⊆ B ⊆ L, A is consistent
and A is logically closed within B, i.e., Cn(A) ∩ B ⊆ A.
The set of all belief states is denoted by B. For every (B, A)
∈ B, B is called the belief base and A the set of core beliefs.
As common properties, both operators satisfy three principles: minimal change (the agent should preserve as much
old beliefs as possible), consistency (the resulting belief
state should be consistent after revision) and protection (core
beliefs should always be preserved).
8.1
8.2
Rational Metabolic Revision
There are some contexts where the agent cannot identify,
initially, the implausible part of an incoming information.
One option is to incorporate all the new beliefs (expansion),
which may lead to some conflicts that will be useful to detect the implausible information and consolidate the belief
state. Yuan (2017) proposed a new multiple revision operator along this line, metabolic revision. This operator lies
in the expansion + consolidation variety of non-prioritized
belief revision (Hansson 1999a).
The name metabolic revision is due to a correlation with
body metabolism. If an animal finds some food and considers it as good to eat, the animal will ingest it and later its
body will eliminate some harmful substance or trash by the
digestive system. The idea for the operator is to work in a
similar manner with new information.
The metabolic revision operator is represented by ♦ and
maps a belief state (B, A) and a set of beliefs D to a new
belief state (B ′ , A′ ). Yuan proposed two constructions: one
based on kernel (♦σ ) and the other on partial meet (♦γ ).
Both were characterized axiomatically.
As observed by Yuan, semi-revision (Hansson 1997) is
not a particular case of metabolic revision when A is empty
and D is a singleton. While it is possible to establish an
interrelation between two semi-revisions of different belief
bases, metabolic revision is defined for a fixed belief state,
i.e., properties for the interrelation between two metabolic
revisions on different belief states were not defined.
Evaluative Multiple Revision
Evaluative Multiple Revision (EMR) is an operation through
which the new information, instead of being directly handled, is pre-processed in an evaluation process that takes
into account the core beliefs of the agent and, then, the revision is performed. Therefore, it is considered a sort of nonprioritized multiple revision, as the whole new information
is not necessarily incorporated. EMR falls into the decision
+ revision variety of non-prioritized belief revision (Hansson 1999a), i.e., a two-phase revision process.
The new information is first submitted to a decision module which, using the core beliefs as criteria, performs an
evaluation and produces two disjoint sets – one for plausible information and another for implausible:
Definition 22. (Yuan, Ju, and Wen 2014) Given a belief state
(B, A), an A-evaluation is a pair of sets of formulas in L,
denoted by I|P , satisfying: (i) I ∪ P 6= ∅, (ii) A ∪ P 0 ⊥
and (iii) Cn(A ∪ P ) ∩ I = ∅.
I is the set of implausible new information while P is
the set of plausible new information. The set of all Aevaluations is denoted by A.
Differently from other frameworks, the revision module
does not receive a single set of sentences to perform the revision operation. It receives the pair of sets produced by the
previous module. The idea is to revise the agent’s beliefs by
the plausible set and, at the same time, contract them by the
implausible set. So, the EMR operator ⊲ maps a belief state
(B, A) and an A-evaluation I|P to a new belief state, that
is, the result of (B, A) ⊲ I|P is a pair as well.
The authors proposed two constructions: one based on the
kernel operation (KEMR, denoted by ⊲σ ) and another one
on partial meet operation (PMEMR, denoted by ⊲γ ). Both
of them were characterized axiomatically.
EMR was compared with the operations of multiple package revision defined in (Falappa et al. 2012) (shown in Section 6). Roughly, the operations ∗σ and ∗γ are special cases
of ⊲σ and ⊲γ , respectively, when I is empty:
9
Conclusion and Open Problems
We presented a literature overview covering several models
of Multiple Revision. We did not include the works on Iterated Revision since our focus was on models in which the
beliefs of the incoming set are processed simultaneously.
One of our aims was to unify the terminology and the
symbols used in the area. Another goal was to bring an
overview of the Multiple Revision literature, shortly describing different works and comparing some of them. When applicable, we observed the possibility or not of reduction to
singleton revision.
225
Grove, A. 1988. Two modellings for theory change. Journal
of philosophical logic 157–170.
Hansson, S. O. 1993. Reversing the levi identity. Journal of
Philosophical Logic 22(6):637–669.
Hansson, S. O. 1994. Kernel contraction. Journal of Symbolic Logic 59(3):845–859.
Hansson, S. 1997. Semi-revision. Journal of Applied NonClassical Logics 7(1-2):151–175.
Hansson, S. O. 1999a. A survey of non-prioritized belief
revision. Erkenntnis 50(2-3):413–427.
Hansson, S. O. 1999b. A Textbook of Belief Dynamics.
Kluwer Academic Publishers.
Hansson, S. O. 2014. Descriptor revision. Studia Logica
102(5):955–980.
Konieczny, S., and Pérez, R. P. 2002. Merging information
under constraints: a logical framework. Journal of Logic
and computation 12(5):773–808.
Lindström, S. 1991. A semantic approach to nonmonotonic
reasoning: inference operations and choice. Uppsala Prints
and Preprints in Philosophy 6.
Peppas, P.; Koutras, C. D.; and Williams, M.-A. 2012.
Maps in multiple belief change. ACM Trans. Comput. Logic
13(4):30:1–30:23.
Peppas, P. 2004. The limit assumption and multiple revision.
Journal of Logic and Computation 14(3):355–371.
Rott, H. 2001. Change, Choice and Inference. Oxford
University Press.
Valdez, N. J., and Falappa, M. A. 2016. Multiple revision on
horn belief bases. In XXII Congreso Argentino de Ciencias
de la Computación (CACIC 2016).
Yuan, Y.; Ju, S.; and Wen, X. 2014. Evaluative multiple
revision based on core beliefs. Journal of Logic and Computation 25(3):781–804.
Yuan, Y. 2017. Rational metabolic revision based on core
beliefs. Synthese 194(6):2121–2146.
Zhang, D., and Foo, N. 2001. Infinitary belief revision.
Journal of Philosophical Logic 30(6):525–570.
Zhang, D.; Chen, S.; Zhu, W.; and Chen, Z. 1997a. Representation theorems for multiple belief changes. In IJCAI,
89–94.
Zhang, D.; Chen, S.; Zhu, W.; and Li, H. 1997b. Nonmonotonic reasoning and multiple belief revision. In IJCAI,
95–101.
Zhang, D. 1996. Belief revision by sets of sentences. Journal of Computer Science and Technology 11(2):108–125.
Zhang, L. 2018. Choice revision on belief bases. arXiv
preprint arXiv:1805.01325.
Zhang, L. 2019. Choice revision. Journal of Logic, Language and Information 28(4):577–599.
The operations described in this paper involve basically
two main models of epistemic states: sentential models (belief sets, belief bases and belief states) and possible worlds.
Most of the operations work with package revision but there
are also a few for choice revision.
We can now list some open problems. Regarding the operators defined in (Falappa et al. 2012), the interrelations
between package kernel and partial meet revision are not
clear. From (Yuan, Ju, and Wen 2014), a possible future
work is the characterization of non-prioritized multiple revision in a unified way without dividing it into two modules.
From (Yuan 2017), further exploration includes the definition of consolidation based on core beliefs and its relation
with metabolic revision. For the choice revision operators
defined in (Zhang 2018) it remains to define their respective
constructions and also to study and establish the differences
and connections between these operators and the one based
on Descriptor Revision from (Zhang 2019). Choice Revision could also be defined and constructed without using
contraction as an intermediate step, as well as investigated
with infinite inputs.
In relation to other approaches, Selective Revision could
be generalized to the multiple case and non-prioritized multiple revision operators could be investigated in their relation with merge operators (Fuhrmann 1997; Falappa et al.
2012). Regarding the underlying logic of each model presented here, as most of them were developed for propositional logic, an important future work is to investigate how
they can be adapted to work with other logics.
References
Alchourrón, C., and Makinson, D. 1982. On the logic of
theory change: Contraction functions and their associated
revision functions. Theoria 48(01):14–37.
Alchourrón, C.; Gärdenfors, P.; and Makinson, D. 1985. On
the logic of theory change. J Symbolic Logic 50:510–530.
Darwiche, A., and Pearl, J. 1997. On the logic of iterated
belief revision. Artificial Intelligence 89(1–2):1–29.
Delgrande, J., and Jin, Y. 2012. Parallel belief revision.
Artificial Intelligence 176(1):2223–2245.
Falappa, M.; Kern-Isberner, G.; Reis, M.; and Simari, G.
2012. Prioritized and non-prioritized multiple change on belief bases. Journal of Philosophical Logic 41(1):77–113.
Falappa, M. A.; Kern-Isberner, G.; and Simari, G. R. 2002.
Explanations, belief revision and defeasible reasoning. Artificial Intelligence 141(1-2):1–28.
Fermé, E., and Hansson, S. O. 1999. Selective revision.
Studia Logica 63(3):331–342.
Fuhrmann, A., and Hansson, S. O. 1994. A survey of multiple contractions. Journal of Logic, Language and Information 3(1):39–75.
Fuhrmann, A. 1988. Relevant Logics, Modal Logics and
Theory Change. Ph.D. Dissertation, Australian National
Univ.
Fuhrmann, A. 1997. An Essay on Contraction. FOLLI.
Gärdenfors, P. 1988. Knowledge in Flux - Modeling the
Dynamics of Epistemic States. MIT Press.
226
A Principle-based Approach to Bipolar Argumentation
Liuwen Yu
University of Luxembourg, Luxembourg, University of Bologna, Italy, University of Turin, Italy
Leendert van der Torre
University of Luxembourg, Luxembourg, Zhejiang University, China
Abstract
be represented more easily in so-called bipolar argumentation frameworks (Cayrol and Lagasquie-Schiex 2005; 2009;
2010; 2013) containing besides attack also a support relation
among arguments.
The concept of support has attracted quite some attention in the formal argumentation literature, maybe because
it remains controversial how to use support relations to compute extensions. Most studies distinguish deductive support,
necessary support and evidential support. Deductive support (Boella et al. 2010) captures the intuition that if a supports b, then the acceptance of a implies the acceptance
of b, and as a consequence the non-acceptance of b implies the non-acceptance of a. Evidential support (Besnard
and others 2008; Oren, Luck, and Reed 2010) distinguishes
prima-facie from standard arguments, where prima-facie arguments do not require any support from other arguments
to stand, while standard arguments must be supported by at
least one prima-facie argument. Necessary support (Nouioua
and Risch 2010) captures the intuition that if a supports b,
then the acceptance of a is necessary to get the acceptance of
b, or equivalently the acceptance of b implies the acceptance
of a.
Despite this diversity, the study of support in abstract argumentation seems to agree on the following three points.
Relation support and attack The role of support among
arguments has been often defined as subordinate to attack,
in the sense that in deductive and necessary support, if
there are no attacks then there is no effect of support. On
the contrary, in the evidential approach, without support,
there is no accepted argument even if there is no attack.
Diversity of support Different interpretations for the notion of support can be distinguished, such as deductive
(Boella et al. 2010), necessary (Nouioua and Risch 2011;
Nouioua 2013) and evidential support (Besnard and others 2008; Oren, Luck, and Reed 2010; Polberg and Oren
2014).
Structuring support Whereas attack has been further
structured into rebutting attack, undermining attack and
undercutting attack, the different kinds of support have
not led yet to a structured argumentation theory for bipolar argumentation frameworks.
The picture that emerges from the literature is that the notion of support is much more diverse than the notion of at-
Support relations among arguments can induce various kinds of indirect attacks corresponding to deductive, necessary or evidential interpretations. These different kinds of indirect attacks have been used in metaargumentation, to define reductions of bipolar argumentation frameworks to traditional Dung argumentation
frameworks, and to define constraints on the extensions
of bipolar argumentation frameworks. In this paper, we
give a complete analysis of twenty eight bipolar argumentation framework semantics and fourteen principles. Twenty four of these semantics are for deductive
and necessary support and defined using a reduction,
and four other semantics are defined directly. We consider five principles directly corresponding to the different kinds of indirect attack, three basic principles
concerning conflict-freeness and the closure of extensions under support, three dynamic principles, a generalized directionality principle, and two supported argument principles. We show that two principles can be
used to distinguish all reductions, and that some principles do not distinguish any reductions. Our results can
be used directly to obtain a better understanding of the
different kinds of support, to choose an argumentation
semantics for a particular application, and to guide the
search for new argumentation semantics of bipolar argumentation frameworks. Indirectly they may be useful
for the search for a structured theory of support, and the
design of algorithms for bipolar argumentation.
keywords: Abstract argumentation, support, principle-based
approach, bipolar argumentation framework
Introduction
In his requirements analysis for formal argumentation, Gordon (2018) proposes the following definition covering more
clearly argumentation in deliberation as well as persuasion
dialogues: “Argumentation is a rational process, typically
in dialogues, for making and justifying decisions of various kinds of issues, in which arguments pro and con alternative resolutions of the issues (options or positions) are
put forward, evaluated, resolved and balanced.” At an abstract level, it seems that these pro and con arguments can
Copyright c 2020, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
227
tack. Whereas there is a general agreement in the formal argumentation community how to interpret attack, even when
different kinds of semantics have been defined, there is much
less consensus on the interpretation of support. Moreover, it
seems that each variant of support can be used for different
applications.
This paper contributes to a further understanding of the
concept of support using a principle-based analysis. Some of
the fourteen principles we study in this paper turn out to discriminate the various reductions based semantics of bipolar
argumentation frameworks, and they can therefore be used
to choose one semantics over another. Some other principles
always hold, or never, and can therefore guide the search for
new semantics of bipolar argumentation frameworks.
Principles and axioms can be used in many ways. Often, they conceptualize the behavior of a system at a higher
level of abstraction. Moreover, in absence of a standard approach, principles can be used as a guideline for choosing
the appropriate definitions and semantics depending on various needs. Therefore, in formal argumentation, principles
are often more technical. The most discussed principles are
admissibility, directionality and scc decomposibility, which
all have a technical nature. In this paper, from these we study
a generalized notion of directionality, taking into account not
only the directionality of the attacks, but also of the supports.
The layout of this paper is as follows. In Section 2 we
introduce the four kinds of indirect attack corresponding to
deductive and necessary interpretation discussed in the literature on bipolar argumentation. In Section 3 we introduce
the four atomic reductions corresponding to these four kinds
of indirect attack, and two iterated reductions. In Section 4
we introduce the fragment of bipolar argumentation frameworks with evidential support. In Section 5 we introduce the
new principles and we give an analysis of which properties
are satisfied by which reduction. Section 6 is devoted to the
related work and to some concluding remarks.
We distinguish several definitions of extension, each corresponding to an acceptability semantics that formally rules
the argument evaluation process.
Definition 3 (Acceptability semantics (Dung 1995)) Let
hA, Ri be an AF:
• E ⊆ A is admissible iff it is conflict-free and defends all
its elements.
• A conflict-free E ⊆ A is a complete extension iff
E = {a|E defends a}.
• E ⊆ A is the grounded extension iff it is the smallest (for
set inclusion) complete extension.
• E ⊆ A is a preferred extension iff it is a largest (for set
inclusion) complete extension.
• E ⊆ A is a stable extension iff it is a preferred extension
that defeats all arguments in A\E.
Example 1 (Four arguments) The argumentation framework visualized on the left hand side of Figure 1 is defined
by AF = h{a, b, c, d}, {(a, b), (b, a), (c, d), (d, c)}i. There are
four preferred extensions: {a, c}, {b, c}, {a, d}, {b, d}, and
they are all stable extensions.
a
b
a
b
c
d
c
d
Figure 1: An argumentation framework (AF) and a bipolar
argumentation framework (BAF)
Bipolar argumentation framework is an extension of Dung’s
framework. It is based on a binary attack relation between
arguments and a binary support relation over the set of arguments.
Definition 4 (Bipolar argumentation framework) (Cayrol and Lagasquie-Schiex 2005) A bipolar argumentation
framework (BAF, for short) is a 3-tuple hA, R, Si where A
is a set of arguments, R is a binary attack relation ⊆ A × A
and S is a binary support relation ⊆ A × A, and R ∩ S = φ .
Thus, an AF is a special BAF with the form hA, R, 0i.
/
A BAF can be represented as a directed graph. Given a, b, c ∈
A, (a, b) ∈ R means a attacks b, noted as a → b; (b, c) ∈ S
means b supports c, noted as b 99K c.
Example 2 (Four arguments, continued) The bipolar argumentation framework visualized at the right hand side of
Figure 1 extends the argumentation framework in Example
1 such that a supports c.
Support relations only influence the extensions when
there are also attacks, leads to the study of the interactions
between attack and support. In the literature, the different
kinds of relations between support and attack have been
studied as different notions of indirect attack.
Indirect Attacks in Bipolar Argumentation
framework
This section gives a brief summary of the concept of indirect attack in bipolar argumentation. Dung’s argumentation
framework (Dung 1995) consists of a set of arguments and
a relation between arguments, which is called attack.
Definition 1 (Argumentation framework (Dung 1995))
An argumentation framework (AF) is a tuple hA, Ri where
A is a set of arguments and R ⊆ A × A is a binary attack
relation over A .
An AF can be represented as a directed graph, where the
nodes represent arguments, and the edges represent the attack relation: Given a, b ∈ A, (a, b) ∈ R stands for a attacks
b, noted as a → b.
Definition 2 (Conflict-freeness & Defense (Dung 1995))
Let hA, Ri be an AF:
• E ⊆ A is conflict-free iff ∄a, b ∈ E such that (a, b) ∈ R.
• E ⊆ A defends c iff ∀b ∈ A with (b, c) ∈ R, ∃a ∈ E such
that (a, b) ∈ R.
Definition 5 (Four indirect attacks (Polberg 2017)) Let
BAF = hA, R, Si be a BAF and a, b ∈ A, there is:
228
• a supported attack from a to b in BAF iff there exists an
argument c s.t. there is a sequence of supports from a to c
and c attacks b, represented as (a, b) ∈ Rsup .
attacks were built from the combination of direct attacks and
the supports, then from the obtained indirect attacks and the
support we can build additional indirect attacks and so on.
• a mediated attack from a to b in BAF iff there exists an
argument c s.t. there is a sequence of supports from b to c
and a attacks c, represented as (a, b) ∈ Rmed .
Definition 7 (Tiered indirect attacks(Polberg 2017)) Let
BF = (A, R, S) be a BAF. The tiered indirect attacks of BF
are as follows :
• a secondary attack from a to b in BAF iff there exists an
argument c s.t. there is a sequence of supports from c to b
and a attacks c, (a, b) ∈ Rsec .
• R0 ind = 0/
sup sec med ext
• Rind
1 = R0/ , R0/ , R0/ , R0/
sup sec med ext
ind
• Rind
i = {RE , RE , RE , RE | E ⊆ Ri−1 } for i > 1, where:
• a extended attack from a to b in BAF iff there exists an
argument c s.t. there is a sequence of supports from c to a
and c attacks b, (a, b) ∈ Rext .
a
c
– Rsup
E = {(a, b) | there exists an argument c s.t. there
S is a
sequence of supports from a to c and (c, b) ∈ R ∪ E}
– Rsec
E = {(a, b) | there exists an argument c s.t. there
S is a
sequence of supports from c to b and (a, c) ∈ R ∪ E}
– Rmed
E = {(a, b) | there exists an argument c s.t. there
S is a
sequence of supports from b to c and (a, c) ∈ R ∪ E}
– Rext
E = {(a, b) | there exists an argument c s.t. there
S is a
sequence of supports from c to a and (c, b) ∈ R ∪ E}
b
(a) Supported attack
a
c
With Rind
denote the collection of all sets of indirect
S we will
ind
attacks ∞
i=0 Ri
b
(b) Mediated attack
Deductive and necessary support
In this section we rephrase the different kinds of indirect attacks as an intermediate step towards semantics for bipolar
argumentation frameworks. The reductions can be used together with definitions 2 and 3 to define the extensions of a
(c) Secondary attack
bipolar argumentation framework.
The notion of conflict-free does not change, in the sense
a
c
b
that the conflict-free principle for bipolar frameworks is defined in the same way as the related principle for Dung’s
theory, though now also indirect attacks are taken into ac(d) Extended attack
count. For example, support relations can help arguments to
defend against other arguments, and in general support relaFigure 2: Four kinds of indirect attack
tions can influence the acceptability of arguments in various
ways. If we have a bipolar argumentation framework withDefinition 6 (Super-mediated attack (Cayrol and Lagasquie-Schiex 2013))
out support relations we would like to recover Dung’s defiLet BAF = hA, R, Si be a BAF and a, b ∈ A, there is a
nitions 2 and 3, such that bipolar argumentation is a proper
super-mediated attack from a to b in BAF iff there exists
extension of Dung’s argumentation. Moreover, if we have
an argument c s.t. there is a sequence of supports from b
a bipolar argumentation framework without attack relations,
to c and a direct or supported attacks c, represented as
then we would like to accept all arguments, for all semantics.
med
(a, b) ∈ RRsup .
The idea of the atomic reductions is that we interpret all
the support relations of the framework according to one of
the types of support. This will help us in the analysis of the
behavior of the different kinds of support.
a
c
b
Definition 8 (Existing reductions of BAF to AF) Given a
BAF = hA, R, Si, ∀a, b, c ∈ A:
d
a
c
b
• SupportedReduction (Cayrol and Lagasquie-Schiex 2013)
(RS for short): (a, b) ∈ Rsup is the collection of supported attack iff (a, c) ∈ S and (c, b) ∈ R, RS(BAF) =
(A, R ∪ Rsup ).
• MediatedReduction (Cayrol and Lagasquie-Schiex 2013)
(RM for short): (a, b) ∈ Rmed is the collection of mediated attackiff (b, c) ∈ S and (a, c) ∈ R, RM(BAF) =
A, R ∪ Rmed .
Figure 3: Super-mediated attack
We can obtain various kinds of indirect attacks according
to different interpretation of support relation. These indirect
229
• SecondaryReduction (Cayrol and Lagasquie-Schiex
2013) (R2 for short): (a, b) ∈ Rsec is the collection
of secondary attack iff (c, b) ∈ S and (a, c) ∈ R,
R2(BAF) = (A, R ∪ Rsec ).
• ExtendedReduction (Cayrol and Lagasquie-Schiex 2013)
(RE for short): (a, b) ∈ Rext is the collection of extended attack, iff (c, a) ∈ S and (c, b) ∈ R, RE(BAF) =
(A, R ∪ Rext ).
a
b
a
b
a
b
c
d
c
d
c
d
RS (BAF)
The initial BAF
• DeductiveReduction(Polberg 2017)(RD for short) Let
ind the collection of supported and
R′ = {Rsup , Rmed
Rsup } ⊆ R
S
super-mediated attacks in BF, RD(BAF) = (A, R ∪ R′ ).
• NecessaryReduction(Polberg 2017)(RN for short) Let
R′ = {Rsec , Rext } ⊆ Rind the collection of secondary
and
S
extended attacks in BF, RN(BAF) = (A, R ∪ R′ ).
a
b
a
b
a
b
c
d
c
d
c
d
R2 (BAF)
In general, we write E(BAF) for the extensions of a BAF,
which is characterized by a reduction and a Dung semantics.
We write ES (BAF) for the extensions of the bipolar framework using Dung semantics S.
Example 3 (Six reductions, continued) The reduction of
the bipolar argumentation framework in Example 2 is visualized in Figure 4. The reductions lead to the following
extensions.
RM (BAF)
RE (BAF)
a
b
c
d
RD (BAF)
RN(BAF)
• After RS, we get the associated AF with the addition of a
attacks b, the preferred extensions are: (a, c), (b, c), (b,
d);
• After RM, we get the associated AF with the addition of
d attacks a, the preferred are extensions: (a, c), (b, c), (b,
d);
• After R2, we get the associated AF with the addition of a
attacks d, the preferred are extensions: (a, c), (a, d), (b,
d);
• After RE, we get the associated AF with the addition of a
attacks d, the preferred are extensions: (a, c), (a, d), (b,
d).
• After RD, we get the associated AF with the addition of
a attacks d, the preferred are extensions: (a, c), (b, c), (b,
d).
• After RN, we get the associated AF with the addition of a
attacks d, the preferred are extensions: (a, c), (a, d), (b,
d).
Figure 4: The initial BAF with the associated AFs after Reductions
Moreover, evidential support contains special arguments
which do not need to be supported by other arguments. Such
arguments may have to satisfy other constraints, for example
that they cannot be attacked by ordinary arguments, or that
they cannot attack ordinary arguments. To keep our analysis
uniform, we do not explicitly distinguish such special arguments, but encode them implicitly: if an argument supports
itself, then it is such a special argument. This leads to the
following definition of an evidential sequence for an argument.
Definition 9 (Evidential sequence) Given a BAF =
hA, R, Si. A sequence (a0 , . . . , an ) of elements of A is an
evidential sequence for argument an iff (a0 , a0 ) ∈ S, and for
0 ≤ i<n we have (ai , ai+1 ) ∈ S.
Definition 10 (e-Defense & e-Admissible) Given a BAF =
hA, R, Si, a set of arguments S ⊆ A e-defends argument
a ∈ A iff for every evidential sequence (a0 , . . . , an ) where
an attacks a, there is an argument b ∈ S attacking one of the
arguments of the sequence. Moreover, a set of arguments S
is e-admissible iff
• for every argument a ∈ S there is an evidential sequence
(a0 , . . . , a) such that each ai ∈ S (a is e-supported by S),
• S is conflict free, and
• S e-defends all its elements.
In line with Dung’s definitions, a set of arguments is called
an e-complete extension if it is e-admissible and it contains
all arguments it e-defends; it is e-grounded extension iff it
is a minimal e-complete extension; and it is e-preferred if it
is maximal e-admissible extension. Moreover, it is e-stable if
It should be noted that these atomic reductions can be
combined in different ways into more complex notions of
reductions. For example, it is common practice to add both
the indirect attacks of two types, and also the order in which
attacks are added can have an impact.
Evidential support
Evidential support is usually defined for a more general
bipolar framework in which sets of arguments can attack or
support other arguments. To keep our presentation uniform
and to compare evidential support to deductive and necessary support, we only consider the fragment of bipolar argumentation frameworks where individual arguments attack or
support other arguments. This also simplifies the following
definitions.
230
for every for every evidential sequence (a0 , . . . , an ) where an
not in S, we have an argument b ∈ S attacking an element of
the sequence. We use REv(BAF) to represent the associated
AF of the BAF with evidential support.
a
d
The traditional definitions add moreover that elements of
evidential support are unique, that support is minimal, and
so on. This does not affect the definition of the extensions,
and we therefore do not consider that in this paper.
Finally, there is also a kind of reduction of bipolar frameworks to Dung frameworks, but this does not work by a reduction of support relations to attack relations. Instead, it is
based on a kind of meta-argumentation, in which the arguments of the Dung framework consists of sets or sequences
of arguments in the bipolar framework. As this reduction is
not directly relevant for the concerns of this paper, we refer
the reader to the relevant literature (Polberg 2017).
a
b
The initial BAF
c
b
c
d
BAF ′
Figure 5: The counterexample of Proof 1
Table 1: Comparison among the reductions and the proposed
principles. We refer to Dung’s semantics as follows: Complete (C), Grounded (G), Preferred (P), Stable (S). When a
principle is never satisfied by a certain reduction for all semantics, we use the × symbol. P1 refers to Principle 1, the
same holds for the others.
Red.
P1
P2
P3
P4
P5
RS
CGPS CGPS
×
×
×
RM CGPS
×
CGPS
×
×
R2
CGPS
×
×
CGPS
×
RE CGPS
×
×
×
CGPS
RD CGPS CGPS
×
CGPS
×
RN CGPS
×
CGPS
×
CGPS
REv
×
×
×
×
×
Principle-based analysis based on the different
kinds of indirect attacks and property
In this section we introduce principles corresponding to the
different notions of indirect attack. They correspond to constraints TRA, nATT and n+ATT Cayrol et al. (Cayrol and
Lagasquie-Schiex 2015). Basically these properties correspond to the interpretations underlying the different kinds
of support.
Principle 1 (Transitivity) For each BAF = hA, R, Si, if
aSb and bSc, then EhA, R, Si = EhA, R, S ∪ {aSc}i.
Principle 2 (Supported attack) For each BAF = hA, R,
Si, if aSc and cRb, then EhA, R, Si = EhA, R ∪ {aRb}, Si.
Principle 3 (Mediated attack) For each BAF = hA, R, Si,
if bSc and aRc, then EhA, R, Si = EhA, R ∪ {aRb}, Si.
Principle 6 (Conflict-free) Given a BAF = (A, R, S), if
(a, b) ∈ R, then ∄E ∈ E(BAF) s.t. (a, b) ∈ E.
The important principle of closure of an extension under
supported arguments was introduced by Cayrol et al. (Cayrol
and Lagasquie-Schiex 2015), called
Principle 7 (Closure) Given a BAF = (A, R, S), for all extensions E in E, ∀a, b ∈ A, if aSb and a ∈ E, then b ∈ E.
The following propositions show that closure under supported arguments holds only for some reductions.
Proposition 2 RS and RM satisfy Principle 7 for all the semantics.
Proof 2 To prove Proposition 2, we use proof by contradiction. Let E ⊆ A be a complete extension of an AF which
is the associated framework of a BAF after RM. Assume
RM does not satisfy Principle 7 for complete semantic, s.t.
∃a ∈ E, b ∈ A\E, s.t. (a, b) ∈ S. As b ∈
/ E, s.t. ∃c ∈ A,
(c, b) ∈ R, but ∄d ∈ E s.t. d defends b, i.e., d attacks c. If
c attacks b, then c mediated attacks a, there is no d attacks
c, then E is not admissible. There is a contradiction between
E is not admissible and E is complete. Therefore, RM satisfies Principle 7 for complete semantics.
Polberg (Polberg 2017) introduces a variant of closure,
called inverse closure.
Principle 8 (Inverse Closure) Given BAF = hA, R, Si, for
all extensions E in E, ∀a, b ∈ A, if aSb and b ∈ E, then a ∈ E.
Principle 4 (Secondary attack) For each BAF = hA, R,
Si, if cSb and aRc, then EhA, R, Si = EhA, R ∪ {aRb}, Si.
Principle 5 (Extended attack) For each BAF = hA, R, Si,
if cSa and cRb, then EhA, R, Si = EhA, R ∪ {aRb}, Si.
Proposition 1 REv does not satisfy Principle 1 for all the
semantics.
Proof 1 We use a counterexample to proof REv does not
satisfy Principle 1 for e-complete semantics. Assume a
BAF = hA, R, Si, in which A = {a, b, c, d}, R = {(d, b)},
S = {(a, a), (a, b), (b, c)(d, d)}, the e-complete semantics of
BAF is {a, d}. Because a supports c and c supports b, s.t.
a supports c, then we have BAF ′ = h{a, b, c, d}, {(d, b)},
{(a, a), (a, b), (b, c)(d, d), (a, c)}i, the e-complete semantics
of BAF ′ is {a, c, d}, see Figure 5.
The table below shows the correspondence between the
reductions and the first five principles. We omit the straightforward proofs.
Basic principles
We start with the basic property from Baroni’s classification
(Baroni and Giacomin 2007), conflict-freeness. Since we
only add attack relations, and all Dung’s semantics satisfy
the conflict-free principle, the property of conflict-freeness
is trivially satisfied for all reductions.
231
The following proposition shows that the reductions that
do not satisfy closure, satisfy the inverse closure principle
instead. Consequently, closure and inverse closure are good
principles to distinguish the behavior of the reductions.
Proposition 3 R2 and RE satisfy Principle 8 for all the semantics.
Proof 3 To prove Proposition 3, we use proof by contradiction. Let E ⊆ A be a complete extension of an AF which
is the associated framework of a BAF after R2. Assume R2
does not satisfy Principle 8, then b ∈ E, a ∈
/ E, s.t. (a, b) ∈ S.
As a ∈
/ E, s.t. ∃c ∈ A, (c, a) ∈ R, but ∄d ∈ E s.t. d defends a,
i.e., d attacks c. If c attacks a, then c secondary attacks b,
there is no d attacks c, then E is not admissible. There is a
contradiction between E is not admissible and E is complete.
Therefore, R2 satisfies Principle 8 for complete semantics.
a
b
a
b
BAF ′
Principle 10 (Addition persistence) Suppose E is an extension of a BAF, and a, b ∈ E. Now BAF ′ is the framework with the addition of a support relation from a to b, i.e.
ES (A, R, S) = ES (A, R, S ∪ (a, b)). We have that E is also an
extension of BAF ′ .
As expected, addition persistence holds for all reductions.
However, Principle 9 only holds for the grounded semantics. Below is the proof for R2 does not satisfy Principle 9
for preferred semantics, we omit other proofs due to lack of
space.
Proposition 5 All the reductions satisfy Principle 10 for all
the semantics.
Proof 6 Due to the lack of space we only provide proof
sketch. While two arguments are already in E which is an
extension of a BAF, we add support relation between them,
then there are three situations: the first is no new attack
needs to be added; the second is a new attack from an argument inside this extension to an outside argument, which
has no influence to this extension; the third is the addition
of a new attack from the outside argument to an inside argument, the attacked argument is still defended. Thus E is still
an extension of BAF.
Proposition 4 All the reductions do not satisfy Principle 9
for all the semantics except for grounded semantics.
Proof 4 We use a counterexample to prove R2 does not satisfy Principle 9 for preferred semantics. Assume a BAF =
hA, R, Si, in which A = {a, b, c}, R = {(b, a), (a, c)}, S = 0,
/
the preferred semantics of BAF is {b, c}. Let c supports b,
then we have BAF ′ = h{a, b, c}, {(b, a), (a, c)}, {(c, b)}i, the
preferred semantics of BAF ′ is {a} and {b, c}.
Along the same lines, the following principle considers
the removal of support relations in specific cases.
c
Principle 11 (Removal persistence) Suppose E is an extension of a BAF, ∀a, b, c ∈ A, where a supports c and c attacks b, a ∈ E but b ∈
/ E. Now BAF ′ is the framework which
removes the support relation from a to c, ES (A, R, S) =
ES (A, R, S\(a, b)), we have that E is also an extension of
BAF ′ ; Or we suppose E is an extension of a BAF, ∀a, b, c ∈
A, where c supports a and c attacks b, a ∈ E but b ∈
/ E. Now
BAF ′ is the framework which removes the support relation
from c to a, ES (A, R, S) = ES (A, R, S\(a, b)), We have that
E is also an extension of BAF ′ .
(a) The initial BAF
a
d
A more fine-grained analysis is based on dynamic properties that consider the addition of relations in certain cases.
The following principle considers the addition of support relations among arguments which are both accepted.
Principle 9 (Number of extensions) |ES (A, R, S ∪ S′ )| 6
|ES (A, R, S)|
b
c
Figure 7: The counterexample of Proof 5
Dynamic properties often give insight in the behavior of semantics. Principle 9 says that adding support relations can
only lead to a decrease of extensions, as in Example 3.
a
d
The initial BAF
Dynamic principles
b
c
c
(b) BAF ′ : After the addition of support
Figure 6: The counterexample of Proof 4
Principle 11 only holds for some reductions.
Proposition 6 RM and R2 do not satisfy Principle 11 for
the all the semantics.
Proof 5 We use a counterexample to prove REv does
not satisfy Principle 9 for e-complete semantics. Assume a BAF = hA, R, Si, in which A = {a, b, c, d}, R =
{(a, b)(b, a)}, S = {(c, c)(c, a)}, the e-complete semantics of BAF is {a, c}. Let d supports d, then we have
BAF ′ = h{a, b, c, d}, {(a, b)(b, a)}, {(c, c), (c, a), (d, d)}i,
the e-complete semantics of BAF ′ is {a, c, d} and {b, c, d}.
Proof 7 We use a counterexample to prove RM and R2 do
not satisfy Principle 11 for preferred semantics. The preferred semantics of Figure 2(b) is {a}, if we delete the support relation from b to c, the preferred semantics turns to
{a, b}; The preferred semantics of Figure 2(c) is {a}, if we
232
delete the support relation from c to b, the preferred semantics turns to {a, b}.
a
b
a
b
c
c
d
c
d
b
d
Directionality
Directionality can be generalized to bipolar argumentation
as follows.
Definition 11 (Unattacked and unsupported arguments
in BAF) Given an BAF = hA, R, Si, a set U is unattacked
and unsupported if and only if there exists no a ∈ A\U such
that a attacks U or a supports U. The unattacked and unsupported sets in BAF is denoted US(BAF) (U for short).
Figure 8: The counterexample for Proof 10
Proposition 10 RM does not satisfy Principle 12 for
grounded, complete and preferred semantics.
Principle 12 (BAF Directionality) A BAF semantics σ
satisfies the BAF directionality principle iff for every BAF,
for every U ∈ US(BAF), it holds that σ (BAF↓U ) = {E ∩
U|E ∈ σ (BAF)}, where for BAF = hA, R, Si, BAF↓U =
(U, R ∩ U × U, S ∩ U × U) is a projection, and σ (BAF↓U )
are the extensions of the projection.
Proof 11 We use a counterexample to prove RM does not
satisfy Principle 12 for preferred semantics, which is showed
in Figure 9. Due to the lack of space, here we omit the details.
a
In (Baroni and Giacomin 2007), the authors have showed
that stable semantics violates directionality, here we omit the
proof of all the reductions do not satisfy Principle 12 for
stable semantics.
b
Proposition 7 RS satisfies Principle 12 for grounded, complete and preferred semantics.
a
c
b
a
c
b
Figure 9: The counterexample for Proof 11
Proof 8 To prove Proposition 7, we use proof by contradiction. Assume RS does not satisfy Principle 12, Let U1 be an
unattacked and unsupported set, let U2 be A\U, we assume
a semantics for AF that satisfies Principle 12 for grounded,
complete and preferred semantics. From the above, we have
a supports c and c attacks b, s.t. a supported attacks b, a is
in U2 and b is in U1 . If b is in U1 , then c must be in U1 , if c
is in U1 , then a must be in U1 . Contradiction.
Supported arguments
Principle 13 (Global support) Given a BAF = hA, R, Si,
for all extensions E in E, if a ∈ E, then there must be an
argument b s.t. b ∈ E, and b supports a.
Principle 14 (Grounded support) Given a BAF = hA, R,
Si, for all extensions E in E, if a ∈ E, then there must be an
argument b ∈ E and (b, b) ∈ S (or (a, a) ∈ S), s.t. there is a
support sequence (b, a0 , . . . , an , a), all ai ∈ E.
Proposition 8 R2 satisfies Principle 12 for grounded, complete and preferred semantics..
Proof 9 To prove Proposition 8, we use proof by contradiction. Assume R2 does not satisfy Principle 12, Let U1 be an
unattacked and unsupported set, let U2 be A\U, we assume
a semantics for AF that satisfies Principle 12 for grounded,
complete and preferred semantics. From the above, we have
a supports c, c attacks b, s.t. a secondary attacks b, a is in
U2 and b is in U1 . If b is in U1 , then c must be in U1 , if c is in
U1 , then a must be in U1 . Contradiction.
Proposition 11 All the reductions except for REv do not satisfy Principle 13 and 14 for all the semantics.
Proof 12 We can simply use a counterexample to prove
Proposition 11. Assume we have a BAF = h{a}, 0,
/ 0i,
/
Ecomplete (RS(BAF))=Ecomplete (RM(BAF)) = Ecomplete (R2
(BAF)) = Ecomplete (RE(BAF)) = Ecomplete (RD(BAF)) =
Ecomplete (RN(BAF)) = {{a}}. However, there is no argument supports a.
Proposition 9 RE does not satisfy Principle 12 for
grounded, complete and preferred semantics.
The following table summarizes the results of this section.
Proof 10 We use a counterexample to prove RE does not
satisfy Principle 12 for preferred semantics. Assume we
have a BAF visualized as the left in Figure 6, argument
c supports a, then we have the associated AF visualized
as the middle in Figure 8 in which we add a extended attacks b and the same form a to d. From the initial BAF,
we have an unattacked and unsupported set U = {b, c, d},
the right of Figure 8 visualizes BAF↓U . The preferred extensions of BAF is σ (BAF) = σ (AF) = {{a, c}}, σ (BAF↓U ) =
{{c}, {b, d}}, σ (BAF) ∩ U = {{c}}, σ (BAF↓U ) 6= {{c}}.
Thus, BAF↓U is not the projection of whole framework.
Concluding remarks, related and future work
Actually, there is a gap between the formal analysis of bipolar frameworks, i.e.knowledge reasoning, and the informal
representation, i.e.knowledge representation. In (Cayrol and
Lagasquie-Schiex 2013) , the authors give the following example written in natural language: “a bipolar degree supports a scholarship”. The interpretation of this sentence is
subjective. You can whether give the support necessary interpretation: “A bachelor degree is necessary for a scholarship, so if someone does not have a bachelor degree, one
233
Table 2: Comparison among the reductions and the proposed
principles.
Red. P6
P7
P8 P9 P10 P11 P12 P13 p14
RS CGPS GCPS × G CGPS GCPS CGP ×
×
RM CGPS GCPS × G CGPS ×
×
×
×
R2 CGPS × GCPS G CGPS × CGP ×
×
RE CGPS × GCPS G CGPS GCPS ×
×
×
RD CGPS GCPS × G CGPS ×
×
×
×
RN CGPS × GCPS G CGPS ×
×
×
×
REv CGPS ×
× G CGPS ×
× GCPS GCPS
does not get a scholarship”; or give a deductive interpretation: “A bachelor degree is sufficient for a scholarship, so if
one does not get a scholarship, one does not have a bachelor degree”. This translation from natural language to formal one is standard pragmatics, i.e. whether “A supports B”
means “A implies B” (sufficient reason) or “B implies A”
(necessary reason), or to mix them to get a more complicated
relation. As a result, different agents have different interpretations, formal argumentation may play the meta-dialogue to
settle this issue such as we can adopt it for legal interpretation.
However, the considerations above do not invalidate our
work about the principle-based approach for bipolar argumentation, on the contrary, because of the ambiguity at the
pragmatic and semantic level, a principle-based approach
can be very useful to better understand the choices of a particular formalization.
In this paper, we have proposed an axiomatic approach to
bipolar argumentation framework semantics, which is summarized in tables 1 and 2 of this paper. We considered
seven reductions from bipolar argumentation frameworks to
a Dung-like abstract argumentation, four standard semantics
to compute the set of accepted arguments, and fourteen principles to study the considered reductions. This work can be
extended by considering more reductions, more semantics,
and more principles. Our principles are all independent of
which admissibility-based semantics is used, though some
principles do not hold for semi-stable semantics. Moreover,
they do not hold for some of the naive-based semantics.
Some general insights can be extracted from the tables.
Our principles P6, P11 and P12 can be used to distinguish
among different kinds of reductions, and can be used to
choose a reduction for a particular application. Principles
like P9 which never hold can be used in the further search for
semantics. Also we can define new semantics directly associating extensions with bipolar argumentation frameworks,
i.e. without using a reduction.
The results of this paper give rise to many new research
questions. We intend to analyze the similarity between reductions for preference-based argumentation frameworks
and for bipolar argumentation frameworks. Whereas in both
frameworks, the support relation and the preference can be
both added and removed. In this way, the theory of reductions for preference based argumentation and bipolar argu-
mentation is closely related to dynamic principles for AF
(Rienstra, Sakama, and van der Torre 2015), which can be
a source of further principles. Similarly, like in preferencebased argumentation, symmetric attack can be studied.
Furthermore, the first volume of the handbook of formal
argumentation (Baroni, Gabbay, and Giacomin 2018)surveys the definitions, computation and analysis of abstract
argumentation semantics depending on different criteria to
decide the sets of acceptable arguments, and various extensions of Dung’s framework have been proposed. There
are many topics where bipolar argumentation could be used,
and such uses could inspire new principles. Gordon (Gordon
2018) requirements analysis for formal argumentation suggests that attack and support should be treated as equals in
formal argumentation, which is also suggested by applications like DebateGraph. The handbook discusses also many
topics where the theory of bipolar argumentation needs to
be further developed. A structured theory of argumentation
seems to be needed most. For example, maybe the most natural kind of support is a lemma supporting a proof. This
corresponds to the idea of a sub-argument supporting its
super-arguments. In Toulmin’s argument structure, support
arguments could be used as a warrant. Moreover, the role
of support in dialogue needs to be clarified. Prakken argues
that besides argumentation as inference, there is also argumentation as dialogue, several chapters of the handbook are
concerned with this, such as argumentation schemes. The
core of the theory is a set of critical questions, which can
be interpreted as attacks. Maybe the answers to the critical
questions can be modeled as support?
Finally, like Doutre et al (P. et al. 2017), we believe that
the scope of the “principle-based approach” of argumentation semantics (Baroni and Giacomin 2007) can be widened.
In the manifesto (Gabbay et al. 2018), it is argued that axioms are a way to relate formal argumentation to other areas
of reasoning, e.g. social choice.
Acknowledgement
This project has received funding from the European
Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie ITN EJD grant agreement
No 814177.
We acknowledge Dr.Srdjan Vesic and Dr.Tjitze Rienstra for
giving valuable advice.
References
Baroni, P., and Giacomin, M. 2007. On principle-based
evaluation of extension-based argumentation semantics. Artificial Intelligence 171(10-15):675–700.
Baroni, P.; Gabbay, D.; and Giacomin, M. 2018. Handbook
of Formal Argumentation. College Publications.
Besnard, P., et al. 2008. Semantics for evidence-based argumentation. Computational Models of Argument: Proceedings of COMMA 2008 172:276.
Boella, G.; Gabbay, D. M.; van der Torre, L.; and Villata, S.
2010. Support in abstract argumentation. In Proceedings of
the Third International Conference on Computational Mod-
234
Rienstra, T.; Sakama, C.; and van der Torre, L. 2015. Persistence and monotony properties of argumentation semantics.
In International Workshop on Theory and Applications of
Formal Argumentation, 211–225. Springer.
els of Argument (COMMA’10), 40–51. Frontiers in Artificial
Intelligence and Applications, IOS Press.
Cayrol, C., and Lagasquie-Schiex, M.-C. 2005. On the
acceptability of arguments in bipolar argumentation frameworks. In European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty, 378–389.
Springer.
Cayrol, C., and Lagasquie-Schiex, M.-C. 2009. Bipolar abstract argumentation systems. In Argumentation in Artificial
Intelligence. Springer. 65–84.
Cayrol, C., and Lagasquie-Schiex, M.-C. 2010. Coalitions
of arguments: A tool for handling bipolar argumentation
frameworks. International Journal of Intelligent Systems
25(1):83–109.
Cayrol, C., and Lagasquie-Schiex, M.-C. 2013. Bipolarity in
argumentation graphs: Towards a better understanding. International Journal of Approximate Reasoning 54(7):876–
899.
Cayrol, C., and Lagasquie-Schiex, M.-C. 2015. An axiomatic approach to support in argumentation. In International Workshop on Theory and Applications of Formal Argumentation, 74–91. Springer.
Dung, P. M. 1995. On the acceptability of arguments
and its fundamental role in non-monotonic reasoning, logic
programming and n-person games. Artificial Intelligence
77:321–357.
Gabbay, D. M.; Giacomin, M.; Liao, B.; and van der Torre,
L. W. N. 2018. Present and future of formal argumentation
(dagstuhl perspectives workshop 15362). Dagstuhl Manifestos 7(1):69–95.
Gordon, T. F. 2018. Towards requirements analysis for
formal argumentation. Handbook of formal argumentation
1:145–156.
Nouioua, F., and Risch, V. 2010. Bipolar argumentation frameworks with specialized supports. In 2010 22nd
IEEE International Conference on Tools with Artificial Intelligence, volume 1, 215–218. IEEE.
Nouioua, F., and Risch, V. 2011. Argumentation frameworks
with necessities. In International Conference on Scalable
Uncertainty Management, 163–176. Springer.
Nouioua, F. 2013. Afs with necessities: further semantics
and labelling characterization. In International Conference
on Scalable Uncertainty Management, 120–133. Springer.
Oren, N.; Luck, M.; and Reed, C. 2010. Moving between
argumentation frameworks. In Proceedings of the 2010 International Conference on Computational Models of Argument. IOS Press.
P., B.; David, V.; S., D.; and D., L. 2017. Subsumption
and incompatibility between principles in ranking-based argumentation. In Proc. of the 29th IEEE International Conference on Tools with Artificial Intelligence ICTAI 2017.
Polberg, S., and Oren, N. 2014. Revisiting support in abstract argumentation systems. In COMMA, 369–376.
Polberg, S. 2017. Intertranslatability of abstract argumentation frameworks. Technical report, Technical Report. Technical Report DBAI-TR-2017-104, Institute for . . . .
235
Discursive Input/Output Logic:
Deontic Modals, Norms, and Semantic Unification
Ali Farjami
University of Luxembourg
ali.farjami@uni.lu
Abstract
However, its most developed formulation is the Kratzerian
framework (Kratzer 2012). For Kratzer, the semantics of deontic modals has two contextual components: a set of accessible worlds and an ordering of those worlds. In the Kratzerian framework, each contextual component is given as a
set of propositions. Formally these are both functions, called
conversational backgrounds, from evaluation worlds to sets
of propositions. The modal base determines the set of accessible worlds and the ordering source induces the ordering on
worlds (Von Fintel 2012).
The other paradigm, namely norm-based semantics, is
originally offered by David Makinson (Makinson 1999). He
drew attention to a semantics for obligations and permissions, where deontic operators are evaluated not with reference to a set of possible worlds but with reference to a
set of norms. This set of norms cannot meaningfully be
termed true or false. The logic developed by Makinson and
van der Torre (2000) is known as Input/Output (I/O) logic.
I/O logic is a fruitful framework for the theoretical study
of deontic reasoning (Makinson and van der Torre 2001;
Parent and van der Torre 2017b) and has strong connections to nonmonotonic logic (Makinson and van der Torre
2001), the other main approach for normative reasoning
(Nute 2012). More examples of norm-based logics include
theory of reasons (Horty 2012), which is based on Reiter’s
default logic, and logic for prioritized conditional imperatives (Hansen 2008).
The question of this paper is: How can we integrate
the norm-based approach in the sense of Makinson (1999)
to the classic semantics in the sense of Kratzerian framework? To achieve a more uniform semantics (Horty 2014;
Fuhrmann 2017; Parent and van der Torre 2017b) for deontic modals, we build I/O operations on top of Boolean
algebras for deriving permissions and obligations. The approach is close to the work is done by Gabbay, Parent, and
van der Torre (2019): a geometrical view of I/O logic . For
defining the I/O framework over an algebraic setting, they
use the algebraic counterpart, upward-closed set of the infimum of A, for the propositional logic consequence relation (“Cn(A)”), within lattices. They have characterized
only the simple-minded output operation. We show that by
choosing the “U p” operator,1 upward-closed set, as the alge-
The so-called modal logic and norm-based paradigms
in deontic logic investigate the logical relations among
normative concepts such as obligation, permission, and
prohibition. The paper unifies these two paradigms by
introducing an algebraic framework, called discursive
input/output logic. In this framework, deontic modals
are evaluated with reference both to a set of possible
worlds and a set of norms. The distinctive feature of
the new framework is the non-adjunctive definition of
input/output operations. This brings us the advantage of
modeling discursive reasoning.
1
Introduction
The paper introduces a new logical framework for normative
reasoning. It is a unification of the two main paradigms for
deontic logic: “modal logic” and “norm-based” paradigms.
Each paradigm has its advantages. An advantage of the
modal logic paradigm is the capability to extend with other
modalities such as epistemic or temporal operators, and advantages of the norm-based paradigm include the ability
to explicitly represent normative codes such as legal systems and using non-monotonic logic techniques of common sense reasoning. Unifying these two paradigms will
provide us with a framework with all of these advantages
simultaneously. For example, we can design a normative
temporal system, which changes over time. The temporal
reasoning comes from the advantage of the modal logic
part and changes operators (expansion, contraction) from
the norm-based part. There are other frameworks, such as
adaptive logic (Straßer 2013; Straßer, Beirlaen, and van de
Putte 2016), that combine modal logic and norm-based approaches. The novelty of our approach is semantical unification. The unification is based on bringing the core semantical
elements of both approaches in a single unit.
Before introducing the suggested framework, we will
briefly discuss the paradigms mentioned above in turn. The
classic semantics for deontic modality was developed as a
branch of modal logic in variants by Danielsson (1968),
Hansson (1969), Føllesdal, Hilpinen (1970), van Fraassen
(1973a; 1973b) and Lewis (2013; 1974), among others.
Copyright c 2020, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
1
236
Thanks to Majid Alizadeh for this suggestion.
systems, we show how it is possible to add AND and other
rules, required for obligation (Makinson and van der Torre
2000), to the proof systems and find I/O operations for them.
The introduced I/O operations admit normative conflicts and
could receive technical benefits from the constrained version
of I/O logics (Makinson and van der Torre 2001) for resolving normative conflicts. The introduced framework is a form
of paraconsistent logic for admitting normative conflicts (see
Subsection 6.1, (Goble 2013)). Moreover, we use Stone’s
representation theorem for Boolean algebras for integrating
input/output logic with possible world semantics.5
The I/O operations presented here are Tarskain or closure operator over a set of conditional norms so that they
can be used as logical operators for reasoning about normative systems. The algebrization of the I/O framework shows
more similarity with the theory of joining-systems (Lindahl
and Odelstad 2013) that is an algebraic approach for study
normative systems over Boolean algebras. We can say that
norms in the I/O framework play the same role of joining
(Sun 2018) in the theory of Lindahl and Odelstal (2013).
The article is structured as follows: Section 2 is about integrating the norm-based approach to the Kratzerian framework for deontic modals. Section 3 and 4 give the soundness and completeness results of I/O operations for deriving
permissions and obligations. Section 5 generalizes the I/O
operations over any abstract logic. Section 6 concludes the
paper.
braic counterpart of the “Cn” operator and by using the reversibility of inference rules in the I/O proof system, we can
characterize all the previously studied I/O systems and find
many more new logical systems. This suggested framework
has a significant difference from other types of input/output
logics. In contrast to the earlier input/output logics, we define non-adjunctive input/output operations. Non-adjunctive
logical systems are those where deriving the conjunctive
formula ϕ ∧ ψ from the set {ϕ, ψ} fails (Ciuciura 2013;
Costa 2005). These systems are especially suited for modeling discursive reasoning. In fact, the first non-adjunctive
system in the literature was proposed by Jaśkowaski (1969)
for discursive systems.
“[...] such a system which cannot be said to include
theses that express opinions in agreement with one another, be termed a discursive system. To bring out the
nature of the theses of such a system it would be proper
to precede each thesis by the reservation: in accordance
with the opinion of one of the participants of the discourse [...]. Hence the joining of a thesis to a discursive
system has a different intuitive meaning than has assertion in an ordinary system.” (Jaśkowski 1969)
We build two groups of I/O operations for deriving permissions and obligations over Boolean algebras. The main
difference between the two operations is similar to the possible world semantics characterization of box and diamond,
where box is closed under AND, ((✷A∧✷B) → ✷(A∧B)),
and diamond not.2 For each deontic modal of permission
and obligation, a primitive operation3 is defined in the strong
sense (Alchourrón and Bulygin 1971).4
The “U p” operator, for a given set A, sees all the elements
that are upper than or equal to the elements of A. This operator instead of the “Cn” operator is not closed under conjunction so that we do not have a ∧ ¬a ∈ U p(a, ¬a). Consequently, the new I/O operations defined by the “U p” operator instead of the “Cn” operate on inputs independently and
do not derive joint outputs (are not closed under AND). According to the reversibility of inference rules in the I/O proof
2
Norms and Deontic Modals
Before going to our discussion consider following basic logical notions:
• W is a finite set of possible worlds W = {w1 , ..., wn }.
• P (W ) is the set of propositions. A proposition x is true
in a world w if and only if w ∈ x.
• If A is a set of propositions,
T
– A 6= ∅ means that A is consistent.
T
– A ⊆ x means that x follows from A.
T
– A∩x 6= ∅ means that x is compatible with A (A∪{x}
is consistent).
• f is a function form W to P (P (W )), which is termed the
modal base. f assigns to every possible world w the set of
propositions (f (w)), which is called premise set, that are
known in w by us. We use the same formal definition for
the ordering source function g.
• Normative system N ⊆ P (W ) × P (W ) denotes a set of
norms (a, x), which the body and head are propositions.
More explicitly, N O denotes a set of obligatory norms and
N P a set of permissive norms. If (a, x) ∈ N O , it means
that “given a, it is obligatory that x” and if (a, x) ∈ N P ,
it means that “ given a, it is permitted that x.”
• x ∈ out(N O , A) means given normative system N O
and input set A (state of affairs), x (obligation) is in
the output (similar definition works for permission x ∈
2
In the main literature of input/output logic developed by
Makinson and van der Torre (2000), Parent and van der Torre
(2014; 2014; 2017a; 2018a), and Stolpe (2008b; 2008a; 2015) at
least one form of AND inference rule is present. Sun (2016) analyzed norms derivations rules of input/output logic in isolation.
Still, it is not clear how we can combine them and build new logical systems, specifically systems that do not admit the rule of AND.
For building a primitive operation for producing permissible propositions, we need to remove the AND rule from the proof system.
3
Von Wright (1951) defined permission as the primitive concept and obligation as the dual of it. Later, in the central literature of deontic logic, obligation introduced as the primitive concept and permission defined as the dual concept, as well in the earlier input/output logic for permission (Makinson and van der Torre
2003). Moreover, in the I/O literature, permission base on derogation is studied by Stolpe (2010b; 2010a) and based on constraints
by Boella and van der Torre (2008).
4
For example Alchourrón, and Bulygin (1971) define strong
permission as :“ To say the p is strongly permitted in the case q
by the system α means that a norm to the effect that p is permitted
in q is a consequence of α”.
5
Another possible worlds semantics of I/O logic is studied by
Bochman (2005) for causal reasoning. It has no direct connection to
the operational semantics (see Subsection 2.4, (Parent and van der
Torre 2013)).
237
out(N P , A)). The output operations resemble inferences,
where inputs need not be included among outputs, and
outputs need not be reusable as inputs (Makinson and
van der Torre 2000).
In the classic semantics, modals are quantifiers over possible worlds. Deontic modals are quantifiers over the best
worlds
in the domain of accessible worlds, represented as
T
f (w) in the Kratzerian framework: ought or have-to are
necessity modals that the prejacent (i.e. the proposition under the modal operator) is true in all of the best worlds and
may or can are possibility modals that the prejacent is true
in some of the best worlds (Von Fintel and Heim 2011;
Von Fintel 2012). We can define deontic modals in the
Kratzerian framework as follows (Von Fintel 2012):
[[be-allowed-to]]w,f,g = λx (Bestg(w) (
\
In this case, deontic modals are evaluated with reference
to a set of propositions given by the modal base and a normative system in each possible world. In the same way, in
the world w, we can detach what we allowed to or have to as
theToutput of what we preferred (as the input set) represented
as g(w) and the corresponding normative systems N . The
modal bases are always factual. Whenever there are possible inconsistencies, we can take the content as an ordering
source (Kratzer, Pires de Oliveira, and Pessotto 2014). If the
set of g(w) is not consistent, we can draw conclusions by
looking at maximal consistent subsets.6
T
Inconsistent premise
sets: Suppose g(w) = ∅,
T
and
T Maxfamily (g(w)) =
{ A|A ⊆ g(w) and A is consistent and maximal}
[[be-allowed-to]]w,g =
T
λN P λx (x ∈ out(N P , Maxfamily (g(w))))
[[have-to]]w,g =
T
λN O λx (x ∈ out(N O , Maxfamily (g(w))))
f (w)) ∩ x 6= ∅)
\
[[have-to]]w,f,g = λx (Bestg(w) ( f (w)) ⊆ x)
T
where Bestg(w) ( f (w)) is given as follows:
\
\
′
′′
{w ∈
f (w) : ¬∃w ∈
f (w) such that
′′
Both introduced modals ([[be-allowed-to]] and
[[have-to]]) are in the strong sense (Alchourrón and
Bulygin 1971). For each one, we can define a weak
sense of modality using the dual operator, which
means x ∈ [[be-allowed-to]]W eak−sense if and only if
¬x ∈
/ [[have-to]]Strong−sense ; x ∈ [[have-to]]W eak−sense
if and only if ¬x ∈
/ [[be-allowed-to]]Strong−sense . We
have presented a family of output operations that derive
different sets of permissions in Section 3. In Section 4,
we define more complicated output operations for deriving obligations. As the distinctive feature, the output
operations for obligations are closed under AND which
means: if x ∈ out(N O , A) and y ∈ out(N O , A), then
x ∧ y ∈ out(N O , A).
′
∃y ∈ g(w) : w ∈ y and w ∈
/ y}
In the definition, the domain of quantification is selected
by a modal base and an ordering source for deriving deontic modals. Moreover, there are two ways for quantification:
compatibility and entailment. We employ the modal base
and ordering source functions, from the Kratzerian framework (Kratzer 2012) and the detachment approach (Parent
and van der Torre 2013) from I/O framework, instead of
quantification. As an advantage of the detachment approach,
we can characterize derivation systems that do not admit, for
example, weakening of the output (WO) or strengthening of
the input (SI). In Section 3 and Section 4, we develop various detachment methods are using different I/O operations,
in turn, for permission and obligation.
In input/output logic, the main semantical construct for
normative propositions is the output operation, which represents the set of normative propositions related to the normative system N , regarding the state of affairs A, namely
out(N, A). Detachment is the basic idea of the semantics
of input/output logic (Parent and van der Torre 2013). The
interpretation of “x is obligatory if a” is that “x can be detached in context a”. In a discourse, the context is represented by a modal base or an ordering source in the Kratzerian framework. To unify the norm-based approach with the
classic semantics, in each world w, we can detach what we
allowed to or have to as theToutput of what we know (as
the input set) represented as f (w), the intersection of the
propositions given by the modal base, and the corresponding
normative systems N .
3
Permissive Norms: Input/Output
Operations
The term “input/output logic” is used broadly for a family of
related systems such as simple-minded, basic and reusable
(Parent and van der Torre 2018b; Makinson and van der
Torre 2000). In this section, we use a similar terminology
and introduce some input/output systems for deriving permissions over Boolean algebras. Each derivation system is
closed under a set of rules. Moreover, we define systems
that are closed only for weakening of the output (WO) or
strengthening of the input (SI). We use a bottom-up approach for characterizing different derivations systems. The
rule of AND, for the output, is absent in the derivation systems presented in this section.
T
Consistent premise sets: Suppose f (w) 6= ∅
[[be-allowed-to]]w,f = T
λN P λx (x ∈ out(N P , { f (w)}))
[[have-to]]w,f =
T
λN O λx (x ∈ out(N O , { f (w)}))
6
In the original input/output logic we have β ∈ (N, {γ, ¬γ})
for all (α, β) ∈ deriv(N ). So when the input set is inconsistent we
have explosion in the original input/output logic. Reasoning from
an inconsistent premise set which is represented as set of logical
formulas is an important issue for deontic modlas (see Chapter 1,
(Kratzer 2012)).
238
Outline of proof for soundness: for the input set A ⊆
B, we show that if (A, x) ∈ deriveB
0 (N ), then x ∈
outB
(N,
A).
By
definition
(A,
x)
∈
deriveB
0
0 (N ) iff
B
(a, x) ∈ derive0 (N ) for some a ∈ A. By induction on
the length of derivation and the following theorem we have
B
(a, x) ∈ deriveB
0 (N ) iff x ∈ out0 (N, {a}). Then by defiB
B
nition of out0 we have x ∈ out0 (N, A). If A = {}, then by
definition (A, x) ∈
/ deriveB
0 (N ). The outline works for the
soundness of other presented systems in the paper as well.
Definition 1 (Boolean algebra) A structure B
=
hB, ∧, ∨, ¬, 0, 1i is a Boolean algebra iff it satisfies
following identities:
•
•
•
•
•
x ∨ y = y ∨ x, x ∧ y = y ∧ x
x ∨ (y ∨ z) = (x ∨ y) ∨ z, x ∧ (y ∧ z) = (x ∧ y) ∧ z
x ∨ 0 = x, x ∧ 1 = x
x ∨ ¬x = 1, x ∧ ¬x = 0
x∨(y∧z) = (x∨y)∧(x∨z), x∧(y∨z) = (x∧y)∨(x∧z)
The elements of a Boolean algebra are ordered as a ≤ b iff
a ∧ b = a.
Theorem 1 (Soundness) outB
0 (N ) validates EQI and
EQO.
Definition 2 (Upward-closed set) Given a Boolean algebra B, a set A ⊆ B satisfying the following property is
called upward-closed.
Proof 1 − EQI:We need to show that
EQI
For all x, y ∈ B, if x ≤ y and x ∈ A then y ∈ A
We denote the least upward-closed set which includes A by
U p(A). U p operator satisfies following properties:
• A ⊆ U p(A)
• A ⊆ B ⇒ U p(A) ⊆ U p(B)
• U p(A) = U p(U p(A))
If x ∈ Eq(N (Eq(a))), then there are t1 and t2 such that
t1 = a and t2 = x and (t1 , t2 ) ∈ N . If a = b then t1 = b.
Hence, by definition x ∈ Eq(N (Eq(b))).
(Inclusion)
(Monotony)
(Idempotence)
− EQO: We need to show that
An operator that satisfies these properties is called closure
operator.
EQO
Let N (A) = {x | (a, x) ∈ N for some a ∈ A} and
Eq(X) = {x|∃y ∈ X, x = y}.
B
10
Theorem 2 (Completeness) outB
0 (N ) ⊆ derive0 (N ) .
Definition 3 (Semantics) Given a Boolean algebra B, a
normative system N ⊆ B × B and an input set A ⊆ B,
we define the zero Boolean operation as follows:
Proof 2 We show that if x ∈ Eq(N (Eq(A))), then (A, x) ∈
deriveB
0 (N ). Suppose x ∈ Eq(N (Eq(A))), then there are
t1 and t2 such that t1 = a and a ∈ A, and t2 = x such that
(t1 , t2 ) ∈ N .
7
outB
0 (N, A) = Eq(N (Eq(A)))
B
We put outB
0 (N ) = {(A, x) : x ∈ out0 (N, A)}.
Definition 4 (Proof system) Given a Boolean algebra B
and a normative system N ⊆ B × B, we define (a, x) ∈
deriveB
0 (N ) if and only if (a, x) is derivable from N using
the rules {EQI, EQO}.8
EQO
(a, x)
EQO
(t1 , t2 )
t2 = x
(t1 , x)
EQI
(a, x)
t1 = a
B
Thus, x ∈ deriveB
0 (N, a) and then x ∈ derive0 (N, A).
a=b
(b, x)
(a, x)
x ∈ Eq(N (Eq(a)))
x=y
y ∈ Eq(N (Eq(a)))
If x ∈ Eq(N (Eq(a))), then there are t1 and t2 such that
t1 = a and t2 = x and (t1 , t2 ) ∈ N . If x = y then t2 = y.
Hence, by definition y ∈ Eq(N (Eq(a))).
Zero Boolean I/O operation
EQI
x ∈ Eq(N (Eq(a)))
a=b
x ∈ Eq(N (Eq(b)))
Two basic subsystems: We can construct two simple subB
systems: outB
R (N, A) = Eq(N (A)) and outL (N, A) =
B
N (Eq(A)). We define (a, x) ∈ deriveR (N ) ((a, x) ∈
deriveB
L (N )) if and only if (a, x) is derivable from N using
the rule {EQO} ({EQI}). By rewriting the same definition
B
B
of outB
0 (N ) for outR (N ) and outL (N ), and the definition
B
B
of derive0 (N ) for deriveR (N ) and deriveB
L (N ), we have:
x=y
(a, y)
Given a set of A ⊆ B, (A, x) ∈ deriveB
0 (N ) when9
ever (a, x) ∈ deriveB
0 (N ) for some a ∈ A. Put
B
deriveB
0 (N, A) = {x : (A, x) ∈ derive0 (N )}.
7
Sometimes we write U p(a, b, ...)(Eq(a, b, ...)) instead of
U p({a, b, ...})(Eq({a, b, ...})) as well out(N, a) (derive(N, a))
instead of out(N, {a}) (derive(N, {a})).
8
EQI stands for equivalence of the input and EQO stands for
equivalence of the output.
9
In the original input/output logic (Makinson and van der Torre
2000), it is for some conjunction a of elements in A.
B
outB
R (N ) = deriveR (N )
B
outB
L (N ) = deriveL (N )
10
For the completeness proofs if A = {}, then by definition of
Eq({}) = {} and U p({}) = {} we have x ∈
/ outB
i (N, {}) = {}.
239
Simple-I Boolean I/O operation
Example 2: For the conditionals N = {(⊤, g), (g, t)} and
the input set A = {} then outB
II (N, A) = {} and for the
input set C = {g} we have outB
II (N, C) = U p(t).
Definition 5 (Semantics) Given a Boolean algebra B, a
normative system N ⊆ B × B and an input set A ⊆ B,
we define the simple-I Boolean operation as follows:
outB
I (N, A) = Eq(N (U p(A)))
B
We put outI (N ) = {(A, x) : x ∈ outB
I (N, A)}.
Definition 6 (Proof system) Given a Boolean algebra B
and a normative system N ⊆ B × B, we define (a, x) ∈
deriveB
I (N ) if and only if (a, x) is derivable from N using
the rules {SI, EQO}.
SI
Simple-minded Boolean I/O operation
Definition 9 (Semantics) Given a Boolean algebra B, a
normative system N ⊆ B × B and an input set A ⊆ B,
we define the simple-minded Boolean operation as follows:
outB
1 (N, A) = U p(N (U p(A)))
B
We put out1 (N ) = {(A, x) : x ∈ outB
1 (N, A)}.
Definition 10 (Proof system) Given a Boolean algebra B
and a normative system N ⊆ B × B, we define (a, x) ∈
deriveB
1 (N ) if and only if (a, x) is derivable from N using
the rules {SI, W O}.
Given a set of A ⊆ B, (A, x) ∈ deriveB
1 (N )
whenever (a, x) ∈ deriveB
1 (N ) for some a ∈ A. Put
B
deriveB
1 (N, A) = {x : (A, x) ∈ derive1 (N )}.
b≤a
(a, x)
(b, x)
x=y
(a, y)
Given a set of A ⊆ B, (A, x) ∈ deriveB
I (N ) when11
(N
)
for
some
a
∈ A. Put
ever (a, x) ∈ deriveB
I
B
deriveB
(N,
A)
=
{x
:
(A,
x)
∈
derive
(N
)}.
I
I
EQO
(a, x)
Theorem 7 (Soundness) outB
1 (N ) validates SI and W O.
Proof 7 − SI: We need to show that
Theorem 3 (Soundness) outB
I (N ) validates SI and EQO.
Proof 3 The proof is similar to Theorems 1 and 7.
B
Theorem 4 (Completeness) outB
I (N ) ⊆ deriveI (N )
Proof 4 The proof is similar to Theorems 2 and 8.
SI
Since b ≤ a we have U p(a) ⊆ U p(b). Hence,
N (U p(a)) ⊆ N (U p(b)) and therefore U p(N (U p(a))) ⊆
U p(N (U p(b))).
− WO: We need to show that
Example 1: For the conditionals N = {(⊤, g), (g, t)} and
the input set A = {} then outB
I (N, A) = {} and for the
input set C = {g} we have outB
I (N, C) = Eq(g, t).
WO
Simple-II Boolean I/O operation
Definition 7 (Semantics) Given a Boolean algebra B, a
normative system N ⊆ B × B and an input set A ⊆ B,
we define the simple-II Boolean operation as follows:
outB
II (N, A) = U p(N (Eq(A)))
B
We put outII (N ) = {(A, x) : x ∈ outB
II (N, A)}.
Definition 8 (Proof system) Given a Boolean algebra B
and a normative system N ⊆ B × B, we define (a, x) ∈
deriveB
II (N ) if and only if (a, x) is derivable from N using
the rules {W O, EQI}.
WO
(a, x)
x ∈ U p(N (U p(a)))
x≤y
y ∈ U p(N (U p(a)))
Since U p(N (U p(a))) is upward-closed and x ≤ y we
have y ∈ U p(N (U p(a))).
Counter-example for AND: We can show that AND is
not valid.
(a, x)
(a, y)
(a, x ∧ y)
Consider the normative system N = {(a, x), (a, y)} we
have x ∈ U p(N (U p({a}))) and y ∈ U p(N (U p({a}))) but
x∧y ∈
/ U p(N (U p({a}))) by definition of U p(X).
AND
x≤y
(a, y)
B
Theorem 8 (Completeness) outB
1 (N ) ⊆ derive1 (N ).
Proof 8 We show that if x ∈ U p(N (U p(A))), then
(A, x) ∈ deriveB
1 (N ). Suppose x ∈ U p(N (U p(A))), then
there is y1 such that y1 ∈ N (U p(A)), y1 ≤ x, and there is
t1 such that (t1 , y1 ) ∈ N and a ≤ t1 for a ∈ A.
a=b
(b, x)
Given a set of A ⊆ B, (A, x) ∈ deriveB
II (N )
whenever (a, x) ∈ deriveB
II (N ) for some a ∈ A. Put
B
deriveB
II (N, A) = {x : (A, x) ∈ deriveII (N )}.
EQI
x ∈ U p(N (U p(a)))
b≤a
x ∈ U p(N (U p(b)))
(a, x)
outB
II (N )
Theorem 5 (Soundness)
validates W O and
EQI.
Proof 5 The proof is similar to Theorems 1 and 7.
B
Theorem 6 (Completeness) outB
II (N ) ⊆ deriveII (N ).
Proof 6 The proof is similar to Theorems 2 and 8.
SI
a ≤ t1
(t1 , y1 ) y1 ≤ x
(t1 , x)
(a, x)
WO
B
Thus, x ∈ deriveB
1 (N, a) and then x ∈ derive1 (N, A).
Example
3: For
the
conditionals
N
=
{(g, t), (¬g, ¬t), (a, b)} and the input set A = {g, ¬g} we
have outB
1 (N, A) = U p(t, ¬t).
11
In the original input/output logic (Makinson and van der Torre
2000), it is for some conjunction a of elements in A.
240
Basic Boolean I/O operation
Reusable Boolean I/O operation
Definition 11 (Saturated set) A set V is saturated in a
Boolean algebra B iff
Definition 14 (Semantics) Given a Boolean algebra B, a
normative system N ⊆ B × B and an input set A ⊆ B, we
define the reusable Boolean operation as follows:
T
{U p(N (V )), A ⊆ V = U p(V ) ⊇ N (V )}
outB
3 (N, A) =
• If a ∈ V and b ≥ a, then b ∈ V ;
• If a ∨ b ∈ V , then a ∈ V or b ∈ V .
B
We put outB
3 (N ) = {(A, x) : x ∈ out3 (N, A)}.
Definition 12 (Semantics) Given a Boolean algebra B, a
normative system N ⊆ B × B and an input set A ⊆ B, we
define the basic Boolean operation as follows:
T
{U p(N (V )), A ⊆ V, V is saturated}
outB
2 (N, A) =
Definition 15 (Proof system) Given a Boolean algebra B
and a normative system N ⊆ B × B, we define (a, x) ∈
deriveB
3 (N ) if and only if (a, x) is derivable from N using
12
the rules of deriveB
1 (N ) along with T .
B
We put outB
2 (N ) = {(A, x) : x ∈ out2 (N, A)}.
Definition 13 (Proof system) Given a Boolean algebra B
and a normative system N ⊆ B × B, we define (a, x) ∈
deriveB
2 (N ) if and only if (a, x) is derivable from N using
the rules of deriveB
1 (N ) along with OR.
OR
T
Proof 11 − T: We need to show that
Theorem 9 (Soundness) outB
2 (N ) validates SI, W O and
OR.
T
Proof 9 − OR: We need to show that
Suppose {a ∨ b} ⊆ V , since V is saturated we have
a ∈ V or b ∈ V . Suppose a ∈ V , in this case since
B
outB
2 (N, {a}) ⊆ U p(N (V )) we have x ∈ out2 (N, {a ∨
b}).
y ∈ outB
3 (N, {x})
y ∈ outB
3 (N, {a})
B
Theorem 12 (Completeness) outB
3 (N ) ⊆ derive3 (N ).
Proof 12 Suppose x ∈
/ deriveB
3 (N, a), we need to find B
such that a ∈ B = U p(B) ⊇ N (B) and x ∈
/ U p(N (B)).
Put B = U p({a} ∪ deriveB
(N,
a)).
We
show
that N (B) ⊆
3
B. Suppose y ∈ N (B), then there is b ∈ B such that
(b, y) ∈ N . We show that y ∈ B. Since b ∈ B there are
two cases:
B
Theorem 10 (Completeness) outB
2 (N ) ⊆ derive2 (N ).
Proof 10 Suppose x ∈
/ deriveB
2 (N, A), then by monotony
of derivability operation there is a maximal set V such that
A ⊆ V and x ∈
/ deriveB
2 (N, V ). V is saturated because
• b ≥ a: in this case we have (a, y) ∈ deriveB
3 (N ) since
(N
)
and
we
have
(b, y) ∈ deriveB
3
(a) Suppose a ∈ V and a ≤ b, we have (a, x) ∈
/
B
deriveB
(N
).
We
need
to
show
that
x
∈
/
derive
(N,
b)
2
2
and since V is maximal we have b ∈ V . Suppose (b, x) ∈
deriveB
2 (N ). We have
(b, x)
x ∈ outB
3 (N, {a})
Suppose that X is the smallest set such that {a} ⊆ X =
U p(X) ⊇ N (X). Since x ∈ outB
3 (N, {a}) we have x ∈
X and from y ∈ outB
3 (N, {x}) we have y ∈ X. Thus,
y ∈ outB
3 (N, {a}).
x ∈ outB
2 (N, {b})
x ∈ outB
2 (N, {a ∨ b})
SI
(a, y)
Theorem 11 (Soundness) outB
3 (N ) validates SI, W O and
T.
Given a set of A ⊆ B, (A, x) ∈ deriveB
2 (N ) if (a, x) ∈
B
deriveB
(N
)
for
some
a
∈
A.
Put
derive
2
2 (N, A) = {x :
B
(A, x) ∈ derive2 (N )}.
OR
(x, y)
Given a set of A ⊆ B, (A, x) ∈ deriveB
3 (N ) if (a, x) ∈
B
deriveB
3 (N ) for some a ∈ A. Put derive3 (N, A) = {x :
B
(A, x) ∈ derive3 (N )}.
(a, x)
(b, x)
(a ∨ b, x)
x ∈ outB
2 (N, {a})
(a, x)
SI
a≤b
(a, x)
(b, y)
a≤b
(a, y)
• ∃z ∈ deriveB
3 (N ), b ≥ z : in this case we have
That is contradiction with (a, x) ∈
/ deriveB
2 (N ).
(b) Suppose a ∨ b ∈ V , we have x ∈
/ deriveB
2 (N, a ∨ b).
We need to show that x ∈
/ deriveB
/
2 (N, a) or x ∈
B
deriveB
2 (N, b). Suppose x ∈ derive2 (N, a) and x ∈
deriveB
2 (N, b), then we have
T
(a, z)
SI
(b, y)
z≤b
(z, y)
(a, y)
We only need to show that x ∈
/ U p(N (B)) =
B
outB
1 (N, {a} ∪ derive3 (N, a)). Suppose x ∈ U p(N (B)),
then there is y1 such that x ≥ y1 and ∃t1 , (t1 , y1 ) ∈ N and
t1 ∈ U p({a} ∪ deriveB
3 (N, a)). There are two cases:
(a, x)
(b, x)
OR
(a ∨ b, x)
That is contradiction with x ∈
/ deriveB
2 (N, a ∨ b).
• t1 ≥ a: in this case we have
Therefore, we have x ∈
/ U p(N (V ))(outB
1 (N, V )) and so
B
x∈
/ out2 (N, A).
12
241
T stands for transitivity.
SI
(t1 , y1 )
a ≤ t1
(a, y1 )
WO
(a, x)
Definition 18 (Semantics outCT
i ) Given a Boolean algebra B, a normative system N ⊆ B × B and an input set
A ⊆ B, we define the CT operation as follows:
y1 ≤ x
0
outCT
(N, A)
i
n+1
outiCT
(N, A)
• ∃z1 ∈ deriveB
3 (N, a), z1 ≤ t1 : in this case we have
(t1 , y1 )
z1 ≤ t 1
(z1 , y1 )
(a, y1 )
WO
(a, x)
(a, z1 )
T
SI
y1 ≤ x
outCT
i (N, A)
CT
We put outCT
i (N ) = {(A, x) : x ∈ outi (N, A)}.
Thus, in both cases (a, x) ∈ deriveB
3 (N ) and then x ∈
(N,
a)
that
is
contradiction.
deriveB
3
D
Definition 19 (Semantics outCT,AN
) Given a Boolean
i
algebra B, a normative system N ⊆ B × B and an input
set A ⊆ B, we define the CT,AND operation as follows:
Example
4: For
the
conditionals
N
=
{(⊤, g), (g, t), (¬g, ¬t), (a, b)} and the input set A = {¬g}
we have outB
3 (N, A) = U p(g, t, ¬t).
4
0
D
outCT,AN
(N, A)
i
CT,AN D n+1
outi
(N, A)
{ y ∧ z : y, z ∈
D
(N, A)
outCT,AN
i
Obligatory Norms: Input/Output
Operations
In this section, we add the rule of AND and cumulative transitivity (CT) to our introduced derivation systems. We aim
to rebuild the derivation systems introduced by Markinson
and van der Torre (Makinson and van der Torre 2000) for
deriving obligations.
derivX
i
D
derivAN
II
AN D
deriv1
D
derivAN
2
derivCT
I
derivCT
II
derivCT
1
D
derivCT,AN
1
∈
Lemma 1 Let D be any derivation using at most EQI, SI,
′
WO, OR, AND, CT; then there is a derivation D of the same
root from a subset of leaves, that applies AND only at the
end.
(a ∧ x, y)
(a, y)
Rules
{WO, EQI, AND}
{SI, WO, AND}
{SI, WO, OR, AND}
{SI, EQO, CT}
{WO, EQI, CT}
{SI, WO, CT}
{SI, WO, CT, AND}
Proof 14 See observation 18 (Makinson and van der Torre
2000).
The main point of the observation is that we can reverse
the order of rules AND, WO to WO, AND; AND, SI to SI,
AND; AND, OR to OR, AND and finally AND, CT to SI, CT
or CT, AND. Also, we can reverse the order of rules AND
and EQI as follows:
1.
D
Definition 17 (Semantics outAN
) Given a Boolean algei
bra B, a normative system N ⊆ B × B and an input set
A ⊆ B, we define the AND operation as follows:
0
: x
Proof 13 The proof is based on the reversibility of inference
rules. Makinson and van der Torre (Makinson and van der
Torre 2000) studied the reversibility of inference rules.
Given a set of A ⊆ B, (A, q) ∈ deriveX
i (N ) whenever (a, x) ∈ deriveX
i (N ) for some a ∈ A. Put
X
deriveX
i (N, A) = {x : (A, x) ∈ derivei (N )}.
D
outAN
(N, A)
i
AN D n+1
outi
(N, A)
{y ∧ z : y, z ∈
D
outAN
(N, A)
i
= {(A, x)
Theorem 13 Given a Boolean algebra B, for every normative system N
⊆ B × B we have
D
D
(N ), i ∈ {II, 1, 2};
(N ) = deriveAN
outAN
i
i
CT
outCT
(N
)
=
derive
(N
),
i ∈ {I, II, 1} and
i
i
CT,AN D
CT,AN D
out1
(N ) = derive1
(N ).
(a, x)
(a, y)
AND
(a, x ∧ y)
(a, x)
= outCT
i (N, A)
Dn
= outCT,AN
(N, A) ∪
i
CT,AN D n
outi
(N, {a}), a ∈ A}
S
Dn
= n∈N outCT,AN
(N, A)
i
D
We put outCT,AN
(N )
i
CT,AN D
outi
(N, A)}.
Definition 16 (Proof system) Given a Boolean algebra B
and a normative system N ⊆ B × B, we define (a, x) ∈
deriveX
i (N ) if and only if (a, x) is derivable from N using
EQI, EQO, SI, W O, OR, AN D, CT as follows:
CT
= outB
i (N, A)
n
= outCT
(N, A) ∪
i
n
{x : y ∈ outCT
(N, {a}) and
i
CT n
x ∈
out
(N,
{a
∧ y}), a ∈ A}
S i
n
(N,
A)
= n∈N outCT
i
2.
= outB
i (N, A)
Dn
= outAN
(N, A) ∪
i
AN D n
outSi
(N, {a}), a ∈ A}
Dn
(N, A)
= n∈N outAN
i
(a, x)
(a, y)
AND
(a, x ∧ y)
a=b
EQI
(b, x ∧ y)
(a, x) a = b
(b, x)
AND
EQI
EQI
(a, y) a = b
(b, y)
(b, x ∧ y)
Hence, in each system of {W O, EQI, AN D},
{SI, W O, AN D} and {SI, W O, OR, AN D} we can
apply AND rule just at the end. Thus, we can characterize
deriviAN D (N ) using the fact deriviB (N ) = outB
i (N ) and
the iteration of AND.
D
D
(N, A)}.
We put outAN
(N ) = {(A, x) : x ∈ outAN
i
i
242
There is another way for representing sentence 1 by means
of the ordering source. In the case of g(w) = {shA, shB},
as a set of possible inconsistent informations, we have
blA, blB, ¬blA ∧ ¬blB ∈ out(N O , {shA, shB}).
It is easy to check that we can reverse CT with SI, EQO,
WO, and EQI, by this fact similarly we can characterize
deriviCT (N ).
Finally, since AND can reverse with ST, WO and CT, we
can characterize deriv1CT,AN D (N ) by applying iteration of
CT,AN D
AND over outCT
(N ).
1 (N ) that means out1
5
An abstract logic (Font and Jansana 2017) is a pair A =
hL, Ci where L = hL, ...i is an algebra and C is a closure
operator defined on the power set of its universe, that means
for all X, Y ⊆ L:
• X ⊆ C(X)
• X ⊆ Y ⇒ C(X) ⊆ C(Y )
• C(X) = C(C(X))
The elements of an abstract logic can be ordered as a ≤ b
if and only if b ∈ C({a}). Similarly, we can define operator
U p for X ⊆ L.
Definition 20 (Semantics) Given an abstract logic A, a
normative system N ⊆ L × L and an input set A ⊆ L,
we define the I/O operations as follows:
• outA
0 (N, A) = Eq(N (Eq(A)))
• outA
I (N, A) = Eq(N (U p(A)))
• outA
II (N, A) = U p(N (Eq(A)))
A
• out1 (N, A) = U p(N (U p(A)))
T
• outA
{U p(N (V )), A ⊆ V, V is saturated}13
2 (N, A) = T
A
• out3 (N, A) = {U p(N (V )), A ⊆ V = U p(V ) ⊇
N (V )}
A
We put outA
i (N ) = {(A, x) : x ∈ outi (N, A)}.
Definition 21 (Proof system) Given an abstract logic A
and a normative system N ⊆ L × L, we deA
A
fine (a, x) ∈ deriveA
0 (N ) (deriveI (N ), deriveII (N ),
A
A
A
derive1 (N ), derive2 (N ), derive3 (N ) ) if and only if
(a, x) is derivable from N using the rules {EQI, EQO}
({SI, EQO}, {W O, EQI}, {SI, W O}, {SI, W O, OR},
{SI, W O, T }). Put deriveA
i (N, A) = {x : (A, x) ∈
deriveA
(N
)}.
i
Similarly, we can define outOR
i (N ) operation and characterize some other proof systems :
derivX
i
derivOR
I
derivCT,OR
I
derivCT,OR
1
D
derivCT,OR,AN
1
I/O Mechanism over Abstract Logic
Rules
{SI, EQO, OR}
{SI, EQO, CT, OR}
{SI, WO, CT, OR}
{SI, WO, CT, OR, AND}
D
D
Four
systems
deriveAN
,
deriveAN
(or
1
2
OR,AN D
CT,AN D
CT,OR,AN D
derive1
), derive1
and derive1
introduced by Markinson and van der Torre (2000) for
reasoning about obligatory norms.
The miners paradox
To illustrate the advantage and difference of the new proposed semantics with the classic semantics, we focus on
the miners paradox. The miners paradox is discussed by
Kolodny and MacFarlane (2010).
Ten miners are trapped either in shaft A or in shaft
B, but we do not know which. Flood waters threaten to
flood the shafts. We have enough sandbags to block one
shaft, but not both. If we block one shaft, all the water
will go into the other shaft, killing any miners inside it.
If we block neither shaft, both shafts will fill halfway
with water, and just one miner, the lowest in the shaft,
will be killed.
So, in our deliberation, it seems that the followings are
true:
1. Either the miners are in shaft A or in shaft B.
Theorem 14 (Soundness and Completeness) outA
i (N ) =
deriveA
(N
).
i
Proof 15 The proofs are the same as soundness and completeness theorems in Section 3.
A logical system L = hL, ⊢L i straightforwardly provides
an equivalent abstract logic hFmL , C⊢L i. Therefore, we can
build I/O framework over different types of logic include
first-order logic, simple type theory, description logic, different kinds of modal logics that are expressive for the intentional concepts such as belief and time.
2. If the miners are in shaft A, we should block shaft A.
3. If the miners are in shaft B, we should block shaft B.
4. We should block neither shaft.
In principle, it would be best to save all the miners by
blocking the right shaft, sentence 2, and 3. Sentence 4 is
correct since there is a fifty-fifty chance of blocking the right
shaft, given that we do not know the shaft in which the miners are. This sentence guarantees that we save nine of the
ten miners according to the scenario (Von Fintel 2012). The
paradox is: the four sentences jointly are inconsistent in classical logic. Moreover, they are inconsistent in the context of
the Kratzerian baseline view (Cariani To appear).
Here, we analyze this paradox in our setting. Suppose that
the set of norms N = {(shA, blA), (shB, blB), (⊤, ¬blA ∧
¬blB)} represents the sentences 2–4. We choose one of the
output operations for deriving obligation, which satisfies the
rule of SI. There are two ways for representing sentence 1.
For the case f (w) = {shA ∨ shB}, as a set of factual informations, we have ¬blA ∧ ¬blB ∈ out(N O , {shA ∨ shB}).
Example 5: In modal logic system KT, for the conditionals N = {(p, ✷q), (q, r), (s, t)} and the input set A = {p}
we have outKT
3 (N, A) = U p(✷q, r).
Moreover, we can add other rules such as AND and CT to
the systems same as last chapter.
13
For this case, the abstract logic A = hL, Ci should include ∨,
that is a binary operation symbol, either primitive or defined by a
term and we have a ∨ b, b ∨ a ∈ C({a}).
243
6
Conclusion
Acknowledgments
I thank two anonymous reviewers for valuable comments.
I would like to thank Majid Alizadeh, who during my visit
to university of Tehran in summer of 2019 dedicated much
time and attention to my work, and who gave me essential
advice for this work. I thank Leon van der Torre, Dov Gabbay, Xavier Parent, and Seyed Mohammad Yarandi for comments that greatly improved the manuscript.
In summary, we have characterized a class of proof systems
over Boolean algebras for a set of explicitly given norms as
follows:
derivB
i
derivB
R
derivB
L
derivB
0
derivB
I
derivB
II
derivB
1
derivB
2
derivB
3
Rules
{EQO}
{EQI}
{EQI, EQO}
{SI, EQO}
{WO, EQI}
{SI, WO}
{SI, WO, OR}
{SI, WO, T}
References
Alchourrón, C., and Bulygin, E. 1971. Normative systems.
Springer-Verlag; Wien New York.
Benzmüller, C.; Farjami, A.; Meder, P.; and Parent, X. 2019.
I/O logic in HOL. Journal of Applied Logics – IfCoLoG
Journal of Logics and their Applications (Special Issue on
Reasoning for Legal AI) 6(5):715–732.
Benzmüller, C.; Farjami, A.; and Parent, X. 2018. A dyadic
deontic logic in HOL. In Broersen, J.; Condoravdi, C.;
Nair, S.; and Pigozzi, G., eds., Deontic Logic and Normative Systems — 14th International Conference, DEON 2018,
Utrecht, The Netherlands, 3-6 July, 2018, volume 9706, 33–
50. College Publications.
Benzmüller, C.; Farjami, A.; and Parent, X. 2019. Åqvist’s
dyadic deontic logic E in HOL. Journal of Applied Logics –
IfCoLoG Journal of Logics and their Applications (Special
Issue on Reasoning for Legal AI) 6(5):733–755.
Bochman, A. 2005. Explanatory nonmonotonic reasoning.
World scientific.
Boella, G., and van der Torre, L. 2008. Institutions with a hierarchy of authorities in distributed dynamic environments.
Artificial Intelligence and Law 16(1):53–71.
Cariani, F. To appear. Deontic logic and natural language.
In Gabbay, D.; Horty, J.; Parent, X.; van der Meyden, R.;
and van der Torre, L., eds., Handbook of Deontic Logic, volume 2. College Publications.
Ciabattoni, A.; Gulisano, F.; and Lellmann, B. 2018. Resolving conflicting obligations in Mı̄mām
. sā: a sequent-based
approach. In Broersen, J.; Condoravdi, C.; Nair, S.; and
Pigozzi, G., eds., Deontic Logic and Normative Systems —
14th International Conference, DEON 2018, Utrecht, The
Netherlands, 3-6 July, 2018, 91–109.
Ciuciura, J. 2013. Non-adjunctive discursive logic. Bulletin
of the Section of Logic 42(3/4):169–181.
Costa, H. A. 2005. Non-adjunctive inference and classical modalities. Journal of Philosophical Logic 34(5-6):581–
605.
Danielsson, S. 1968. Preference and obligation, studies in theelogic of ethics. Ph.D. Dissertation, Filosofiska
Färeningen.
Føllesdal, D., and Hilpinen, R. 1970. Deontic logic: An
introduction. In Deontic logic: Introductory and systematic
readings. Springer. 1–35.
Font, J. M., and Jansana, R. 2017. A general algebraic
semantics for sentential logics. Cambridge University Press.
Fuhrmann, A. 2017. Deontic modals: Why abandon the
default approach. Erkenntnis 82(6):1351–1365.
Each proof system is sound and complete for an I/O operation. For each of the introduced I/O operations, we can define a new I/O operation version that allows input reappear
as outputs. Let N + = N ∪ I, where I is the set of all pairs
+
B
+
(a, a) for a ∈ B. We define outB
i (N, A) = outi (N , A) ,
B
the characterization is the same as outi . It is interesting
to compare the introduced systems with Minimal Deontic
Logic (Goble 2013), and its similar approaches, such as
(Ciabattoni, Gulisano, and Lellmann 2018), that do not have
deontic aggregation principles. Moreover, we have shown
that how we can add two rules AND and CT to the proof
systems and find representation theorems for them.
The input/output logic is inspired by a view of logic as
“secretarial assistant”, to prepare inputs before they go into
the motor engine and unpacking outputs, rather than logic
as an “inference motor”. The only input-output logics investigated so far in the literature are built on top of classical propositional logic and intuitionist propositional logic.
The algebraic construction shows how we can build the input/output version of any abstract logic. Furthermore, we
can build I/O framework over posts hA, ≤i where A is a
set, and ≤ is a reflexive, antisymmetric and transitive binary relation. The monotony property of closure has no
role in the proofs, and we can build I/O operations over
non-monotonic relations. We pose combining I/O operations with consequence relations that do not satisfy inclusion or idempotence property (Makinson 2005) as a further research question. For example, combining the original I/O framework with a consequence relation that does
not satisfy inclusion where the consequence relation is an
input/output closure (A ∈ out(N, A) is not necessary)
was explored by Sun and van der Torre (2014). Moreover,
we could present an embedding of new I/O operations in
HOL (Benzmüller, Farjami, and Parent 2019; Benzmüller et
al. 2019; Benzmüller, Farjami, and Parent 2018). Finally,
we have introduced a unification of possible worlds and
norm-based semantics for reasoning about permission and
obligation; it is worth to investigate exploring the philosophical and conceptual advantages of integrating normbased semantics into the classic semantics (Von Fintel 2012;
Horty 2014). As an advantage, we discussed the miners
paradox (Kolodny and MacFarlane 2010).
244
Gabbay, D.; Parent, X.; and van der Torre, L. 2019.
A geometrical view of I/O logic.
arXiv preprint
arXiv:1911.12837.
Goble, L. 2013. Prima facie norms, normative conflicts, and
dilemmas. Handbook of Deontic Logic 1:499–544.
Hansen, J. 2008. Imperatives and deontic logic–on the semantic foundations of deontic logic.
Hansson, B. 1969. An analysis of some deontic logics. Nous
373–398.
Horty, J. F. 2012. Reasons as defaults. Oxford University
Press.
Horty, J. 2014. Deontic modals: why abandon the classical
semantics? Pacific Philosophical Quarterly 95(4):424–460.
Jaśkowski, S. 1969. Propositional calculus for contradictory
deductive systems (communicated at the meeting of march
19, 1948). Studia Logica: An International Journal for Symbolic Logic 24:143–160.
Kolodny, N., and MacFarlane, J. 2010. Ifs and oughts. The
Journal of philosophy 107(3):115–143.
Kratzer, A.; Pires de Oliveira, R.; and Pessotto, A. L. 2014.
Talking about modality: an interview with angelika kratzer.
ReVEL, especial (8).
Kratzer, A. 2012. Modals and conditionals: New and revised
perspectives. Oxford University Press.
Lewis, D. 1974. Semantic analyses for dyadic deontic logic.
In Logical theory and semantic analysis. Springer. 1–14.
Lewis, D. 2013. Counterfactuals. John Wiley & Sons.
Lindahl, L., and Odelstad, J. 2013. The theory of joiningsystems. Handbook of Deontic Logic 1:545–634.
Makinson, D., and van der Torre, L. 2000. Input/output
logics. Journal of philosophical logic 29(4):383–408.
Makinson, D., and van der Torre, L. 2001. Constraints
for input/output logics. Journal of Philosophical Logic
30(2):155–185.
Makinson, D., and van der Torre, L. 2003. Permission from
an input/output perspective. Journal of philosophical logic
32(4):391–416.
Makinson, D. 1999. On a fundamental problem of deontic
logic. Norms, logics and information systems. New studies
on deontic logic and computer science 29–54.
Makinson, D. 2005. Bridges from classical to nonmonotonic
logic. King’s College.
Nute, D. 2012. Defeasible deontic logic, volume 263.
Springer Science & Business Media.
Parent, X., and van der Torre, L. 2013. Input/output logic.
Handbook of Deontic Logic 1:499–544.
Parent, X., and van der Torre, L. 2014. Sing and dance!
In International Conference on Deontic Logic in Computer
Science, 149–165. Springer.
Parent, X., and van der Torre, L. 2017a. The pragmatic oddity in norm-based deontic logics. In Proceedings of the 16th
edition of the International Conference on Articial Intelligence and Law, 169–178.
Parent, X., and van der Torre, L. 2017b. Detachment in normative systems: Examples, inference patterns, properties. IfCoLog Journal of Logics and their Applications 4(9):2295–
3039.
Parent, X., and van der Torre, L. W. 2018a. I/O logics with a
consistency check. In Broersen, J.; Condoravdi, C.; Nair, S.;
and Pigozzi, G., eds., Deontic Logic and Normative Systems
— 14th International Conference, DEON 2018, Utrecht, The
Netherlands, 3-6 July, 2018, 285–299. College Publications.
Parent, X., and van der Torre, L. 2018b. Introduction to
deontic logic and normative systems. College Publications.
Parent, X.; Gabbay, D.; and van der Torre, L. 2014. Intuitionistic basis for input/output logic. In David Makinson
on Classical Methods for Non-Classical Problems. Springer.
263–286.
Stolpe, A. 2008a. Norms and norm-system dynamics. Department of Philosophy, University of Bergen, Norway.
Stolpe, A. 2008b. Normative consequence: The problem of
keeping it whilst giving it up. In International Conference
on Deontic Logic in Computer Science, 174–188. Springer.
Stolpe, A. 2010a. Relevance, derogation and permission.
In International Conference on Deontic Logic in Computer
Science, 98–115. Springer.
Stolpe, A. 2010b. A theory of permission based on the
notion of derogation. Journal of Applied Logic 8(1):97–113.
Stolpe, A. 2015. A concept approach to input/output logic.
Journal of Applied Logic 13(3):239–258.
Straßer, C.; Beirlaen, M.; and van de Putte, F. 2016. Adaptive logic characterizations of input/output logic. Studia
Logica 104(5):869–916.
Straßer, C. 2013. Adaptive logics for defeasible reasoning: Applications in argumentation, normative Reasoning
and default Reasoning. Springer Publishing Company, Incorporated.
Sun, X., and van der Torre, L. 2014. Combining constitutive and regulative norms in input/output logic. In International Conference on Deontic Logic in Computer Science,
241–257. Springer.
Sun, X. 2016. Logic and games of norms: a computational
perspective. Ph.D. Dissertation, University of Luxembourg,
Luxembourg.
Sun, X. 2018. Proof theory, semantics and algebra for
normative systems. Journal of logic and computation
28(8):1757–1779.
Van Fraassen, B. C. 1973a. The logic of conditional obligation. In Exact Philosophy. Springer. 151–172.
Van Fraassen, B. C. 1973b. Values and the heart’s command.
The Journal of Philosophy 70(1):5–19.
Von Fintel, K., and Heim, I. 2011. Intensional semantics.
Unpublished Lecture Notes.
Von Fintel, K. 2012. The best we can (expect to) get?
challenges to the classic semantics for deontic modals. In
Central meeting of the american philosophical association,
february, volume 17.
Von Wright, G. H. 1951. Deontic logic. Mind 60(237):1–15.
245
Kratzer Style Deontic Logics in Formal Argumentation
Huimin Dong1 , Beishui Liao1 , Leendert van der Torre1,2
1
Department of Philosophy, Zhejiang University, China
2
Department of Computer Science, University of Luxembourg, Luxembourg
huimin.dong@xixilogic.org, baiseliao@zju.edu.cn, leon.vandertorre@uni.lu
Abstract
logic (van Benthem, Grossi, and Liu 2014), and formal argumentation (Liao et al. 2019).
Yet considering one instance of the problem of detachment, the warning sign (Prakken and Sergot 1996), the usual
non-monotonic tools, for instance, abnormality or priority
among obligations, do not seem to be explicitly expressed in
the deontic modals:
Kratzer introduced an ordering source to define obligations,
while Poole, Makinson, and others showed how to build nonmonotonic logics using a similar approach of ordering default
assumptions. In this paper, we present three ways to use the
ordering source in order to capture various defeasible deontic
logics. We prove representation theorems showing how these
defeasible deontic logics can be explained in terms of formal
argumentation. We illustrate the logics by various benchmark
examples of contrary-to-duty obligation.
• “There must be no dog.”
• “If there is no dog, there must be no warning sign.”
• “If there is a dog, there must be a warning sign.”
1
Motivation
• “There is a dog.”
As some linguists argue, the problem of contrary-to-duty
concerns whether we can successfully recognize what is
ideal from what is actually true (Arregui 2010), and this
can be linked by different ways they are presumed. So the
problem of detachment is when we can make a hypothesis
that corresponds to less-than-ideal circumstances, and why.
These will be answered by the hypothetical reasoning patterns they are in.
We adopt Kratzer’s linguistic theory to build up various
hypothetical reasoning mechanisms to analyze the contraryto-duty. To uniformly explore the logical nature of the linguistic expressions of contrary-to-duty, and then solve the
paradox, we propose a family of Kratzer style deontic logics (Kratzer 1981; Horty 2014). We take two variants of
standard deontic logic as the basic logics for hypothetical reasoning in modal language for obligation and ability.
The interpretation offered by the Kratzer style inferences is
twofold as hypothetical reasoning does. To derive a conclusion from the above-mentioned premises, we not only
have to check whether it comes from hypotheses but also examine whether the rules inferring it are defeasible. “Working hypotheses” is common in normative and legal reasoning (Makinson 1994), for instance, the presumption of innocence in criminal law and that of capacity in civil code.
This intuition is captured by the notion of ordering source
proposed by Kratzer (1981).
Comparing to the formalisms proposed by Poole, Makinson, and others (Poole 1988; Makinson 1994; Freund
1998), our Kratzer ordering source can redefine the priorities among presumptions in compositional semantics. Ordering source can be well deployed as a function to define
The reasoning with contrary-to-duty obligation has been
widely studied in philosophy, linguistics, law, computer science, and artificial intelligence (Chisholm 1963; Prakken
and Sergot 1996; Pigozzi and van der Torre 2017). A
contrary-to-duty obligation says what should be the case if
a violation occurs. “If someone violates the parking regulation, she should pay the compensation.” As first discussed
by Chisholm (1963), when the statements of a contrary-toduty obligation and its violation are taken together, it is inevitable to have either inconsistency or logical dependency
in standard deontic logic (von Wright 1951). This is called
the Chisholm Paradox. It is also called the problem of
detachment in the follow-up research (Prakken and Sergot
1996; Parent and van der Torre 2017).
Many formal tools in non-monotonic logics have been
proposed to represent variants of contrary-to-duty and to
represent the paradoxes in a consistent way (Straßer 2014;
Governatori and Rotolo 2006). There are two main accounts to avoid inconsistency in the reasoning of contraryto-duty. The first account explicitly expresses the paraconsistent structures of normative exceptions and throw
them away when facing conflicts. For instance, in adaptive
logic (Goble 2014; Straßer 2014) each model has a set of
abnormal propositions to remove the deontic conflicts according to certain inferential rules. In contrast, the paraconsistent representation of obligation can also be expressed in
terms of a sequential operator in substructural logic (Governatori and Rotolo 2006). The second account replaces
paraconsistency with priority to handle normative conflicts,
including methods from default logic (Horty 1994), defeasible deontic logic (Governatori 2018), preference deontic
246
preference over presumptions (Horty 2014). As a case study,
we connect our deontic logics to the formal argumentation
theory ASPIC` . We chose ASPIC` to transform our defeasible deontic logics to a formal argumentation approach,
because the defeasible knowledge and defeasible rules properly correspond to hypotheses, while, the preference can be
well defined by ordering source. We prove this connection
by several representation theorems.
This paper is structured as follows. We first present two
deontic logics motivated by an example of contrary-to-duty
in Section 2. We then propose the modeling of ordering
source in Section 3. In Section 4 we define the possible
defeasible deontic inferences in Kratzer’s sense. We also
present the logical properties satisfied in these Kratzer style
deontic logics. Section 5 instantiates ASPIC` in accordance
with ordering source. In Section 6 we prove the representation theorems connecting from defeasible deontic logics to
ASPIC` . The related work is discussed in Section 7. We
conclude in Section 8.
2
(PL)
(MP)
(NEC∆ )
(RE2 )
(Dual2 )
(DualO )
(U)
(OiC)
(D)
(M2 )
(C∆ )
(R)
(T)
(B)
(4)
(5)
All tautologies
ϕ, ϕ Ñ ψ{ψ
ϕ{∆ϕ
ϕ Ø ψ{2ϕ Ø 2ψ
3ϕ Ø 2 ϕ
Pϕ Ø O ϕ
2ϕ Ñ Oϕ
Oϕ Ñ 3ϕ
pOϕ ^ O ϕq
2pϕ ^ ψq Ñ p2ϕ ^ 2ψq
p∆ϕ ^ ∆ψq Ñ ∆pϕ ^ ψq
2pϕ Ñ ψq Ñ pOϕ Ñ Oψq
2ϕ Ñ ϕ
ϕÑ2 2 ϕ
2ϕ Ñ 22ϕ
2 ϕÑ2 2 ϕ
Table 1: Logic D` of obligation and necessity, where ∆ P t2, Ou.
We define the notion of Hilbert style derivation based on
modal logic in the usual way, see e.g. the textbook of modal
logic (Blackburn, De Rijke, and Venema 2002). Note that
modal logic provides two related kinds of derivation according to the application of necessitation or uniform substitution, i.e. necessitation or uniform substitution can only be
applied to theorems rather than an arbitrary set of formulas. Our connection from defeasible deontic logic to formal
argumentation theory will use both notions.
Deontic Logics
We present two deontic logics based on standard deontic
logic which are motivated to capture derivations of contraryto-duty. The modal language we use is simple. We only
have two monadic operators: One is for obligation (O) and
another is for necessity (2) for agent’s ability “seeing to it
that.” We can introduce the indices for agents, but for simplification, we omit them here.
Definition 1 (Deontic Language). Let p be any element of
a given (countable) set P rop of atomic propositions. The
deontic language L of modal formulas is defined as follows:
ϕ :“ p | ϕ | pϕ ^ ϕq | Oϕ | 2ϕ
The disjunction _, the material condition Ñ, weak permission P , and possibility 3 are defined as usual: ϕ _ ψ :“
p ϕ ^ ψq, ϕ Ñ ψ :“ pϕ ^ ψq, P ϕ :“ O ϕ, and
3ϕ :“ 2 ϕ.
The so-called upper-bound logic is presented in Definition 2 as our strongest monotonic deontic logic, to which we
refer using the symbol “D” to stand for “deontic.” The axiomatization in Table 1 is a variant of standard deontic logic.
Notice that the axiom MO for obligation expansion can be
derived by the rule of necessitation NEC2 and the axiom
R for obligation replacing, while the uniformed substitution
REO for obligation is derivable from the axioms M2 and
R. As Chellas (1980) shows, instead of having axiom K,
it is possible to replace it with axioms M and C together
with the rule of uniform substitution RE. So the logic D`
is a possible representation of obligation and ability to uniformly handle non-monotonic reasoning later. The upperbound logic D` includes axioms and rules for obligation
and ability of agency, as well as their interactions. The axiom OiC is the principle of “ought implies can” and the axiom U for obligation enforcing. The axiom R illustrates this
intuition: “When ϕ enforces ψ, if ϕ is obligatory then ψ is
also obligatory.” The axiom T for agents’ abilities indicates
that “The enforcement by agent makes it the case.”
Definition 2 (Upper-bound logic D` ). The deontic logic
D` is a system including all axioms and rules in Table 1.
Definition 3 (Derivations without Premises). Let D be a deontic logic. A derivation for φ in D is a finite sequence
φ1 , . . . , φn´1 , φn such that φ “ φn and for every φi p1 ď
i ď nq in this sequence is either an instance of one of the
axioms in D; or the result of the application of one of the
rules in D to those formulas appearing before φi . We write
$D φ if there is a derivation for φ in D. We say φ is a theorem of D or D proves φ. We write CnpDq as the set of all
theorems of D.
Definition 4 (Derivations from Premises). Let D be a deontic logic. Given a set Γ of formulas, a derivation for φ
from Γ in D is a finite sequence φ1 , . . . , φn´1 , φn such that
φ “ φn and for every φi p1 ď i ď nq in this sequence either
φi P CnpDq Y Γ; or the result of the application of one of
the rules (which is neither RE nor NEC) to those formulas
appearing before φi . We write Γ $D φ if there is a derivation from Γ for φ in D.1 We say this that φ is derivable in D
from Γ. We write CnD pΓq as the set of formulas derivable in
D from Γ.
A system D is consistent iff K R CnpDq; otherwise, inconsistent. A set Γ is D-consistent iff K R CnD pΓq; otherwise,
inconsistent. A set Γ1 Ď Γ is maximally D-consistent subset
of Γ, denoted as Γ1 P MCSD pΓ1 q iff there is no Γ2 Ą Γ1 such
that Γ2 is D-consistent.
Now we formalize the reasoning of the scenario Instructions of a Party Host in the logic D` .
1
Alternatively, it can be seen as a theorem $D
the deduction theorem.
247
Ź
Γ Ñ φ by
Example 1 (Instructions of a Party Host). This scenario
here is interpreted as a factual detachment paradox, or a
contrary-to-duty paradox, proposed by Prakken and Sergot (Prakken and Sergot 1996).
not meet and be forced to embrace”. The worse is, given
the indefeasible axiom OiC, it infers the ability of being
forced to embrace when they do not meet, 3pe ^ mq,
which, however, is not consistent with the assumption (4).
As Prakken and Sergot (1996) point out, the obligations
O m and Oe are temporally independent, and the obligation aggregation should not be applied on them. This
analysis then offers a reason why the application of the
axiom CO is defeasible. Of course, we can have a step
backward and instead consider the axiom T defeasible,
because not every act can be achieved.
(1) “John and Kevin should not meet.” O m
(2) “If John and Kevin meet, they should be forced to embrace.” 2pm Ñ Oeq
(3) “John and Kevin meet.” m
(4) “John and Kevin cannot be forced to embrace if they do
not meet.” 3pe ^ mq
where m stands for “John and Kevin meet” and e for “John
and Kevin are forced to embrace.” Intuitively, the four statements should be jointly satisfiable. But they are inconsistent
in the derivations with them as premises in D` .
Oe: 2, 3, T
O e: 1, 4, R
Om: 2, 3, 4, Dual2 , T, R
Ope ^ mq: 2, 3, 4, CO
Op e ^ mq: 1, 2, 3, 4, CO
4. If the premise (1) is defeasible and defeated by the conclusion Om, now we consider whether the application of
axiom R is defeasible. Suppose not. In addition, we consider premises (1) and (2) are defeasible while neither (3)
nor (4). Now we have conclusions Oe and O e, which
are both from defeasible premises. Assume that the application of the axiom T is defeasible. This gives us a reason
to say O e is more undefeated than Oe. Even if we do
not consider the result Op e ^ mq given by the aggregation axiom, we already receive something far from being
well-behaved: John and Kevin should meet, Om, as well
as they should not be forced to embrace, O e. It does not
sound like a good guideline.
O m: 1
Ope ^ mq: 1, 2, 3, CO
Op e ^ mq: 1, 4, CO
Table 2: The derived formulas are in the left side of the colon, while
the assumptions, axioms, and rules used to derive them are on the
right side.
To keep the consistency for this case of contrary-to-duty,
there are two possible ways to have the hypothetical reasoning patterns (Makinson 1994) for defeasible derivations.
We can hypothesize certain premises as the normal assumptions which can be challenged when their derived results are
inconsistent with themselves. In this case, the derivations
with defeasible premises are also defeasible. We can also
hypothesize certain axioms or inferential rules being used in
a defeasible manner in order to have their derivations provisional. Given the basic ideas here, the reasons to defeasibly
derive the normative statements regarding e and m in Table 2
can be analyzed as follows:
We then can consider either the derivations with deontic
premises defeasible or those necessarily using one of the axioms CO , T and R defeasible. In particular, the three axioms
have some defeasible readings in natural language. We can
read the axiom CO as “If ϕ is obligatory and ψ is obligatory, normally, then ϕ and ψ is obligatory.” The axiom T is
read as ‘The enforcement by agent, normally, makes it the
case.” Similarly, the axiom R is read as “When ϕ enforces
ψ, if ϕ is obligatory then, normally, ψ is also obligatory.”
Taking different patterns of hypothetical reasoning, either
by the premises or by the applications of axioms or rules,
we have different ways to resolve the inconsistency in Example 1. In Section 4 the method here will be used to show
how to handle the problem of detachment according to the
ways of doing hypothesis.
Now we define the so-called lower-bound logic D´ in
Definition 5 as the weakest deontic logic used for nonmonotonic reasoning later. As discussed in Example 1, it is
defined by removing some inference rules and axioms from
the upper-bound logic. Roughly, the formulas derived by
using these removed rules and axioms will become less prioritized. In terms of formal argumentation, they are the results derived by the defeasible principles of our defeasible
deontic logic. For instance, it removes the axiom R, so it
does not contain NECO nor the controversial principle MO
of obligation expansion derived by R. Nor it has obligation
aggregation represented by the axiom CO in Example 1. We
also remove the axiom T to have a weaker sense of agent’s
“seeing to it that”.
1. As the axiom D indicates, the derived result Om is not
consistent with the given premise (1) O m, nor the derived results Oe and O e are consistent.
2. When the premise (1) O m is accepted, to remain consistent, we suppose the conclusion Om is defeasible and
then can be defeated by the premise (1).
(a) In this case, the defeasibility of Om is possible to be
traced back to the premises, if at least one of them is
defeasible. Let us presume the premise (2) is defeasible while the premises (3) and (4) are not, because the
former is considered as a deontic statement while the
latter are factual statements.
(b) We can also consider one of the axioms, Dual2 , T, or
R, which is used to apply for this result is defeasible.
Here we only consider the axiom T or the axiom R to be
defeasible, because Dual2 is much common in modal
logics.
Definition 5 (Lower-bound logic D´ ). The deontic logic
D´ is a system including all axioms and rules in D` except
R, CO , and T. 2
3. Suppose no premises are defeasible, as well as the derivations using axiom T necessarily for the conclusions, such
as the one to derive Oe. In such a case, by using the aggregation axiom CO , we conclude “John and Kevin should
2
248
This deontic logic D´ then reduces to a minimal deontic logic
Although the lower-bound logic does not accept obligation
aggregation, it does admit that OK by the axiom OiC and
pOϕ ^ O ϕq by the axiom D. Notice that all four aggregated obligations in Table 2 are not derived in D´ .
3
derivation is stronger than its contrary from the defeasible
one? What if the defeasible derivation is firm?
To keep the comparison between formulas consistent, our
ordering source simply stipulates the priority over formulas
derivable from the strongest derivations to the weakest ones:
firm and strict, denoted as Dfs ; firm and defeasible, denoted as Dfd ;
plausible and strict, denoted as Dps ; plausible and defeasible, denoted as Dpd .
Observe that CnD` pΓf Y Γp q “ Dfs Y Dfd Y Dps Y Dpd .
We follow Kratzer and define the ordering source as a
function g, which maps a pair pΓf , Γp q of assumptions to a
set of subsets of Dpd Y Dps Y Dfd Y Dfs all derivation results.
Precisely, this intuition stipulates gpΓf , Γp q as follows, :
Defining Ordering Source
Our Kratzer style ordering source is defined from the perspective of hypotheses. As discussed in Exampe 1, the
hypothetical reasoning of contrary-to-duty can differentiate
derivations either by the premises or by the logics:
• Firm Derivation Premises are firm and certain in the conversational background and this background information
cannot be changed. If we take 3pe ^ mq as background information, then all derivations from this firm
premises are also firm, so is the one deriving Ope ^
mq.
tDpd , Dpd Y Dps , Dpd Y Dps Y Dfd , Dpd Y Dps Y Dfd Y Dfs u.
Then gpΓf , Γp q is consists of elements from the set of formulas derivable in the weakest derivations to the set of those
derivable in all and the strongest ones. So the Kratzer style
ordering ďgpΓf ,Γp q induced by gpΓf , Γp q is defined as follows:
ϕ ďgpΓf ,Γp q ψ iff @X P gpΓf , Γp q. pψ P X ñ ϕ P Xq.
In other words, a formula ϕ is as strong as a formula ψ when,
if ϕ is contained in an element of gpΓf , Γp q, then ψ is also
contained in this set.
The following example shows how this classification
works as a guideline for interpreting the reasoning of
contrary-to-duty. Simply speaking, our ordering source respects the derived results in the firm derivations first, and
then consider the logics used in the derivations.
Example 3 (Instructions of a Party Host, continued). Let
Γf “ tm, 3pe ^ mqu be the set of firm premises
and Γp “ t2pm Ñ Oeq, O mu be the set of plausible
premises. Now we know the following results:
• Dfs contains: m, 3pe ^ mq, Ope ^ mq, 2pe Ñ
mq, Ope Ñ mq
• Dfd “ H
• Dps contains: 2pm Ñ Oeq, O m
• Dpd contains:
Om, Oe, O e, Ope ^ mq, Ope ^
mq, Op e ^ mq, Op e ^ mq, 3pe ^ mq
Then we observe that the contraries of Om, Ope^mq, Ope^
mq, Op e ^ mq, 3pe ^ mq are in the higher priorities. As the stipulation of priority given by the ordering
source, we now say that the Kratzer style deontic consequences would like to exclude them. Importantly, we can
find the reasons in the conversation for each selection. We
remove 3pe ^ mq because it comes from a derivation with
a defeasible rule and a hypothetical assumption, while its
contrary is a firm assumption without a doubt in the conversation background.
Further, the obligation of being forced to embrace, Oe,
and the obligation of not being forced to embrace, O e, are
both in the same priority and inconsistent to each other. We
shall decide when to chose Oe and when to chose O e as
well as Op e^ mq to construct the Kratzer style consistent
sets.
• Plausible Derivation Premises are hypothetical and plausible as beliefs and so introduce uncertainty into the conversation. This plausible information can be changed if
their contraries are firm. If we consider the premises
O m, 2pm Ñ Oeq as plausible while m as firm, then the
derivation for formula Ope ^ mq with all these premises
is also plausible.
• Strict Derivation The derivations in the lower-bound
logic are considered to be strict. For instance, the derivation for Ope ^ mq from the premise 3pe ^ mq is
strict.
• Defeasible Derivation The derivations in the upperbounded logic but not in the lower-bounded logic are considered to be defeasible. So the derivations for Ope^ mq
and 3pe ^ mq from O m, 2pm Ñ Oeq, m are defeasible.
Therefore, firm and plausible derivations are mutually disjoined. And so are the strict and defeasible derivations.
Definition 6 (Types of Formulas). Given two sets of formulas Γf and Γp , where Γf is a set of firm premises and Γd
is a set of plausible premises. A formula ϕ is derivable in
a firm derivation iff there is Γ Ď Γf such that Γ $ ϕ. A
formula ϕ is derivable in a plausible derivation iff there is a
Γ X Γp ‰ H such that Γ $ ϕ. A formula ϕ is derivable in
a strict derivation iff there is a Γ Ď L such that Γ $D´ ϕ.
A formula ϕ is devirable in a defeasible derivation iff there
is a Γ Ď L such that Γ $D` ϕ but there is no Γ Ď L such
that Γ $D´ ϕ.
As what Kratzer (1981) required, the comparison brought
by ordering source should follow the principle of consistency. By saying that formula derivable in a strict derivation
is stronger than the one derivable in a defeasible one, this
easily leads to inconsistency, shown as the following example.
Example 2. It is possible that a strict derivation for Ope^
mq is plausible, if the premise 3pe ^ mq is considered
to be plausible. Notice that the derivation for Ope ^ mq
is defeasible. Can we say that Ope ^ mq from a strict
in neighborhood models (Chellas 1980).
249
According to the Kratzer style full layers, we define two
types of defeasible deontic inferences. They are widely invested in the literature of non-monotonic reasoning, the socalled skeptical defeasible inference and the so-called credulou defeasible inference (Horty 1994). The former one takes
the intersection of all full layers into consideration to define
consequences, while the latter take their union. As shown in
Example 4, fixing the premises, it is possible to have more
than one Kratzer style maximal consistent set of formulas.
So these two constructions provide different Kratzer style
deontic consequences.
Definition 8 (Skeptical Defeasible Inferences). Given two
sets of formulas Γ, Γ1 Ď L such that Γ X Γ1 “ H, we define
the closure operator
č
LpΓ, Γ1 q.
DΓ@1 pΓq “
Now we turn to the further discussion on Kratzer style
deontic inferences.
4
Kratzer Style Deontic Inferences
All Kratzer style deontic inferences in this paper are defined
by maximal consistent sets of formulas according to the ordering source. As shown in Section 3, we propose to consider first a maximally consistent subset of the formulas in
the strongest Dfs , and then a consistent subset of the weaker
Dfd such that it is also consistent with Dfs and it is maximal;
and we repeat this process of “keeping consistency with the
previous layer as much as possible” for Dps and the weakest
Dpd . We call this set a full layer and define it as follows.
Definition 7 (Full Layers). Given two sets of formulas
Γ, Γ1 Ď L such that ΓXΓ1 “ H, we denote S0 “ S2 “ D´ ,
and S1 “ S3 “ D` . We recursively define a layer LpΣi q
where i P t0, 1, 2, 3u as follows:
• LpΣ0 q “ CnD´ pΣ0 q;
• LpΣi`1 q is defined as follows:
(i) LpΣi`1 q Ď CnSi`1 pΣi`1 q;
(ii) LpΣi`1 q is consistent with ϕ w.r.t. Si where ϕ P LpΣi q,
and
(iii) there is no ∆ Ą LpΣi`1 q such that ∆ Ď CnSi`1 pΣi`1 q
and for all ϕ P LpΣi q we have ∆ Y tϕu be Si consistent.
where Σ0 P MCSD´ pΓq, Σ1 P MCSD` pLpΣ0 qq, Σ2 P
MCSD´ pLpΣ1 q Y Γ1 q, and Σ3 P MCSD` pLpΣ2 qq. We then
define:
1
• a full layer FpΣŤ
0 q from the pair pΓ, Γ q starting at Σ0 P
MCSD´ pΓq as iPt0,1,2,3u LpΣi q;
@
The defeasible inference |„Γ1 corresponding to this closure
@
operator is defined as follows: Γ |„Γ1 ϕ iff ϕ P DΓ@1 pΓq. We
@
@
write |„Γ1 ϕ when H |„Γ1 ϕ.
Example 5 (Instructions of a Party Host, continued). Given
two sets Γ “ tm, 3pe ^ mqu and Γ1 “ t2pm Ñ
Oeq, O mu. Then m, 3pe ^ mq, Ope ^ mq, 2pe Ñ
mq, Ope Ñ mq, 2pm Ñ Oeq, O m P DΓ@1 pΓq. Meanwhile, we also have Oe _ O e P DΓ@1 pΓq.
Definition 9 (Credulous Defeasible Inferences). Given two
sets of formulas Γ, Γ1 Ď L such that Γ X Γ1 “ H, we define
the closure operator
ď
LpΓ, Γ1 q.
DΓD 1 pΓq “
D
The defeasible inference |„Γ1 corresponding to this closure
D
operator is defined as follows: Γ |„Γ1 ϕ iff ϕ P DΓD 1 pΓq. We
D
D
write |„Γ1 ϕ when H |„Γ1 ϕ.
Example 6 (Instructions of a Party Host, continued).
Given two sets Γ “ tm, 3pe ^ mqu and Γ1 “
t2pm Ñ Oeq, O mu. Now we have this result: DΓ@1 pΓq Y
tOe, O e, Op e ^ mqu Ď DΓD 1 pΓq.
We present an observation about the connection between
these two Kratzer style deontic inferences and the classical
consequences as follows.
• a set LpΓ, Γ1 q of full layers as tFpΣ0 q | Σ0 P
MCSD´ pΓqu.
A layer is a consistent subset, such that all elements are consistent with the previous and stronger layer as much as possible. We can see that a layer LpΣ0 q is a consistent subset of
Dfs , a layer LpΣ1 q is a consistent subset of Dfs Y Dfd , a layer
LpΣ2 q is a consistent subset of Dfs Y Dfd Y Dps , and a layer
LpΣ3 q is a consistent subset of Dfs Y Dfd Y Dps Y Dpd .
Example 4 (Instructions of a Party Host, continued). Given
Γ “ tm, 3pe ^ mqu and Γ1 “ t2pm Ñ Oeq, O mu.
As the classification given by ordering source in Example 3,
for each group we put inconsistent formulas in the same priority into different full layers. So we have two full layers
according to Dpd :
• One full layer includes: m, 3pe ^ mq, Ope ^
mq, 2pe Ñ mq, Ope Ñ mq, 2pm Ñ Oeq, O m, Oe;
• Another full layer includes: m, 3pe ^ mq, Ope ^
mq, 2pe
Ñ
mq, Ope
Ñ
mq, 2pm
Ñ
Oeq, O m, O e, Op e ^ mq.
We see that inconsistency arises in Dpd but not in the other
three in Example 3. In other words, the inconsistency is
brought up by the axioms and rules in the upper-bounded
logic from the plausible premises. It therefore each full layer
contain all elements in Dfs Y Dfd Y Dps . They only differ in
Dpd .
@
D
Proposition 1. $D´ Ď |„Γ1 Ď |„Γ1 Ď $D` .
Now we consider the third type of Kratzer style deontic
inferences, which focuses on derivations rather than the resulting consequences in derivations.
Definition 10 (All-Things-Considered Defeasible Inferences). Given two sets of formulas Γ, Γ1 Ď L such that
Γ X Γ1 “ H, we define the closure operator
č
DpΓ, Γ1 qu,
DΓ1 pΓq “ tϕ | p∆, ϕq P
where the set DpΓ, Γ1 q of all derivations from the set of full
layers is defined as:
DpΓ, Γ1 q “ tp∆, ϕq | ∆ Ď Γ Y Γ1 and ϕ P F P LpΓ, Γ1 qu.
The defeasible inference |„Γ1 corresponding to this closure
operator is defined as follows: Γ |„Γ1 ϕ iff ϕ P DΓ1 pΓq. We
write |„Γ1 ϕ when H |„Γ1 ϕ.
250
5
Example 7 (Instructions of a Party Host, continued). Remember that we have Oe _ O e P DΓ@1 pΓq in Example 6.
This is not the case in the all-things-considered defeasible
inference, because this result cannot be derived from the
same premises. More precisely, Oe _ O e R DΓ1 pΓq.
There are two main ways in the formal argumentation literature to make a monotonic logic like D` defeasible, using
either a defeasible knowledge base (a kind of assumptions)
or defeasible rules. ASPIC+ (Modgil and Prakken 2018) is
one of the few approaches which combines both ways. This
has been criticized by proponents of other approaches, suggesting that one of these ways can be reduced to the other.
We do not take a stance in this discussion, we just observe
that the availability of both defeasible knowledge and defeasible rules, as well as the possibility to define preferences
over arguments, can be used to well capture deontic arguments by the Kratzer style deontic inferences.
In the following, we define deontic arguments based on
the lower-bound and upper bound logic. Following the
spirit of ordering source in Section 3, ASPIC` style arguments (Modgil and Prakken 2018) are defined in terms of
strict and defeasible rules, and knowledge. The knowledge
base can be defeasible or not, and this does affect the definition of arguments.
Our instantiation of ASPIC` is twofold in terms of
Kratzer’s deontic modals (Kratzer 1981; Horty 2014). The
knowledge base is divided into strict and defeasible knowledge, i.e. Ks and Kd , corresponding to the firm and plausible derivations, while the inference rules are categorized as
strict and defeasible rules, i.e. Rs and Rd , corresponding to
the strict and defeasible derivations. Preference in ASPIC`
is defined by the ordering source as discussed in Section 3.
In addition, in many approaches of formal argumentation,
arguments are similar to derivations, but in our approach,
they are not the same. Although each argument corresponds
to a derivation defined as a top rule, the former explicitly
consider each step of this derivation as a finite sequence.
Definition 11 (Inference Rules and Arguments). Let K “
Ks YKd Ď L be a knowledge base such that Ks XKd “ H,
and R “ Rs Y Rd be a set of rules such that
• Rs “ tφ1 , . . . , φn ÞÑ φ | tφ1 , . . . , φn u $D´ φu is the
set of strict rules, and
• Rd “ tφ1 , . . . , φn Zñ φ | tφ1 , . . . , φn u $D`
φ & tφ1 , . . . , φn u &D´ φu is the set of defeasible rules.
Given each n P N, the set An is defined by induction as
follows:
Proposition 2. $D´ Ę |„Γ1 and |„Γ1 Ď $D` .
Next, we check whether the Kratzer style deontic inferences satisfy some important properties regarding nonmonotonicity.
@
D
Proposition 3. Let , P t|„Γ1 , |„Γ1 , |„Γ1 u, we define:
1. Reflexivity: Γ , ϕ where ϕ P Γ
2. Cut: If Γ Y tψu , χ and Γ , ψ then Γ , χ
3. Cautious Monotony: If Γ , ψ and Γ , χ then Γ Y tψu ,
χ
4. Left Logical Equivalence: If CnD` pΓq “ CnD` p∆q and
Γ , χ then ∆ , χ
5. Right Weakening: If $D` ϕ Ñ ψ and Γ , ϕ then Γ , ψ
6. OR: If Γ , ϕ and ∆ , ϕ then Γ Y ∆ , ϕ
7. AND: If Γ , ψ and Γ , χ then Γ , ψ ^ χ
8. Rational Monotony: If Γ , χ and Γ . ψ then Γ Y
tψu , χ
The results of satisfactions are shown in Table 3.
Properties
Reflexivity
Cut
Cautious Monotony
Left Logical Equivalence
Right Weakening
OR
AND
Rational Monotony
@
|„Γ1
X˚
X
X
X
X
No
X
X
D
|„Γ1
X
X
X
X
X
No
X
No
|„Γ1
X˚
X
X
X
X
No
X
X
Table 3: The symbol X˚ indicates that this property is satisfied
when the given knowledge base is consistent in D´ .
Example 8 (Miss Manner). This counterexample of OR
illustrates the so-called Miss Manners case discussed by
Horty (2014) for deontic detachment. Let Γ “ tO f, Onu,
∆ “ tOpa Ñ f q, Onu and Γ1 “ tP au, where f stands
for “eating with fingers”, a for “being served asparagus”,
and n for “putting the napkin”. We then have the following
D
@
results for , P t|„Γ1 , |„Γ1 , |„Γ1 u:
• Γ , P a and ∆ , P a but Γ Y ∆ ,
A0
An`1
“ K
“ An Y tB1 , . . . , Bm Ź ψ |
Bi P An for all i P t1, . . . , mu and ψ P Lu
where for an element B “ B1 , . . . , Bm Ź ψ:
• If B P K, then P rempBq “ tψu, ConcpBq “ ψ,
SubpBq “ tψu, Rulesd pBq “ H, T opRulepBq “
undef ined where ψ P K.
• If B “ B1 , . . . , Bm Ź ψ where Ź is ÞÑ then
ConcpB1 q, . . . , ConcpBm q ÞÑ ψ P Rs with
P rempBq “ P rempB1 q Y . . . Y P rempBm q,
ConcpBq “ ψ,
SubpBq “ SubpB1 q Y . . . Y SubpBm q Y tBu,
Rulesd pBq “ Rulesd pB1 q Y . . . Y Rulesd pBm q,
T opRulepBq “ ConcpB1 q, . . . , ConcpBm q ÞÑ ψ.
P a.
Example 9 (Instructions of a Party Host, continued). Now
D
we provide a counterexample of Rational Monotony for |„Γ1 .
1
Assume that Γ “ tmu, Γ “ t 3pe ^ mq, O mu, χ “
Oe _ Op e ^ mq, and ψ “ 2pm Ñ Oeq. We then
have the following results:
D
Instantiating ASPIC`
D
• Γ |„Γ1 χ and Γ |Γ1 ψ;
D
• However, Γ Y tψu |Γ1 χ.
251
• A P A is acceptable w.r.t. E iff when B P A such that
pB, Aq P D then DC P E such that pC, Bq P D.
• E is an admissible set iff E is conflict-free and if A P E
then A is acceptable w.r.t. E.
• E is a complete extension iff E is admissible and if A P A
is acceptable w.r.t. E then A P E.
• E is a stable extension iff E is conflict-free and @B R E
DA P E such that pA, Bq P D.
• If B “ B1 , . . . , Bm Ź ψ where Ź is Zñ, then each condition is similar to the previous item, except that the rule
is defeasible and Rulesd pBq “ Rulesd pB1 q Y . . . Y
Rulesd pBm q Y tConcpB1 q, . . . , ConcpBm q Zñ ψu.
Ť
We define A “ nPN An as the set of arguments on the basis of K, and define ConcpEq “ tϕ Ď ConcpAq | A P Eu
where E Ď A. We define the set of formulas regarding
to a given argument as follows: F pDq “
Ť P rempDq Y
tConcpDqu where D P A. Let F pEq “ tF pDq | D P
E Ď Au.
Example 11 (Instructions of a Party Host, continued). In Example 10, arguments are ordered as:
D2 , D3 , D4 , D5 ď C, D ď D1 , A, B. The defeats include
pB, D5 q, pD1 , D4 q, pD2 , D3 q, pD3 , D2 q, as illustrated in
Figure 1.
The following example illustrates the arguments in the
running example.
Example 10 (Instructions of a Party Host, continued). We
take the knowledge base constructed by Ks “ tm, 3pe ^
mqu and Kd “ t2pm Ñ Oeq, O mu. The knowledge
base has four arguments: the strict knowedge A “ m and
B “ 3pe ^ mq as well as the defeasible knowledge
C “ 2pm Ñ Oeq and D “ O m. Then the arguments
leading to the conclusions Ope ^ mq, Oe, O e which
are presented as follows:
1. The arguments which have top rules as strict rules:
• D1 “ B ÞÑ Ope ^ mq The premise of argument
D1 is a strict knowledge.
2. The arguments which have top rules as defeasible rules:
• D2 “ A, C Zñ Oe
• D3 “ B, D Zñ Oe
• D4 “ A, C, D Zñ Ope ^ mq
• D5 “ A, C, D Zñ 3pe ^ mq
All four arguments have at least one piece of defeasible
knowledge in their premises.
B
D5
D1
D4
D2
D3
Figure 1: The closed circle represents the arguement with strict
rule as the top rule, and the dashed circles represent the arguments
with defeasible rules as the top rules. The arrows represent defeat
relations.
6
From Deontic Logic to ASPIC`
One contribution of this paper is to instantiate ASPIC` in
terms of Kratzer style consistency. We prove that the notion
of full layer in Section 4 characterizes the notion of stable
extension in Section 5. This is not coincident. The compositional idea proposed by Kratzer provides a process to capture
the intuition behind stability – “all outsiders are defeated.”
The ordering source as a function induces a preference. And
then a full layer recursively defines a mechanism to remove
all inconsistent and less prioritized results. This is exactly
what a stable extension requires. This connection is shown
by the following representation theorem for stable extension.
We then define such a preference among arguments by the
ordering source.
Definition 12 (Arguments properties). Let A, B be arguments. Then A is strict if Rulesd pAq “ H; defeasible if
Rulesd pAq ‰ H; firm iff P rempAq Ď Ks ; plausible iff
P rempAq X Kd ‰ H. The preference ď is defined as:
A ď B iff:
• If A, B both are firm or plausible then A is defeasible;
• otherwise, A is plausible and B is firm.
The preference indeed divides arguments into four groups:
firm and strict, firm and defeasible, plausible and strict, and
plausible and defeasible.
Proposition 4. Let K “ Ks Y Kd be a knowledge base
and A be a set of arguments on the basis of K. Let F P
LpKs , Kd q as Definition 7 defined. And we also define E as
tD P A | F pDq Ď Fu. Then E is a stable extension.
Definition 13 (Defeats). We define D as a set of pairs of arguments in which argument A defeats argument B is defined
as:
Proof. First, we prove that E is conflict-free. Otherwise,
there are A, B P E such that pA, Bq P D. This implies
that F pAq Y F pBq $ K where $ is either $D´ or $D` .
Because each F P LpKs , Kd q, we suppose Ť
there is some
Σ0 P MCSD´ pKs q such that F “ FpΣ0 q “ 0ďiď3 LpΣi q
is a full layer from the pair pKs , Kd q starting at Σ0 , such
that each layer LpΣi q satisfies Definition 7. Because each
LpΣi q (0 ď i ď 3) is constructed from a consistent set,
it implies that A and B are elements from different LpΣi q
where i P t0, 1, 2, 3u. We assume that F pAq Ď LpΣ1 q and
• either ConcpAq “ φ for some B 1 P SubpBq and
T opRulepB 1 q P Rd , ConcpB 1 q “ φ and A ă B 1 .
• or ConcpAq “
φ for defeasible knowledge φ P
P rempBq X Kd of B and A ă φ.
Definition 14 (Dung Extensions). Let A be a set of arguments, D be a set of defeats, and E Ď A be a set of arguments. Then
• E is conflict-free iff @A, B P E we have pA, Bq R D.
252
@
F pBq Ď LpΣ2 q. But then it contradicts the condition (ii).
Similar in the other cases. So E is conflict-free.
Now we prove that E is a stable extension. For each B R
E we need to find out a A P E such that pA, Bq P D. Given
B R E, we know that F pBq does not satisfy some condition
of LpΣi q. There are four cases need to be considered.
• Γ |„Γ1 ϕ iff every stable extension on the basis of K contains an argument A with ConcpAq “ ϕ.
Proposition 6. Let K “ Ks Y Kd be a knowledge base and
A be a set of arguments on the basis of K, such that Ks “ Γ
and Kd “ Γ1 . We have:
D
• Γ |„Γ1 ϕ iff some stable extension on the basis of K contains an argument A with ConcpAq “ ϕ.
1. If F pAq Ď LpΣ0 q. Because B R E, we know that F pBq
is not possible to be contained in LpΣ0 q. Then we have
F pAq Y F pBq $D´ K. Because T opRulepAq P Rs and
F pAq Ď LpΣ0 q, according to the priority defined by the
ordering source, A ď B. So there is a A P E such that
pA, Bq P D.
2. If F pAq Ď LpΣ1 q. We only consider the case of F pAq Y
F pBq $D` K. Otherwise, like the previous step, there is
an argument in LpΣ0 q such that its set of formulas is not
consistent with F pBq.
• If A ă B, then B is strict and firm. So F pBq is not
D´ -consistent with the elements in LpΣ0 q, otherwise
F pBq Ď LpΣ0 q. As the proof in the previous step,
there is a A1 such that F pA1 q Ď LpΣ0 q and A1 ě B.
Thus there is a A1 such that pA1 , Bq P D and A1 P E.
• If A ě B, then we also have there is a A P E such that
pA, Bq P D.
3. If F pAq Ď LpΣ2 q. We only consider the case of F pAq Y
F pBq $D´ K. Otherwise, the proof is similar to the
previous two steps.
• If A ă B. Then either B is strict and firm or B is
defeasible and firm. It further infers that either B is
strict and firm and F pBq is not D´ -consistent with
LpΣ0 q; or B is defeasible and firm and F pBq is not
D` -consistent with LpΣ1 q; otherwise F pBq Ď LpΣ0 q
or F pBq Ď LpΣ1 q. Either case implies that there is a
A1 such that F pA1 q Ď LpΣi q, F pA1 q Y F pBq $Si K,
and A1 ě B (i P t0, 1u). So there is a A1 P E such that
pA1 , Bq P D.
• If A ě B, then it is clear that pA, Bq P D.
4. If F pAq Ď LpΣ3 q. We only consider the case of F pAq Y
F pBq $D` K. Otherwise, the proof is similar to the
previous three steps.
• If A ă B, then either B is strict and firm, defeasible and firm, or strict and plausible. As the strategy used previously, we always find a A1 such that
F pA1 q Ď LpΣi q, F pA1 q Y F pBq $Si K, and A1 ě B
(i P t0, 1, 2u). So, there is a A1 P E such that
pA1 , Bq P D.
• If A ě B, then it is clear that pA, Bq P D.
Proposition 7. Let K “ Ks Y Kd be a knowledge base and
A be a set of arguments on the basis of K, such that Ks “ Γ
and Kd “ Γ1 . We have:
• Γ |„Γ1 ϕ iff there is an argument A contained in every stable extension on the basis of K such that ConcpAq “ ϕ.
7
Related Work
A well-known approach for hypothetical reasoning in the
form of maximally consistent subsets has been studied
in nonmonotonic reasoning (Poole 1988; Makinson 1994;
Freund 1998; Makinson and van der Torre 2001). One of
the early proposals is introduced by Poole (1988), in which
an assumption-based consequence is constructed based on
default logic following certain principles of consistency.
Along the same line, Freund (1998) proposes the preferential inferences for hypothetical reasoning, in which preference is induced by the hypotheses. So a proposition is more
preferable than the other iff it is more likely than the other
proposition in the sense that it satisfies a subset of presumed
premises than the other does. In contrast, the preference in
Krazter’s account is induced by the ordering source. In our
case, it takes two kinds of premises as well as two categories
of inferential rules into consideration. In contrast, Makinson
and van der Torre (2001) study a variety of constraints of
input/output pairs to model the inferential relation between
premises and conclusions. These constraints deciding which
conclusions to be accepted, we assume, can be represented
by the Kratzer ordering source. We leave this to further research.
The study of logic-based instantiations of argumentation
framework (da Costa Pereira et al. 2017; Beirlaen, Heyninck, and Straßer 2018) is an active area to connect logics to
formal argumentation. It usually investigates how to apply
the standard method of constructing maximal consistency to
argumentation systems (Amgoud and Besnard 2013; Arieli,
Borg, and Straßer 2018). This basic idea can be traced back
to Benferhat et al. and Cayroll’s work (Benferhat, Dubois,
and Prade 1995; Cayrol 1995). Benferhat et al. propose the
concept of “level of paraconsistency” to characterize preference in argumentation theory. Cayroll (1995) links the construction of stable extensions to maximally consistent sets
in classical logics. Recently this research area puts emphasis on modal logics (Beirlaen, Heyninck, and Straßer 2018;
Liao et al. 2019). Beirlaen et al. (2018) define their argumentation systems of conditional obligation, in which
preference is indexed by the modal language. In contrast,
the rule-based argumentation systems developed by Liao et
al. (2019) provide total orderings in order to prioritize norms
in the semantics. In both accounts, preferences are given
but not induced. Dong et al. (2019) instantiates ASPIC` on
Now we can conclude that E is a stable extension.
Given the above-mentioned representation theorem, we
then have the following representation theorems for the
Kratzer style deontic inferences in terms of argumentations.
Proposition 5. Let K “ Ks Y Kd be a knowledge base and
A be a set of arguments on the basis of K, such that Ks “ Γ
and Kd “ Γ1 . We have:
253
agents’ abilities. A contrary-to-duty obligation can be understood in different intuitions, for instance, factual detachment (Straßer 2014) or deontic detachment (Prakken and
Sergot 1996) about norm violations, or a compensational
norm linking in a computational way (Governatori and Rotolo 2006). Several formal systems have been proposed to
deal with various variants of contrary-to-duty (Prakken and
Sergot 1996; Makinson and van der Torre 2001; Parent and
van der Torre 2014; Beirlaen, Heyninck, and Straßer 2018).
As discussed by Prakken and Sergot (1996) and recently by
Pigozzi and van der Torre (2017), the challenge of representing norm violation different to norm exception is still
waiting to be solved. Whether there are some other linguistic features to distinguish a violation from an exception, this
question is left to future work.
the basis of a modal deontic logic for obligation and strong
permission. They define preferences either by the language
types of premises or by the inference rules but, contrary to
our work, have not considered both at the same time.
Now we turn to the fruitful work on defeasible deontic
logic (Nute 1997). The main idea is to define defeasibility, either by consistency governed under a set of formulas combining with a set of inference rules (Goble 2014;
Straßer 2014; Governatori and Rotolo 2006), or by providing a priority to overtake less normal conclusions (Horty
1994; Governatori 2018). For instance, Goble (2014) provides an adaptive logic to handle different kinds of normative conflicts via the notion of abnormality. A formula is
true from a set of formulas iff this formula is satisfied at
every reliable and normal model. This inference relation
highly depends on the sets of abnormalities and inferential
rules on them. Straßer (2014) follows Goble’s work and investigates the dynamics in adaptive reasoning. While Governatori (2006) proposes that the multi-layered consistency
for conditional obligations is captured by the sequential operators to compute norms and their violations. In contrast,
Horty (1994) and Governatori (2018) define the defeasible
consequences by the priorities over default rules. They both
define priorities among default rules rather than over the arguments. We therefore can apply Kratzer’s method to define
another kind of ordering source to model the preferences and
their default constructions.
8
Acknowledgments
The authors thank for the useful comments from the two
anonymous reviewers from NMR 2020. Huimin Dong is
supported by the China Postdoctoral Science Foundation
funded project [No. 2018M632494] and the National Science Centre of Poland [No. UMO-2017/26/M/HS1/01092].
Beishui Liao and Huimin Dong are supported by the Convergence Research Project for Brain Research and Artificial Intelligence, Zhejiang University. Leendert van der
Torre and Beishui Liao have received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 690974 for the project “MIREL: MIning and REasoning with Legal texts”.
Conclusion
This paper employs Kratzer’s compositional method, the ordering source, to define preferences associated with hypothetical reasoning. By doing so, we can explicitly interpret why a conclusion is drawn: by the linguistic types of
premises or by the inferential rules defeasible or not. This
is proven by the four representation theorems in Section 6
to connect defeasible deontic logics to ASPIC` . It clarifies the kernel to constructing preference and arguments by
the Kratzer maximal consistency method. We believe that
this Kratzer method can generally redefine the other nonmonotonic formal tools. We leave this in our further research.
We observe that the ordering source provided in this paper highlights the principle of “Premises first.” Our ordering
source always considers the derivations with firm premises
first and then categorizes derivations according to what inferential rules they use. In other words, it regards background information more important than the logical rules.
This is reflected in the notion of full layers. In future work,
we can explore an alternative ordering source associated
with the principle of “Rules first,” and examine the logical
structure behind. Furthermore, given the full consideration
of “Premises first” and “Rules first”, it is possible to answer
the linguistic question in a general way: Whether a factual
detachment or a deontic detachment is satisfied or valid (Arregui 2010), depending on what kind of the logical structure
is used for the hypothetial reasoning.
We can also investigate alternative formats of contraryto-duty obligation. We have studied an example of contraryto-duty regarding obligation violation interacting with the
References
Amgoud, L., and Besnard, P. 2013. Logical limits of abstract argumentation frameworks. Journal of Applied NonClassical Logics 23(3):229–267.
Arieli, O.; Borg, A.; and Straßer, C. 2018. Reasoning with
maximal consistency by argumentative approaches. Journal
of Logic and Computation 28(7):1523–1563.
Arregui, A. 2010. Detaching if-clauses from should. Natural
Language Semantics 18(3):241–293.
Beirlaen, M.; Heyninck, J.; and Straßer, C. 2018. Structured
argumentation with prioritized conditional obligations and
permissions. Journal of Logic and Computation 29(2):187–
214.
Benferhat, S.; Dubois, D.; and Prade, H. 1995. A local approach to reasoning under inconsistency in stratified knowledge bases. In European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty, volume
946, 36–43. Springer.
Blackburn, P.; De Rijke, M.; and Venema, Y. 2002. Modal
Logic, volume 53. Cambridge University Press.
Cayrol, C. 1995. On the relation between argumentation and non-monotonic coherence-based entailment. In International Joint Conference on Artificial Intelligence, volume 95, 1443–1448.
Chellas, B. F. 1980. Modal logic: an introduction. Cambridge university press.
254
Chisholm, R. M. 1963. Contrary-to-duty imperatives and
deontic logic. Analysis 24(2):33–36.
da Costa Pereira, C.; Liao, B.; Malerba, A.; Rotolo, A.; Tettamanzi, A. G. B.; van der Torre, L.; and Villata, S. 2017.
Handling norms in multi-agent systems by means of formal
argumentation. IfCoLog Journal of Logics and Their Applications 4(9):3039–3073. Also in Handbook of Normative
Multi-agent Systems.
Dong, H.; Liao, B.; Markovich, R.; and van der Torre, L.
2019. From classical to non-monotonic deontic logic using
ASPIC`. In International Workshop on Logic, Rationality
and Interaction, 71–85. Springer.
Freund, M. 1998. Preferential reasoning in the perspective
of poole default logic. Artificial Intelligence 98(1-2):209–
235.
Goble, L. 2014. Deontic logic (adapted) for normative conflicts. Logic Journal of the IGPL 22(2):206–235.
Governatori, G., and Rotolo, A. 2006. Logic of violations:
A gentzen system for reasoningwith contrary-to-duty obligations. The Australasian Journal of Logic 4.
Governatori, G. 2018. Practical normative reasoning with
defeasible deontic logic. In Reasoning Web International
Summer School, 1–25. Springer.
Horty, J. F. 1994. Moral dilemmas and nonmonotonic logic.
Journal of philosophical logic 23(1):35–65.
Horty, J. 2014. Deontic modals: why abandon the classical
semantics? Pacific Philosophical Quarterly 95(4):424–460.
Kratzer, A. 1981. The notional category of modality. Words,
Worlds, and Contexts: New Approaches in Word Semantics
6:38.
Liao, B.; Oren, N.; van der Torre, L.; and Villata, S. 2019.
Prioritized norms in formal argumentation. Journal of Logic
and Computation 29(2):215–240.
Makinson, D., and van der Torre, L. 2001. Constraints for
input/output logics. Journal of Philosophical Logic 30:155–
185.
Makinson, D. 1994. General patterns in nonmonotonic reasoning. In Handbook of logic in artificial intelligence and
logic programming (vol. 3). 35–110.
Modgil, S., and Prakken, H. 2018. Abstract rule-based argumentation. In Baroni, P.; Gabbay, D.; Giacomin, M.; and
van der Torre, L., eds., Handbook of formal argumentation.
College Publication. 287–364.
Nute, D., ed. 1997. Defeasible deontic logic.
Parent, X., and van der Torre, L. 2014. Sing and dance!
In International Conference on Deontic Logic in Computer
Science, 149–165. Springer.
Parent, X., and van der Torre, L. 2017. Detachment in
normative systems: Examples, inference patterns, properties. IfCoLog Journal of Logics and Their Applications
4(9):2995–3038. Also in Handbook of Normative Multiagent Systems.
Pigozzi, G., and van der Torre, L. 2017. Multiagent deontic logic and its challenges from a normative systems perspective. IfCoLog Journal of Logics and Their Applications
4(9):2929–2993. Also in Handbook of Normative Multiagent Systems.
Poole, D. 1988. A logical framework for default reasoning.
Artificial intelligence 36(1):27–47.
Prakken, H., and Sergot, M. 1996. Contrary-to-duty obligations. Studia Logica 57(1):91–115.
Straßer, C. 2014. A deontic logic framework allowing for
factual detachment. In Adaptive Logics for Defeasible Reasoning. Springer. 297–333.
van Benthem, J.; Grossi, D.; and Liu, F. 2014. Priority
structures in deontic logic. Theoria 80(2):116–152.
von Wright, G. H. 1951. Deontic logic. Mind 1–15.
255