Information Revision: The Joint Revision of Belief and Trust

Ammar Yasser

Information Revision: The Joint Revision of Belief and Trust

Ammar Yasser

2020, Conference

visibility

…

description

255 pages

link

1 file

KR 2020 17th International Conference on Principles of Knowledge Representation and Reasoning 18th INTERNATIONAL WORKSHOP ON NON-MONOTONIC REASONING NMR 2020 Workshop Notes Maria Vanina Martínez Universidad de Buenos Aires and CONICET, Argentina Ivan Varzinczak CRIL, Univ. Artois & CNRS, France 2 Preface NMR is the premier forum for results in the area of non-monotonic reasoning. Its aim is to bring together active researchers in this broad ﬁeld within knowledge representation and reasoning (KR), including belief revision, uncertain reasoning, reasoning about actions, planning, logic programming, preferences, argumentation, causality, and many other related topics including systems and applications. NMR has a long history — it started in 1984, and has been held every two years since then. The present edition is the 18th workshop in the series and it aims at fostering connections between the diﬀerent subareas of non-monotonic reasoning and providing a forum for emerging topics. This volume contains the papers accepted for presentation at NMR 2020, the 18th International Workshop on Non-Monotonic Reasoning, held virtually on September 12-14, 2020, and collocated with the 17th International Conference on Principles of Knowledge Representation and Reasoning (KR 2020). There were 26 submissions, each of which have been reviewed by two program-committee members. The committee has decided to accept all 26 papers. The program also includes two invited talks by Francesca Tony (Imperial College, London) and Andreas Herzig (IRIT CNRS, Toulouse). The latter was part of a joint session with the workshop on Description Logics (DL 2020). 12 September 2020 Buenos Aires and Lens Maria Vanina Martínez Ivan Varzinczak 3 4 Contents Counting with Bounded Treewidth: Meta Algorithm and Runtime Guarantees J.K. Fichte and M. Hecher . . . . . . . . . . . . . . . . . . . . . . . . 9 Paraconsistent Logics for Knowledge Representation and Reasoning: advances and perspectives W. Carnielli and R. Testa . . . . . . . . . . . . . . . . . . . . . . . . 19 Towards Interactive Conflict Resolution in ASP Programs A. Thevapalan and G. Kern-Isberner . . . . . . . . . . . . . . . . . . 29 Towards Conditional Inference under Disjunctive Rationality R. Booth and I. Varzinczak . . . . . . . . . . . . . . . . . . . . . . . 37 Treewidth-Aware Complexity in ASP: Not all Positive Cycles are Equally Hard J. Fandinno and M. Hecher . . . . . . . . . . . . . . . . . . . . . . . 48 Towards Lightweight Completion Formulas for Lazy Grounding in Answer Set Programming B. Bogaerts, S. Marynissen, and A. Weinzierl . . . . . . . . . . . . . 58 Splitting a Logic Program Efficiently R. Ben-Eliyahu-Zohary . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Interpreting Conditionals in Argumentative Environments J. Heyninck, G. Kern-Isberner, M. Thimm, and K. Skiba . . . . . . . 73 Inductive Reasoning with Difference-making Conditionals M. Sezgin, G. Kern-Isberner, and H. Rott . . . . . . . . . . . . . . . 83 Stability in Abstract Argumentation J.-G. Mailly and J. Rossit . . . . . . . . . . . . . . . . . . . . . . . . 93 Weak Admissibility is PSPACE-complete W. Dvořák, M. Ulbricht, and S. Woltran . . . . . . . . . . . . . . . . 100 Cautious Monotonicity in Case-Based Reasoning with Abstract Argumentation G. Paulino-Passos and F. Toni . . . . . . . . . . . . . . . . . . . . . 110 A Preference-Based Approach for Representing Defaults in FirstOrder Logic J. Delgrande and C. Rantsoudis . . . . . . . . . . . . . . . . . . . . . 120 Probabilistic Belief Fusion at Maximum Entropy by First-Order Embedding M. Wilhelm and G. Kern-Isberner . . . . . . . . . . . . . . . . . . . . 130 5 Stratified disjunctive logic programs and the infinite-valued semantics P. Rondogiannis and I. Symeonidou . . . . . . . . . . . . . . . . . . . 140 Information Revision: The Joint Revision of Belief and Trust A. Yasser and H. Ismail . . . . . . . . . . . . . . . . . . . . . . . . . 150 Algebraic Foundations for Non-Monotonic Practical Reasoning N. Ehab and H. Ismail . . . . . . . . . . . . . . . . . . . . . . . . . . 160 BKLM - An expressive logic for defeasible reasoning G. Casini, T. Meyer, and G. Paterson-Jones . . . . . . . . . . . . . . 170 Towards Efficient Reasoning with Intensional Concepts J. Heyninck, R. Gonçalves, M. Knorr, and J. Leite . . . . . . . . . . 179 Obfuscating Knowledge in Modular Answer Set Programming R. Gonçalves, T. Janhunen, M. Knorr, J. Leite, and S. Woltran . . . 189 A framework for a modular multi-concept lexicographic closure semantics L. Giordano and D. Theseider Dupre’ . . . . . . . . . . . . . . . . . 198 An Approximate Model Counter for Answer Set Programming F. Everardo, M. Hecher, and A. Shukla . . . . . . . . . . . . . . . . . 208 A Survey on Multiple Revision F. Resina and R. Wassermann . . . . . . . . . . . . . . . . . . . . . 217 A Principle-based Approach to Bipolar Argumentation L. Yu and L. van der Torre . . . . . . . . . . . . . . . . . . . . . . . 227 Discursive Input/Output Logic: Deontic Modals, Norms, and Semantic Unification A. Farjami . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 Kratzer Style Deontic Logics in Formal Argumentation H. Dong, B. Liao, and L. van der Torre . . . . . . . . . . . . . . . . 246 6 Program Committee Ofer Arieli Academic College of Tel-Aviv, Israel Christoph Beierle FernUniversitaet Hagen, Germany Alexander Bochman Holon Institute of Technology, Israel Richard Booth Cardiﬀ University, United Kingdom Arina Britz Stellenbosch University, South Africa Giovanni Casini Université du Luxembourg James Delgrande Simon Fraser University, Canada Juergen Dix Clausthal University of Technology, Germany Wolfgang Faber Alpen-Adria-Universität Klagenfurt, Germany Jorge Fandinno Potsdam University, Germany Bettina Fazzinga Advanced Analytics on Complex Data - ICAR CNR, Italy Eduardo Fermé Universidade da Madeira, Portugal Martin Gebser University of Potsdam, Germany Laura Giordano Universite of Piemonte Orientale, Italy Lluis Godo Lacasa IIIA - CSIC, Spain Andreas Herzig IRIT-CNRS, France Aaron Hunter British Columbia Institute of Technology, Canada Anthony Hunter University College London, United Kingdom Katsumi Inoue National Institute of Informatics, Japan Tomi Janhunen Aalto University, Finland Souhila Kaci Université Montpellier 2, France Antonis Kakas University of Cyprus Gabriele Kern-Isberner Technische Universitaet Dortmund, Germany Sébastien Konieczny CRIL-CNRS, France Thomas Lukasiewicz University of Oxford, United Kingdom Marco Maratea DIBRIS, University of Genova, Italy Thomas Meyer University of Cape Town, South Africa Nir Oren University of Aberdeen, United Kingdom Odile Papini Aix-Marseille Université, France Xavier Parent Université du Luxembourg Ramon Pino Perez Universidad de Los Andes, Venezuela Laurent Perrussel Université de Toulouse, France Ricardo O. Rodriguez Universidad de Buenos Aires, Argentina Ken Satoh National Institute of Informatics and Sokendai, Japan Gerardo Simari Universidad Nacional del Sur and CONICET, Argentina Guillermo R. Simari Universidad del Sur in Bahia Blanca, Argentina Christian Straßer Ruhr-Universitaet Bochum, Germany Matthias Thimm Universität Koblenz-Landau, Germany Leon van der Torre Université du Luxembourg Renata Wassermann Universidade de São Paulo, Brazil Emil Weydert Université du Luxembourg Stefan Woltran Vienna University of Technology, Austria 7 Additional Reviewers Flavio Everardo University of Potsdam, Germany Pedro Cabalar Corunna University, Spain Igor Câmara Universidade de São Paulo, Brazil 8 Counting with Bounded Treewidth: Meta Algorithm and Runtime Guarantees∗ Johannes K. Fichte1 , Markus Hecher2 1 Faculty of Computer Science, TU Dresden, 01062 Dresden, Germany 2 Institute of Logic and Computation, TU Wien, Favoritenstraße 9-11, 1040 Wien, Austria johannes.fichte@tu-dresden.de, hecher@dbai.tuwien.ac.at Abstract The computational complexity of counting has been studied since the late 70s (Durand, Hermann, and Kolaitis 2005; Hemaspaandra and Vollmer 1995; Valiant 1979). Unsurprisingly, counting is at least as hard as solving the corresponding decision problem, because one can trivially solve the decision problem by counting and checking whether the count differs from zero (Hemaspaandra and Vollmer 1995). While it suffices to count the number of solutions, many applications employ combinatorial solvers in practice by encoding the application into ASP, SAT, QBF, or ILP (Gaggl et al. 2015; Dueñas-Osorio et al. 2017). There, we often need auxiliary constructions (variables) in the encodings that are not necessarily in a functional dependency. If we are interested in the solutions with respect to certain variables, the standard concept is projection, which is extensively used in the area of databases (Abiteboul, Hull, and Vianu 1995) as well as in declarative problem specifications (Dueñas-Osorio et al. 2017; Gebser, Kaufmann, and Schaub 2009). Projected Solution Counting (PSC) then asks for the number of solutions after restricting each solution to parts of interest (projection set). In other words, multiple solutions that are identical with respect to the projection set, count as single projected solution. Recently, there is growing interest in PSC, as witnessed by a variety of results in areas such as logic (Aziz 2015; Aziz et al. 2015; Capelli and Mengel 2019; Fichte et al. 2018; Lagniez and Marquis 2019; Sharma et al. 2019), reliability estimation (Dueñas-Osorio et al. 2017), answer set programming (Fichte and Hecher 2019), and argumentation (Fichte, Hecher, and Meier 2019). Interestingly, the projected solution counting is often harder than counting problems, in contrast to decision problems, where projecting the solution to a projection set obviously does not change the complexity of the decision problem. To deal with the high computational complexity and designing solving algorithms assuming that the input instance has a certain structure, ideas from parameterized algorithmics proved valuable (Cygan et al. 2015). In particular, treewidth (Bodlaender and Kloks 1996) was successfully applied to solution counting for a range of problems (Curticapean 2018; Fichte et al. 2017; Fioretto et al. 2018; Kangas, Koivisto, and Salonen 2019; Pichler, Rümmele, and Woltran 2010; Samer and Szeider 2010). Some recent results also address projected solution counting when parameterized by treewidth (Capelli and Mengel 2019; Fichte et al. 2018; In this paper, we present a meta result to construct algorithms for solution counting in various formalisms in knowledge representation and reasoning (KRR). Our meta algorithm employs small treewidth of the input instance, which yields polynomial-time solvability in the input size for instances of bounded treewidth when considering various decision problems in graph theory, reasoning, and logic. For many results, there are explicit dynamic programming algorithms or results based on the well-known Courcelle’s theorem that allow to decide a problem in time linear in the input size and some function in the treewidth. We follow this line of research, however, consider a much more elaborate question: counting and projected solution counting (PSC). PSC is a natural generalization to counting all solutions where we consider multiple indistinguishable solutions as one single solution. Our meta result allows to extend already existing given dynamic programming (DP) algorithms by introducing only a single-exponential blowup in the runtime on top of the existing DP algorithm. The technique is widely applicable for problems in KRR. Exemplarily, we present an application to projected solution counting on QBFs, which often also serves as a canonical counting problem for the polynomial hierarchy. Finally, we present a list of problems on which our result is applicable and where the single-exponential blowup caused by the approach cannot be avoided under ETH (exponential time hypothesis). This completes the picture of recently obtained results in argumentation, answer set programming, and epistemic logic programming. Introduction Counting solutions is a well-known task in mathematics, computer science, and other areas (Domshlak and Hoffmann 2007; Gomes, Sabharwal, and Selman 2009; Sang, Beame, and Kautz 2005). For instance, in mathematical combinatorics one characterizes the number of solutions to combinatorial problems by means of mathematical expressions, e.g., generating functions (Doubilet, Rota, and Stanley 1972). Another example are applications to machine learning and probabilistic inference (Chavira and Darwiche 2008). ∗ This work has been supported by the Austrian Science Fund (FWF), Grants P32830 and Y698 and the Vienna Science and Technology Fund, Grant WWTF ICT19-065. Markus Hecher is also affiliated with the University of Potsdam, Germany. 9 Fichte, Hecher, and Meier 2019). A definability based approach for problems that can be encoded into monadic second-order logic have also been considered, e.g., (Arnborg, Lagergren, and Seese 1991). Still, a generic approach to facilitate the development of algorithms for counting problems of bounded treewidth is missing. We address this research question and present a meta algorithm for solving PSC by utilizing small treewidth of the Gaifman graph (Gaifman 1982). It works for various graph problems, for problems in logic and reasoning, including problems located higher on the polynomial hierarchy such as QBFs. Our meta algorithm allows for extending existing dynamic programming (DP) algorithms by causing only a single-exponential blowup in the treewidth on top of the existing DP. In fact, if we consider all solutions as distinguishable by taking an unrestricted projection set, the considered projection counting question simplifies to simple counting. Hence, our results immediately apply to simple counting. based on knowledge compilation (Charwat and Woltran 2019; Capelli and Mengel 2019). Preliminaries Basics and Computational Complexity. We assume familiarity with standard notions in computational complexity and use counting complexity classes as defined by Durand, Hermann, and Kolaitis (2005). For parameterized complexity, we refer to standard texts (Cygan et al. 2015). Let n ∈ N be a natural number (including zero), then [n] := {1, . . . , n}. Further, for all ℓ ∈ N, we define tower : N × N → N by tower(1, n) = 2n and tower(ℓ + 1, n) = 2tower(ℓ,n) . Given a family of finite sets X1 , X2 , . . ., Xn , the generalized combinatorial inclusion-exclusion principle (Graham, Grötschel, and Lovász 1995) states that the number S of elements in the union over all subsets is nj=1 Xj = P T |I|−1 | i∈I Xi |. For a set X, let 2X be the I⊆{1,...,n},I6=∅ (−1) power set of X consisting of all subsets Y with ∅ ⊆ Y ⊆ X. Let ~s be a sequence of elements of X. When we address the i-th element of the sequence ~s for a given positive integer i, we write ~s(i) . Similar, for a set U of sequences we let U(i) := {~s(i) | ~s ∈ U }. Contributions. We give the following contributions. 1. We establish a novel meta approach to solve PSC for various problems. We simply assume that the input is given in terms of a finite structure, for which a dynamic programming algorithm (DP) for computing the solutions to the considered problem exists, and build a generic algorithm on top of the DP that solves PSC. Quantified Boolean Formulas (QBFs). We assume familiarity with notations and problems for quantified Boolean formulas (QBF), their evaluation and satisfiability (Biere et al. 2009). Literals are variables or their negations. For a Boolean formula F , we denote by var(F ) the set of variables of F . A term is a conjunction of literals and a clause is a disjunction of literals. F is in conjunctive normal form (CNF) if F is a conjunction of clauses. We identify F by its set of sets of literals. From now on assume that a Boolean formula F is in CNF, and each set in F has at most three literals. Let ℓ ≥ 0 be integer. A quantified Boolean formula Q (Biere et al. 2009) of quantifier depth ℓ is of the form Q1 V1 .Q2 V2 . · · · Qℓ Vℓ .F where quantifier Qi ∈ {∀, ∃} for 1 ≤ i ≤ ℓ and Qj 6= Qj+1 for 1 ≤ j ≤ ℓ − 1. Further, sets Vi are disjoint, non-empty sets of Boolean variables and F is a Boolean formula such S that ℓi=1 Vi = var(F ). We let mat(Q) := F be the matrix of Q. An assignment is a mapping ι : X → {0, 1} defined for a set X of variables. Sometimes we compactly denote assignments by {x | x ∈ X, ι(x) = 1}, i.e., the sets of variables that are set to true. Given a QBF Q and an assignment ι, then Q[ι] is a QBF that is obtained from Q, where every occurrence of any x ∈ X in mat(Q) is replaced by ι(x), and variables that do not occur in the result are removed from preceding quantifiers accordingly. QBF Q evaluates to true (or is valid) if ℓ = 0 and the Boolean formula mat(Q) evaluates to true, denoted |= mat(Q). Otherwise (ℓ 6= 0), we distinguish according to Q1 . If Q1 = ∃, then Q evaluates to true if and only if there exists an assignment ι : V1 → {0, 1} such that Q[ι] evaluates to true. If Q1 = ∀, then Q[ι] evaluates to true if for any assignment ι : V1 → {0, 1}, Q[ι] evaluates to true. Deciding validity of a given QBF is PSPACE-complete and (believed) harder than SAT (Stockmeyer and Meyer 1973). 2. Since not every DP algorithm can be used to also solve PSC, we provide sufficient conditions under which a DP algorithm can be used in our framework for PSC. 3. For various PSC problems, we list complexity upper bounds that can be obtained from our framework, which completes the recently established lower bounds (Fichte, Hecher, and Pfandler 2020) for treewidth when assuming ETH (exponential time hypothesis). As running example, we illustrate the applicability of our framework on PSC for quantified Boolean formulas (QBFs), which spans the canonical counting problems #Σℓ QSAT and #Πℓ QSAT (Durand, Hermann, and Kolaitis 2005) on the polynomial counting hierarchy. Related Work. Gebser, Kaufmann, and Schaub (2009) considered projected solution enumeration for conflict-driven solvers based on clause learning. Aziz (2015) introduced techniques to modify modern solvers for logic programming in order to count projected solutions. Recently, Fichte et al. (2018) gave DP algorithms for PSC in SAT and showed lower bounds under ETH. This algorithm was then extended to related formalisms (Fichte and Hecher 2019; Fichte, Hecher, and Meier 2019). Our algorithm also traverses a tree decomposition multiple times and runs in linear time, while being single-exponential in the maximum number of records computed by the DP algorithm. However, we generalize the results by (i) providing a general framework to solve PSC (ii) generalizing the PSC algorithm such that it can take a DP algorithm as input to solve various problems, and (iii) establishing necessary conditions for DP algorithms to be employed in our framework. For implementations of decision and counting problems on QBFs, one could adapt existing DP algorithms (Chen 2004) or use alternative approaches Example 1. Let F :={c1 , c2 , c3 , c4 }, where c1 = ¬a ∨ b, c2 = c ∨ ¬e, c3 = ¬b ∨ ¬d ∨ e, c4 = b ∨ d ∨ e, and 10 Q := ∃c, d, e.∀a.∃b.F . Take assignment ι : {c, d, e} → {0, 1}, where ι := {c, e}. Then, formula Q[ι] evaluates to true, because for any assignment ι′ : {a} → {0, 1} there is ι′′ : {b} → {0, 1} with ι′′ := {b} such that ((F [ι])[ι′ ])[ι′′ ] evaluates to true. Similarly, for ζ : {c, d, e} → {0, 1}, where ζ := ∅, formula Q[ζ] evaluates to true. In total, there are only four assignments over domain {c, d, e} witnessing validity of Q, namely ι, ζ, and assignments {c} and {c, d, e}. structures. This then allows us to use these problems for projected solution counting. Formally, a problem (specification) P = hσ, ξ, sol i consists of disjoint vocabularies σ and ξ and a function sol : algs(σ)×algs(ξ) → {0, 1}. We consider a σ-structure I as instance, a ξ-structure S as solution, and sol as the solution checker. The solution checker sol then returns 1 if and only if structure S is a solution of instance I. From a problem specification, we define a (meta) problem #PS OLS(P) for projected solution counting as follows. We define the counting vocabulary ξsc , consisting of only one symbol sc of arity 1 that we use for the solution count, ˙ Then, formally, we let #PS OLS(P) := i.e., ξsc = {sc}. hσ ∪ ξ, ξsc , psolsi. Instances are (σ ∪ ξ)-structures, solutions are ξsc -structures, and psols is the solution checker. Since projected solution counting requires to specify a projection over solutions (ξ-structures) to P, we also give a projection Iξ as input which is defined by P := Iξ . Then, the number s of projected solutions is obtained by projecting each solution of input instance Iσ to projection P, i.e., s = |{S ′ ⊓ P | S ′ ∈ algs(ξ), sol (Iσ , S ′ ) = 1}|. Now we can simply define the solution checker as psols(I, S) := 1 if and only if S is the ξsc -structure containing relation sc with just s, i.e., S = h{s}, (sc)i. For our running example with QBFs, we can now instantiate the definitions from above to specify the problem QSAT := hσQBF , ξT , sol i. Recall from above that the instances are σQBF -structures Q and solutions are ξT -structures S. Naturally, from the definitions of QSAT we set sol as follows: sol (I, S) := 1 if and only if the QBF Q corresponding to instance I = Q evaluates to true under the assignment corresponding to S. If we restrict our σQBF -structure Q such that the corresponding QBFs of the instances are of quantifier depth ℓ, where the first quantifier starts with ∃, we call the resulting problem Σℓ QSAT. Naturally, we define #Σℓ QSAT := #PS OLS(Σℓ QSAT). Finite Structures and Projected Solution Counting. A vocabulary σ is a set of relation symbols, where each such symbol Ṙ ∈ σ is of arity ar(Ṙ) ≥ 0. Let D be a finite set of elements, referred to by domain. By relation, we mean a set R ⊆ Dar(Ṙ) . A finite σ-structure I = hD, (R)Ṙ∈σ i consists of a domain D and a set of relations for every symbol Ṙ in σ. We denote by algs(σ) the set of finite σ-structures and refer by kσk to the size of σ. Given finite σ-structures I = hD, Ri and I ′ = hD′ , R′ i and a vocabulary ξ ⊆ σ. In order to access relation R for symbol Ṙ ∈ σ, we let RṘ := R. Then, the structure Iξ restricted to ξ is the structure that consists of relation symbols from I that occur in ξ, i.e., Iξ := hD, (R)Ṙ∈ξ i. Further, we define the intersection I ⊓ I ′ of both σ-structures as the structure that consists of an intersection over each relation, i.e., I ⊓ I ′ := hD ∩ D′ , (RṘ ∩ R′Ṙ )Ṙ∈σ i. Assume some integer ℓ ≥ 1. For QBFs, we define ˙ , FORALL ˙ , the primal vocabulary by σQBF := {DEPTH ˙ ˙ ˙ ˙ EXISTS , NEG , POS , IN C LAUSE}, containing binary relation symbols only. Given QBF Q = Q1 V1 .Q2 V2 . · · · Qℓ Vℓ .F . Then, the σQBF -structure Q of Q is given by Q := hvar(F ), (R)Ṙ∈σQBF i, where DEPTH = {ℓ}, FORALL = {(v, i) | Qi =∀, v ∈ Vi }, EXISTS = {(v, i) | Qi =∃, v ∈ Vi }, NEG = {(v, c) | c ∈ F, ¬v ∈ c}, POS = {(v, c) | c ∈ F, v ∈ c}, and IN C LAUSE = {(u, v) | c ∈ F, {u, v} ⊆ var(c)}. Note that in the definition above we place constant symbols for integers 0 ≤ i ≤ ℓ, which we need below when decomposing the input. Instead of the constants that occur in tuples, we can also use multiple relation symbols of the ˙ ℓ , FORALL ˙ ˙ i . Since this results in a form DEPTH i , and EXISTS vacuous overhead, we treat them as given above. We say that Q is the corresponding QBF of Q and we sometimes use Q instead of Q for brevity. Let ι be an assignment for var(F ). Then, we define the solution vocabulary ξT := {Ṫ} of arity 1 and the ξT -structure is given by hι, (ι)i. Example 3. Consider QBF Q, σQBF -structure Q = hD, (R)Ṙ∈σQBF i from Example 1. The problem #Σℓ QSAT = hσQBF ∪ ξT , ξsc , psolsi additionally assumes a projection as part of the instances. This projection is given as part of the (σQBF ∪ ξT )-structure. Hence, consider projection T = {d, e}, our instance of #Σℓ QSAT is given by I = hD, RṘ∈(σQBF ∪ξT ) i. Consequently, projection P = IξT = hD, ({d, e})i. Recall the four assignments ∅, {c} {c, e}, and {c, d, e} from Example 1 under which Q evaluates to 1. When we project these assignments to {d, e}, we are left with only three assignments ∅, {e}, and {d, e}. As a result, h{3}, ({3})i is the only solution to instance I of problem #Σℓ QSAT. Example 2. Consider QBF Q from Example 1. Then, we construct the σQBF -structure from Q as Q= hmat(Q) ∪ var(mat(Q)), (R)Ṙ∈σQBF i, where DEPTH={3}, EXISTS ={(c, 1), (d, 1), (e, 1), (b, 3)}, FORALL ={(a, 2)}, NEG ={(a, c1 ), (c, c2 ), (b, c3 ), (d, c3 )}, POS ={(b, c1 ), (c, c2 ), (e, c3 ), (b, c4 ), (d, c4 ), (e, c4 )}. Observe that Q is the corresponding QBF of Q. Further, assignment ι of Example 1 is represented using ξT -structure h{c, e}, ({c, e})i. Proposition 1 (Hemaspaandra and Vollmer, 1995). The problem #Σℓ QSAT is #·Σℓ P -complete. Tree Decompositions (TDs) of Finite Structures. For a tree T and a node t of T , we let children(t) be the sequence of all child nodes of t in arbitrary but fixed order. Let I = hD, Ri be a σ-structure. A tree decomposition (TD) of I is a pair T = (T, χ) where T = (N, A) is a tree rooted at root(T ) and χ a mapping that assigns to each node t ∈ N Similar in algorithms and specifications that use logic for verification (Gurevich 1995) as well as in descriptive complexity, we define problems in a very general way using finite 11 a set χ(t) ⊆ D, called bag, such that the following conditions S hold: (i) D = t∈N χ(t) and for each Ṙ ∈ σ, we have R ⊆ χ(t)ar(Ṙ) for some t ∈ N ; and (ii) for each r, s, and t such that s lies on the path from r to t, we have χ(r) ∩ χ(t) ⊆ χ(s). This definition of a TD of I is the same as a TD of the Gaifman graph (Gaifman 1982) of I. Then, width(T ) := maxt∈N |χ(t)| − 1. The treewidth tw (G) of G is the minimum width(T ) over all TDs T of G. We denote S the bags χ≤t below t by χ≤t := t′ of T [t] χ(t′ ), where T [t] is the sub-tree of T rooted at t. For a node t ∈ N , we say that type(t) is leaf if children(t) = hi; join if children(t) = ht′ , t′′ i where χ(t) = χ(t′ ) = χ(t′′ ) 6= ∅; int (“introduce”) if children(t) = ht′ i, χ(t′ ) ⊆ χ(t) and |χ(t)| = |χ(t′ )| + 1; rem (“removal”) if children(t) = ht′ i, χ(t′ ) ⊇ χ(t) and |χ(t′ )| = |χ(t)| + 1. We use nice TDs, where for every node t ∈ N , type(t) ∈ {leaf, join, int, rem} and bags of the root node and leaf nodes are empty, which can be obtained in linear time without increasing the width (Kloks 1994). Listing 1: Algorithm DPA (I, T ): Dynamic programming on TTD T , cf., (Fichte et al. 2017). In: Problem instance I, TTD T = (T, χ, ι) of I such that n is the root of T and children(t) = ht1 , . . . , tℓ i. Out: A-TTD (T, χ, o) with A-table mapping o. 1 o ← empty mapping 2 for iterate t in post-order(T,n) do 3 o(t) ← At (It , ι(t), ho(t1 ), . . . , o(tℓ )i) 4 return (T, χ, o) TD (T, χ) multiple times, where ι is a table mapping from an earlier traversal. Therefore, ι might be empty at the beginning of the first traversal. 2. Run algorithm DPA (see Listing 1). It takes a TTD T = (T, χ, ι) and traverses T in post-order. At each node t of T it computes a new A-table o(t) by executing the algorithm A. The algorithm A has a “local view” on the computation and can access only t, atoms in the bag χ(t), bag-structure It , and child A-table o(t′ ) for child nodes t′ . 3. Output the A-tabled tree decomposition (T, χ, o). Dynamic Programming on TDs of Finite Structures 4. Print the result by interpreting o(n) for root n = root(T ). Usually when giving a dynamic programming algorithm, one only describes algorithm A. Hence, we focus on this algorithm in the following and call A table algorithm. Algorithms that utilize treewidth to solve problems typically proceed by dynamic programming along the TD (in postorder) where at each node of the tree information is gathered (Bodlaender and Kloks 1996) in a table by a table algorithm A. More generally, a table is a set of records, where a record ~u is a sequence of fixed length. The actual length, content, and meaning of the records depend on the algorithm A. Since we later traverse the tree decomposition repeatedly running different algorithms, we explicitly state A-record if records of this type are syntactically used for algorithm A and similar A-table for tables. In order to access tables computed at certain nodes after a traversal as well as to provide better readability, we attribute tree decompositions with an additional mapping to store tables. Formally, a tabled tree decomposition (TTD) of graph G is a pair T = (T, χ, τ ) where (T, χ) is a tree decomposition of G and τ is a mapping which maps nodes t of T to tables. When a TTD has been computed using algorithm A, after traversing, we call the decomposition the A-TTD of the input instance. Let T = (T, χ, ·) be a TTD of a σ-structure I = hD, Ri for a problem P = hσ, ξ, sol i and t in T . Then, we define the bagrelations Rt := (R ∩ χ(t)ar(Ṙ) )Ṙ∈σ . The bag-structure It is given by It := hχ(t), Rt i. This allows to define the bagS domain below t by D≤t := t′ in T [t] χ(t′ ), bag-relations bear(Ṙ) )Ṙ∈σ , and bag-structure below t low t by R≤t := (R∩D≤t by I≤t := hD≤t , R≤t i. Table Algorithm for Σℓ QSAT Next, we briefly present table algorithm QALG that allows us to solve problem Σℓ QSAT. To this end, consider a QBF Q = ∃V1 . · · · Qℓ Vℓ .F its σQBF -structure Q and a tabled tree decomposition T = (T, χ, ι) of Q. Then, algorithm DPQALG solves Σℓ QSAT, where algorithm QALG stores in table o(t) (nested) records of the form hI, Ai. The first position of such a record consists of an assignment I restricted to V1 ∩ χ(t). The second position consists of a nested set A of sequences that are of the same form as records in o(t). Intuitively, I is an assignment restricted to variables in V1 . For a nested sequence hI ′ , A′ i in A, assignment I ′ is restricted to variables in V2 ∩ χ(t) and so on. The innermost sequence hI ∗ , ∅i stores assignments restricted to Vℓ ∩ χ(t). In other words, the first position ~u(1) of any ~u ∈ o(t) characterizes a bag-relation T for symbol Ṫ ∈ ξT . Before we discuss algorithm QALG in more details, we introduce some auxiliary notation. In order to evaluate quantifiers we let checkForall(Q, hτ1 i) return true if and only if either Q1 = ∃, or for every ~u ∈ τ1 , we have that QALGt (Q, h{~u}i) outputs something different from ∅, i.e., we need for each record of τ1 a “succeeding record” for the parent node of t. Analogously, we let checkForall(Q, hτ1 , τ2 i) be true if and only if either Q1 = ∃, or for every ~u ∈ τ1 and ~v ∈ τ2 , we have QALGt (Q, h{~u}, τ2 i) 6= ∅ as well as QALGt (Q, hτ1 , {~v }i) 6= ∅. Intuitively, this reports whether records are missing in order to satisfy QBFs having outermost universal quantifier. Listing 2 presents table algorithm QALG, which works as follows. The algorithm computes the nested records recursively, which are of the same depth as the quantifier Observation 1. Given a finite structure I = hD, ·, Ri over σ and a TD T = (T, χ). Then, for n = root(T ), D≤n = D, R≤n = R, and I≤n = I. Let σ be a vocabulary, P = hσ, ξ, sol i a problem, and A a table algorithm for solving P on instances over σ. Then, dynamic programming (DP) on tree decompositions of finite structures performs the following steps for σ-structure I: 1. Compute a TTD (T, χ, ι) of I. Later, we traverse a 12 Listing 2: Table algorithm QALGt (Qt , hτ1 , . . .i), influenced by previous work (Chen 2004). In: Node t, bag-structure Qt = hDt , Rt i, and sequence hτ1 , . . .i of QALG-tables of children of t. Out: QALG-table τt . 1 Q1 V1 · · · Qℓ Vℓ .F ← corresponding QBF of Qt 2 if ℓ = 0 then τt ← ∅ 3 else if type(t) = leaf then 4 τt ← {h∅, QALGt (Q2 V2 · · · Qℓ Vℓ .F, hi)i} 5 else if type(t) = int and a ∈ χt is introduced then 6 τt ← {hJ, A′ i | hI, Ai ∈ τ1 , J ∈ {I}∪{Ia+ | a ∈ V1 }, |= mat(Qt [J]), A′ =QALGt (Qt [J], hAi), A′ 6=∅, checkForall(Qt [J], hAi)} 7 else if type(t) = rem and a 6∈ χt is removed then 8 τt ← {hIa− , QALGt (Qt [I], hAi)}i | hI, Ai ∈ τ1 } 9 else if type(t) = join then 10 τt ← {hI, A′ i | hI, A1 i ∈ τ1 , hI, A2 i ∈ τ2 , A′ =QALGt (Qt [I], hA1 , A2 i), A′ 6=∅, checkForall(Qt [I], hA1 , A2 i)} 11 return τt T : ∅ t14 i hI12.i ,A12.i i 1 h∅, {h∅, {{b}}i}i 2 h∅, {h∅, {∅, {b}}i}i τ12 {b} t4 t12 {b} τ11 t11 hI i hI4.i ,A4.i i i t3 11.i ,A11.i i {a, b} {b, d} 1 h∅, {h∅, {∅, {b}}i, h∅, {h∅, {{b}}i}i 1 h∅, {{b}}i}i h∅, {h∅, {∅, {b}}i}i 2 t2 {a} t10 {b, d, e} τ4 h{d}, {h∅, {∅}i}i 3 τ3 h{d}, {h∅, {∅, {b}}i}i 4 ∅ t1 t9 {d, e} i hI3.i ,A3.i i hI10.i , A10.i i i 1 h∅, {h∅, {∅, {b}}i, t8 {e} h∅, {h∅, {{b}}i}i 1 h{a}, {{b}}i}i h{e}, {h∅, {∅, {b}}i}i 2 t7 τ9 τ7 {c, e} h{d}, {h∅, {∅}i}i 3 hI9.i , A9.i i i hI7.i , A7.i i h{d, e},{h∅, {∅, {b}}i}i 4 h∅, {h∅, {∅}i}i 1 h∅, {h∅, {∅}i}i {c} t6 τ10 h{e}, {h∅, {∅}i}i 2 h{c}, {h∅, {∅}i}i hI5.i ,A5.i i i h{d}, {h∅, {∅}i}i 3 h{c, e},{h∅, {∅}i}i τ5 ∅ t5 1 h∅, {h∅, {∅}i}i h{d, e},{h∅, {∅}i}i 4 i hI13.i ,A13.i i τ13 1 h∅, {h∅, {{b}}i}i 2 h∅, {h∅, {∅, {b}}i}i {b} t13 Figure 1: Selected tables of τ obtained by DPQALG on TTD T . ing of the empty assignment for each depth d > 0, and a set of assignments that contain an empty set, recursively constructed with decreasing depth in Line 4. Node t2 is of type int and introduces variable a. Line 6 makes sure that results in τ2 contains only record h∅, {h∅, {∅}i, h{a}, {∅}i}i, thereby guessing on the assignment of a for the second (universal) quantifier of Q. Node t3 is of type int and introduces b. Then, bag-relations Rt3 at node t3 contain IN C LAUSE={(a, b)}, POS={(b, c1 )}, NEG={(a, c1 )}, i.e., we need to ensure clause c1 = ¬a ∨ b is satisfied in t3 . This is done in Line 6, as well as making sure that we keep for universally quantified variable a all its assignments. Node t4 is of type rem. Here, we restrict the records (see Line 8) such that they contain only variables occurring in bag χ(t4 ) = {b}. Basic conditions of a TD ensure that once a variable is removed, it does not occur in any bag at an ancestor node, i.e., we encountered all clauses for a. Nodes t5 , t6 , t7 , and t8 are symmetric to nodes t1 , t2 , t3 , and t4 . We proceed similar for nodes t9 − t12 . At node t13 we join tables τ4 and τ12 according to Line 10, where we only match agreeing assignments such that no assignment involving universally quantified variable a is lost. At the root node t14 , it is then ensured that we have those records only that lead to witnessing assignments. Since τ14 is not empty, formula Q is valid. We can reconstruct witnessing assignment {c, e} by combining parts I of the yellow highlighted records, as shown in Figure 1. Se+ :=S ∪ {e}, Se− :=S \ {e} and R∼ outputs R where relahṘ,N i tion RṘ is replaced by the relation N . depth ℓ. For leaf nodes, i.e., nodes t with type(t) = leaf, we construct a record of the form h∅, {h∅, {. . .}i}i, which is a nested record of depth ℓ, cf., Line 4. Note the recursive call to QALG and the base (termination) case in Line 2. Intuitively, whenever a variable a is introduced (int), we decide whether we assign a to true and only keep records, cf., Line 6, where all clauses of the matrix of the corresponding QBF of Qt are satisfied. Further, we need to guarantee that the universal quantifiers are still satisfiable, which is ensure by checkForall. When removing (rem) a variable a, we remove a from our records accordingly, cf., Line 8. If the node is of type join, we combine two records in two different child tables and ensure the satisfiability of the universal quantifier by means of checkForall. Intuitively, we are enforced to agree on assignments I, and on the ones in A, which is established in Line 10. Example 4. Recall QBF Q from Example 1. Observe that by the construction of Q, we have (u, v) ∈ IN C LAUSE for every two variables u, v of a given clause. Consequently, it is guaranteed (Kloks 1994) that in any TD of Q, we have for each clause at least one bag containing all its variables. In the light of this observation, Figure 1 depicts a TD T = (T, χ) of Q, assuming that clauses are implicitly contained in those bags, which contain all of its variables. The figure illustrates a snippet of tables of the TTD (T, χ, τ ), which we obtain when running DPQALG on instance Q and TTD T according to Listing 2. Note that for the ease of presentation, we write X instead of hX, ∅i. Further, for brevity we write τj instead of τ (tj ) and identify records by their node and identifier i in the figure. For example, record ~u9.2 = hI9.2 , A9.2 i ∈ τ9 refers to the second record of table τ9 for node t9 ; similarly we write for table τ3 , e.g., A3.1.2.1 to address {a}. In the following, we briefly discuss selected records of tables in τ . Node t1 is of type leaf. Therefore, table τ1 equals table τ5 which both have only one record, consist- Lemma 1 (Chen, 2004). Given a QBF Q of quantifier depth ℓ and a TTD T = (T, χ, ·) of Q of width k with g nodes. Then, algorithm DPQALG runs in time O(tower(ℓ, k + 5) · g). A recent result establishes that one cannot significantly improve the running time of the algorithm above assuming that the exponential time hypothesis (ETH). ETH (Impagliazzo, Paturi, and Zane 2001) states that there is some real s > 0 such that satisfiability of a given 3-CNF formula F cannot be decided in time 2s·|F | · kF kO(1) . Proposition 2 (Fichte, Hecher, and Pfandler, 2020). Un- 13 1. Create TTD T of Iσ Store results in table τt done? no Apply A on (Iσ )t Store results in table πt Visit next node t of T in post-order done? yes 2.I. DPA for P Purge non-solutions in τ ν Listing 3: Table algorithm APRJ(νt , It , hπ1 , . . .i) for projected solution counting, c.f., (Fichte et al. 2018). In: Purged table mapping νt , bag-projection Pt = (It )ξ , sequence hπ1 , . . .i of APRJ-tables of children of t. Out: APRJ-table πt of records hρ, ci, where ρ ⊆ νt , c ∈ N. 1 πt ← hρ, ipsc(t, ρ, hπ1 , . . .i)i ρ ∈ sub-bucketsPt (νt ) 2 return πt Apply APRJ to νt no Visit next node t of T in post-order yes 2.II. DPAPRJ for #PSols(P) 3. Output projected count pre-order and remove all records from the tables that cannot be extended to a solution (“Purge non-solution records”). Intuitively, these records are those that are not involved in τ (n) when recursively following A-origins(n, τ (n)) the root n back to the leaf nodes of T . In other words, we keep only records ~u of table τ (t), if ~u is involved in τ (n), i.e., if ~u participates in constructing a solution to our problem P. We call the table mapping ν purged table mapping and let the resulting TTD be Tpurged = (T, χ, ν). Step 2.II forms the main part of PSCA . By DPAPRJ , we traverse Tpurged to count solutions with respect to the projection and obtain Tproj = (T, χ, π). From the table π(n) at the root n of T , we can directly read the projected solution count of I. In the following, we only describe table algorithm APRJ as the traversal in DPAPRJ is the same as before. Records are of the form hρ, c, i ∈ π(t), where ρ ⊆ ν(t) is an A-table, and c is a non-negative integer. For a set ρ of records, the intersection projected solution count ipsc is, vaguely speaking, the number “projected” solutions to Iσ , all records in ρ have in common. That is the cardinality of the intersection of solutions to I≤t restricted to the given projection P for the set ρ of involved records. In the end, we are interested in the projected solution count (psc) of ρ, i.e., the cardinality of the union of solutions to I≤t restricted to P for records ρ. In the remainder, we additionally let ν be the purged table mapping, π be the APRJ-table mapping as used above, and ρ ⊆ ν(t). The relation ≡P ⊆ ρ × ρ considers equivalent records with respect to the projection P by ≡P :={(~u, ~v ) | ~u, ~v ∈ ρ, hD, ~u(1) i ⊓ P = hD, ~v(1) i ⊓ P}. Let bucketsP (ρ) be the set of equivalence classes induced by ≡P on set ρ of records, i.e., buckets S P (ρ) := (ρ/ ≡P ) = {[~u]P | ~u ∈ ρ}, where [~u]P = ~v∈ρ {~v | ~v ≡P ~u}, and sub-bucketsP (ρ) :={S | ∅ ( S ⊆ B, B ∈ bucketsP (ρ)}. Figure 2: Algorithm PSCA consists of DPA and DPAPRJ . der ETH, problems Σℓ QSAT for given QBF Q of quantifier depth ℓ cannot be solved in time tower(ℓ, o(k)) · 2o(kQk) , using treewidth k of structure Q. Projected Solution Counting In this section, we present a generic dynamic programming algorithm (PSCA ) and table algorithm (APRJ) that allows for solving projected solution counting, namely, problem #PS OLS(P). Therefore, we let P = hσ, ξ, ·i be a problem, which we extend to projected counting, I = hD, ·i a σ ∪ ξ-structure, P :=Iξ the considered projection, and A a table algorithm that solves P by dynamic programming. Since we reuse the output of algorithm A (tabled tree decomposition) to solve the actual projected solution counting problem, we let T = (T, χ, τ ) be an A-TTD of instance Iσ for problem P. By convention, we take t as a node of T . Generic Algorithm PSCA Next, we define a new meta algorithm (PSCA ) for a given table algorithm A. Core ideas that lead to algorithm PSCA are based on an algorithm for Boolean satisfiability (Fichte et al. 2018), which we lift to many much more general problems in KRR that can be defined using finite structures. Before we discuss our approach, we provide a notion to reconstruct solutions from a tabled tree decomposition that has been computed using a table algorithm A. This requires to determine for a given record its predecessor records in the corresponding child tables. Let therefore children(t) = ht1 , . . . , tℓ i. Given a sequence ~s = hs1 , . . . , sℓ i, we let h{~s}i :=h{s1 }, . . . , {sℓ }i. For a given A-record ~u, we define the originating A-records of ~u in node t by A-origins(t, ~u) :={~s | ~s ∈ τ (t1 ) × · · · × τ (tℓ ), ~u ∈ At (It , S h{~s}i)}. We extend this to A-table ρ by A-origins(t, ρ) := ~u∈ρ A-origins(t, ~u). These origins allow us to collect records that are extendable to solutions of P and combine them accordingly. Given any descendant node t′ of t, we call every record ~v ∈ ι(t′ ) that appears in some Arecords, when recursively following every A-origins(t, ρ) back to the leaf nodes, involved in ρ. Example 6. Consider again QBF Q, projection P, TTD (T, χ, τ ), and tables τ10 , τ11 from Example 4 and Figure 1. During purging, records u10.3 ~ and u11.3 ~ are removed, as highlighted in gray. This results in tables ν10 and ν11 . Then, the set ν10 / ≡P of equivalence classes ~ }, {u10.2 ~ }, {u10.4 ~ }}, whereas is bucketsP (ν10 ) = {{u10.1 ν11 / ≡P = {{u11.1 ~ , u11.2 ~ }, {u11.4 ~ }}. Example 5. Recall QBF Q, TTD (T, χ, τ ), and taConbles τ10 , τ11 from Example 4 and Figure 1. Observe that sider record ~u11.1 , ~u11.2 ∈ τ11 . A-origins(t11 , ~u11.1 ) = {~u11.1 }, which is in contrast to A-origins(t11 , ~u11.2 ) = {~u10.2 , ~u10.4 }. Later, we require to access already computed projected counts in tables of children of a given node t. Therefore, we define the stored ipsc of a table ρ ⊆ ν(t) in table π(t) P by s-ipsc(π(t), ρ) := hρ,ci∈π(t) c. We extend this to a sequence ~s = hπ(t1 ), . . . , π(tℓ )i of tables of length ℓ and a set O = {hρ1 , . . . , ρℓ i, hρ′1 , . . . , ρ′ℓ i, . . .} of sequences Q of ℓ tables by s-ipsc(s, O) = i∈{1,...,ℓ} s-ipsc(s(i) , O(i) ). In other words, we select the i-th position of the sequence Figure 2 illustrates an overview of the steps of PSCA . First, we compute a TD T = (T, χ) of Iσ . Then, we traverse the TD a first time by running DPA (Step 2.I), which outputs a TTD Tcons = (T, χ, τ ). Afterwards, we traverse Tcons in 14 i 1 2 3 c4.i i π4 i hρ4.i , 1 h{h∅, A4.i i},1i t 3 Example 7. Recall QBF Q, TTD T =(T, χ, τ), and tables τ1 , . . ., τ14 from Example 4 and Figure 1. Recall that for some nodes t, there are records among different QALG-tables that are removed (highlighted gray in Figure 1) during purging, i.e., not contained in purged table mapping ν. By purging we avoid to correct stored counters (backtracking) whenever a record has no “succeeding” record in the parent table. Next, recall Example 3 and consider Q, projection P, and the resulting instance I of #Σℓ QSAT. We discuss selected tables obtained by DPAPRJ (I, (T, χ, ν)). Figure 3 depicts selected tables of π1 , . . . , π14 obtained after running DPAPRJ for projected counting. We assume that record i in table πt corresponds to v~t.i = hρt.i , ct.i i where ρt.i ⊆ ν(t). Since type(t1 ) = leaf, we have π1 = h{~u1.1 }, 1i. Intuitively, at t1 the record ~u1.1 belongs to one bucket. Similarly for nodes t2 , t3 ,t4 , and t5 . t6 introduces c, which results in table π6 = Node h{u~6.1 }, 1i, h{u~6.2 }, 1i, h{u~6.1 , u~6.2 }, 1i , where u~6.1 = h∅, A6.1 i and u~6.2 = h{c}, A6.2 i with u~6.1 , u~6.2 ∈ τ6 . Consequently, c6.1 = ipsc(t6 , {u~6.1 }) = psc(t6 , {u~6.1 }) = s-ipsc(hπ5 i, {u~5.1 }) = 1; analogously for u~6.2 . Further, c6.3 = ipsc(t6 , {u~6.1 , u~6.2 }) = | psc(t6 , {u~6.1 , u~6.2 }) − ipsc(t6 , {u~6.1 }) − ipsc(t6 , {u~6.2 })| = |1 − 1 − 1| = 1. Similarly for table π7 as given, but ν7 has two buckets, as well as for π8 , π9 of Figure 3. Next, we discuss how to compute table π11 , For record v11.1 ~ we comgiven table π10 . = ipsc(t11 , {u11.1 ~ } = pute the count c11.1 psc(t11 , {u11.1 ~ }= s-ipsc(hπ10 i, {u10.1 ~ })=1. Analogously, for record v11.2 ~ , where c11.2 = ipsc(t11 , {u11.2 ~ })=1. ~ , we compute c11.3 = In order to obtain v11.3 ipsc(t11 , {u11.1 ~ , u11.2 ~ }) = | psc(t11 , {u11.1 ~ , u11.2 ~ }) − ipsc(t11 , {u11.1 ~ }) − ipsc(t11 , {u11.2 ~ })| = |2 − 1 − 1| = 0. We continue for tables π12 and π13 . In the end, the projected solution count of Q is given in the root node t14 and corresponds to s-ipsc(hπ14 i, {u14.1 ~ }) = c13.1 + c13.2 − c13.3 = 3. c14.i i i hρ14.i , 1 h{h∅, {h∅, {∅}i}i},3i π14 {b} t13 c12.i i i hρ12.i , 1 h{h∅, A12.1 i},1i {b} t4 t12 {b} π12 2 h{h∅, A12.2 i},2i h{h∅, A i, 12.1 3 {a, b} t11 {b, d} 0i h∅, A12.2 i}, hρ13.i , c13.i i π13 h{h∅, A13.1 i},1i h{h∅, A13.2 i},2i h{h∅, A13.1 i, 0i h∅, A13.2 i}, T : ∅ t14 c3.i i i hρ3.i , π3 t {a} t hρ11.i , c11.i i i 2 10 {b, d, e} 1 h{h∅, A3.i i},1i h{h∅, A i}, 1i 1 11.1 c1.i i π i hρ1.i , 1 2 {d, e} t9 h{h∅, A11.2 i}, 1i ∅ t1 1 h{h∅, A1.i i},1i h{h∅, A11.1 i, 3 0i h∅, A i}, hρ , c i 11.2 9.i i 9.i t8 {e} 4 π h{h{d}, A i},1i 11 11.4 1 h{h∅, A9.1 i}, 1i π9 2 h{h{e}, A9.2 i}, 1i t7 {c, e} i hρ10.i , c10.i i 3 h{h{d, e}, A9.4 i},1i 1 h{h∅, A10.1 i}, 1i hρ , c i 7.i t6 {c} 2 h{h{e}, A10.2 i}, 1i i 7.i 1i 3 1 h{h∅, A7.1 i}, π10h{h{d, e}, A10.4 i},1i π7 2 h{h{c}, A7.2 i}, 1i t5 ∅ hρ8.i , c8.i i i 3 h{h∅, A7.1 i, 1i h{c}, A7.2 i}, 1 π8 h{h∅, A8.1 i}, 1i 4 h{h{c, e}, A i},1i 2 h{h{e}, A8.2 i},1i 7.3 Figure 3: Tables of π obtained by DPPROJ on TTD T and purged table mapping ν of τ . together with sets of the i-th positions from the set of sequences. Intuitively, when we are at a node t in algorithm DPAPRJ we have already computed π(t′ ) of Tproj for every node t′ below t. Then, the projected solution count of ρ ⊆ ν(t) is obtained by applying the inclusion-exclusion principle to the stored projected solution count of origins. Definition 1. For table ρ and node t, the projected solution count psc is X psc(t, ρ, hπ(t1 ), . . .i := −1(|O|−1) ·s-ipsc(hπ(t1 ), . . .i, O). ∅(O⊆A-origins(t,ρ) Vaguely speaking, psc determines the A-origins of table ρ, iterates over all subsets of these origins and looks up the stored counts (s-ipsc) in the APRJ-tables over the children ti of node t. Finally, we provide a definition to compute ipsc, which can be computed at a node t for given table ρ ⊆ ν(t) by computing the psc for children ti of t using stored ipsc values from tables π(ti ), subtracting and adding ipsc values for subsets ∅ ( ϕ ( ρ accordingly. Runtime Analysis Next, we present asymptotic upper bounds on the runtime of our Algorithm DPAPRJ . To this end, we assume γ(n) to be the number of operations that are required to multiply two nbit integers, which can be achieved in time O(n · log n · log log n) (Knuth 1998). If unit costs for multiplication of numbers are assumed, then γ(n) = 1. Definition 2. For table ρ and node t, the intersection pro:=1 if type(t) = leaf and jected solution count ipsc(t, ρ, Ps) ipsc(t, ρ, s) := psc(t, ρ, s)+ ∅(ϕ(ρ (−1)|ϕ| ·ipsc(t, ϕ, s) where s = hπ(t1 ), . . .i, otherwise. Theorem 1 (⋆1 ). Given instance I of problem P and a TTD Tpurged = (T, χ, ν) of I of width k with g nodes. Then, DPAPRJ runs in time O(24m · g · γ(kIk)) where m := max(|{ν(t) | t ∈ N }|). In other words, if a node is of type leaf the ipsc is one, since bags of leaf nodes are empty. Otherwise, we compute count of given table ρ ⊆ ν(t) with respect to P, by exploiting the inclusion-exclusion principle on A-origins of ρ such that we count every projected solution only once. Then we have to subtract and add ipsc values (“all-overlapping” counts) for strict subsets ϕ of table ρ. Listing 3 presents the table algorithm APRJ, which stores π(t) consisting of every sub-bucket of the given table ν(t) together with its ipsc. In the end, the solution to #PS OLS(P) is given by s-ipsc(hπ(n)i, ν(n)). Corollary 1 (⋆). Given an instance Q of #Σℓ QSAT of treewidth k. Then, PSCQALG runs in time O(tower(ℓ + 1, k + 7) · γ(kQk) · g), where ℓ is the quantifier depth of QBF of Q. From recent results stated in Proposition 2, we can conclude that one cannot significantly improve the runtime assuming that the exponential time hypothesis (ETH) holds. 1 Proofs of statements marked with “⋆” will be made available in an author self-archived copy. 15 Definition 4 (Compatibility). Let children(t) = ht1 , . . . , tℓ i, û = hŜ, . . .i be an A-solution up to t and v̂ = hŜ ′ , . . .i be an A-solution up to ti . Then, û is compatible with v̂ (and vice-versa) if v̂(1) |χ≤ (ti ) = û(1) |χ≤ (ti ) . Corollary 2. Under ETH, the problem #Σℓ QSAT cannot be solved in time tower(ℓ + 1, o(k)) · 2o(kQk) for instance Q of treewidth k and quantifier depth ℓ. Formalization of suitable Table Algorithms For a table algorithm that correctly models the solutions to any instance of P, we require both soundness, indicating that computed records are not wrong, and completeness, which ensures that we do not miss records. In order to use our algorithm for variable problems P, we need to characterize the suitable table algorithms for P. To this end, we formalize the content of a table at a node t. Therefore, we define a record up to node t as follows: A record û up to t is of the form û = hû1 , . . . , ûq i such that for each i with 1 ≤ i ≤ q, either ûi only contains elements of the sub-tree rooted at t, i.e., ûi ⊆ χ≤ (t), or ûi is a set of records up to t. A set of records up to t is referred to by a table ρ̂ up to t. Definition 5 (Soundness). Algorithm A is referred to as sound, if for any TTD T ′ of any instance I ′ of P, any node t′ of T ′ with children(t′ ) = ht1 , . . . , tℓ i, any A-record solution u at t′ , and any A-record solutions vi at ti for 1 ≤ i ≤ ℓ, we have: If hv1 , . . . , vℓ i ∈ A-origins(t′ , u), then u is also an A-record solution at node t′ and u is compatible with vi . Definition 6 (Completeness). Algorithm A is referred to as complete, if for any TTD T ′ of any instance I ′ of P, any node t′ of T ′ with children(t′ ) = ht1 , . . . , tℓ i, ℓ ≥ 1, any A-record solution u at node t′ , and any corresponding Asolution û up to t′ (of u), we have: For every 1 ≤ i ≤ ℓ, there exists s = hv1 , . . . , vℓ i where vi is an A-record solution at ti such that s ∈ A-origins(t, u), and vˆi is a corresponding Asolution up to ti (of vi ) that is compatible with û. Formalizing Tables. Correctness of a table algorithm A for a problem P is typically established using a set C of conditions (Fichte et al. 2017; Jakl, Pichler, and Woltran 2009; Pichler, Rümmele, and Woltran 2010) that hold for every table that is computed using algorithm A. Let therefore ρ̂ be a table up to t, and C be a set of conditions, which depend only on sub-tree T [t] of T rooted at t. Then, û ∈ ρ̂ is referred to as A-solution up to t consistent with C if it ensures C. However, we need to restrict this set such that it allows us to characterize the solutions to P. Therefore, we need the definition of sufficient conditions, which, vaguely speaking, make sure that parts of the record, while potentially containing auxiliary data, correspond to the (relations of) solutions. These definitions finally allow us to define table algorithms to capture the solutions to instances of problem P. This ensures, besides soundness and completeness, that checking conditions in C can be done in polynomial time. Definition 7 (Correctness). Algorithm A is referred to as correct for problem P, if A is both sound and complete. Further, for any TD T ′ = (T, χ′ ) of any instance I ′ of P over σ, and any node t′ ∈ N ′ , any resulting A-TTD (·, ·, o), we have: (i) We can verify for every A-record u at node t′ of table o(t′ ) in time |u|O(1) whether record u is an A-record solution at t′ , by using only A-record solutions in o(·) for children of t′ . In other words, for every corresponding Asolution û up to t′ of record u, the conditions C hold. (ii) If t′ = root(T ′ ), or type(t′ ) = leaf, then |ν(t′ )| ≤ 1 for purged table mapping ν of o. Definition 3. A set C of conditions is called sufficient for A, if the set of solutions of any instance I ′ of problem P, and any TD T ′ of I ′ are characterized as follows: The set {û(1) | û is an A-solution up to root n′ of T ′ consistent with C} corresponds to set {R | S = h·, ·, Ri, S is a solution to instance I ′ }. However, these table algorithms A do not store records up to a node. Instead, such algorithms store “local” records that only mention contents restricted to the bag of a node. To this end, we need the following definitions. Given a table ρ̂ = {v̂1 , . . . , v̂s } up to t and a set P ⊆ χ(t). Then the table ρ̂ restricted to P is given by ρ̂|P :={v̂1 |P , . . . , v̂s |P }, where for v̂i ∈ ρ̂, if v̂i ⊆ χ≤ (t), then v̂i |P :=v̂i ∩ P . Otherwise, if v̂i = hû1 , . . . , ûq i, then v̂i |P :=hû1 |P , . . . , ûq |P i. This allows us to formalize the table ρ at node t, which is given by ρ :=ρ̂|χ(t) . Remark: Condition (2) hardly restricts correctness, since bags of the leaves and root are empty by definition, as we use nice TDs. In fact, reasonable table algorithms that compute solutions to problems P are correct, because the form of the table data structure, the correctness condition, and the monotonicity criteria via compatibility notion are very weak notions on top of the existing algorithm. In particular, these conditions still allow for solving (Bliem, Pichler, and Woltran 2013) monadic second order logic using TDs. Characterization of Correctness. In the following, we let C be such a set of sufficient conditions for A and û be an A-solution up to t consistent with C. Then, û|χ(t) at t is referred to as A-record solution at node t consistent with C. We say û is a corresponding A-solution up to t of û|χ(t) . Intuitively, to characterize correctness, we need some kind of monotonicity among bag relations over ξ, i.e., it is not allowed that some table records defer or change decisions at some descendant node about domain elements in relations that are part of solutions. Therefore, we rely on the following notion of compatibility. Proposition 3. Algorithm QALG is correct. Proof (Idea). Correctness of QALG can be established by adapting the original proof (Chen 2004) and establishing conditions C similar to invariants for other formalisms (Samer and Szeider 2010). Results for our Meta Algorithm PSCA . Finally, we state that new table algorithm APRJ is indeed correct assuming a correct table algorithm A is given. 16 Origin Problem P Graphs Graphs Graphs Graphs Logic Logic Logic Programs Epistemic LPs Epistemic LPs Reasoning Argumentation V ERTEX C OVER D OMINATING S ET I NDEPENDENT S ET 3-C OLORABILITY SAT Σℓ−1 QSAT, Πℓ−1 QSAT, ℓ ≥ 2 ASP C ANDIDATE W ORLD V IEWS W ORLD V IEWS A BDUCTION , C IRCUMSCRIPTION C REDpreferred ,C REDsemi-st ,C REDstage Runtime tower(i, Θ(k)) · kIkO(1) of #PS OLS(P) i=2 i=3 i=4 i=5 N,H N,H N,H N,H N,△[1],▽[1] i=ℓ N,△[3],▽[2] N,△[1],▽[1,2] N,▽[2,5] N,▽[2,5] N,▽[2] N,△[4],▽[2,4] Table 1: Runtime upper (N, △) and lower (H, ▽) bounds (ETH) of #PS OLS(P) for selected problems P. I refers to an instance of #PS OLS(P), k to the treewidth of I. (△, ▽) indicates previously known bounds. For problem definitions, we refer to the problem compendium of (Fichte, Hecher, and Pfandler 2020). Due to space reasons, we abbreviate references above as follows: [1]: (Fichte et al. 2018), [2]: (Fichte, Hecher, and Pfandler 2020), [3]: (Chen 2004), [4]: (Fichte, Hecher, and Meier 2019), [5]: (Hecher, Morak, and Woltran 2020) Theorem 2 (⋆). Given a correct algorithm A for problem P and an instance I of P. Then, Algorithm PSCA is correct and s-ipsc(hπ(n)i, ν(n)) outputs for TD-root n of the instance, consisting of I and any projection P, the solution to problem #PS OLS(P). established lower bounds for projected solution counting problems (under ETH) by also providing the corresponding upper bounds that are achieved with our framework. While we did not elaborate in detail, our work still allows different graph representations such as the incidence graph. The presented research opens up a variety of new questions. We believe that an implementation for projected counting can be quite interesting. Another interesting direction for future work is to extend the existing counting framework to projected solution enumeration with linear delay. We believe that projected counting or enumeration can be a promising extension for well-known graph problems, yielding new insights and wider applications as it was already the case for abstract argumentation (Fichte, Hecher, and Meier 2019). Corollary 3. Algorithm PSCQALG is correct and outputs for any given instance of #Σℓ QSAT its projected solution count. As a side result we immediately obtain a meta algorithm for counting. This includes problems P, where the corresponding DP algorithm A might compute more involved records such that a trivial extension of A to facilitate counters might count duplicate solutions. Corollary 4. Given an instance hD, (RṘ∈σ )i of problem P = hσ, ξ, ·i and correct table algorithm A for P. If we set R :=Dar(Ṙ) for each Ṙ ∈ ξ and run algorithm PSCA on instance hD, (RṘ∈σ∪ξ )i, the value s-ipsc(hπ(n)i, ν(n)) for TD-root n is the number of solutions to P. References Abiteboul, S.; Hull, R.; and Vianu, V. 1995. Foundations of Databases: The Logical Level. Addison-Wesley, 1st edition. Arnborg, S.; Lagergren, J.; and Seese, D. 1991. Easy problems for tree-decomposable graphs. J. Algorithms 12(2):308– 340. Aziz, R. A.; Chu, G.; Muise, C.; and Stuckey, P. 2015. #(∃)SAT: Projected Model Counting. In SAT’15, 121–137. Springer. Aziz, R. A. 2015. Answer Set Programming: Founded Bounds and Model Counting. Ph.D. Dissertation, University of Melbourne. Biere, A.; Heule, M.; van Maaren, H.; and Walsh, T., eds. 2009. Handbook of Satisfiability, volume 185 of Frontiers in Artificial Intelligence and Applications. IOS Press. Bliem, B.; Pichler, R.; and Woltran, S. 2013. Declarative dynamic programming as an alternative realization of courcelle’s theorem. In IPEC’13, volume 8246 of LNCS, 28–40. Springer. Bodlaender, H. L., and Kloks, T. 1996. Efficient and constructive algorithms for the pathwidth and treewidth of graphs. J. Algorithms 21(2):358–402. Table 1 gives a brief overview of selected problems P and their respective runtime upper bounds (N) obtained via DPA , as well as lower bounds (H, ▽) under the exponential time hypothesis (ETH). Conclusion and Future Work We introduced a novel framework and meta algorithm for counting projected solutions in a variety of domains. We use finite structures for specifying the problem and the graph representations on which we run dynamic programming. With this general tool at hand to describe problems, we employ dynamic programming (DP) on tree decompositions of the Gaifman graph (primal graph) of the input given in the finite structure. Conveniently, we can reuse already established DP algorithms and impose very weak conditions for their usage for projected counting. Interestingly, a very general technique that describes implementations of counting techniques (without projection) in a similar fashion, namely relational algebra, is also useful and competitive in practice (Fichte et al. 2020). Further, we completed the picture of previously 17 Capelli, F., and Mengel, S. 2019. Tractable QBF by knowledge compilation. In STACS’19, volume 126 of LIPIcs, 18:1– 18:16. Dagstuhl. Charwat, G., and Woltran, S. 2019. Expansion-based QBF solving on tree decompositions. Fundam. Inform. 167(12):59–92. Chavira, M., and Darwiche, A. 2008. On probabilistic inference by weighted model counting. Artificial Intelligence 172(6–7):772—799. Chen, H. 2004. Quantified constraint satisfaction and bounded treewidth. In ECAI’04, 161–170. IOS Press. Curticapean, R. 2018. Counting problems in parameterized complexity. In IPEC’18, volume 115 of LIPIcs, 1:1–1:18. Dagstuhl. 978-3-95977-084-2. Cygan, M.; Fomin, F. V.; Kowalik, Ł.; Lokshtanov, D.; Dániel Marx, M. P.; Pilipczuk, M.; and Saurabh, S. 2015. Parameterized Algorithms. Springer. Domshlak, C., and Hoffmann, J. 2007. Probabilistic planning via heuristic forward search and weighted model counting. J. Artif. Intell. Res. 30. Doubilet, P.; Rota, G.-C.; and Stanley, R. 1972. On the foundations of combinatorial theory. vi. the idea of generating function. In Berkeley Symposium on Mathematical Statistics and Probability, 2: 267–318. Dueñas-Osorio, L.; Meel, K. S.; Paredes, R.; and Vardi, M. Y. 2017. Counting-based reliability estimation for powertransmission grids. In AAAI’17, 4488–4494. AAAI Press. Durand, A.; Hermann, M.; and Kolaitis, P. G. 2005. Subtractive reductions and complete problems for counting complexity classes. Theoretical Computer Science 340(3):496–513. Fichte, J. K., and Hecher, M. 2019. Treewidth and counting projected answer sets. In LPNMR’19, volume 11481 of LNCS, 105–119. Springer. Fichte, J. K.; Hecher, M.; Morak, M.; and Woltran, S. 2017. Answer set solving with bounded treewidth revisited. In LPNMR’17, volume 10377 of LNCS, 132–145. Springer. Fichte, J. K.; Hecher, M.; Morak, M.; and Woltran, S. 2018. Exploiting treewidth for projected model counting and its limits. In SAT’18. Springer. Fichte, J. K.; Hecher, M.; Thier, P.; and Woltran, S. 2020. Exploiting database management systems and treewidth for counting. In PADL’20, volume 12007 of LNCS, 151–167. Springer. Fichte, J. K.; Hecher, M.; and Meier, A. 2019. Counting complexity for reasoning in abstract argumentation. In AAAI’19, 2827–2834. AAAI Press. Fichte, J. K.; Hecher, M.; and Pfandler, A. 2020. Lower bounds for QBFs of bounded treewidth. In LICS’20, 410–424. ACM. Fioretto, F.; Pontelli, E.; Yeoh, W.; and Dechter, R. 2018. Accelerating exact and approximate inference for (distributed) discrete optimization with GPUs. Constraints 23(1):1–43. Gaggl, S. A.; Manthey, N.; Ronca, A.; Wallner, J. P.; and Woltran, S. 2015. Improved answer-set programming encodings for abstract argumentation. TPLP 15(4-5):434–448. Gaifman, H. 1982. On local and nonlocal properties. In Proceedings of the Herbrand Symposium, volume 107 of Studies in Logic and the Foundations of Mathematics, 105– 135. Elsevier. Gebser, M.; Kaufmann, B.; and Schaub, T. 2009. Solution enumeration for projected boolean search problems. In CPAIOR’09, volume 5547 of LNCS, 71–86. Springer. Gomes, C. P.; Sabharwal, A.; and Selman, B. 2009. Chapter 20: Model counting. In Handbook of Satisfiability, volume 185 of Frontiers in Artificial Intelligence and Applications. IOS Press. 633–654. Graham, R. L.; Grötschel, M.; and Lovász, L. 1995. Handbook of combinatorics, volume I. Elsevier. Gurevich, Y. 1995. Evolving algebras 1993: Lipari guide. In Specification and Validation Methods. OUP. 9–36. Hecher, M.; Morak, M.; and Woltran, S. 2020. Structural decompositions of epistemic logic programs. In AAAI’20, 2830–2837. AAAI Press. Hemaspaandra, L. A., and Vollmer, H. 1995. The satanic notations: Counting classes beyond #P and other definitional adventures. SIGACT News 26(1):2–13. Impagliazzo, R.; Paturi, R.; and Zane, F. 2001. Which problems have strongly exponential complexity? J. of Computer and System Sciences 63(4):512–530. Jakl, M.; Pichler, R.; and Woltran, S. 2009. Answer-set programming with bounded treewidth. In IJCAI’09, volume 2, 816–822. Kangas, K.; Koivisto, M.; and Salonen, S. 2019. A faster tree-decomposition based algorithm for counting linear extensions. In IPEC’18, volume 115 of LIPIcs, 5:1–5:13. Dagstuhl. Kloks, T. 1994. Treewidth. Computations and Approximations, volume 842 of LNCS. Springer. Knuth, D. E. 1998. How fast can we multiply? In The Art of Computer Programming, volume 2 of Seminumerical Algorithms. Addison-Wesley, 3 edition. chapter 4.3.3, 294– 318. Lagniez, J.-M., and Marquis, P. 2019. A recursive algorithm for projected model counting. In AAAI’19, 1536–1543. AAAI Press. Pichler, R.; Rümmele, S.; and Woltran, S. 2010. Counting and enumeration problems with bounded treewidth. In LPAR’10, volume 6355 of LNCS, 387–404. Springer. Samer, M., and Szeider, S. 2010. Algorithms for propositional model counting. J. Discrete Algorithms 8(1):50—64. Sang, T.; Beame, P.; and Kautz, H. 2005. Performing bayesian inference by weighted model counting. In AAAI’05. AAAI Press. Sharma, S.; Roy, S.; Soos, M.; and Meel, K. S. 2019. Ganak: A scalable probabilistic exact model counter. In IJCAI’19, 1169–1176. IJCAI. Stockmeyer, L. J., and Meyer, A. R. 1973. Word problems requiring exponential time. In STOC’73, 1–9. ACM. Valiant, L. 1979. The complexity of enumeration and reliability problems. SIAM J. Comput. 8(3):410–421. 18 Paraconsistent Logics for Knowledge Representation and Reasoning: advances and perspectives Walter Carnielli1,2,3,4 , Rafael Testa1 , 1 Centre for Logic, Epistemology and the History of Science 2 Institute for Philosophy and Human Sciences University of Campinas (UNICAMP) 3 Advanced Institute for Artificial Intelligence (AI2) 4 Modal Institute walterac@unicamp.br, rafaeltesta@gmail.com Abstract Indeed, the so called Bar-Hillel-Carnap paradox has already suggested half century ago the collapse between the notions of contradiction and semantic information: the less probable a statement is, the more informative it is, and so contradictions carry the maximum amount of information (Carnap and Bar-Hillel 1952). However, and in the light of standard logic, contradictions are “too informative to be true” as a famous quote by the latter has it. To face the task of reasoning under contradictions, a field where human agents excel, is a difficult philosophical problem for standard logic, which is forced to equate triviality and contradiction, and to regard all contradictions as equivalent. However, skipping all technicalities in favor of a clear intuition (technical details can be found in (Mendonça 2018)), the Bar-Hillel-Carnap observation is not paradoxical for LFIs. This paper briefly outlines some advancements in paraconsistent logics for modeling knowledge representation and reasoning. Emphasis is given on the so-called Logics of Formal Inconsistency (LFIs), a class of paraconsistent logics that formally internalize the very concept(s) of consistency and inconsistency. A couple of specialized systems based on the LFIs will be reviewed, including belief revision and probabilistic reasoning. Potential applications of those systems in the AI area of KRR are tackled by illustrating some examples that emphasizes the importance of a fine-tuned treatment of consistency in modeling reputation systems, preferences, argumentation, and evidence. 1 Introduction Non-classical logics find several applications in artificial intelligence, including multi-agent systems, reasoning with vagueness, uncertainty, and contradictions, among others, mostly akin with the area of knowledge representation and reasoning (Thomason 2020). Regarding this latter, there is a plethora of aims and applications in view when representing a knowledge of an agent, including fields beyond AI like software engineering, databases and robotics. Several logics have been studied for the latter purposes, including non-monotonic, epistemic, temporal, many-valued and fuzzy logics. This paper highlights the use of paraconsistent logics in some inconsistency-tolerant frameworks, introducing the family of Logics of Formal Inconsistency (LFIs) (advanced in the literature to be presented) for representing reasoning that makes use of the very notion of consistency and inconsistency, suitably formalized within the systems. 2 2.1 2.2 The beginings of Paraconsistent Logics (modern era) The idea of a non-Aristotelian logic was advanced in a lecture in 1919 by Nicolai A. Vasiliev, where he proposed a kind of reasoning free from the laws of excluded middle and contradiction – called Imaginary Logic as an analogy with Lobachevsky’s imaginary geometry. Such a logic would be valid, as the former has it, only for reasoning in “imaginary worlds” (Vasiliev 1912). A more concrete example of a system for reasoning with contradictions can be found in the Discussive Logic (Jaśkowski 1948), advanced as a formal answer to the puzzling situation posed by J. Łukasiewicz: which logic applies in the situation where one has to defend some judgment A, also considering not-A for the sake of the argument? Jaśkowski’s strategy is to avoid the combination of conflicting information by blocking the rule of adjunction. The idea is making room for A and ¬A without entailing A ∧ ¬A, since the classic explosion actually still holds in the form of A ∧ ¬A 6⊢ B. In terms of reasoning, it has a straightforward meaning: each agent must still be consistent! Jaśkowski’s intuitions contributed to the proposal of the society semantics and to general case, the possible-translations semantics. A discussion on some conceptual points involving society semantics and their role on collective intelli- Reasoning under contradiction The informative power of contradictions Contradictory information is not only frequent, and more so as systems increase in complexity, but can have a positive role in human thought, in some cases not being totally undesirable. Finding contradictions in juridical testimonies, in statements from suspects of a crime or in suspects of tax fraud, for instance, can be an effective strategy – contradictions can be very informative in those cases (Carnielli and Coniglio 2016). 19 Adaptive Logics Human reasoning can be better understood as endowed with many dynamic consequence relations. Adaptive reasoning recognizes the so-called abnormalities to develop formal strategies to deal with them: for instance, an abnormality might be an inconsistency (inconsistency-adaptive logics), or it might be an inductive inference, and a strategy might be excluding a line of a proof (by marking it), or to change an inference rule. (Batens 2001). gence can be found in (Carnielli and Lima-Marques 2017; Testa 2020). Another precursor, with a multi-valued approach, is the Logic of Nonsense (Halldén 1949) that, despite its name, captured a meaningful form of reasoning – aiming in studying logical paradoxes by means of 3-valued logical matrices (closely related to the Nonsense Logic introduced in 1938 by A. Bochvar). An analogous approach is made by F. Asenjo, who introduced a 3-valued logic as a formal framework for studying antinomies by means of 3-valued Kleene’s truthtables for negation and conjunction, where the third truthvalue is distinguished (Asenjo 1966). The same logic has been studied by G. Priest, from the perspective of matrix logics, in the form of the so-called Logic of Paradox (LP) (Priest 1979). With respect to a constructive approach to intuitionistic negation, D. Nelson proposed an extension of positive intuitionistic logic with a strong negation – a connective designed to capture the notion of “constructible falsity”. By eliminating the explosion, Nelson obtained a (first-order) paraconsistent logic (Nelson 1959). Focusing on the status of contradictions in mathematical reasoning, N. da Costa advanced a hierarchy of paraconsistent systems Cn (for n ≥ 1) tolerant to contradictions, where the consistency of a formula A (in his terminology, the ‘well-behavior’ of A) is defined in C1 by the formula A◦ = ¬(A ∧ ¬A). Let A1 =def A◦ and An+1 =def (An )◦ . Then, in C n , the following holds: (i) the well behavior is denoted by A(n) =def A1 ∧ · · · ∧ An ; (ii) A, ¬A 6⊢ B in general, but A(n) , A, ¬A ⊢ B always holds; and (iii) A(n) , B (n) ⊢ (A#B)(n) and A(n) ⊢ (¬A)(n) . By concentrating on the non-triviality of the systems rather than on the absence of contradictions, da Costa defined a logic to be paraconsistent with respect to ¬ if it can serve as a basis for ¬-contradictory yet non-trivial theories (da Costa 1974): Dialetheism A dialetheia is a sentence A, such that both it and its negation ¬A are true. Assuming that falsity is the truth of negation, a dialetheia then is a sentence which is both true and false. Dialetheism, accordingly, is the metaphysical view that there are dialetheia, i.e., that there are true contradictions. As such, dialetheism opposes the Law of Non-Contradiction in the forma of ¬(A ∧ ¬A) (Priest 1987). A system admitting ‘both’ as a truth-value, for instance, is the aforementioned Logic of Paradox. Inconsistent (or rather Contradictory) Formal Systems The main idea is that there are situations in which contradictions can, at least temporarily, be admissible if their “behavior can be somehow controlled”, as da Costa has it (op. cit.). Contemporaneously, (Carnielli and Marcos 2002) extended and further generalized such notions, giving rise to the so called Logics of Formal Inconsistency, to be presented in the next section. 3 3.1 Contradiction, consistency, inconsistency, and triviality LFIs are a family of paraconsistent logics designed to express the notion(s) of consistency and inconsitency (sometimes defining one another, sometimes taken as primitive, depending on the strength of the axioms) within the object language by employing a connective “◦” (or “•”), in which ◦α means that “α is consistent” (and •α means that “α is inconsistent”), further expanding and generalizing da Costa’s hierarchy of C systems. Accordingly, the principle of explosion is not valid in general, although this law is not abolished but restricted to the so-called “consistent sentences”, a feature captured by the following law, which is referred to as the “principle of Gentle Explosion” (PGE): Definition 1. ∃Γ∃α∃β(Γ ⊢ α and Γ ⊢ ¬α and Γ 6⊢ β) 2.3 Logics of Formal Inconsistency- LFIs Motivations: main approaches Preservationism Similar to the way discussive logic has it, there is a clear distinction between an inconsistent data set, like {A, ¬A} (which is considered tractable), with a contradiction in the form A ∧ ¬A (intractable). Thus, given an inconsistent collection of sentences (in an already defined logic L, usually classical logic), one should not try to reason about that collection as a whole, but rather focus on internally consistent subsets of premises. (Schotch, Brown, and Jennings 2009). α, ¬α, ◦α ⊢ β, for every β, but α, ¬α 6⊢ β for some β (1) In formal terms, we have the following (Carnielli and Coniglio 2016): Definition 2 (A formal definition of LFI). Let L be a Tarskian logic with a negation ¬. The logic L is a LFI if there is a non-empty set (p) of formulas in some language L of L which depends only on the propositional variable p, satisfying the following: Relevant Logics Relevant logics are mainly concerned with a meaningful connection between the premises and the conclusion of an argument, thus not accepting for example inferences like B ⊢ A → B. This strategy induces a paraconsistent character in the resulting deductions, since A and ¬A, as premisses, do not necessarily have a meaningful connection with an arbitrary conclusion B (Anderson, Belnap, and Dunn 1992). a. b. c. d. 20 ∃α∃β(¬α, α 6⊢ β) ∃α∃β( (α), α 6⊢ β) ∃α∃β( (α), ¬α 6⊢ β) ∀α∀β( (α), α, ¬α ⊢ β) For any formula α, the set (α) is intended to express, in a specific sense, the consistency of α relative to the logic L. When this set is a singleton, it is denoted by ◦α the sole element of (α), thus defining a consistency operator. The connective “◦”, as mentioned, is not necessarily a primitive one. Indeed, LFI is an umbrella definition that covers many paraconsistent logics of the literature. Remark 3 (Some notable LFIs). Following definition 2, it can be easily proved that some well-known logics in the literature are LFIs, including the aforementioned Jaśkowski’s Discussive logic, Halldén’s nonsense logic and, as expected, da Costa’s C-systems (Carnielli and Coniglio 2016; Carnielli, Coniglio, and Marcos 2007; Carnielli and Marcos 2002). It is worth observing that each one of the aforementioned logics has their own motivations and particularities - being Remark 3 to be understood as a logic-mathematical reminder that those logics share some common results and properties. 3.2 mbC can be characterized in terms of valuations over {0, 1} (also called bivaluations), but cannot be semantically characterized by finite matrices (cf. (Carnielli, Coniglio, and Marcos 2007)). Surprisingly, however, mbC can be characterized by 5-valued non-deterministic matrices, as shown in (Avron 2005) (details also in Example 6.3.3 of (Carnielli and Coniglio 2016)). Definition 5 (Valuations for mbC). A function v : L → 0, 1 is a valuation for mbC if it satisfies the following clauses: (Biv1) v(α ∧ β) = 1 ⇐⇒ v(α) = 1 and v(β) = 1 (Biv2) v(α ∨ β) = 1 ⇐⇒ v(α) = 1 or v(β) = 1 (Biv3) v(α → β) = 1 ⇐⇒ v(α) = 0 or v(β) = 1 (Biv4) v(¬α) = 0 =⇒ v(α) = 1 (Biv5) v(◦α) = 1 =⇒ v(α) = 0 or v(¬α) = 0. The semantic consequence relation associated to valuations for mbC is defined as expected: X |=mbC α iff, for every mbC-valuation v, if v(β) = 1 for every β ∈ X then v(α) = 1. Definition 6 (Extensions of mbC (Carnielli and Marcos 2002; Carnielli, Coniglio, and Marcos 2007; Carnielli and Coniglio 2016)). Consider the following axioms: (ciw) ◦α ∨ (α ∧ ¬α) (ci) ¬◦α → (α ∧ ¬α) (cl) ¬(α ∧ ¬α) → ◦α (cf) ¬¬α → α (ce) α → ¬¬α Some interesting extensions of mbC are the following: mbCciw = mbC+(ciw) mbCci = mbC+(ci) bC = mbC+(cf) Ci = mbC+(ci)+(cf) = mbCci+(cf) mbCcl = mbC+(cl) Cil = mbC+(ci)+(cf)+(cl) = mbCci+(cf)+(cl) = mbCcl+ (cf) + (ci) = Ci+(cl) The semantic characterization by bivaluations for all these extensions of mbC can be easily obtained from the one for mbC (see (Carnielli, Coniglio, and Marcos 2007; Carnielli and Coniglio 2016)). For instance, mbCciw is characterized by mbC-valuations such that v(◦α) = 1 if and only if v(α) = 0 or v(¬α) = 0 (if and only if v(α) 6= v(¬α)). Notation 7 (derived bottom particle and strong negation). ⊥=def α ∧ ¬α ∧ ◦α and ∼ α =def α →⊥ (for any α). It is then clear that the LFIs are at the same time subsystems and extensions of CPL. They can be seen as classical logic extended by two connectives: a paraconsistent negation and a consistency connective (or an inconsistency one, dual to it). In formal terms, consider CPL defined over the language L0 generated by the connectives ∧, ∨, →, ¬, where ¬ represents the classical negation instead of the paraconsistent one. If Y ⊆ L0 then ◦(Y ) = {◦α : α ∈ Y }. Then, the following result can be obtained: Observation 8 (Derivability Adjustment Theorem (Carnielli and Marcos 2002)). Let X ∪ {α} be a set of formulas in L0 . Then X ⊢CP L α if and only if ◦(Y ), X ⊢mbc α for some Y ⊆ L0 . A family of LFIs It should be clear that the notions of consistency and noncontradiction are not coincident in the LFIs, and that the same holds for the notions of inconsistency and contradiction. There is, however, a fully-fledged hierarchy of LFIs where consistency is gradually connected to noncontradiction. Starting from positive classical logic plus tertium non datur (α ∨ ¬α), mbC is one of the basic logics intended to comply with definition 2 in a minimal way: an axiom schema called (bc1) is added solely to capture the aforementioned principle of gentle explosion. Definition 4 (mbC(Carnielli and Marcos 2002)). The logic mbC is defined over the language L ( generated by the connectives ∧, ∨, →, ¬, ◦) by means of a Hilbert system as follows: Axioms: (A1) α → (β → α) (A2) (α → β) → ((α → (β → δ)) → (α → δ)) (A3) α → (β → (α ∧ β)) (A4) (α ∧ β) → α (A5) (α ∧ β) → β (A6) α → (α ∨ β) (A7) β → (α ∨ β) (A8) (α → δ) → ((β → δ) → ((α ∨ β) → δ)) (A9) α ∨ (α → β) (A10) α ∨ ¬α (bc1) ◦α → (α → (¬α → β)) Inference Rule: (Modus Ponens (MP)) α, α → β ⊢ β (A1)-(10) plus (MP) coincides with Baten’s paraconsistent logic CLuN – it is worth mentioning that a nonmonotonic characterization of the Ci-hierarchy (presented in section 6) can be found in (Batens 2009). Furthermore, (A1)(A9) plus (MP) defines positive classical propositional logic CPL+ . 21 4 Paraconsistent Belief Change The fact is that there are in the literature several systems that could be understood as endowing a certain paraconsistent character, each one based on distinct strategies and motivations (see for instance (Fermé and Wassermann 2017) for an Iterated Belief Change perspective). An approach of Belief Change from the perspective of inconsistent formal systems was conceptually suggested by (da Costa and Bueno 1998). Departing from the technical advances of mbC and its extensions, (Testa, Coniglio, and Ribeiro 2017) goes further in this direction, defining external and semi-revisions for belief sets, as well as consolidation (operations that were originally presented for belief bases (Hansson 1993)(1997). By considering consistency as an epistemic attitude, and allowing temporary contradictions, the informational power of the operations are maximized (as it argued by (Testa 2015)). It is worth mentioning that, as proposed by Priest and Tanaka (op. cit.), paraconsistent revision could be understood as a plain expansion. As it is explained by (Testa et al. 2018), to equate paraconsistent revision with expansion it is necessary to assume that consistency is necessarily equivalent to non-triviality in a paraconsistent setting and, furthermore, that all paraconsistent logics do not endow a bottom particle (primitive or defined). As this paper intends to highlight, neither assumption is true. Belief Change in a wide sense has been subject of philosophical reflection since antiquity, including discussions about the mechanisms by which scientific theories develop and proposing rationality criteria for revisions of probability assignments (Fermé and Hansson 2018). Contemporaneously, there is a strong tendency towards confluence of the research traditions on the subject from philosophy and from computer research (Hansson 1999). The most influential paradigm in this area of study is the AGM model (Alchourrón, Gärdenfors, and Makinson 1985), in which epistemic states are represented as theories – considered simply as sets of sentences closed under logical consequence. Three types of epistemic changes (or operations) are considered in this model: expansion, the incorporation of a sentence into a given theory; contraction, the retraction of a sentence from a given theory; and revision, the incorporation of a sentence into a given consistent theory by ensuring the consistency of the resulting one. Notably, given the possibility of reasoning with contradictions (as paraconsistent logics have it), as well as the aforementioned scrutiny on the very concept of “consistency”, the definition of revision can be refined. Indeed, there are some investigations in the literature alongside this direction: Based on the four-valued relevant logic of first-degree entailment, (Restall and Slaney 1995) defines an AGM-like contraction without satisfying the recovery postulate. Revision is obtained from contraction by the Levi identity (to be introduced). Also based on the first-degree entailment, (Tamminga 2001) advances a system that put forth a distinction between information and belief. Techniques of expansion, contraction an revisions are applied to information (which can be contradictory), while other kind of operations are advanced for extracting beliefs from those information. The demanding for consistency (i.e. non-contradictoriness) is applied only for those beliefs. (Mares 2002) proposes a model in which an agent’s belief state is represented by a pair of sets – one of these is the belief set, and the other consists of the sentences that the agent rejects. A belief state is coherent if and only if the intersection of these two sets is empty, i.e. if and only if there is no statement that the agent both accepts and rejects. In this model, belief revision preserves coherence but does not necessarily preserve consistency. Also departing from a distinction between consistency and coherence, (Chopra and Parikh 1999) advances a model based on Belnap and Dunn’s logic that preserves an agent’s ability to answer contradictory queries in a coherent way, splitting the language to distinguish between implicit and explicit beliefs. In (Priest 2001) and (Tanaka 2005), it is suggested that revision can be performed by just adding sentences without removing anything, i.e, revision can be defined as a simple expansion. Furthermore, Priest first pointed out that in a paraconsistent framework, revision on belief sets can be performed as external revision, defined with the reversed Levi identity as advanced for belief bases (Hansson 1993) . Remark 9. From now on, let us assume a LFI, namely L=hL, ⊢L i, such that L is mbC or some extension as presented above. Since the context is clear, we will omit the subscript, and simply denote ⊢L by ⊢ and, accordingly, the respective closure by Cn. 4.1 Revisions in the LFIs In (Testa, Coniglio, and Ribeiro 2017) the so-called AGMp system is proposed, in which it is shown that a paraconsistent revision of a belief set K by a belief-representing sentence α (the operation K ∗ α) can be defined not only by the Levi identity as in classical AGM (that is, by a prior contraction by ¬α followed by a expansion by α) but also by reversed Levi identity and other kind of constructions where contradictions are temporarily accepted. Formally, we have the following: Let K = Cn(K). The expansion of K by α (K + α) is given by Definition 10. K + α = Cn(K ∪ {α}) There are several constructions for defining a contraction operator. The one adopted is the partial meet contraction, constructed as follows (Alchourrón, Gärdenfors, and Makinson 1985): 1. Choose some maximal subsets of K (with respect the inclusion) that do not entail α. 2. Take the intersection of such sets. The remainder of K and α is the set of all maximal subsets of K that do not entail α. Definition 11 (Remainder). The set of all the maximal subsets of K that do not entail α is called the remainder set of K by α and is denoted by K⊥α, that is, K ′ ∈ K⊥α iff: (i) K ′ ⊆ K. 22 (ii) α 6∈ Cn(K ′ ). (iii) If K ′ ⊂ K ′′ ⊆ K then α ∈ Cn(K ′′ ). Typically K⊥α may contain more than one maximal subset. The main idea constructing a contraction function is to apply a selection function γ which intuitively selects the sets in K⊥α containing the beliefs that the agent holds in higher regard (those beliefs that are more entrenched). Definition 12 (selection function). A selection function for K is a function γ such that, for every α: 1. γ(K⊥α) ⊆ K⊥α if K⊥α 6= ∅. 2. γ(K⊥α) = {K} otherwise. The partial meet contraction is the intersection of the sets of K⊥α selected by γ. Definition 13 (partial meet contraction). Let K be a belief set, and γ a selection function for K. The partial meet contraction on K that is generated by γ is the operation −γ such that for all sentences α: \ K −γ α = γ(K⊥α). Example 1. A man has died in a remote place in which only two other persons, Adam and Bob, were present. Initially, the public prosecutor believes that neither Adam nor Bob has killed him. Thus her belief state contains ¬A (Adam has not killed the deceased) and ¬B (Bob has not killed the deceased). For simplicity, we may assume that her belief state is K0 = Cn({¬A, ¬B}). Case 1: The prosecutor receives a police report saying (1) that the deceased has been murdered, and that either Adam or Bob must have done it; and (2) that Adam has previously been convicted of murder several times. After receiving the report, she revises her belief set by (A ∨ B) and by the assumption that Bob’s innocence is indeed consistent ◦¬B, i.e. she revises her initial belief set by (A ∨ B) ∧ ◦¬B. Case 2: differs from case 1 only that it is Bob who has previously been convicted of murder. Thus, the new piece of information consists of (A ∨ B) ∧ ◦¬A. Internal Revision approach: If represented as an internal partial meet revision, when the first suboperation is performed (namely, contraction by ¬((A ∨ B) ∧ ◦¬B) and ¬((A ∨ B) ∧ ◦¬A) respectively in case 1 and case 2), we have that The distinct revisions are then defined as follows: Definition 14. Internal revision (K − ¬α) + α External revision (K + α) − ¬α Semi-revision (K + α)! The aforementioned operator “!”, originally advanced for belief bases (Hansson 1997), is a particular case of contraction – called consolidation. In Hansson’s original presentation, this operator is defined as a contraction by “⊥”. In the context of LFIs, it is defined as the contraction by ΩK = {α ∈ K : exists β ∈ L such that α = β ∧ ¬β}. The technical details of those operations, alongside a presentation through postulates and their respective representation theorems can be found in the references. 4.2 K0 ⊥(¬((A ∨ B) ∧ ◦¬B)) = K0 ⊥(¬((A ∨ B) ∧ ◦¬A)). The subsequent expansion does not necessarily add nor delete Adam’s or Bob’s guilty/innocence in both cases, since the previous contraction could indiscriminately delete Adam’s or Bob’s innocence – not taking profit of the new piece of information as a whole. External Revision approach: If represented as an external partial meet revision, we have the following. Case 1: The police report brings about the expansion of K to K1 = Cn(K +(A∨B)∧◦¬B). Notably, A ∈ K1 (on the grounds that ◦¬B, ¬B, A ∨ B ⊢ ◦¬B, ¬B, A ∨ ¬¬B ⊢ A). In plain English, Adam is now proven to be guilty. Moreover, •¬A ∈ K1 (for A ∧ ¬A ⊢ ¬¬A ∧ ¬A ⊢ •¬A) i.e., the initial assumption about Adam’s innocence is logically proven to be inconsistent. The subsequent contraction thus has means to delete the initial supposition about Adam’s innocence. Case 2: Mutatus mutandis. Semi-revision approach: The semi-revision approach is analogous to the external-revision, with the distinction that the second suboperation (namely, contraction) does not necessarily delete Adam’s and Bob’s innocence (respectively in case 1 and case 2) but, rather, gives the option for deleting the new piece of information given by the police report. Reasoning with consistency and inconsistency Each of the LFIs in the aforementioned family (recall definition 6) captures distinct properties regarding the notion of formal consistency. For instance, mbC separates the notions of consistency from non-contradictoriness (◦α ⊢ ¬(¬α ∧ α), but the converse does not hold), and also separates the notions of inconsistency from contradictoriness (α ∧ ¬α ⊢ ¬◦α, but the converse does not hold). In Ci inconsistency and contradictoriness are identified (¬◦α⊣⊢α ∧ ¬α) and, in Cil consistency and non-contradictoriness are identified (◦α⊣⊢¬(α ∧ ¬α)). This cautious way of dealing with the formal concept of consistency allows the modeling of significant forms of reasoning, as it is illustrated by the following example adapted from (Hansson 1999). In Hansson’s original presentation, it was intended to show a case of an external partial meet revision that is not also an internal partial meet revision – indeed, neither one can be subsumed under the other. In our analysis, the same conclusion applies: the avoidance of contradictions in every step of the reasoning refrain the revision to adduce the following significant results. Let ¬◦α =def •α, and let us consider Ci as the underlying logic. 4.3 Formal consistency as an epistemic attitude An alternative system considered in (Testa, Coniglio, and Ribeiro 2017), called AGM◦, relies heavily on the formal consistency operator. This means that the explicit constructions themselves (and accordingly the postulates) assume that such operator plays a central role. In a static paradigm (i.e., when the focus is the logical consequence relation) this is already the case. Assuming the consistency of the sentence involved in a contradiction entails a trivialization (as elucidated in the gentle explosion principle) – which somehow captures and describes the intuition of the expansion. 23 The main idea of AGM ◦ is to also incorporate the notion of consistency in the contraction. In this case, it is interpreted that a belief being consistent means that it is not liable to be removed from the belief set in question, adducing that the contraction endows the postulate of failure (namely, that if ◦α ∈ K then K − α = K). The strategy is to incorporate the idea of non-revisibility in the selection function – the consistent belief remains in the epistemic state in any situation, unless the agent retract the very fact that such belief is consistent. Definition 15 (selection function for AGM◦ contraction). A selection function for K is a function γ ′ such that, for every α: 1. γ ′ (K, α) ⊆ K⊥α if α ∈ / Cn(∅) and ◦α ∈ / K. 2. γ ′ (K, α) = {K} otherwise. Contraction, thus, is defined as definition 13. In short, the seven epistemic attitudes defined in AGM ◦ are: Definition 16 (Possible epistemic attitudes in AGM◦, see figure 1 (Testa, Coniglio, and Ribeiro 2017; Testa 2014)). Let K be a given belief set. Then, a sentence α is said to be: Accepted if α ∈ K. Rejected if ¬α ∈ K. Under-determined if α ∈ / K and ¬α ∈ / K. Over-determined if α ∈ K and ¬α ∈ K. Consistent if ◦α ∈ K. Boldly accepted if ◦α ∈ K and α ∈ K. Boldly rejected if ◦α ∈ K and ¬α ∈ K (i.e. ∼G ∈ K). 2. Ellen, on the other hand, is a believer (G ∈ K). However, it may very well happen that she loses her faith so definitely that she can never become a believer in God again (◦¬G ∈ K). 3. Florence is an inveterate doubter. Nothing can bring her to a state of firm (irreversible) belief (◦G 6∈ K) and neither can she be brought to a state of firm disbelief (◦¬G 6∈ K) Paraconsistent Belief Revision based on the LFIs are an important step for further advancements on systems for detecting and handling with contradictions, mostly if combined with tools for expressing probabilistic reasoning. Some progress in this direction are overviewed in the following sections. 5 This section briefly surveys the research initiative on paraconsistent probability theory based on the LFIs and its consequences, which makes it possible to treat realistic probabilistic reasoning under contradiction. Paraconsistent probabilities can be regarded as degrees of belief that a rational agent attaches to events, even if such degrees of belief might be contradictory. Thus it is not impossible for an agent to believe in the proposition α and ¬α and to be rational, if this belief is justified by evidence, as argued in (Bueno-Soler and Carnielli 2016). A quite general notion of probability function can be defined, in such a way that different logics can be combined with probabilistic functions, giving rise to new measures that may reflect some subtle aspects of probabilistic reasoning. ⊤ ◦α, α ∈ K α, ¬α ∈ K ◦α, ¬α ∈ K α∈K ◦α ∈ K ¬α ∈ K Sound probabilistic reasoning under contradiction Definition 17. A probability function for a language L of a logic L, or a L-probability function, is a function P : L 7→ R satisfying the following conditions, where ⊢L stands for the syntactic derivability relation of L: 1. 2. 3. 4. 5. α, ¬α ∈ /K Figure 1: Epistemic attitudes in AGM◦ Non-negativity: 0 ≤ P (ϕ) ≤ 1 for all ϕ ∈ L Tautologicity: If ⊢L ϕ, then P (ϕ) = 1 Anti-tautologicity: If ϕ ⊢L , then P (ϕ) = 0 Comparison: If ψ ⊢L ϕ, then P (ψ) ≤ P (ϕ) Finite additivity: P (ϕ ∨ ψ) = P (ϕ) + P (ψ) − P (ϕ ∧ ψ) This collection of meta-axioms, by assuming appropriate ⊢L (for instance, by taking the classical, intuitionistic or paraconsistent derivability relation) defines distinct probabilities, each one deserving a full investigation. In particular, for the sake of this project, we have in mind paraconsistent probability theory based on the Logics of Formal Inconsistency, as it has been treated in (Bueno-Soler and Carnielli 2016),(2017). Several central properties of probability are preserved, as the notions of paraconsistent updating which is materialized through new versions of Bayes’ theorem for conditionalization. Other papers already proposed connections between non-classical logics and probabilities and even for the paraconsistent case (references can be found in the aforementioned works), recognizing that some non-classical logics The following examples illustrate an important feature of human belief that, in classical AGM, has no room in a model solely based on contractions and revisions – the stubbornness of human belief. Instead of introducing the notions of necessity and possibility on the metalanguage, as suggested by (Hansson 1999), it is possible to capture such notions based on the concept of bold-acceptance. Indeed, as interpreted by (Testa 2014), this fact illustrate a well-studied feature regarding the proximity of LFIs with modal logics. Example 2. Adapted from (Hansson 1999) 1. Doris is not religious, but she has religious leanings. She does not believe that God exists (G 6∈ K), but it is possible for her to become a believer (∼G 6∈ K). 24 Furthermore, as given in the problem, P (D/A) = 0.98, P (C/¬A) = 0.9 and P (D) = 0.11. The results of the test have no paraconsistent character, since the events D (‘doping’ ) and C (‘no doping’) exclude each other. Thus, P (D/¬A) = 1 − P (C/¬A) = 0.1 and P (C/A) = 1 − P (D/A) = 0.02. Suppose someone has been tested, and the test is positive (“doping”). What is the probability that the tested individual regularly uses this illegal drug, that is what is P (A/D)? By applying the paraconsistent Bayes’ rule: are better suited to support uncertain reasoning in particular domains. The combinations between probabilities and LFIs deserves to be emphasized, as they offer a quite natural and intuitive extension of standard probabilities which is useful and philosophically meaningful. The following example uses the system Ci, a member of the LFI family with some features that make it reasonably close to classical logic (recall definition 6); it is appropriate, in this way, to define a generalized notion of probability strong enough to enjoy useful properties. Observation 18 (Paraconsistent Bayes’ Conditionalization Rule (PBCR) (Bueno-Soler and Carnielli 2016)). If P (α ∧ ¬α) 6= 0, then: P (α/β) = P (A/D) = P (β/α) · P (α) P (β/α) · P (α) + P (β/¬α) · P (¬α) − δα P (D/A) · P (A) P (D/ A) · P (A) + P (D/¬A) · P (¬A) − δA where δA = P (D/A ∧ ¬A) · P (A ∧ ¬A) since P (A ∧ ¬A) 6= 0. All of the values are known, with the exception of P (D/A ∧ ¬A). Since: where δα = P (β/α∧¬α)·P (α∧¬α) is the ’contradictory residue’ of α. It is clear that this rule generalizes the classical conditionalization rule, as it reduces to the classical case if P (α ∧ ¬α) = 0 or if α is consistent: indeed, in the last case, P (β ∧ ◦α) = P (β ∧ ◦α ∧ α) + P (β ∧ ◦α ∧ ¬α) since P (◦α ∧ α ∧ ¬α) = 0. We can interpret (PBCR) as Bayes’ ruke taking into account the likelihood relative to the contradiction. It is possible, however, to formulate other kinds of conditionalization rules by combining the notions of conditional probability, contradictoriness, consistency and inconsistency. Example 3. As an example, suppose that a doping test for an illegal drug is such that it is 98% accurate in the case of a regular user of that drug (i.e., it produces a positive result, showing “doping”, with probability 0.98 in the case that the tested individual often uses the drug), and 90% accurate in the case of a non-user of the drug (i.e., it produces a negative result, showing “no doping”, with probability 0.9 in the case that the tested individual has never used the drug or does not often use the drug). Suppose, additionally, that: (i) it is known that 10% of the entire population of all athletes often uses this drug; (ii) that 95% of the entire population of all athletes does not often use the drug or has never used it; and (iii) that the test produces a positive result, showing “doping”, with probability 0.11 for the whole population, independent of the tested individual. Let the following be some mnemonic abbreviations: D : the event that the drug test has declared “doping” (positive) for an individual; C : the event that the drug test has declared “clear” or “no doping” (negative) for an individual; A : the event that the person tested often uses the drug; ¬A : the event that the person tested does not often use the drug or has never used it. We know that P (A) = 0.1 and P (¬A) = 0.95. The situation is clearly contradictory with respect to the events A and ¬A, as they are not excludent. Therefore, by finite additivity, P (A ∨ ¬A) = 1 = (P (A) + P (¬A)) − P (A ∧ ¬A), and thus, P (A ∧ ¬A) = (P (A) + P (¬A)) − 1 = 0.05 P (D/A ∧ ¬A) = P (D ∧ A ∧ ¬A) P (A ∧ ¬A) it remains to compute P (D ∧ A ∧ ¬A). It follows directly from some easy properties of probability that P (D ∧ A ∧ ¬A) = P (D ∧ A) + P (D ∧ ¬A) − P (D) = P (D/A).P (A) + P (D/¬A).P (¬A) − P (D) = 0.083. Therefore, by plugging in all of the values, it follows that P (A/D) = 51.9%1 . This example suggests, as argued below, that the paraconsistent Bayes’ conditionalization rule is more robust than traditional conditionalization, as it can provide useful results even in the case the test could be regarded as ineffective due to contradictions. The following table compares the paraconsistent result with the results obtained by trying to remove the contradiction involving the events A (the event that the person tested often uses the drug) and ¬A (the event that the person tested does not often use the drug or has never used it), that is by trying to make them “classical”. Since A and ¬A overlap by 5%, we might consider reviewing the values, by ‘removing the contradiction’ according to three hypothetical scenarios: an alarming scenario, by lowering the value of ¬A by 5%; a happy scenario, by lowering the value of A by 5%; and a cautious scenario, by dividing the surplus equally between A and ¬A and computing the probability P (A/D) that the tested individual regularly uses this illegal drug. Table 1: Removing the contradiction Alarming Scenario Cautious Scenario Happy Scenario P (A) = 10% P (¬A) = 90% P (D/A) = 98% P (D/¬A) = 10% P (A) = 7.5% P (¬A) = 92.5% P (D/A) = 98% P (D/¬A) = 10% P (A) = 5% P (¬A) = 95% P (D/A) = 98% P (D/¬A) = 10% Result Result Result P (A/D) = 52% P (A/D) = 44% P (A/D) = 34% 1 The values correct some miscalculations in (Bueno-Soler and Carnielli 2016). 25 4. Comparison: If ψ ⊢L ϕ, then N (ψ) ≤ N (ϕ) 5. Conjunction: N (ϕ ∧ ψ) = min{N (ϕ), N (ψ)} 6. Metaconsistency: N (•α) + N (◦α) = 1 A condition N (α) = λ can be understood as expressing that ‘α is certain to degree λ’ (in all normal states of affairs). Possibilistic measures are also useful when representing preferences expressed as sets of prioritized goals, as e.g. some lattice-valued possibility measures studied in the literature instead of real-valued possibility measures. The parameter L in the above definition can be Cie, or the threevalued logic LFI1, or XXX (see references for details). Analogously to the necessity function, a generic notion of logic-dependent possibility measure (dual to a necessity function) is defined as follows: Definition 20. A possibility function (or measure) for the language L of Cie, or a Cie- possibility function, is a function Π : L 7→ R satisfying the following conditions: 1. Non-negativity: 0 ≤ Π(ϕ) ≤ 1 for all ϕ ∈ L 2. Tautologicity: If ⊢L ϕ, then Π(ϕ) = 1 3. Anti-Tautologicity: If ϕ ⊢L , then Π(ϕ) = 0 4. Comparison: If ψ ⊢L ϕ, then Π(ψ) ≤ Π(ϕ) 5. Disjunction: Π(ϕ ∨ ψ) = max{Π(ϕ), Π(ψ)} 6. Metaconsistency: Π(•α) + Π(◦α) = 1 Standard necessity and possibility measures do not cope well with contradictions, since they treat contradictions in a global form (even if in a gradual way). This is the main reason to define new forms of necessity and possibility measures based upon paraconsistent logics; although they lack graduality, LFIs offer a tool for handling contradictions in knowledge bases in a local form, by locating the contradictions on critical sentences. Yet, the combination of them reaches a good balance: the paraconsistent paradigm by itself does not allow for any fine-grained graduality in the treatment of contradictions, which may lead to some loss of information when contradictions appear in a knowledge base. When enriched with possibility and necessity functions, however, a new reasoning tool emerges. It is possible to define a natural non-monotonic consequence relation on databases acting under some of the logic L as above. Non-monotonic logics are structurally closed to the internal reasoning of belief revision, as argued in (Gärdenfors 1990), where it is shown that the formal structures of the two theories are similar. The resulting logic systems have a great potential to be used in real-life knowledge representation and reasoning systems. Another important concept that can be advantageously treated by the paraconsistent paradigm is the concept of evidence. The paper (Rodrigues, Bueno-Soler, and Carnielli 2020) introduces the logic of evidence and truth LETF as an extension of the Belnap-Dunn four-valued logic FDE. LETF is equipped with a classicality operator ◦ and its dual to non-classicality operator •. It would be interesting to define possibility and necessity measures over LETF , generalizing the probability measures defined over LETF and to further investigate the connections between the formal notions of evidence and the graded notions of possibility and necessity. Using paraconsistent probabilities, one obtains, in the case of this example, a value close (even if a bit inferior) to the “alarming” hypothetical scenario, helping to make a decision even if the contradictory character would make it be seen as ineffective. In other words, the presence of a contradiction does not mean that we need to discard the test, if we have reasoning tools that are sensitive and robust enough. 6 Possibility and necessity measures Possibility theory is a generalization of (or an alternative to) probability theory devoted to deal with certain types of uncertainty by means of possibility and necessity measures. As aforementioned, it is well recognized that reasoning with contradictory premises is a critical issue, since large knowledge bases are inexorably prone to incorporate contradictions. Contradictory information comes from the fact that data is provided by different sources, or by a single source that delivers contradictory data as certain. The connections between the possibilistic and the paraconsistent paradigms are complex and various forms of contradiction can be accommodated into possibilistic logic, defining concepts such as ‘paraconsistency degree’ and ‘paraconsistent completion’ (Dubois and Prade 2015). Paraconsistent logics offer simple and effective models for reasoning in the presence of contradictions, as they avoid collapsing into deductive trivialism by a natural logic machinery. Taking into consideration that it is more natural and effective to reason from a contradictory information scenario than trying to remove the contradictions involved, the investigation of credal calculi concerned with necessity and possibility is naturally justified. On one hand, possibility theory based on classical logic is able to handle contradictions, but at the cost of expensive maneuvers (Dubois and Prade 2015). On the other hand, paraconsistent logics cannot easily express uncertainty in a gradual way. The blend of both via the LFIs, in view of the operators of consistency and inconsistency, offers a simple and natural qualitative and quantitative tool to reason with uncertainty. The idea of defining possibility and necessity models, dubbed as credal calculi, based on the Logics of Formal Inconsistency, takes advantage of the flexibility of the notions of consistency “◦” and inconsistency “•”. Some basic properties of possibility and necessity functions over the Logics of Formal Inconsistency have been investigated in (Carnielli and Bueno-Soler 2017), making clear that paraconsistent possibility and necessity reasoning can, in general, attain realistic models for artificial judgement. A generic notion of logic-dependent necessity measures is given by the conditions below. Definition 19 ((Carnielli and Bueno-Soler 2017)). A necessity function (or measure) for a language L in an LFI, called an LFI-necessity function, is a function N : L 7→ R satisfying the following conditions, where ⊢L stands for the syntactic derivability relation of L: 1. Non-negativity: 0 ≤ N (ϕ) ≤ 1 for all ϕ ∈ L 2. Tautologicity: If ⊢L ϕ, then N (ϕ) = 1 3. Anti-Tautologicity: If ϕ ⊢L , then N (ϕ) = 0 26 7 Other applications and further work dling contradictions, and producing explanations for its conclusions. This is naturally relevant, for instance, in medical diagnosis, natural language understanding, forensic sciences and other areas where evidence interpretation is an important issue. Again, this is work in progress, but it seem clear that paraconsistent Bayesian networks may be useful and stimulating in a series of circumstances where contradictions are around. Description Logics (DLs) play an important role in the semantic web domain and in connections to computational ontologies, and incorporating uncertainty in DL reasoning has been the topic of lively research. DLs can expanded with paraconsistent, probabilistic and possibilistic tools, or with their combinations (one example toward the relevance of paraconsistent reasoning for the Semantic Web can be found in (Zhang, Lin, and Wang 2010)). Enhancing DLs with LFI-probabilities and possibility measures is a research in progress, and will represent a considerable step forward to DLs in regard to the representation of more realistic ontologies. A second problem concerns clarifying the concept of evidence. As mentioned, (Rodrigues, Bueno-Soler, and Carnielli 2020) introduces the logic of evidence and truth LETF , a Logic of Formal Inconsistency and Undeterminedness that extends Belnap–Dunn four-valued logic, formalizes a notion of evidence as a concept weaker than truth in the sense that there may be evidence for a proposition α even if α is not true. The paper proposes a probabilistic semantics for LETF taking into account probabilistic and paracomplete scenarios (where, respectively, the sum or probabilities for α and ¬α is P (α) + P (¬α), is greater or less than 1). Classical reasoning can be recovered when consistency and inconsistency behave within normality, that is, then P (◦α) = 1 or P (•α) = 0. In this way it is possible to obtain some new versions of standard results of probability theory. By relating the concepts of evidence and coherence, it may be possible to obtain an enhanced version of the model proposed in (Chopra and Parikh 1999). This may represent an important leap forward into the clarification of the notion of evidence, each time more demanded in AI and KR. Paraconsistent Bayesian networks is another topic with great interest. Bayesian Networks are indispensable tools for expressing the dependency among events and assigning probabilities to them, thus ascertaining the effect of changes of occurrence in one event given the others. Bayesian Networks can be (roughly) represented as nodes an annotated acyclic graph (a set of direct edges between variables) that represents a joint (paraconsistent) probability distribution over a finite set of random variables V = {V1 · · · , Vn }. The praxis usually supposes that each variable has only a finite number of possible values (though this is not a mandatory restriction – numeric or continuous variables that take values from a set of continuous numbers can also be used. For such discrete random variables, conditional probabilities are usually represented by a table containing the probability that a child node takes on each of the values, taking into account the combination of values of its parents, that is, to each variable Vi with parents {B1 , · · · , Bni } there is attached a conditional probability table relating Vi to its parents (regarded as “causes”) Paraconsistent Bayesian networks, notably when combined with paraconsistent belief revision (including (Testa, Coniglio, and Ribeiro 2017)) and with belief maintenance systems can lead to a new approach to detecting and han- Acknowledgments The authors are grateful for the colleagues that participated in advancing the results presented in this paper. Carnielli acknowledges support from the National Council for Scientific and Technological Development (CNPq), Brazil under research grants 307376/2018-4 and from Modal Institute, Brasilia. Testa acknowledges support from São Paulo Research Foundation, under research grants FAPESP 2014/22119-2 (at CLE-Unicamp, Brazil) and FAPESP 2017/10836-0 (at University of Madeira, Portugal). References Alchourrón, C. E.; Gärdenfors, P.; and Makinson, D. 1985. On the logic of theory change: Partial meet contraction and revision functions. The Journal of Symbolic Logic 50:510– 530. Anderson, A. R.; Belnap, N. D.; and Dunn, J. M. 1992. Entailment: The Logic of Relevance and Necessity, Volume 2. Princeton: Princeton University Press. Asenjo, F. G. 1966. A calculus of antinomies. Notre Dame Journal of Formal Logic 7(1):103–105. Avron, A. 2005. Non-deterministic matrices and modular semantics of rules. In Béziau, J.-Y., ed., Logica Universalis, 149–167. Basel: Birkhäuser Verlag. Batens, D. 2001. A general characterization of adaptive logics. Logique et Analyse 44(173-175):45–68. Batens, D. 2009. Adaptive Cn logics. In Carnielli, W.; Coniglio, M.; and D’Ottaviano, I. M. L., eds., The many sides of logic, volume 21, 27–45. London, UK: College Publications. Bueno-Soler, J., and Carnielli, W. A. 2016. Paraconsistent probabilities: consistency, contradictions and Bayes’ theorem. In Stern, J., ed., Statistical Significance and the Logic of Hypothesis Testing, volume Entropy 18(9). MDPI Publications. Open acess at http://www.mdpi.com/1099-4300/18/ 9/325/htm. Carnap, R., and Bar-Hillel, Y. 1952. An outline of a theory of semantic information. In Research laboratory of electronics technical report 247. Massachusetts Institute of Technology. Carnielli, W. A., and Bueno-Soler, J. 2017. Paraconsistent probabilities, their significance and their uses. In Caleiro, C.; Dionisio, F.; Gouveia, P.; Mateus, P.; and Rasga, J., eds., Essays in Honour of Amilcar Sernadas, volume 10500. London: College Publications. 197–230. Carnielli, W., and Coniglio, M. 2016. Paraconsistent Logic: Consistency, Contradiction and Negation. New 27 York: Logic, Epistemology, and the Unity of Science Series, Springer. Carnielli, W., and Lima-Marques, M. 2017. Society semantics and the logic way to collective intelligence. Journal of Applied Non-Classical Logics 27(3-4):255–268. Carnielli, W., and Marcos, J. 2002. A taxonomy of c-systems. In Carnielli, W. A.; Coniglio, M. E.; and D’Ottaviano, I. M. L., eds., Paraconsistency: The Logical Way to the Inconsistent, volume 228 of Lecture Notes in Pure and Applied Mathematics, 1–94. Marcel Dekkerr. Carnielli, W.; Coniglio, M.; and Marcos, J. 2007. Logics of formal inconsistency. In Gabbay, D. M., and Guenthner, F., eds., Handbook of Philosophical Logic, volume 14, 1–93. Springer. Chopra, S., and Parikh, R. 1999. An inconsistency tolerant model for belief representation and belief revision. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, IJCAI 99. Stockholm, Sweden. da Costa, N. C., and Bueno, O. 1998. Belief change and inconsistency. Logique et Analyse 41(161/163):31–56. da Costa, N. C. A. 1974. On the theory of inconsistent formal systems. Notre Dame Journal of Formal Logic 15(4):497–510. Dubois, D., and Prade, H. 2015. Inconsistency management from the standpoint of possibilistic logic. International Journal of Uncertainty, Fuzziness and KnowledgeBased Systems 23:15–30. Fermé, E., and Hansson, S. O. 2018. Belief Change: Introduction and Overview. Switzerland: Springer Briefs in Intelligent Systems, Springer. Fermé, E., and Wassermann, R. 2017. Iterated belief change the case of expansion into inconsistency. In 2017 Brazilian Conference on Intelligent Systems (BRACIS), 420–425. Gärdenfors, P. 1990. Belief revision and nonmonotonic logic: two sides of the same coin? In European Workshop on Logics in Artificial Intelligence, 52–54. Springer. Halldén, S. 1949. The Logic of Nonsense. Lundequistska Bokhandeln: Uppsala: A.-B. Hansson, S. O. 1993. Reversing the levi identity. Journal of Philosophical Logic 22(6):637–669. Hansson, S. 1997. Semi-revision. Journal of Applied NonClassical Logics 7(1-2):151–175. Hansson, S. O. 1999. A Textbook of Belief Dynamics. Theory Change and Database Updating. Kluwer. Jaśkowski, S. 1948. Rachunek zdań dla systemów dedukcyjnych sprzecznych. Studia Societatis Scientiarum Torunensi (Sectio A) 1(5):55–77. Mares, E. D. 2002. A paraconsistent theory of belief revision. Erkenntnis 56(2):229–246. Mendonça, B. R. 2018. Traditional theory of semantic information without scandal of deduction: a moderately externalist reassessment of the topic based on urn semantics and a paraconsistent application. Ph.D. Dissertation, IFCH, Unicamp. Nelson, D. 1959. Negation and separation of concepts in constructive systems. In Heyting, A., ed., Constructivity in Mathematics, volume 40, 208–225. NorthHolland, Amsterdam. Priest, G. 1979. The logic of paradox. Journal of Philosophical Logic 8(1):219–241. Priest, G. 1987. In Contradiction: A Study of the Transconsistent. Dordrecht: Martinus Nijhoff. second edition, Oxford: Oxford University Press, 2006. Priest, G. 2001. Paraconsistent belief revision. Theoria 67:214–228. Restall, G., and Slaney, J. 1995. Realistic belief revision. In De Glas, M., and Pawlak, Z., eds., Proceedings of the Second World Conference in the Fundamentals of Artificial Intelligence, 367–378. Paris: Angkor. Rodrigues, A.; Bueno-Soler, J.; and Carnielli, W. A. 2020. Measuring evidence: a probabilistic approach to an extension of Belnap-Dunn logic. Synthese. In print. Schotch, P.; Brown, B.; and Jennings, R. 2009. On Preserving: Essays on Preservationism and Paraconsistent Logic. Toronto: University of Toronto Press. Tamminga, A. 2001. Belief Dynamics: (Epistemo)logical investivations. Ph.D. Dissertation, Institute for Logic, Language and Computation, Universiteit van Amsterdam. Tanaka, K. 2005. The AGM theory and inconsistent belief change. Logique et Analyse 48:113–150. Testa, R.; Fermé, E.; Garapa, M.; and Reis, M. 2018. How to construct remainder sets for paraconsistent revisions: Preliminary report. Proceedings of the 17th International Workshop on Nonmonotonic Reasoning. Testa, R.; Coniglio, M.; and Ribeiro, M. 2015. Paraconsistent belief revision based on a formal consistency operator. CLE e-prints 15(8). Testa, R.; Coniglio, M.; and Ribeiro, M. 2017. AGM-like paraconsistent belief change. Logic Journal of the IGPL 25(4):632–672. Testa, R. R. 2014. Revisão de Crenças Paraconsistente baseada em um operador formal de consistência. Ph.D. Dissertation, IFCH, Unicamp. Testa, R. 2015. The cost of consistency: information economy in paraconsistent belief revision. South American Journal of Logic 1(2):461–480. Testa, R. 2020. Judgment aggregation and paraconsistency. (working manuscript). Thomason, R. 2020. Logic and artificial intelligence. In Zalta, E. N., ed., The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, summer 2020 edition. Vasiliev, N. 1912. Imaginäre (nichtaristotelische) logik. In Zhurnal m–va nar. prosveshcheniya, volume 40, 207–246. Zhang, X.; Lin, Z.; and Wang, K. 2010. Towards a paradoxical description logic for the semantic web. In Link, S., and Prade, H., eds., Foundations of Information and Knowledge Systems, 306–325. Springer. 28 Towards Interactive Conflict Resolution in ASP Programs Andre Thevapalan1 , Gabriele-Kern-Isberner2 1,2 Department of Computer Science, TU Dortmund University, Dortmund, Germany 1 andre.thevapalan@tu-dortmund.de, 2 gabriele.kern-isberner@cs.tu-dortmund.de Abstract of each conflict is done in interaction with the user. For each conflict, suggestions on how to resolve the conflict are generated. Each conflict is presented to the expert together with the matching suggestions of which the expert can choose one. The active involvement of the expert guarantees an updated program which represents the knowledge of the expert in the best possible way. Especially in sensitive fields like medical therapies it is of the utmost importance that the encoded knowledge remains correct. The presented approach ensures this by not only providing transparency by showing the expert where the conflicts are located and what modifications are made but by also actively involving the expert. The presentation of the interactive update approach is furthermore accompanied by a running example which depicts a scenario where the knowledge about the determination of therapies for cancer patients has to be updated with new, but conflicting knowledge. Note that instead of the update procedure of (Eiter et al. 2002), any other ASP update system that produces answer sets containing information on conflicting rules could be used for our interactive update approach. The rest of the paper is organised as follows: Section 2 provides some necessary preliminaries regarding answer set programming. In section 3 we explain the update of extended logic programs by causal rejection as presented in (Eiter et al. 2002). In section 4 and 5 we introduce our approach to detect conflicts when updating extended logic programs by using the mechanism mentioned in section 3 and how to resolve the conflicts interactively with an expert. Section 6 deals with related work. This paper ends in section 7 with a short summary and a discussion of further extensions and improvements regarding the presented approach. For reasons of readability and oversight we moved the larger programs of the running example to the appendix. Updating knowledge bases often requires topical expertise as to whether prior knowledge should be corrected, simply deleted, or merged with the new information. In this paper we introduce a formalism to update non-disjunctive ASP programs in an interactive way with the user by generating suitable suggestions regarding how to solve each conflict which are based on an ASP update procedure by Eiter et al.. The main goal is the development of a lean method to efficiently update ASP programs by highlighting possible causes for conflicts, generate solution suggestions for the user and eventually modifying parts of the program so that an updated, conflict-free program results in a guided way. 1 Introduction In (Thevapalan et al. 2018) a prototype decision support system for mammary carcinoma therapy plans (MAMMA - DSCS) was introduced. The core of this system is based on answer set programs (ASP) with the extension HEX which allows the connection to external sources with answer set programs (Eiter et al. 2005). MAMMA - DSCS was motivated by the steady growth of knowledge in the medical sector, especially in the oncological field. In a fast pace, new drugs and therapies are developed, and new ways are found to detect specific cancer subtypes (e. g., specific gene markers) which improve the therapy possibilities. Thus an application which has a logic program as the core component was introduced. Rule-based systems like ASP offer a declarative paradigm which allows the extension of a program by simply adding more rules. However, despite of the declarative paradigm, updating an existing ASP program with additional rules can be quite complex due to the emergence of contradictions the rules can potentially cause. In this paper, we present a method to update ASP programs interactively by handling all conflicts that arise jointly with the user. Figure 1 gives an overview on the general approach of the method presented in this paper. There, a logic program P1 has be to be updated with new information P2 . Basically, the original programs P1 , P2 are successively modified to programs Pˆ1 , Pˆ2 by altering the conflict-causing rules. At the end of the process the update can be realized by simply uniting Pˆ1 , Pˆ2 because all conflicts have been resolved before. To find all conflicts we modify an approach to update answer set programs presented in (Eiter et al. 2002). The resolution 2 Preliminaries In this paper we look at non-disjunctive extended logic programs (ELPs) (Gelfond and Lifschitz 1991). An ELP is a finite set of rules over a set A of propositional atoms. A literal L is either an atom A (positive literal) or a negated atom ¬A (negative literal). For a literal L the complementary literal L is ¬A if L = A and A otherwise. For a set X of literals, X = {L | L ∈ X} is the set of corresponding complementary literals. Then LitA denotes the set A ∪ A of 29 (P1 , P2 ) blocking its applicability whenever the newer rule is applicable. In this paper, we will deal only with update sequences of length n = 2 because we focus on situations where a consistent ELP is available, and some new information has to be integrated in such a way that a consistent ELP results that represents the current knowledge. Furthermore, in medical environments like hospitals new information is usually provided and implemented periodically rather than continuously, and after the update, a new consistent view is expected that will be the base for the next update. Therefore, in the following we will briefly describe the translation explicitly for an update sequence of length n = 2. Given an update sequence P = (P1 , P2 ) over A the set A∗ is the extension of A by pairwise distinct atoms rej(r) and Ai , for each rule r occurring in P, each atom A ∈ A, and each i ∈ {1, 2}. The literal which is created by replacing the atomic formula A of a literal L by Ai will be denoted by Li . P̂1 ∪ P̂2 AS(P1 ◭ P2 ) Conflict detection Interactive conflict resolution User Figure 1: Overview on the whole interactive update procedure all literals over A. A default-negated literal L is written as not L. A rule r is of the form L0 ← L1 , . . . , Lm , not Lm+1 , . . . , not Ln . with literals L0 , . . . , Ln and 0 ≤ m ≤ n. The literal L0 is the head of r, denoted by H(r) and {L1 , . . . Lm , not Lm+1 , . . . not Ln } is the body of r, denoted by B(r). Furthermore {L1 , . . . , Lm } is denoted by B + (r) and {Lm+1 , . . . , Ln } by B − (r). A rule r with B(r) = ∅ is called a fact, and if H(r) = ∅ rule r is called a constraint. A set of literals is inconsistent if it contains complementary literals. A set X of non-complementary literals is called interpretation. An interpretation X is called model of an ELP P if for every rule r ∈ P the following holds: H(r) ∈ X whenever B + (r) ⊆ X and B − (r) ∩ X = ∅. The reduct P X of a program P relative to a set X of literals is defined by Definition 2 (Update program (Eiter et al. 2002)). Given an update sequence P = (P1 , P2 ) over a set of atoms A, the update program P⊳ = P1 ⊳ P2 over A∗ consists of the following items: (i) all constraints occurring in P1 , P2 ; (ii-a) for each r ∈ P1 : L1 ←B(r), not rej(r). (ii-b) for each r ∈ P2 : L2 ←B(r). if H(r) = L; (iii) for each r ∈ P1 : P X = {H(r) ← B + (r). | r ∈ P, B − (r) ∩ X = ∅}. rej(r) ←B(r), L2 . An answer set of an ELP P is an interpretation X which is a ⊆-minimal model of P X (Gelfond and Lifschitz 1991). The set of all answer sets of a program P will be denoted by AS(P ), and P is called consistent iff AS(PS) 6= ∅. We say a literal L is derivable in an ELP P iff L ∈ AS(P ). 3 if H(r) = L; if H(r) = L; (iv) for each literal L occurring in P: L1 ←L2 .; L ←L1 .. Note that transformations of type (ii-a) are only applied to P1 because there is no P3 . Consequently, the rules of P2 do not need to be modified (see (ii-b)). The answer set of an update sequence P is the projection of the answer set of the update program P⊳ onto the set of atoms A. Update by causal rejection Our approach to detect conflicts is based on the update of extended answer set programs by causal rejection (Eiter et al. 2002). In that approach the extended logic programs are given in a sequence (P1 , . . . , Pn ) of ELPs, where each Pi updates the information encoded in (P1 , . . . , Pi−1 ). Definition 3 (Update Answer Set (Eiter et al. 2002)). Let P = (P1 , P2 ) be an update sequence over a set of atoms A. Then S ⊆ LitA is an update answer set of P iff S = S ′ ∩ A for some answer set S ′ of P⊳ . Definition 1 (Update sequence (Eiter et al. 2002)). An update sequence P = (P1 , . . . , Pn ) is a series of consistent ELPs over A, where A is the set of all atoms occurring in P1 , . . . , Pn and where Pi contains newer information than Pi−1 . To illustrate the update mechanism we present an example which represents a possible scenario in a medical setting. Example 1. Consider the following extended logic program P1 : An update sequence P = (P1 , . . . , Pn ) is translated into a single program P⊳ which encodes the information of P and whose answer sets represent the answer sets of P. Informally the translated program P⊳ merges the information of the programs in P but in case of conflicting rules, P⊳ rejects the rule of the program with the older information by r1 : tnbc met. r2 : pdl pos. r3 : treat. 30 r4 : mono th ← treat, tnbc met, not visc crisis. can not hold which consequently prevents a conflict. Similarly the information that the suggested therapy should be a monotherapy is rejected via the rej(r4 )-literal. From the medical experts’ point of view not only the resulting updated and consistent program is important - for them the information about explicit rule rejections is crucial and have to be further analyzed. Generally it is important to know which rules were rejected and especially why they were rejected. To be precise, from a medical expert’s point of view the following questions are relevant regarding rule rejections: r5 : mono th ← treat, tnbc met, visc crisis. r6 : nab pt ← treat, tnbc met. r7 : carbopl ← treat, tnbc met, visc crisis. r8 : low success ← treat, tnbc met. Let P1 encode the following scenario: A patient (let us call her Agent A) has a metastasized triple negative breast cancer (r1 ), and one of the tests showed that the tumor is also PD-L1-positive (r2 ). The patient is getting treated at a cancer clinic (r3 ). According to the clinic’s guidelines, Agent A should get treated with a drug called nabpaclitaxel (r6 ). Usually, the treatment with a single drug (monochemotherapy) is indicated (r4 ). But if Agent A’s cancer is rapidly progressing due to severe organ dysfunction (visceral crisis) a more aggressive approach can be chosen by additionally treating with carboplatin (r7 ). The use of multiple drugs in a chemotherapy is called polychemotherapy, which in this scenario can be interpreted as the opposite resp. negation of a monochemotherapy (r5 ). Generally though, the treatment of metastasized tnbc is known to have a low success rate and the chemotherapy is a palliative treatment (r8 ). However, recent studies then show (cf. (Schmid et al. 2018; Schneeweiss et al. 2019)) that for PD-L1-positive patients, who are not in a visceral crisis, the treatment with an additional immunotherapy consisting of the PD-L1-inhibitor atezolizumab is advisable (r9 ). Such a combination therapy with atezolizumab and nab-paclitaxel (r10 ) can prolong the life of a tnbc patient significantly (r11 ). This is encoded in following program P2 : r9 : atzmab ← treat, tnbc met, pdl pos, not visc crisis. (Q1) Which rule r in P1 was rejected? (Q2) Which rule r′ in P2 caused the rejection? (Q3) What are the options to handle the rejection? (Q3a) (Q3b) (Q3c) (Q3d) Remove r and/or r′ completely? Modify body of r - how? Modify body of r′ - how? Are there deeper causes of rejection? (which rule led to the applicability of r′ etc.) But neither the update answer set S ′ nor the answer set S show information how these conflicts have arisen. On the basis of the update program P⊳ one is only able to see which rules in P1 were rejected. In the following we will describe how we extend the update procedure to use it for an interactive update process. 4 Conflict Detection With (Eiter et al. 2002) it is possible to compute information about syntactical correlations between two programs. We will use this as meta-information to detect conflicts. In this paper we will define conflicts via conflicting rules. r10 : mono th ← treat, tnbc met, nab pt, atzmab. Definition 4 (Conflicting Rules, Conflict). Let P be an ELP and let LitP be the set of all literals derivable in P . Two rules r, r′ ∈ P are conflicting if H(r) and H(r′ ) are complementary and there exists an interpretation X ⊆ LitP such that B(r) and B(r′ ) are true in X. A conflict is a pair (r, r′ ) of rules such that r, r′ are conflicting. r11 : low success ← treat, tnbc met, atzmab, mono th. Which treatment would the clinic recommend to Agent A now, and how can the new information P2 be integrated into the prior knowledge P1 ? Following definition 2 we can generate the update program P⊳ = P1 ⊳ P2 (cf. Appendix A.1) where the only answer set of P⊳ is S ′ = {tnbc met1 , tnbc met, pdl pos1 , pdl pos, treat1 , treat, nab pt1 , nab pt, atzmab2 , atzmab1 , atzmab, mono th2 , mono th1 , mono th, low success2 , low success1 , low success, rej(r4 ), rej(r8 )}. Consequently, we get the update answer set of P = (P1 , P2 ) by S = S ′ ∩ A ={tnbc met, pdl pos, treat, nab pt, atzmab, mono th, low success}. Note that the request for an ELP to be conflict-free is stronger than for it to be consistent as a conflict-free ELP is consistent but not vice versa. Instead of automating the update of programs we will compute suggestions for resolving these conflicts. Therefore we extend the meta-information given in an update program by modifying the update method in (Eiter et al. 2002). In order to use the update program itself to control the update process we add the possibility to recognize the immediate cause of a rejection. Therefore in addition to the rej-atoms we will introduce rej cause- and active-atoms to enable the backtracking of rejections. To realize these modifications our generated update program P◭ will be over a set of atoms A∗∗ which is the extension of A∗ by pairwise distinct atoms rej cause(r, r′ ), active(r), for each rule r and for each pair of rules r 6= r′ occurring in P1 , P2 . With low success ∈ S one can see that the therapy’s expected success is updated correctly. Indeed, the new information P2 is crucial for helping Agent A effectively. By the rules low success1 ←treat, tnbc met, not rej(r8 )., rej(r8 ) ←treat, tnbc met, low success2 . in P⊳ it is guaranteed that the update program does not conclude low success if newer information suggests otherwise. Literal rej(r8 ) being in answer set S ′ shows that r8 Definition 5 (Modified update program). Given an update sequence P = (P1 , P2 ) over A the modified update program 31 holds: rej(r) ∈ S through r iff there exists a rule r′ in P2 such that H(r′ ) = L and rej cause(r, r′ ) ∈ S + . For literal rej cause(r, r′ ) we have: rej cause(r, r′ ) ∈ S + iff B(r) is true in S and active(r′ ) ∈ S + . Furthermore, we have active(r′ ) ∈ S + if B(r′ ) is true in S. Hence, the following holds: rej(r) ∈ S iff B(r) is true in S, and there exists a rule r′ in P2 such that H(r′ ) = L and B(r′ ) is true in S. Consequently, we have: L1 ∈ S through r iff B(r) is true in S, and there is no rule r′ in P2 such that H(r′ ) = L and B(r′ ) is true in S. Again, both update procedures employ equivalent strategies to derive literals from A∗ in their answer sets. Furthermore, due to these considerations, it is clear that for each answer set S ∈ AS(P⊳ ), there is an answer set S + ∈ AS(P◭ ) with S = S + ∩ A∗ , and, the other way around, for each S + ∈ AS(P◭ ), S = S + ∩ A∗ is an answer set of AS(P⊳ ). (MUP) P◭ = P1 ◭ P2 over A∗∗ consists of the following rules: (m-i) all constraints occurring in P1 , P2 ; (m-ii-a) for each r ∈ P1 : L1 ← B(r), not rej(r). if H(r) = L; (m-ii-b) for each r′ ∈ P2 : L2 ← B(r′ ). if H(r′ ) = L; active(r′ ) ← B(r′ ).; (m-iii) for each r ∈ P1 if there exists a rule r′ ∈ P2 such that H(r), H(r′ ) are complementary: rej cause(r, r′ ) ← B(r), active(r′ ). Corollary 1. Let P = (P1 , P2 ) be an update sequence over a set of atoms A. Then for the set AS(P⊳ ) of answer sets of the update program and the set AS(P◭ ) of answer sets of the modified update program, the following holds: AS(P⊳ ) = {S | S = S + ∩ A∗ , S + ∈ AS(P◭ )} Example 2. The update program P◭ of the modified approach can be found in Appendix A.2. The only answer set of P◭ = P1 ◭ P2 is T ′ = {tnbc met1 , tnbc met, pdl pos1 , pdl pos, treat1 , treat, nab pt1 , nab pt, atzmab2 , atzmab1 , atzmab, active(r9 ), mono th2 , mono th1 , mono th, active(r10 ), low success2 , low success1 , low success, active(r11 ), rej cause(r4 , r10 ), rej(r4 ), rej cause(r8 , r11 ), rej(r8 )}. The update answer set of the update sequence P = (P1 , P2 ) with the modified approach is T = T ′ ∩ A = {tnbc met, pdl pos, treat, nab pt, atzmab, mono th, low success}. We can see that the update answer set S in example 1 and answer set T in example 2 are identical. However the answer sets of a modified update program P◭ (e. g. answer set T ′ in example 2) enables us to analyze the rejections and its causes. With the modified approach we can detect the immediate causes of rejections. In the previous example the rules low success2 ←treat, tnbc met, atzmab, mono th., rej(r) ← rej cause(r, r′ ). (m-iv) for each literal L occurring in P: L1 ← L2 .; L ← L1 .. We can show that our modifications to the original approach in (Eiter et al. 2002) only adds meta-information in form of custom literals to each answer set of the update program without changing the (intended) update answer sets themselves. Proposition 1. Let P = (P1 , P2 ) be an update sequence over a set of atoms A. Then for every answer set S ∈ AS(P⊳ ) there exists a corresponding answer set S + ∈ AS(P◭ ) such that S = S + ∩ A∗ , meaning S + is a composition of all literals in S and possibly additional active- and rej cause-literals. Conversely, for answer sets S + ∈ AS(P◭ ) S = S + ∩ A∗ is an answer set of P⊳ . Proof. Let P = (P1 , P2 ) be an update sequence over a set of atoms A. Let r be a rule in P2 and H(r) = L. Then for each answer set S ∈ AS(P⊳ ) we have: L2 ∈ S through r iff B(r) is true in S. Likewise, for every answer set S + ∈ AS(P◭ ) with S = S + ∩ A∗ we have: L2 ∈ S through r iff B(r) is true in S and iff active(r) ∈ S + . This shows that the conditions for L2 to be derived via r on the base of A∗ are identical. Now, let r be a rule in P1 and H(r) = L. Then for each answer set S ∈ AS(P⊳ ) we have: L1 ∈ S through r iff B(r) is true in S and rej(r) ∈ / S. For literal rej(r), the following holds: rej(r) ∈ S iff B(r) is true in S and L2 ∈ S. Consequently, we have: rej(r) ∈ S iff B(r) is true in S and there exists a rule r′ ∈ P2 such that H(r′ ) = L and B(r′ ) is true in S. Altogether, we have L1 ∈ S through r iff B(r) is true in S, and there is no rule r′ ∈ P2 such that H(r′ ) = L and B(r′ ) is true in S. Likewise, for each answer set S + ∈ AS(P◭ ) with S = S + ∩ A∗ we have: L1 ∈ S through r iff B(r) is true in S and rej(r) ∈ / S. For literal rej(r) ∈ S, the following active(r11 ) ←treat, tnbc met, atzmab, mono th., rej cause(r8 , r11 ) ←active(r11 ). make sure that the answer set of P◭ contains the literal rej cause(r8 , r11 ). This tells us that the knowledge about the recommended therapy having a low success rate is ignored due to the statement given in rule r11 ∈ P2 . The other reject-literal rej cause(r4 , r10 ) ∈ T ′ indicates, that rule r4 is also rejected. The conflict detection therefore provides a way to locate each conflict between two programs of an update sequence P by using the corresponding MUP P◭ , as each literal rej cause(r, r′ ) in an answer set of P◭ represents a conflict (r, r′ ). After the determination of all conflicts in P we can now look at how to generate suggestions for resolving each conflict. 32 not rej(r) which prevents that r and a rule r′ ∈ P2 hold simultaneously. In our approach a suggestion for a conflict (r, r′ ) consists of an alternative rule r̂ for r. Similar to the extension of a rule in step (ii-a) we will extend r to r̂ by adding body-literals which ensure that r̂ and r′ are not conflicting. The extension will be realized by adding bodyliterals which will be determined by comparing B(r) and B(r′ ). Interactive conflict resolution Conflict c and suggestions displayed to user Suggestion selected by user Suggestions for conflict c generated Rule modified Proposition 2. Let r, r′ be two conflicting rules and P ot(C) = {C | C ⊆ C} the powerset of C = B(r′ ) − B(r). Furthermore let r̂ be a possible modification of r with r̂ : H(r) ← B(r), Cnot . Figure 2: Interactive conflict resolution 5 where C ∈ P ot(C), Cnot = { not c | c ∈ C} with not not c ≡ c. Then for every non-empty set C ∈ P ot(C) the rules r̂, r′ are non-conflicting. Interactive Conflict Resolution As mentioned above, instead of an automated, rule-based update of ELPs according to (Eiter et al. 2002), we propose an interactive conflict resolution approach. It uses the metainformation given in the MUP P◭ of an update sequence P = (P1 , P2 ) to recognize the conflicts which two programs P1 , P2 cause. The goal is to gradually modify P1 , P2 such that the resulting programs Pˆ1 and Pˆ2 do not contain conflicting rules. Figure 2 shows the components of the interactive part of the update process. For every conflict, suitable suggestions are generated based on P◭ . These suggestions are modifications of the original rules involved in the conflict. The rules involved in the conflict have to be shown to the user and solutions in the form of possible rule modifications are suggested. The user can choose the most suitable modification which will then be applied to the corresponding modified logic program (e. g. Pˆ1 ). This interaction can be done for each conflict. The original programs of the update sequence are thereby successively modified in such way that the update can be realized by simply uniting the two modified programs without creating conflicts. In the previous section we showed how to detect conflicts in an update sequence P = (P1 , P2 ). To detect the conflicts, we have to look at every answer set S + of the MUP P◭ = P1 ◭ P2 . Each rej cause(r, r′ ) ∈ S + represents a conflict. In the proof of the modified approach it is shown that rej cause(r, r′ ) ∈ S + iff the following holds: r ∈ P1 , r′ ∈ P2 and B(r), B(r′ ) are true in S + . This means, to resolve a conflict it is necessary to manipulate the rules r and r′ such that the modified rules are not conflicting anymore and hence can replace the rules r, r′ . As one can see, there can be a large amount of possibilities to resolve a single conflict, mainly the adding, removing or modifications of rules. In the remainder of this paper we will focus our attention on the most difficult case and generate suggestions for the modification of rules. Therefore, we will adhere to following principle: A conflict between two rules r, r′ is resolved by only modifying r (as it stems from the program with the older knowledge) where B(r) is modified using literals which occur in B(r), B(r′ ). The actual absence of conflicts will be ensured by following a principle which is also exploited in (Eiter et al. 2002). In step (ii-a) of definition 2 the original rule r ∈ P1 is extended by Example 3. For the conflict between r4 and r10 we get C = {∅, nab pt, atzmab}. Consequently, the following modifications are possible: r̂4 : mono th ←treat, tnbc met, not visc crisis, not nab pt. r̂′4 : mono th ←treat, tnbc met, not visc crisis, not atzmab. r̂′′4 : mono th ←treat, tnbc met, not visc crisis, not nab pt, not atzmab. For the conflict between r8 and r11 we get C = {∅, atzmab, mono th}. This leads to the following possible modifications: r̂8 : low success ←treat, tnbc met, not atzmab. r̂8′ : low success ←treat, tnbc met, not mono th. r̂8′′ : low success ←treat, tnbc met, not atzmab, not mono th. Note that the suggestions r̂4′′ and r̂8′′ contain all literals of their respective set C. If the human expert is not able to choose a suggestion for a conflict, this type of suggestion can be chosen by default. For each conflict (r, r′ ) the fallback solution would then be the modification of r with r̂ : H(r) ← B(r), Cnot . where Cnot = { not c | c ∈ (B(r′ ) − B(r))}. After resolving all detected conflicts we get two programs P̂1 , P̂2 whose union results in a conflict-free ELP. Example 4. To resolve conflict (r4 , r10 ) the medical expert would choose r̂4′ , as besides a visceral crisis the drug atezolizumab is primarily relevant for the decision whether the patient should get monotherapy or not. Likewise, for conflict (r8 , r11 ) the expert would choose r̂8′ , as the low success rate of the longer known monotherapy applies further on. Therefore one result of modified programs would be P̂2 = P2 and P̂1 : r1 : tnbc met. 33 6 r2 : pdl pos. r3 : treat. r4 : mono th ← treat, tnbc met, not visc crisis, not atzmab. Related Works In the context of logic programs, instead of program sequences the connection of different knowledge bases can often be found in multi-agent-systems. In (Vos et al. 2005b) a multi-agent architecture is presented which allows deductive reasoning via Ordered Choice Logic Programs (OCLP). OCLP is an extension of ASP, which allows choice rules and a preferential order over rule sets. Each agent is encoded as an OCLP and can communicate with other agents. Compared to the approach in this paper, the knowledge update in (Vos et al. 2005b) is realized by the exchange of information between the agents. An agent’s knowledge can be updated by the incoming information. Although an extension of ASP is used, negation is not allowed explicitly and therefore contradictions are not directly possible. But each agent has specific goals in form of rules and facts. Incoming information is only incorporated in the agent’s knowledge if the information is not contradictory to the agent’s goals. The authors mention that negation and contradictory information could be implemented and handled, amongst others, by removing knowledge or adding the notion of trust between agents (Vos et al. 2005a). The implementation in (Vos et al. 2005b) allows an human agent. Similar to our approach, the human agent can control the agents’ updates if needed. But unlike our approach, the multi-agent platform is designed to run mostly autonomously while the updating of each agent’s knowledge is mainly done in an automated manner using the preferential order and choice rules provided in OCLPs. r5 : mono th ← treat, tnbc met, visc crisis. r6 : nab pt ← treat, tnbc met. r7 : carbopl ← tnbc met, mono th, visc crisis. r8 : low success ← treat, tnbc met, not mono th. Given the update sequence P = (P1 , P2 ) the update of P1 with P2 is therefore the conflict-free program P̂ with P̂ = P̂1 ∪ P̂2 . As one can see, the strategy in proposition 2 can potentially lead to multiple suggestions per conflict. In these cases the human expert, who is informed about each conflict, has to actively step in and choose the suggestion which is, according the expert’s knowledge, the most suitable one. On the one hand, this ensures full transparency of the update sequence to the expert regarding the modifications. On the other hand, the approach creates an updated program whose professional suitability is guaranteed by the expert. In this context we say a program is professionally suitable if the represented knowledge is suitable according to the experts. Proposition 3. Let P = (P1 , P2 ) be an update sequence. The interactive conflict resolution modifies the rules in P1 , P2 such that the union P̂ = P̂1 ∪ P̂2 of their corresponding modified programs P̂1 , P̂2 is conflict-free and professionally suitable. 7 Conclusion and Future Work In this paper we presented a method to update ELPs of an update sequence interactively with an expert. We used the approach in (Eiter et al. 2002) to find all conflicts between the programs. To resolve each conflict we defined a strategy to generate possible modifications of conflicting rules which resolve the conflict. To ensure professional suitability, for each conflict the expert can choose the suggestion which is the most suitable. This procedure leads to the successive modification of the programs given in the update sequence, resulting in modified ELPs which do not have conflicting rules and can therefore be updated by simply uniting the programs. During the interactive conflict resolution the expert maintains full control over each modification. Revisiting the questions relevant to the medical experts in section 3, the modified approach delivers following improvements: Questions (Q1) and (Q2) can be answered by presenting the information of the rej cause-literals in AS(P◭ ). Question (Q3a) should be solely answered and acted upon by the expert. In section 5 we delivered answers for questions (Q3b) and (Q3c). Regarding question (Q3d) further research is conceivable. The presented approach does only generate modification suggestions for rules directly involved in a conflict. This purely syntactical procedure can be limiting considering that due to the active involvement of the expert the expert’s knowledge is already available. It is possible that the actual conflict lies in other rules which are not part of the conflict pair itself. One possible solution to find deeper causes for a conflict could be the It is important to note that the strategy for the resolution of conflicts defined in proposition 2 prevents the creation of new conflicts when modifying rules. Proposition 4. Let (r, r′ ) be a conflict. After the conflict resolution according to proposition 2, the resulting rules r̂, r′ cannot be extended (by adding literals to the respective rule bodies) such that they become conflicting again. Proof. Let (r, r′ ) be a conflict and r̂, r′ the resulting nonconflicting rules after the extension of r according to proposition 2. Then, there exists a literal L such that (1) L ∈ B + (r̂) and L ∈ B − (r′ ) or (2) L ∈ B − (r̂) and L ∈ B + (r′ ). Let P be an ELP, {r̂, r′ } ⊆ P , S an answer set of P , Lh = H(r̂), and Lh = H(r′ ). In case (1), the following holds: If B(r̂) is true in S, then L ∈ S and consequently B(r′ ) cannot be true in S. This implies that Lh 6∈ S whenever Lh ∈ S . If B(r′ ) is true in S, then B(r̂) cannot be true in S. This in turn implies that Lh 6∈ S whenever Lh ∈ S. The line of argumentation holds analogously in case (2). This approach therefore provides a way to update an ELP by interactively modifying the programs of the update sequence such that the conflicts between the programs are eliminated while preserving the professional suitability of the updated program. 34 active inclusion of the expert. This enables the search for the cause of a conflict in all rules of the update sequence on a professional level. One can also consider to implement the search for rules involved in a conflict syntactically and the computation of matching resolutions. One can also consider to look at various semantical extensions like the support of default negation in rule heads. (Slota, Baláz, and Leite 2014) point out that the consideration of both types of negation lead to more fine-grained control over each atom. Especially in medical scenarios a distinction between positive and negative test results (strong negation) and the absence of symptoms (default negation) is important. By allowing default negation in rule heads rules can be defined more precisely and the general conflict potential when updating a program could be mitigated. Another useful improvement would be the extension of the conflict detection. Currently the detection of conflicts is dependent on the facts given in the programs of an update sequence. Looking at the running example of the paper one can imagine a scenario where a patient with different patient data is given. This results in a different set of facts. Then the update has to be executed specifically for this patient. The detection of conflicts independent of the program’s facts would save the time and effort to compute the update for each patient. Furthermore, as mentioned in the introduction, the method to detect conflicts can be switched out. This also applies to the method of generating possible rule modifications. Possible implementation approaches could be the manual conflict resolution by the expert, the complete removal of rules with older knowledge or generating modifications of rules with newer knowledge. For larger programs one can consider to improve the computation of answer sets and implicitly the detection of conflicts by using approaches like multi-shot ASP solving developed by (Gebser et al. 2019). This approach enables more control when grounding and solving an ELP which would lead to faster and more efficient interaction processes. atzmab2 ←treat, tnbc met, pdl pos, not visc crisis. mono th2 ←treat, tnbc met, nab pt, atzmab. low success2 ←treat, tnbc met, atzmab, mono th. rej(r1 ) ←tnbc met2 . rej(r2 ) ←pdl pos2 . rej(r3 ) ←treat2 . rej(r4 ) ←treat, tnbc met, not visc crisis, mono th2 . rej(r5 ) ←treat, tnbc met, visc crisis, mono th2 . rej(r6 ) ←treat, tnbc met, nab pt2 . rej(r7 ) ←treat, tnbc met, visc crisis, carbopl2 . rej(r8 ) ←treat, tnbc met, low success2 . tnbc met1 ←tnbc met2 . tnbc met ←tnbc met1 . pdl pos1 ←pdl pos2 . pdl pos ←pdl pos1 . treat1 ←treat2 . treat ←treat1 . mono th1 ←mono th2 . mono th ←mono th1 . mono th1 ←mono th2 . mono th ←mono th1 . visc crisis1 ←visc crisis2 . visc crisis ←visc crisis1 . nab pt1 ←nab pt2 . nab pt ←nab pt1 . carbopl1 ←carbopl2 . carbopl ←carbopl1 . atzmab1 ←atzmab2 . atzmab ←atzmab1 . low success1 ←low success2 . low success ←low success1 . Acknowledgements We would like to thank the anonymous reviewers for their helpful suggestions and comments. A Update Programs of Example A.1 Update program P⊳ tnbc met1 ← not rej(r1 ). pdl pos1 ← not rej(r2 ). treat1 ← not rej(r3 ). mono th1 ←treat, tnbc met, not visc crisis, not rej(r4 ). low success1 ←low success2 . low success ←low success1 . A.2 mono th1 ←treat, tnbc met, visc crisis, not rej(r5 ). nab pt1 ←treat, tnbc met, not rej(r6 ) carbopl1 ←treat, tnbc met, visc crisis, not rej(r7 ). low success1 ←treat, tnbc met, not rej(r8 ). Modified Update program P◭ tnbc met1 ← not rej(r1 ). pdl pos1 ← not rej(r2 ). treat1 ← not rej(r3 ). mono th1 ←treat, tnbc met, not visc crisis, not rej(r4 ). 35 References mono th1 ←treat, tnbc met, visc crisis, not rej(r5 ). nab pt1 ←treat, tnbc met, not rej(r6 ) carbopl1 ←treat, tnbc met, visc crisis, not rej(r7 ). low success1 ←treat, tnbc met, not rej(r8 ). Eiter, T.; Fink, M.; Sabbatini, G.; and Tompits, H. 2002. On properties of update sequences based on causal rejection. Theory Pract. Log. Program. 2(6):711–767. Eiter, T.; Ianni, G.; Schindlauer, R.; and Tompits, H. 2005. A uniform integration of higher-order reasoning and external evaluations in answer-set programming. In Proceedings of the 19th International Joint Conference on Artificial Intelligence, IJCAI’05. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. 90–96. Gebser, M.; Kaminski, R.; Kaufmann, B.; and Schaub, T. 2019. Multi-shot ASP solving with clingo. Theory Pract. Log. Program. 19(1):27–82. Gelfond, M., and Lifschitz, V. 1991. Classical negation in logic programs and disjunctive databases. New Gener. Comput. 9(3/4):365–386. Schmid, P.; Adams, S.; Rugo, H. S.; Schneeweiss, A.; Barrios, C. H.; Iwata, H.; Diéras, V.; Hegg, R.; Im, S.A.; Shaw Wright, G.; Henschel, V.; Molinero, L.; Chui, S. Y.; Funke, R.; Husain, A.; Winer, E. P.; Loi, S.; and Emens, L. A. 2018. Atezolizumab and nab-paclitaxel in advanced triple-negative breast cancer. New England Journal of Medicine 379(22):2108–2121. Schneeweiss, A.; Denkert, C.; Fasching, P. A.; Fremd, C.; Gluz, O.; Kolberg-Liedtke, C.; Loibl, S.; and Lück, H.J. 2019. Diagnosis and therapy of triple-negative breast cancer (tnbc)–recommendations for daily routine practice. Geburtshilfe und Frauenheilkunde 79(06):605–617. Slota, M.; Baláz, M.; and Leite, J. 2014. On strong and default negation in logic program updates (extended version). CoRR abs/1404.6784. Thevapalan, A.; Kern-Isberner, G.; Howey, D.; Beierle, C.; Meyer, R. G.; and Nietzke, M. 2018. Decision support core system for cancer therapies using ASP-HEX. In Brawner, K., and Rus, V., eds., Proceedings of the Thirty-First International Florida Artificial Intelligence Research Society Conference, FLAIRS 2018, Melbourne, Florida, USA. May 21-23 2018, 531–536. AAAI Press. Vos, M. D.; Cliffe, O.; Watson, R.; Crick, T.; Padget, J. A.; and Needham, J. 2005a. T-LAIMA: answer set programming for modelling agents with trust. In Gleizes, M.; Kaminka, G. A.; Nowé, A.; Ossowski, S.; Tuyls, K.; and Verbeeck, K., eds., EUMAS 2005 - Proceedings of the Third European Workshop on Multi-Agent Systems, Brussels, Belgium, December 7-8, 2005, 126–136. Koninklijke Vlaamse Academie van Belie voor Wetenschappen en Kunsten. Vos, M. D.; Crick, T.; Padget, J. A.; Brain, M.; Cliffe, O.; and Needham, J. 2005b. LAIMA: A multi-agent platform using ordered choice logic programming. In Baldoni, M.; Endriss, U.; Omicini, A.; and Torroni, P., eds., Declarative Agent Languages and Technologies III, Third International Workshop, DALT 2005, Utrecht, The Netherlands, July 25, 2005, Selected and Revised Papers, volume 3904 of Lecture Notes in Computer Science, 72–88. Springer. atzmab2 ←treat, tnbc met, pdl pos, not visc crisis. active(r9 ) ←treat, tnbc met, pdl pos, not visc crisis. mono th2 ←treat, tnbc met, nab pt, atzmab. active(r10 ) ←treat, tnbc met, nab pt, atzmab. low success2 ←treat, tnbc met, atzmab, mono th. active(r11 ) ←treat, tnbc met, atzmab, mono th. rej cause(r4 , r10 ) ←active(r10 ). rej(r4 ) ←rej cause(r4 , r10 ). rej cause(r8 , r11 ) ←active(r11 ). rej(r8 ) ←rej cause(r8 , r11 ). tnbc met1 ←tnbc met2 . tnbc met ←tnbc met1 . pdl pos1 ←pdl pos2 . pdl pos ←pdl pos1 . treat1 ←treat2 . treat ←treat1 . mono th1 ←mono th2 . mono th ←mono th1 . mono th1 ←mono th2 . mono th ←mono th1 . visc crisis1 ←visc crisis2 . visc crisis ←visc crisis1 . nab pt1 ←nab pt2 . nab pt ←nab pt1 . carbopl1 ←carbopl2 . carbopl ←carbopl1 . atzmab1 ←atzmab2 . atzmab ←atzmab1 . low success1 ←low success2 . low success ←low success1 . low success1 ←low success2 . low success ←low success1 . 36 Towards Conditional Inference under Disjunctive Rationality Richard Booth1 , Ivan Varzinczak2,3 1 Cardiff University, United Kingdom 2 CRIL, Univ. Artois & CNRS, France 3 CAIR, Computer Science Division, Stellenbosch University, South Africa boothr2@cardiff.ac.uk, varzinczak@cril.fr Abstract the latter does not hold. Quite surprisingly, the debate did not catch on, and, for lack of rivals of the same stature, Rational Closure has since reigned alone as a role model in non-monotonic inference. That used to be the case until Rott (2014) reignited interest in Disjunctive Rationality by considering interval models in connection with belief contraction. Inspired by that, here we revisit disjunctive consequence relations and make the first steps in the quest for a suitable notion of disjunctive rational closure of a conditional knowledge base. The plan of the paper is as follows. First, in Section 2, we give the usual summary of the formal background assumed in the following sections, in particular of the rational closure construction. Then, in Section 3, we make a case for weakening the rationality requirement and propose a semantics with an accompanying representation result for a weaker form of rationality enforcing the rule of Disjunctive Rationality. We move on, and in Section 4, we investigate a notion of closure of (or entailment from) a conditional knowledge base under Disjunctive Rationality. Our analysis is in terms of a set of postulates, all reasonable at first glance, that one can expect a suitable notion of closure to satisfy. Following that, in Section 5, we propose a specific construction for the Disjunctive Rational Closure of a conditional knowledge base and assess its suitability in the light of the postulates previously put forward (Section 6). We conclude with some remarks on future directions of investigation. The question of conditional inference, i.e., of which conditional sentences of the form “if α then, normally, β” should follow from a set KB of such sentences, has been one of the classic questions of non-monotonic reasoning, with several well-known solutions proposed. Perhaps the most notable is the rational closure construction of Lehmann and Magidor, under which the set of inferred conditionals forms a rational consequence relation, i.e., satisfies all the rules of preferential reasoning, plus Rational Monotonicity. However, this last named rule is not universally accepted, and other researchers have advocated working within the larger class of disjunctive consequence relations, which satisfy the weaker requirement of Disjunctive Rationality. While there are convincing arguments that the rational closure forms the “simplest” rational consequence relation extending a given set of conditionals, the question of what is the simplest disjunctive consequence relation has not been explored. In this paper, we propose a solution to this question and explore some of its properties. 1 Introduction The question of conditional inference, i.e., of which conditional sentences of the form “if α then, normally, β” should follow from a set KB of such sentences, has been one of the classic questions of non-monotonic reasoning, with several well-known solutions proposed. Since the work of Lehmann and colleagues in the early ’90s, the so-called preferential approach to defeasible reasoning has established itself as one of the most elegant frameworks within which to answer this question. Central to the preferential approach is the notion of rational closure of a conditional knowledge base, under which the set of inferred conditionals forms a rational consequence relation, i.e., satisfies all the rules of preferential reasoning, plus Rational Monotonicity. One of the reasons for accepting rational closure is the fact it delivers a venturous notion of entailment that is conservative enough. Given that, rationality has for long been accepted as the core baseline for any appropriate form of non-monotonic entailment. Very few have stood against this position, including Makinson (1994), who considered Rational Monotonicity too strong and has briefly advocated the weaker rule of Disjunctive Rationality instead. This rule is implied by Rational Monotonicity and may still be desirable in cases where 2 Formal preliminaries In this section, we provide the required formal background for the remainder of the paper. In particular, we set up the notation and conventions that shall be followed in the upcoming sections. (The reader conversant with the KLM framework for non-monotonic reasoning can safely skip to Section 3.) Let P be a finite set of propositional atoms. We use p, q, . . . as meta-variables for atoms. Propositional sentences are denoted by α, β, . . ., and are recursively defined in the usual way: α ::= ⊤ | ⊥ | P | ¬α | α ∧ α | α ∨ α | α → α | α ↔ α We use L to denote the set of all propositional sentences. def With U = {0, 1}P , we denote the set of all propositional valuations, where 1 represents truth and 0 falsity. We use 37 v, u, . . ., possibly with primes, to denote valuations. Whenever it eases the presentation, we shall represent valuations as sequences of atoms (e.g., p) and barred atoms (e.g., p̄), with the understanding that the presence of a non-barred atom indicates that the atom is true (has the value 1) in the valuation, while the presence of a barred atom indicates that the atom is false (has the value 0) in the valuation. Thus, for the logic generated from P = {b, f, p}, where the atoms stand for, respectively, “being a bird”, “being a flying creature”, and “being a penguin”, the valuation in which b is true, f is false, and p is true will be represented as bf̄p. Satisfaction of a sentence α ∈ L by a valuation v ∈ U is defined in the usual truth-functional way and is denoted by v α. The set of models of a sentence α is defined as def JαK = {v ∈ U | v α}. This notion isTextended to a set def of sentences X in the usual way: JXK = α∈X JαK. We say a set of sentences X (classically) entails α ∈ L, denoted X |= α, if JXK ⊆ JαK. α is valid, denoted |= α, if JαK = U . 2.1 Monotonicity rule (Lehmann and Magidor 1992), it is said to be a rational consequence relation: (RM) Rational consequence relations can be given an intuitive semantics in terms of ranked interpretations. Definition 1. A ranked interpretation R is a function from U to N ∪ {∞} such that R(v) = 0 for some v ∈ U, and satisfying the following convexity property: for every i ∈ N, if R(u) = i, then, for every j s.t. 0 ≤ j < i, there is a u′ ∈ U for which R(u′ ) = j. In a ranked interpretation, we call R(v) the rank of v w.r.t. R. The intuition is that valuations with a lower rank are deemed more normal (or typical) than those with a higher rank, while those with an infinite rank are regarded as so atypical as to be ‘forbidden’, e.g. by some background knowledge—see below. Given a ranked interpretation R, we therefore partition the set U into the set of plausible valuations (those with finite rank), and that of implausible ones (with rank ∞).1 Figure 1 depicts an example of a ranked interpretation for P = {b, f, p}. (In our graphical representations of ranked interpretations—and of interval-based interpretations later on—we shall plot the set of valuations in U on the y-axis so that the preference relation reads more naturally across the x-axis—from lower to higher. Moreover, plausible valuations are associated with the colour blue, whereas the implausible ones with red.) KLM-style rational defeasible consequence Several approaches to non-monotonic reasoning have been proposed in the literature over the past 40 years. The preferential approach, initially put forward by Shoham (1988) and subsequently developed by Kraus et al. (1990) in much depth (the reason why it became known as the KLMapproach), has established itself as one of the main references in the area. This stems from at least three of its features: (i) its intuitive semantics and elegant prooftheoretic characterisation; (ii) its generality w.r.t. alternative approaches to non-monotonic reasoning such as circumscription (McCarthy 1980), default logic (Reiter 1980), and many others, and (iii) its formal links with AGM-style belief revision (Gärdenfors and Makinson 1994). The fruitfulness of the preferential approach is also witnessed by the great deal of recent work extending it to languages that are more expressive than that of propositional logic such as those of description logics (Bonatti 2019; Britz, Meyer, and Varzinczak 2011; Casini et al. 2015; Britz and Varzinczak 2017; Giordano et al. 2007; Giordano et al. 2015; Pensel and Turhan 2017; Varzinczak 2018), modal logics (Britz and Varzinczak 2018a; Britz and Varzinczak 2018b; Chafik et al. 2020), and others (Booth, Meyer, and Varzinczak 2012). A defeasible consequence relation |∼ is defined as a binary relation on sentences of our underlying propositional logic, i.e., |∼ ⊆ L × L. We say that |∼ is a preferential consequence relation (Kraus, Lehmann, and Magidor 1990) if it satisfies the following set of (Gentzen-style) rules: (Ref) α |∼ α (LLE) |= α ↔ β, α |∼ γ β |∼ γ (And) α |∼ β, α |∼ γ α |∼ β ∧ γ (Or) α |∼ γ, β |∼ γ α ∨ β |∼ γ (RW) α |∼ β, |= β → γ α |∼ γ (CM) α |∼ β, α |∼ γ α ∧ β |∼ γ α |∼ β, α 6|∼ ¬γ α ∧ γ |∼ β b̄f̄p b̄fp bfp bf̄p bf̄p̄ bfp̄ b̄fp̄ b̄f̄p̄ • • • • • • • • 0 1 2 ∞ Figure 1: A ranked interpretation for P = {b, f, p}. Given a ranked interpretation R and α ∈ L, with JαKR we denote the set of plausible valuations satisfying α (αvaluations for short) in R. If JαKR = J⊤KR , then we say α def min{R(v) | is true in R and denote it R α. With R(α)= 1 In the literature, it is customary to omit implausible valuations from ranked interpretations. Since they are not logically impossible, but rather judged as irrelevant on the grounds of contingent information (e.g. a knowledge base) which is prone to change, we shall include them in our semantic definitions. This does not mean that we do anything special with them in this paper; they are rather kept for future use. If, in addition to the preferential rules, the defeasible consequence relation |∼ also satisfies the following Rational 38 v ∈ JαKR } we denote the rank of α in R. By convention, if JαKR = ∅, we let R(α) = ∞. Defeasible consequence of the form α |∼ β is then given a semantics in terms of ranked interpretations in the following way: We say α |∼ β is satisfied in R (denoted R α |∼ β) if R(α) < R(α ∧ ¬β). (And here we adopt Jaeger’s (1996) convention that ∞ < ∞ always holds.) It is easy to see that for every α ∈ L, R α if and only if R ¬α |∼ ⊥. If R α |∼ β, we say R is a ranked model of α |∼ β. In the example in Figure 1, we have R b |∼ f, R p → b (and therefore R ¬(p → b) |∼ ⊥), R p |∼ ¬f, R 6 f |∼ b, and R p ∧ ¬b |∼ b, which are all according to the intuitive expectations. That this semantic characterisation of rational defeasible consequence is appropriate is a consequence of a representation result linking the seven rationality rules above to precisely the class of ranked interpretations (Lehmann and Magidor 1992; Gärdenfors and Makinson 1994). rational extensions of |∼KB . Nevertheless, as pointed out by Lehmann and Magidor (1992, Section 4.2), the intersection of all such rational extensions is not, in general, a rational consequence relation: it coincides with preferential closure and therefore may fail RM. Among other things, this means that the corresponding entailment relation, which is called rank entailment and defined as KB |=R α |∼ β if every ranked model of KB also satisfies α |∼ β, is monotonic and therefore it falls short of being a suitable form of entailment in a defeasible reasoning setting. As a result, several alternative notions of entailment from conditional knowledge bases have been explored in the literature on non-monotonic reasoning (Booth and Paris 1998; Booth et al. 2019; Casini, Meyer, and Varzinczak 2019; Giordano et al. 2012; Giordano et al. 2015; Lehmann 1995; Weydert 2003), with rational closure (Lehmann and Magidor 1992) commonly acknowledged as the gold standard in the matter. 2.2 Rational closure (RC) is a form of inferential closure extending the notion of rank entailment above. It formalises the principle of presumption of typicality (Lehmann 1995, p. 63), which, informally, specifies that a situation (in our case, a valuation) should be assumed to be as typical as possible (w.r.t. background information in a knowledge base). Rational closure One can also view defeasible consequence as formalising some form of (defeasible) conditional and bring it down to the level of statements. Such was the stance adopted by Lehmann and Magidor (1992). A conditional knowledge base KB is thus a finite set of statements of the form α |∼ β, with α, β ∈ L, and possibly containing classical statements. As an example, let KB = {b |∼ f, p → b, p |∼ ¬f}. Given a conditional knowledge base KB, a ranked model of KB is a ranked interpretation satisfying all statements in KB. As it turns out, the ranked interpretation in Figure 1 is a ranked model of the above KB. It is not hard to see that, in every ranked model of KB, the valuations b̄f̄p and b̄fp are deemed implausible—note, however, that they are still logically possible, which is the reason why they feature in all ranked interpretations. An important reasoning task in this setting is that of determining which conditionals follow from a conditional knowledge base. Of course, even when interpreted as a conditional in (and under) a given knowledge base KB, |∼ is expected to adhere to the rules of Section 2.1. Intuitively, that means whenever appropriate instantiations of the premises in a rule are sanctioned by KB, so should the suitable instantiation of its conclusion. To be more precise, we can take the defeasible conditionals in KB as the core elements of a defeasible consequence relation |∼KB . By closing the latter under the preferential rules (in the sense of exhaustively applying them), we get a preferential extension of |∼KB . Since there can be more than one such extension, the most cautious approach consists in taking their intersection. The resulting set, which also happens to be closed under the preferential rules, is the preferential closure of |∼KB , which we denote by |∼KB PC. When interpreted again as a conditional knowledge base, the preferential closure of |∼KB contains all the conditionals entailed by KB. (Hence, the notions of closure of and entailment from a conditional knowledge base are two sides of the same coin.) The same process and definitions carry over when one requires the defeasible consequence relations also to be closed under the rule RM, in which case we talk of Assume an ordering KB on all ranked models of a knowledge base KB, which is defined as follows: R1 KB R2 , if, for every v ∈ U , R1 (v) ≤ R2 (v). Intuitively, ranked models lower down in the ordering are more typical. It is easy to see that KB is a weak partial order. Giordano et al. (2015) showed that there is a unique KB minimal element. The rational closure of KB is defined in terms of this minimum ranked model of KB. Definition 2. Let KB be a conditional knowledge base, and KB be the minimum element of the ordering KB on let RRC ranked models of KB. The rational closure of KB is the def KB defeasible consequence relation |∼KB RC = {α |∼ β | RRC α |∼ β}. As an example, Figure 1 shows the minimum ranked model of KB = {b |∼ f, p → b, p |∼ ¬f} w.r.t. KB . Hence we have that ¬f |∼ ¬b is in the rational closure of KB. Observe that there are two levels of typicality at work for rational closure, namely within ranked models of KB, where valuations lower down are viewed as more typical, but also between ranked models of KB, where ranked models lower down in the ordering are viewed as more typical. The most KB typical ranked model RRC is the one in which valuations are as typical as KB allows them to be (the principle of presumption of typicality we alluded to above). Rational closure is commonly viewed as the basic (although certainly not the only acceptable) form of nonmonotonic entailment, on which other, more venturous forms of entailment can be and have been constructed (Booth et al. 2019; Casini et al. 2014; Casini, Meyer, and Varzinczak 2019; Lehmann 1995). 39 3 Disjunctive rationality and interval-based preferential semantics Figure 2 illustrates an example of an interval-based interpretation for P = {b, f, p}. In our depictions of intervalbased interpretations, it will be convenient to see I as a function from U to intervals on the set N ∪ {∞}. Whenever the intervals associated to valuations u and v overlap, the intuition is that both valuations are incomparable in I ; otherwise the leftmost interval is seen as more preferred than the rightmost one. One may argue that there are cases in which Rational Monotonicity is too strong a rule to enforce and for which a weaker defeasible consequence relation would suffice (Giordano et al. 2010; Makinson 1994). Nevertheless, doing away completely with rationality (i.e., sticking to the preferential rules only) is not particularly appropriate in a defeasiblereasoning context. Indeed, as widely known in the literature, preferential systems induce entailment relations that are monotonic. In that respect, here we are interested in defeasible consequence relations (or defeasible conditionals) that do not necessarily satisfy Rational Monotonicity while still encapsulating some form of rationality, i.e., a venturous passage from the premises to the conclusion. A case in point is that of the Disjunctive Rationality (DR) rule (Kraus, Lehmann, and Magidor 1990) below: (DR) b̄f̄p b̄fp bfp bf̄p bf̄p̄ bfp̄ b̄fp̄ b̄f̄p̄ α ∨ β |∼ γ α |∼ γ or β |∼ γ • • • • • 0 Intuitively, DR says that if one may draw a conclusion from a disjunction of premises, then one should be able to draw this conclusion from at least one of these premises taken alone (Freund 1993). A preferential consequence relation is called disjunctive if it also satisfies DR. As it turns out, every rational consequence relation is also disjunctive, but not the other way round (Lehmann and Magidor 1992). Therefore, DR is a weaker form of rationality, as its name suggests. Given that, Disjunctive Rationality is indeed a suitable candidate for the type of investigation we have in mind. 1 2 ∞ Figure 2: An interval-based interpretation for P = {b, f, p}. In Figure 2, the rationale behind the ordering is as follows: situations with flying birds are the most normal ones; situations with non-flying penguins are more normal than the flying-penguin ones, but both are incomparable to nonpenguin situations; the situations with penguins that are not birds are the implausible ones; and finally those that are not about birds or penguins are so irrelevant as to be seen as incomparable with any of the plausible ones. The notions of plausible and implausible valuations, as well as that of α-valuations, carry over to interval-based interpretations, only now the plausible valuations are the ones with finite lower ranks (and hence also finite upper ranks, by part (iv) of the previous definition). With def def min{U (v) | min{L (v) | v ∈ JαKI } and U (α) = L (α) = I v ∈ JαK } we denote, respectively, the lower and the upper rank of α in I . By convention, if JαKI = ∅, we let L (α) = U (α) = ∞. We say α |∼ β is satisfied in I (denoted I α |∼ β) if U (α) < L (α ∧ ¬β). (Recall the convention that ∞ < ∞.) As an example, in the intervalbased interpretation of Figure 2, we have I b |∼ f, I p |∼ ¬f, and I 6 ¬f |∼ ¬p (contrary to the ranked interpretation R in Figure 1, which endorses the latter statement). In the tradition of the KLM approach to defeasible reasoning, we define the defeasible consequence relation induced def by an interval-based interpretation: |∼I = {α |∼ β | I α |∼ β}. We can now state a KLM-style representation result establishing that our interval-based semantics is suitable for characterising the class of disjunctive defeasible consequence relations, which is a variant of Freund’s (1993) result: Theorem 1. A defeasible consequence relation is a disjunctive consequence relation if and only if it is defined by some A semantic characterisation of disjunctive consequence relations was given by Freund (1993) based on a filtering condition on the underlying ordering. Here, we provide an alternative semantics in terms of interval-based interpretations. (We conjecture Freund’s semantic constructions and ours can be shown to be equivalent in the finite case.) Definition 3. An interval-based interpretation is a tuple def I = hL , U i, where L and U are functions from U to N∪{∞} s.t. (i) L (v) = 0, for some v ∈ U; (ii) if L (u) = i or U (u) = i, then for every 0 ≤ j < i, there is u′ s.t. either L (u′ ) = j or U (u′ ) = j, (iii) L (v) ≤ U (v), for every v ∈ U, and (iv) L (u) = ∞ iff U (u) = ∞. Given I = hL , U i and v ∈ U, L (v) is the lower rank of v in I , and U (v) is the upper rank of v in I . Hence, for any v, the pair (L (v), U (v)) is the interval of v in I . We say u is more preferred than v in I , denoted u ≺ v, if U (u) < L (v). The preference order ≺ on U defined above via an interval-based interpretation forms an interval order, i.e., it is a strict partial order that additionally satisfies the interval condition: if u ≺ v and u′ ≺ v ′ , then either u ≺ v ′ or u′ ≺ v. Furthermore, every interval order can be defined from an interval-based interpretation in this way. See the work of Fishburn (1985) for a detailed treatise on interval orders. 40 interval-based interpretation, i.e., |∼ is disjunctive if and only if there is I such that |∼ = |∼I . 4 say two knowledge bases KB 1 , KB 2 are equivalent, written KB 1 ≡ KB 2 , if there is a bijection f : KB 1 −→ KB 2 s.t. each α |∼ β ∈ KB 1 is equivalent to f (α |∼ β). Towards disjunctive rational closure Given this, we can express a weak form of syntax independence: Given a conditional knowledge base KB, the obvious definition of closure under Disjunctive Rationality consists in taking the intersection of all disjunctive extensions of |∼KB (cf. Section 2.2). Let us call it the disjunctive closure of |∼KB , with interval-based entailment, defined as KB |=I α |∼ β if every interval-based model of KB also satisfies α |∼ β, being its semantic counterpart. The following result shows that the notion of disjunctive closure is stillborn, i.e., it does not even satisfy Disjunctive Rationality. 2 1 . =|∼KB Equivalence If KB 1 ≡ KB 2 , then |∼KB ∗ ∗ Finally, the last of our basic postulates requires rational closure to be the upper bound on how venturous our consequence relation should be. KB Infra-Rationality |∼KB ∗ ⊆ |∼RC . 4.2 Theorem 2. Given a conditional knowledge base KB, (i) the disjunctive closure of KB coincides with its preferential KB closure |∼KB P C . (ii) There exists KB such that |∼P C does not satisfy Disjunctive Rationality. Echoing a fundamental principle of reasoning in general and of non-monotonic reasoning in particular is a property requiring |∼KB to contain only conditionals whose inferences ∗ can be justified on the basis of KB. The first idea to achieve this would be to set |∼KB to be a set-theoretically minimal ∗ disjunctive consequence relation that extends KB. For a simple counterexample showing that |∼KB P C need not satisfy Disjunctive Rationality, consider KB = {⊤ |∼ b}. Clearly we have p ∨ ¬p |∼KB P C b, but one can easily construct interval-based interpretations I1 , I2 whose corresponding consequence relations both satisfy KB but for which p 6|∼I1 b and ¬p 6|∼I2 b. This result suggests that the quest for a suitable definition of entailment under disjunctive rationality should follow the footprints in the road which led to the definition of rational closure. Such is our contention here, and our research question is now: ‘Is there a single best disjunctive relation extending the one induced by a given conditional knowledge base KB?’ Let us denote by |∼KB ∗ the special defeasible consequence relation that we are looking for. In the remainder of this section, we consider some desirable properties for the mapping from KB to |∼KB ∗ , and consider some simple examples in order to build intuitions. In the following section, we will offer a concrete construction: the Disjunctive Rational Closure of KB. 4.1 Minimality postulates Example 1. Suppose the only knowledge we have is a single conditional saying “birds normally fly”, i.e., KB = {b |∼ f}. Assuming just two variables, we have a unique ⊆-minimal disjunctive consequence relation extending this knowledge base, which is given by the interval-based interpretation I in Figure 3. Indeed, the conditional b |∼ f is saying precisely that bf ≺ bf̄, but is telling us nothing with regard to the relative typicality of the other two possible valuations, so any pair of valuations other than this one is incomparable. For this reason, we do not have ¬f |∼I ¬b here. Note the rational closure in this example does endorse this latter conclusion, thus providing further evidence that the rational closure arguably gives some unwarranted conclusions. bf̄ bf b̄f b̄f̄ Basic postulates Starting with our most basic requirements, we put forward the following two postulates: Inclusion If α |∼ β ∈ KB, then α |∼KB β. ∗ • • 0 D-Rationality |∼KB ∗ is a disjunctive consequence relation. 1 Figure 3: Interval-based model of KB = {b |∼ f}. Note that, given Theorem 1, D-Rationality is equivalent to saying there is an interval-based interpretation I such that |∼KB ∗ = |∼I . If we replace “disjunctive consequence” in the statement by “rational consequence”, then that is the postulate that is usually considered in the area. Another reasonable property to require from an induced consequence relation is for two equivalent knowledge bases to yield exactly the same set of inferences. This prompts the question of what it means to say that two conditional knowledge bases are equivalent. One weak notion of equivalence can be defined as follows. The next example illustrates the fact that there might be more than one ⊆-minimal extension of a KB-induced consequence relation. Example 2. Assume a COVID-19 inspired scenario with only two propositions, m and s, standing for, respectively, “you wear a mask” and “you observe social distancing”. Let KB = {m |∼ s, ¬m |∼ s}. There are two ⊆-minimal disjunctive consequence relations extending |∼KB , corresponding to the two interval-based interpretations I1 and I2 (from left to right) in Figure 4. The first conditional is saying ms ≺ ms̄, while the second is saying m̄s ≺ m̄s̄. According Definition 4. Let α1 , α2 , β1 , β2 ∈ L. We say α1 |∼ β1 is equivalent to α2 |∼ β2 if |= (α1 ↔ α2 ) ∧ (β1 ↔ β2 ). We 41 (on P) is a function σ : P −→ L. A symbol translation can be extended to a function on L by setting, for each sentence α, σ(α) to be the sentence obtained from α by replacing each atom p occurring in α by its image σ(p) throughout.2 Similarly, given a conditional knowledge base KB and a symbol translation σ(·), we denote by σ(KB) the knowledge base obtained by replacing each conditional α |∼ β in KB by σ(α) |∼ σ(β). to the interval condition (see the paragraph following Definition 3), we must then have either ms ≺ m̄s̄ or m̄s ≺ ms̄. The choice of which gives rise to I1 and I2 , respectively. ms̄ ms m̄s m̄s̄ ms̄ ms m̄s m̄s̄ • • 0 1 2 • • Representation Independence For any symbol translaσ(KB) tion σ(·), we have α |∼KB β iff σ(α) |∼∗ σ(β). ∗ 0 1 Note that Weydert (2003) also considers Representation Independence (RI) in the context of conditional inference, but in a slightly different framework. The idea behind it has also been explored by Jaeger (1996), who, in particular, looked at the property in relation to rational closure. As noted by Marquis and Schwind (2014), the property is a very demanding one that is likely hard to satisfy in its full, unrestricted form above. And indeed this is confirmed in our setting, since it can be shown that Representation Independence is jointly incompatible with two of our basic postulates, namely Inclusion and Infra-Rationality. This motivates the need to focus on specific families of symbol translation. Some examples are the following: 2 Figure 4: Interval-based models of the two ⊆-minimal extensions of |∼KB , for KB = {m |∼ s, ¬m |∼ s}. In the light of Example 2 above, a question that arises is what to do when one has more than a single ⊆-minimal extension of |∼KB . Theorem 2 already tells us we cannot, in general, take the obvious approach by taking their intersection. However, even though returning the disjunctive/preferential closure |∼KB P C is not enough to ensure DRationality, we might still expect the following postulates as reasonable. Vacuity If |∼KB P C is a disjunctive consequence relation, then KB |∼KB ∗ = |∼P C . 1. σ(·) is a permutation on P, i.e., is just a renaming of the propositional variables; 2. σ(p) ∈ {p, ¬p}, for all p ∈ P. Then, instead of using p to denote say “it’s raining”, we use it rather to denote “it’s not raining”. We call any symbol translation of this type a negation-swapping symbol translation. KB Preferential Extension |∼KB P C ⊆ |∼∗ . (Note, given Theorem 2, the postulate above follows from Inclusion and D-Rationality.) Justification If α |∼KB β, then α |∼′ β for at least one ∗ ⊆-minimal disjunctive relation |∼′ extending |∼KB . 4.3 Each special subfamily of symbol translations yields a corresponding weakening of RI that applies to just that kind of translation. In particular we have the following postulate: Representation independence postulates Negated Representation Independence For any negationswapping symbol translation σ(·), we have α |∼KB β iff ∗ σ(KB) σ(α) |∼∗ σ(β). Going back to Example 2, what should the expected output be in this case? Intuitively, faced with the choice of which of the pairs ms ≺ m̄s̄ or m̄s ≺ ms̄ to include, and in the absence of any reason to prefer either one, it seems the right thing to do is to include both, and thereby let the intervalbased interpretation depicted in Figure 5 yield the output. Notice that this will be the same as the rational closure in this case. ms̄ ms m̄s m̄s̄ Example 3. Going back to Example 2, when modelling the scenario, instead of using propositional atom m to denote “you wear a mask” we could equally well have used it to denote “you do not wear a mask”. Then the statement “if you wear a mask then, normally, you do social distancing” would be modelled by ¬m |∼ s, etc. This boils down to taking a negation-swapping symbol translation such that σ(m) = ¬m and σ(s) = s. Then σ(KB) = {¬m |∼ s, ¬¬m |∼ s}, and if we inferred, say, m ↔ s |∼ s from KB then we would expect to infer ¬m ↔ s |∼ s from σ(KB). • • • • 4.4 0 Cumulativity postulates The idea behind a notion of Cumulativity in our setting is that adding a conditional to the knowledge base that was already inferred should not change anything in terms of its consequences. We can split this into two ‘halves’. 1 Figure 5: Interval-based models of the union of the two ⊆-minimal extensions of |∼KB , for KB = {m |∼ s, ¬m |∼ s}. Cautious Monotonicity If α |∼KB β and KB ′ = KB ∪ ∗ ′ KB {α |∼ β}, then |∼KB . ∗ ⊆ |∼∗ We can express the desired symmetry requirement in a syntactic form, using the notion of symbol translations (Marquis and Schwind 2014). A symbol translation 2 Marquis and Schwind (2014) consider much more general settings, but this is all we need in the present paper. 42 ′ Cut If α |∼KB β and KB ′ = KB ∪ {α |∼ β}, then |∼KB ⊆ ∗ ∗ KB |∼∗ . take all conditionals holding in this interpretation as the consequences of KB. In this section, we give our construction KB of the interpretation IDC that gives us the disjunctive rational closure of a conditional knowledge base. KB KB KB To specify IDC , we will construct the pair hLDC , UDC i of functions specifying the lower and upper ranks for each valuation. Since we aim to satisfy Infra-Rationality, our conKB struction method takes the rational closure RRC of KB as a point of departure. Starting with the lower ranks, we simply set, for all v ∈ U: def KB KB LDC (v) = RRC (v). That is, the lower ranks are given by the rational closure. KB , if we happen to have For the upper ranks UDC KB KB LDC (v) = RRC (v) = ∞, then, to conform with the definition of interval-based interpretation, it is clear that we must KB KB (v) 6= ∞, then the construc(v) = ∞ also. If LDC set UDC KB tion of UDC (v) becomes a little more involved. We require first the following definition. Definition 5. Given a ranked interpretation R and a conditional α |∼ β such that R α |∼ β, we say a valuation v verifies α |∼ β in R if R(v) = R(α). KB Now, assuming LDC (v) 6= ∞, our construction of KB UDC (v) splits into two cases, according to whether v veriKB fies any of the conditionals from KB in RRC or not. We conclude this section with an impossibility result concerning a subset of the postulates we have mentioned so far. Theorem 3. There is no method ∗ simultaneously satisfying all of Inclusion, D-Rationality, Equivalence, Vacuity, Cautious Monotonicity and Negated Representation Independence. Proof. Assume, for contradiction, that ∗ satisfies all the listed properties. Suppose P = {m, s} and let KB be the knowledge base from Example 2, i.e., {m |∼ s, ¬m |∼ s}. By Inclusion, m |∼KB s and ¬m |∼KB s. By D-Rationality, ∗ ∗ we know |∼KB satisfies the Or rule, so, from these two, we ∗ get m ∨ ¬m |∼KB s which, in turn, yields (m ↔ s) ∨ (¬m ↔ ∗ s) |∼KB s, by LLE. Applying DR to this means we have: ∗ (m ↔ s) |∼KB s or (¬m ↔ s) |∼KB s ∗ ∗ (1) Now, let σ(·) be the negation-swapping symbol translation mentioned in Example 3, i.e., σ(m) = ¬m, σ(s) = s, so σ(KB) = {¬m |∼ s, ¬¬m |∼ s}. Then, by Negated Representation Independence, we have (m ↔ s) |∼KB s iff ∗ σ(KB) (¬m ↔ s) |∼∗ s. But clearly we have KB ≡ σ(KB), so, by Equivalence, we obtain from this: (m ↔ s) |∼KB s iff (¬m ↔ s) |∼KB s ∗ ∗ Case 1: v does not verify any of the conditionals in KB KB in RRC . In this case, we set: (2) s Putting (1) and (2) together gives us both (m ↔ s) |∼KB ∗ ′ s. Now, let KB = KB ∪ {(m ↔ and (¬m ↔ s) |∼KB ∗ KB′ s) |∼ s}. By Cautious Monotonicity, |∼KB . In ∗ ⊆ |∼∗ ′ KB particular, (¬m ↔ s) |∼∗ s. It can be checked that the disjunctive/preferential closure of KB ′ is itself a disjunctive consequence relation. In fact, it corresponds to the interval-based interpretation on the left of Figure 4. Hence, by Vacuity, this particular interval-based interpretation cor′ . But, by inspecting this picture, we responds also to |∼KB ∗ ′ see (¬m ↔ s) 6|∼KB s, which leads to a contradiction. ∗ def KB KB KB (v) = max{RRC (u) | RRC (u) 6= ∞} UDC KB Case 2: v verifies at least one conditional from KB in RRC . In this case, the idea is to extend the upper rank of v as much as possible while still ensuring the constraints repKB resented by KB are respected in the resulting IDC . If v KB verifies α |∼ β in RRC , then this is achieved by setting KB KB UDC (v) = RRC (α ∧ ¬β) − 1; or, if R(α ∧ ¬β) = ∞, then KB KB KB again just set UDC (v) = max{RRC (u) | RRC (u) 6= ∞}, as in Case 1. (This takes care of ‘redundant’ conditionals that might occur in KB, like α |∼ α). We introduce now the following notation. Given sentences α, β: KB KB RRC (α ∧ ¬β) − 1, if RRC (α ∧ ¬β) 6= ∞ def tKB (α, β) = KB KB RC max{RRC (u) | RRC (u) 6= ∞}, otherwise. But we need to take care of the situation in which v posKB sibly verifies more than one conditional from KB in RRC . In order to ensure that all conditionals in KB will still be satisfied, we need to take: def KB min{tKB UDC (v) = RC (α, β) | (α |∼ β) ∈ KB and Theorem 3 is both surprising and disappointing, since all of the properties mentioned seem to be rather intuitive and desirable. Note that a close inspection of the proof shows that even just Vacuity and Cautious Monotonicity together place some quite severe restrictions on the behaviour of ∗. Corollary 1. Let P = {p, q} and KB = {p |∼ q, ¬p |∼ q}. There is no operator ∗ satisfying Vacuity and Cautious Monotonicity that infers both (p ↔ q) |∼KB q and (¬p ↔ ∗ q) |∼KB q. ∗ KB } v verifies α |∼ β in RRC So, summarising the two cases, we arrive at our final defKB inition of UDC :  min{tKB  RC (α, β) | α |∼ β ∈ KB and    KB  v verifies α |∼ β in RRC },   def KB if v verifies at least one conditional from UDC (v)=  KB  KB in RRC      KB KB max{RRC (u) | RRC (u) 6= ∞}, otherwise. What can we do in the face of these results? Our strategy will be to seek to construct a method that can satisfy as many of these properties as possible. We now provide our candidate for such a method - the disjunctive rational closure. 5 A construction for disjunctive rational closure In order to satisfy D-Rationality, we can focus on constructing a special interval-based interpretation from KB and then 43 KB Note that if v verifies α |∼ β ∈ KB in RRC , then KB KB KB (v) = RRC (α) ≤ RRC (α ∧ ¬β) − 1 = tKB (α, β). RRC RC KB KB Thus, in both cases above, we have LDC (v) ≤ UDC (v) KB KB and so the pair LDC and UDC form a legitimate intervalbased interpretation. We thus arrive at our final definition of the disjunctive rational closure of a conditional knowledge base. KB def KB KB , UDC i be the intervalDefinition 6. Let IDC = hLDC KB KB based interpretation specified by LDC and UDC as above. The disjunctive rational closure of KB is the defeasible condef KB α |∼ β}. sequence relation |∼KB DC = {α |∼ β | IDC In the remainder of this section, we revisit the examples we have seen throughout the paper, to see what answer the disjunctive rational closure gives. Example 4. Going back to Example 1, with KB = {b |∼ KB KB f}, the rational closure yields RRC (bf) = RRC (b̄f) = KB KB KB KB RRC (b̄f̄) = 0 and RRC (bf̄) = 1. Since LDC = RRC , KB this gives us the lower ranks for each valuation in IDC . Turning to the upper ranks, the only valuation that verifies KB the single conditional b |∼ f in KB is bf, thus UDC (bf) = KB KB tRC (b, f) = RRC (b ∧ ¬f) − 1 = 1 − 1 = 0, meaning that the interval assigned to bf is (0, 0). The other three valuations all get assigned the same upper rank, which is just the KB maximum finite rank occurring in RRC , which is 1. Thus the interval assigned to bf̄ is (1, 1), while both the valuaKB tions in J¬bK are assigned (0, 1). So IDC outputs exactly the same interval-based interpretation depicted in Figure 3 which, recall, gives the unique ⊆-minimal disjunctive consequence relation extending KB in this case. Example 5. Returning to Example 2, with KB = {m |∼ KB s, ¬m |∼ s}, the rational closure yields RRC (ms) = KB KB KB RRC (m̄s) = 0 and RRC (ms̄) = RRC (m̄s̄) = 1, which gives us the lower ranks. The valuation ms verifies only KB the conditional m |∼ s, and so UDC (ms) = tKB RC (m, s) = KB RRC (m ∧ ¬s) − 1 = 1 − 1 = 0. Similarly, the valuation m̄s verifies only the conditional ¬m |∼ s and so, by analoKB gous reasoning, UDC (m̄s) = tKB RC (¬m, s) = 0. So both KB . of these valuations are assigned the interval (0, 0) by IDC The other two valuations, which verify neither conditional KB in KB, are assigned (1, 1). Thus, in this case, IDC returns just the rational closure of KB, as pictured in Figure 5. In both the above examples, the disjunctive rational closure returns arguably the right answers. Example 6. Consider KB = {b |∼ f, p → b, p |∼ ¬f}. KB As previously mentioned, the rational closure RRC for this KB is depicted in Figure 1. Since both of the valuations in Jp ∧ ¬bK (in red at the top of the picture) are deemed implausible (i.e., have rank ∞), they are both assigned interval (∞, ∞). Focusing then on just the plausible valuations, the KB only valuation verifying b |∼ f in RRC is bfp̄ (which veriKB KB fies no other conditional in KB), so UDC (bfp̄) = RRC (b ∧ ¬f) − 1 = 1 − 1 = 0. The only valuation verifying p |∼ ¬f KB KB is bf̄p, so UDC (bf̄p) = RRC (p ∧ f) − 1 = 2 − 1 = 1. All other plausible valuations get assigned as their upper rank KB the maximum finite rank, which is 2. The resulting IDC is the interval-based interpretation depicted in Figure 2. We end this section by considering our construction from the standpoint of complexity. The construction method above runs in time that grows (singly) exponentially with the size of the input, even if the rational closure of the knowledge base has been computed offline. To see why, let the input be a set of propositional atoms P together with a conditional knowledge base KB, and let |KB| = n. (For simplicity, we assume the size of KB to be the number of conditionals therein.) We know that |U| = 2|P| . Now, for each valuation v ∈ U , one has to check whether v verifies at least one conditional α |∼ β in KB (cf. Definition 5). In the worst case, we have (i) all conditionals in KB will be checked against v, i.e., we will have n checks per valuation. Each of such checks amounts to comparing R(v) with R(α), where α is the antecedent of the conditional under inspection. While R(v) is already known, R(α) has to be computed (unless, of course, we also assume it has been done offline in the computation of the rational closure). Computing R(α) is done by searching for the lowest valKB uations in RRC satisfying α. In the worst case, we have |P| that (ii) 2 valuations have to be inspected. Each such inspection amounts to a propositional verification, which is a polynomial-time task. Every time v verifies a conditional α |∼ β, the computation of tKB RC (·) also requires that KB (α ∧ ¬β). In the worst case, the latter requires 2|P| of RRC propositional verifications. So, the computation of tKB RC (·) takes at most (iii) n × 2|P| checks. From (i), (ii) and (iii), it follows that n2 × 22|P| propositional verifications are required. This has to be done for each of the 2|P| valuations, and therefore we have a total of n2 × 23|P| verifications in the worst case, from which the result follows. Let us now take a look at the complexity of entailment checking, i.e., that of checking whether a conditional α |∼ KB β is satisfied by IDC . This task amounts to computing KB KB UDC (α) and LDC (α ∧ ¬β) and comparing them. It is easy to see that in the worst-case scenario both require 2|P| propositional verifications. 6 Properties of the Disjunctive Rational Closure We now turn to the question of which of the postulates from Section 4 are satisfied by the disjunctive rational closure. We start by observing that we obtain all of the basic postulates proposed in Section 4.1: Proposition 1. The disjunctive rational closure satisfies Inclusion, D-Rationality, Equivalence and Infra-Rationality. Proof. (Outline) D-Rationality is immediate since we construct an interval-based interpretation. Equivalence is also straightforward. For Infra-Rationality, first recall that KB KB KB α |∼KB DC β iff UDC (α) < LDC (α∧¬β). Since LDC (α) ≤ KB UDC (α) (follows by definition of interval-based interpretaKB KB tion) and LDC (α ∧ ¬β) = RRC (α ∧ ¬β) (by construction), KB KB KB we have UDC (α) < LDC (α ∧ ¬β) implies RRC (α) = KB KB KB LDC (α) < RRC (α ∧ ¬β), giving α |∼RC β, as required for Infra-Rationality. For Inclusion, suppose α |∼ β ∈ KB. 44 KB KB KB If RRC (α) = ∞, then LDC (α) = UDC (α) = ∞ by conKB struction and so α |∼KB β. So assume RRC (α) 6= ∞. DC KB KB Then, to show α |∼DC β, it suffices to show UDC (v) < KB KB LDC (α ∧ ¬β) = RRC (α ∧ ¬β) for at least one v ∈ JαK. Since rational closure satisfies inclusion, we know α |∼KB RC β KB and so, since RRC (α) 6= ∞, there must exist at least one v ′ KB KB verifying α |∼ β in RRC . By construction of UDC , we KB KB KB ′ have UDC (v ) ≤ tRC (α, β) = RRC (α ∧ ¬β) − 1 as required. lead to an increase in the upper ranks, which means the disjunctive rational closure does satisfy Cautious Monotonicity. Proposition 3. The disjunctive rational closure satisfies Cautious Monotonicity. ′ Proof. (Outline) Suppose α |∼KB DC β and let KB = KB ∪ {α |∼ β}. Since disjunctive rational closure satisfies InfraRationality, we know α |∼KB RC β, and so, since rational cloKB′ KB = RRC , i.e., sure satisfies Cautious Monotonicity, RRC KB′ are unchanged the lower ranks of all valuations in IDC KB KB′ from IDC . To show |∼KB , it thus suffices to show DC ⊆|∼∗ ′ KB KB UDC (v) ≤ UDC (v) for all valuations v. If v does not KB KB′ KB′ (v) (since (v) = UDC verify α |∼ β in RRC , then UDC KB′ all terms and cases in the definition of UDC depend only KB′ KB′ KB ), while if v does verify α |∼ β in RRC , on RRC = RRC ′ ′ KB KB KB (v), (α, β)} ≤ U (v), tKB then UDC (v) = min{UDC DC RC as required. We remind the reader that, since Inclusion and DRationality hold, disjunctive rational closure also satisfies Preferential Extension. Now we look at the Cumulativity properties. It is known from the work by Lehmann and Magidor (1992) that rational closure satisfies both Cautious Monotonicity and Cut, and, ′ in fact, if α |∼KB RC β and KB = KB ∪ {α |∼ β}, then KB KB′ RRC = RRC . We can show the following for disjunctive rational closure. As we have seen in Corollary 1 in Section 4.4, the satisfaction of Cautious Monotonicity, plus the seemingly very reasonable behaviour displayed by disjunctive rational closure in Example 5, come at the cost of Vacuity, i.e., even if the preferential closure happens to be a disjunctive relation, the output may sanction extra conclusions. Proposition 4. The disjunctive rational closure does not satisfy Vacuity. Proposition 2. The disjunctive rational closure does not satisfy Cut. Proof. Assume P = {b, f}, and KB is again the knowledge base from Example 1, i.e., {b |∼ f}. We have seen in KB Example 4 that IDC is given by the interval-based interpretation depicted in Figure 3. By inspecting this picture, we KB see IDC ⊤ |∼ (b → f). Now let KB ′ = KB ∪ {⊤ |∼ KB′ (b → f)}. Then IDC is given by the model in Figure 6. KB′ We now have IDC ¬f |∼ ¬b, whereas before we had KB IDC 6 ¬f |∼ ¬b. bf̄ bf b̄f b̄f̄ Proof. By Corollary 1, there can be no operator ∗ satisfying Cautious Monotonicity and Vacuity that infers both KB (¬m ↔ s) |∼KB DC s and (m ↔ s) |∼DC s. We saw in Example 5 that the disjunctive rational closure returns the rational closure for this KB, and so yields both these conditional inferences. We have also just seen that disjunctive rational closure satisfies Cautious Monotonicity. Hence we deduce that disjunctive rational closure cannot satisfy Vacuity. • • • • 0 What about the Representation Independence postulates? Concerning full Representation Independence, we have remarked earlier that this postulate is not compatible with the basic postulates, and so Proposition 1 already tells us that disjunctive rational closure fails it. However, we conjecture that Negated Representation Independence is satisfied, since we can show that if rational closure satisfies it, then the disjunctive rational closure will inherit the property. Although Jaeger (1996) showed that rational closure does indeed conform with his version of Representation Independence, it remains to be proved that his notion coincides precisely with ours. 1 Figure 6: Output for KB′ = {b |∼ f, ⊤ |∼ (b → f)}. Essentially, the reason for the failure of Cut is that by adding a new conditional α |∼ β to the knowledge base, even when that conditional is already inferred by the disjunctive rational closure, we give certain valuations (namely those in JαK) the opportunity to verify one more condiKB′ tional from the knowledge base in RRC . (See, e.g. the two valuations in J¬bK in the above counterexample.) This leads, potentially, to a corresponding decrease in their upper ranks U∗KB , leading in turn to more inferences being made available. This behaviour reveals that disjunctive rational closure can be termed a base-driven approach, since the conditionals that are included explicitly in the knowledge base have more influence compared to those that are merely derived. However, adding an inferred conditional will never 7 Concluding remarks In this paper, we have set ourselves the task to revive interest in weaker alternatives to Rational Monotonicity when reasoning with conditional knowledge bases. We have studied the case of Disjunctive Rationality, a property already known by the community from the work of Kraus et al. and Freund in the early ’90s, which we have then coupled with a semantics in terms of interval orders borrowed from a more recent work by Rott in belief revision. 45 In our quest for a suitable form of entailment ensuring Disjunctive Rationality, we started by putting forward a set of postulates, all reasonable at first glance, characterising its expected behaviour. As it turns out, not all of them can be satisfied simultaneously, which suggests there might be more than one answer to our research question. We have then provided a construction of the disjunctive rational closure of a conditional knowledge base, which infers a set of conditionals intermediate between the preferential closure and the rational closure. Regarding the properties of disjunctive rational closure, the news is somewhat mixed, with several basic postulates satisfied, as well as Cautious Monotonicity, but with neither Cut nor Vacuity holding in general. Regarding Cut, the reason for its failure seems tied to the fact that disjunctive rational closure places special importance on the conditionals that are explicitly written as part of the knowledge base. In this regard it shares commonalities with other base-driven approaches to defeasible inference such as the lexicographic closure (Lehmann 1995). We conjecture that a weaker version of Cut will still hold for our approach, according to which the new conditional added α |∼ β is such that α already appears as an antecedent of another conditional already in KB. Regarding Vacuity, our impossibility result and surrounding discussion tells us that its failure is unavoidable given the other, reasonable, behaviour that we have shown disjunctive rational closure to exhibit. Essentially, when trying to devise a method for conditional inference under Disjunctive Rationality, we are faced with a choice between Vacuity and Cautious Monotonicity, with disjunctive rational closure favouring the latter at the expense of the former. It is possible, of course, to tweak the current approach by treating the case when |∼KB P C happens to be a disjunctive relation separately, outputting the preferential closure in this case, while returning the disjunctive rational closure otherwise. However the full ripple effects on the other properties of |∼KB DC of making this manoeuvre remain to be worked out. As for future work, we plan to start by checking whether disjunctive rational closure satisfies Negated Representation Independence, as well as the Justification postulate. We also plan to investigate suitable definitions of a preference relation on the set of interval-based interpretations. We hope our construction can be shown to be the most preferred extension of the knowledge base according to some intuitively defined preference relation, as has been done in the rational case. In this work we required the postulate of Infra-Rationality. As a result our construction of disjunctive rational closure took the rational closure as a starting point and then performed a particular modification to it to obtain a special ‘privileged’ subset of it that extends the input knowledge base and forms a disjunctive consequence relation. However it is clear that this modification could just as well be applied to any of the other conditional inference methods that have been suggested in the literature and that output a rational consequence relation, such as the lexicographic closure or System JLZ (Weydert 2003) or those based on c-revisions (Kern-Isberner 2001). It will be interesting to see what kind of properties will be gained or lost in these cases. Finally, given the recent trend in applying defeasible reasoning to formal ontologies in Description Logics (Bonatti et al. 2015; Bonatti and Sauro 2017; Britz, Meyer, and Varzinczak 2011; Britz and Varzinczak 2019; Giordano et al. 2015; Pensel and Turhan 2018), an investigation of our approach beyond the propositional case is also envisaged. Acknowledgments This work is based upon research supported in part by the “Programme de Recherche Commun” Non-Classical Reasoning for Enhanced Ontology-based Semantic Technologies between the CNRS and the Royal Society. Thanks to the anonymous NMR reviewers for some helpful suggestions. References Bonatti, P., and Sauro, L. 2017. On the logical properties of the nonmonotonic description logic DLN . Artificial Intelligence 248:85–111. Bonatti, P.; Faella, M.; Petrova, I.; and Sauro, L. 2015. A new semantics for overriding in description logics. Artificial Intelligence 222:1–48. Bonatti, P. 2019. Rational closure for all description logics. Artificial Intelligence 274:197–223. Booth, R., and Paris, J. 1998. A note on the rational closure of knowledge bases with both positive and negative knowledge. Journal of Logic, Language and Information 7(2):165–190. Booth, R.; Casini, G.; Meyer, T.; and Varzinczak, I. 2019. On rational entailment for propositional typicality logic. Artificial Intelligence 277. Booth, R.; Meyer, T.; and Varzinczak, I. 2012. PTL: A propositional typicality logic. In Fariñas del Cerro, L.; Herzig, A.; and Mengin, J., eds., Proceedings of the 13th European Conference on Logics in Artificial Intelligence (JELIA), number 7519 in LNCS, 107–119. Springer. Britz, K., and Varzinczak, I. 2017. Toward defeasible SROIQ. In Proceedings of the 30th International Workshop on Description Logics. Britz, K., and Varzinczak, I. 2018a. From KLM-style conditionals to defeasible modalities, and back. Journal of Applied Non-Classical Logics (JANCL) 28(1):92–121. Britz, K., and Varzinczak, I. 2018b. Preferential accessibility and preferred worlds. Journal of Logic, Language and Information (JoLLI) 27(2):133–155. Britz, K., and Varzinczak, I. 2019. Contextual rational closure for defeasible ALC. Annals of Mathematics and Artificial Intelligence 87(1-2):83–108. Britz, K.; Meyer, T.; and Varzinczak, I. 2011. Semantic foundation for preferential description logics. In Wang, D., and Reynolds, M., eds., Proceedings of the 24th Australasian Joint Conference on Artificial Intelligence, number 7106 in LNAI, 491–500. Springer. Casini, G.; Meyer, T.; Moodley, K.; and Nortjé, R. 2014. Relevant closure: A new form of defeasible reasoning for 46 description logics. In Fermé, E., and Leite, J., eds., Proceedings of the 14th European Conference on Logics in Artificial Intelligence (JELIA), number 8761 in LNCS, 92–106. Springer. Casini, G.; Meyer, T.; Moodley, K.; Sattler, U.; and Varzinczak, I. 2015. Introducing defeasibility into OWL ontologies. In Arenas, M.; Corcho, O.; Simperl, E.; Strohmaier, M.; d’Aquin, M.; Srinivas, K.; Groth, P.; Dumontier, M.; Heflin, J.; Thirunarayan, K.; and Staab, S., eds., Proceedings of the 14th International Semantic Web Conference (ISWC), number 9367 in LNCS, 409–426. Springer. Casini, G.; Meyer, T.; and Varzinczak, I. 2019. Taking defeasible entailment beyond rational closure. In Calimeri, F.; Leone, N.; and Manna, M., eds., Proceedings of the 16th European Conference on Logics in Artificial Intelligence (JELIA), number 11468 in LNCS, 182–197. Springer. Chafik, A.; Cheikh, F.; Condotta, J.-F.; and Varzinczak, I. 2020. On the decidability of a fragment of preferential LTL. In Proceedings of the 27th International Symposium on Temporal Representation and Reasoning (TIME). Fishburn, P. 1985. Interval Orders and Interval Graphs: A Study of Partially Ordered Sets. Wiley. Freund, M. 1993. Injective models and disjunctive relations. Journal of Logic and Computation 3(3):231–247. Gärdenfors, P., and Makinson, D. 1994. Nonmonotonic inference based on expectations. Artificial Intelligence 65(2):197–245. Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. 2007. Preferential description logics. In Dershowitz, N., and Voronkov, A., eds., Logic for Programming, Artificial Intelligence, and Reasoning (LPAR), number 4790 in LNAI, 257–272. Springer. Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. 2010. Preferential vs rational description logics: which one for reasoning about typicality? In Proceedings of the European Conference on Artificial Intelligence (ECAI), 1069–1070. Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. 2012. A minimal model semantics for nonmonotonic reasoning. In Fariñas del Cerro, L.; Herzig, A.; and Mengin, J., eds., Proceedings of the 13th European Conference on Logics in Artificial Intelligence (JELIA), number 7519 in LNCS, 228– 241. Springer. Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. 2015. Semantic characterization of rational closure: From propositional logic to description logics. Artificial Intelligence 226:1–33. Jaeger, M. 1996. Representation independence of nonmonotonic inference relations. In Aiello, L.; Doyle, J.; and Shapiro, S., eds., Proceedings of the 5th International Conference on Principles of Knowledge Representation and Reasoning (KR), 461–472. Morgan Kaufmann Publishers. Kern-Isberner, G. 2001. Conditionals in Nonmonotonic Reasoning and Belief Revision. Springer, Lecture Notes in Artificial Intelligence. Kraus, S.; Lehmann, D.; and Magidor, M. 1990. Nonmono- tonic reasoning, preferential models and cumulative logics. Artificial Intelligence 44:167–207. Lehmann, D., and Magidor, M. 1992. What does a conditional knowledge base entail? Artificial Intelligence 55:1– 60. Lehmann, D. 1995. Another perspective on default reasoning. Annals of Mathematics and Artificial Intelligence 15(1):61–82. Makinson, D. 1994. General patterns in nonmonotonic reasoning. In Gabbay, D.; Hogger, C.; and Robinson, J., eds., Handbook of Logic in Artificial Intelligence and Logic Programming, volume 3. Oxford University Press. 35–110. Marquis, P., and Schwind, N. 2014. Lost in translation: Language independence in propositional logic – application to belief change. Artificial Intelligence 206:1–24. McCarthy, J. 1980. Circumscription, a form of nonmonotonic reasoning. Artificial Intelligence 13(1-2):27–39. Pensel, M., and Turhan, A.-Y. 2017. Including quantification in defeasible reasoning for the description logic EL⊥ . In Balduccini, M., and Janhunen, T., eds., Proceedings of the 14th International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR), number 10377 in LNCS, 78–84. Springer. Pensel, M., and Turhan, A.-Y. 2018. Reasoning in the defeasible description logic EL⊥ – computing standard inferences under rational and relevant semantics. International Journal of Approximate Reasoning 112:28–70. Reiter, R. 1980. A logic for default reasoning. Artificial Intelligence 13(1-2):81–132. Rott, H. 2014. Four floors for the theory of theory change: The case of imperfect discrimination. In Fermé, E., and Leite, J., eds., Proceedings of the 14th European Conference on Logics in Artificial Intelligence (JELIA), number 8761 in LNCS, 368–382. Springer. Shoham, Y. 1988. Reasoning about Change: Time and Causation from the Standpoint of Artificial Intelligence. MIT Press. Varzinczak, I. 2018. A note on a description logic of concept and role typicality for defeasible reasoning over ontologies. Logica Universalis 12(3-4):297–325. Weydert, E. 2003. System JLZ - rational default reasoning by minimal ranking constructions. Journal of Applied Logic 1(3-4):273–308. 47 Treewidth-Aware Complexity in ASP: Not all Positive Cycles are Equally Hard∗ Jorge Fandinno1 , Markus Hecher1,2 1 University of Potsdam, Germany 2 TU Wien, Austria {jorgefandinno, mhecher}@gmail.com Abstract of the program. Still, existing solvers (Gebser et al. 2012; Alviano et al. 2017) are able to find solutions for many interesting problems in reasonable time. A way to shed light into this discrepancy is by means of parameterized complexity (Cygan et al. 2015), which conducts more fine-grained complexity analysis in terms of parameters of a problem. For ASP, several results were achieved in this direction (Gottlob, Scarcello, and Sideri 2002; Lonc and Truszczynski 2003; Lin and Zhao 2004; Fichte and Szeider 2015), some insights involve even combinations (Lackner and Pfandler 2012; Fichte, Kronegger, and Woltran 2019) of parameters. More recent studies focus on the influence of the parameter treewidth for solving ASP (Jakl, Pichler, and Woltran 2009; Fichte et al. 2017; Fichte and Hecher 2019; Bichler, Morak, and Woltran 2018; Bliem et al. 2020). These works directly make use of the treewidth of a given logic program in order to solve, e.g., the consistency problem, in polynomial time in the program size, while being exponential only in the treewidth. Recently, it was shown that for normal ASP deciding consistency is expected to be slightly superexponential for treewidth (Hecher 2020). More concretely, a lower bound was established saying that under reasonable assumptions such as the Exponential Time Hypothesis (ETH) (Impagliazzo, Paturi, and Zane 2001), consistency for any normal logic program of treewidth k cannot be decided in time significantly better than 2k·⌈log(k)⌉ · poly(n), where n is the number of variables (atoms) of the program. This result matches the known upper bound (Fichte and Hecher 2019) and shows that the consistency of normal ASP is slightly harder than the satisfiability (SAT) of a propositional formula, which under the ETH cannot be decided in time 2o(k) · poly(n). We address this result and provide a more detailed analysis, where besides treewidth, we also consider the size ℓ of the largest strongly-connected component (SCC) of the positive dependency graph as parameter. This allows us to obtain runtimes below 2k·⌈log(k)⌉ · poly(n) and show that that not all positive cycles of logic programs are equally hard. Then, we also provide a treewidth-aware reduction from headcycle-free ASP to the fragment of tight ASP, which prohibits cycles in the corresponding positive dependency graph. This reduction reduces a given head-cycle-free program of treewidth k to a tight program of treewidth O(k · log(ℓ)), which improves known results (Hecher 2020). Finally, we establish that tight ASP is as hard as SAT in terms of treewidth. It is well-know that deciding consistency for normal answer set programs (ASP) is NP-complete, thus, as hard as the satisfaction problem for classical propositional logic (SAT). The best algorithms to solve these problems take exponential time in the worst case. The exponential time hypothesis (ETH) implies that this result is tight for SAT, that is, SAT cannot be solved in subexponential time. This immediately establishes that the result is also tight for the consistency problem for ASP. However, accounting for the treewidth of the problem, the consistency problem for ASP is slightly harder than SAT: while SAT can be solved by an algorithm that runs in exponential time in the treewidth k, it was recently shown that ASP requires exponential time in k · log(k). This extra cost is due checking that there are no self-supported true atoms due to positive cycles in the program. In this paper, we refine the above result and show that the consistency problem for ASP can be solved in exponential time in k · log(λ) where λ is the minimum between the treewidth and the size of the largest strongly-connected component in the positive dependency graph of the program. We provide a dynamic programming algorithm that solves the problem and a treewidth-aware reduction from ASP to SAT that adhere to the above limit. 1 Introduction Answer Set Programming (ASP) (Brewka, Eiter, and Truszczyński 2011; Gebser et al. 2012) is a problem modeling and solving paradigm well-known in the area of knowledge representation and reasoning that is experiencing an increasing number of successful applications (Balduccini, Gelfond, and Nogueira 2006; Nogueira et al. 2001; Guziolowski et al. 2013). The flexibility of ASP comes with a high computational complexity cost: its consistency problem, that is, deciding the existence of a solution (answer set) for a given logic program is ΣP 2 -complete (Eiter and Gottlob 1995), in general. Fragments with lower complexity are also known. For instance, the consistency problem for normal ASP or head-cycle-free (HCF) ASP, is NP-complete. Even for solving this class of programs, the best known algorithms require exponential time with respect to the size ∗ The work has been supported by the Austrian Science Fund (FWF), Grants Y698 and P32830, and the Vienna Science and Technology Fund, Grant WWTF ICT19-065. It is also accepted for presentation at the ASPOCP’20 workshop (Fandinno and Hecher 2020). 48 and Truszczyński 2011). Let m, n, o be non-negative integers such that m ≤ n ≤ o, a1 , . . ., ao be distinct propositional atoms. Moreover, we refer by literal to an atom or the negation thereof. A (logic) program Π is a set of rules of the form a1 ∨ · · · ∨ am ← am+1 , . . . , an , ¬an+1 , . . . , ¬ao . For a rule r, we let Hr := {a1 , . . . , am }, Br+ := {am+1 , . . . , an }, and Br− := {an+1 , . . . , ao }. We denote the sets of atoms occurring in a rule r or in S a program Π by at(r) := Hr ∪ Br+ ∪ Br− and at(Π) := r∈Π at(r). For a set X ⊆ at(Π) of atoms, we let X := {¬x | x ∈ X}. Program Π is normal, if |Hr | ≤ 1 for every r ∈ Π. The positive dependency digraph DΠSof Π is the directed graph defined on the set of atoms from r∈Π Hr ∪ Br+ , where there is a directed edge from vertex a to vertex b iff there is a rule r ∈ Π with a ∈ Br+ and b ∈ Hr . A head-cycle of DΠ is an {a, b}cycle1 for two distinct atoms a, b ∈ Hr for some rule r ∈ Π. A program Π is head-cycle-free (HCF) if DΠ contains no head-cycle (Ben-Eliyahu and Dechter 1994) and Π is called tight if DΠ contains no cycle at all (Lin and Zhao 2003). The class of tight, normal, and HCF programs is referred to by tight, normal, and HCF ASP, respectively. An interpretation I is a set of atoms. I satisfies a rule r if (Hr ∪ Br− ) ∩ I 6= ∅ or Br+ \ I 6= ∅. I is a model of Π if it satisfies all rules of Π, in symbols I |= Π. For brevity, we view propositional formulas as sets of clauses that need to be satisfied, and use the notion of interpretations, models, and satisfiability analogously. The GelfondLifschitz (GL) reduct of Π under I is the program ΠI obtained from Π by first removing all rules r with Br− ∩ I 6= ∅ and then removing all ¬z where z ∈ Br− from every remaining rule r (Gelfond and Lifschitz 1991). I is an answer set of a program Π if I is a minimal model of ΠI . The problem of deciding whether an ASP program has an answer set is called consistency, which is ΣP2 -complete (Eiter and Gottlob 1995). If the input is restricted to normal programs, the complexity drops to NP-complete (Bidoı́t and Froidevaux 1991; Marek and Truszczyński 1991). A head-cycle-free program Π can be translated into a normal program in polynomial time (Ben-Eliyahu and Dechter 1994). The following characterization of answer sets is often invoked when considering normal programs (Lin and Zhao 2003). Given a set A ⊆ at(Π) of atoms, a function σ : A → {0, . . . , |A|−1} is called level mapping over A. Given a model I of a normal program Π and a level mapping σ over I, an atom a ∈ I is proven if there is a rule r ∈ Π proving a with σ, where a ∈ Hr with (i) Br+ ⊆ I, (ii) I ∩Br− = ∅ and I ∩(Hr \{a}) = ∅, and (iii) σ(b) < σ(a) for every b ∈ Br+ . Then, I is an answer set of Π if (i) I is a model of Π, and (ii) I is proven, i.e., every a ∈ I is proven. This characterization vacuously extends to head-cycle-free programs (Ben-Eliyahu and Dechter 1994) and allows for further simplification when considering SCCs of DΠ (Janhunen 2006). To this end, we denote for each atom a ∈ at(Π) the strongly-connected component (SCC) of atom a in DΠ by scc(a). Then, Condition (iii) above can be relaxed to σ(b) < σ(a) for every b ∈ Br+ ∩ C, where C = scc(a) is the SCC of a. Contributions. More concretely, we present the following. 1. First, we establish a parameterized algorithm for deciding consistency of any head-cycle-free program Π that runs in time 2O(k·log(ℓ)) · poly(|at(Π)|), where k is the treewidth of Π and ℓ is the size of the largest strongly-connected component (SCC) of the dependency graph of Π. Combining this result with results from (Hecher 2020), consistency of any head-cycle-free program can be decided in 2O(k·log(λ)) ·poly(|at(Π)|) where λ is the minimum of k and ℓ. Besides, our algorithm bijectively preserves answer sets with respect to the atoms of Π and can be therefore easily extended, see, e.g. (Pichler, Rümmele, and Woltran 2010), for counting and enumerating answer sets. 2. Then, we present a treewidth-aware reduction from headcycle-free ASP to tight ASP. Our reduction takes any headcycle-free program Π and creates a tight program, whose treewidth is at most O(k · log(ℓ)), where k is the treewidth of Π and ℓ is the size of the largest SCC of the dependency graph of Π. In general, the treewidth of the resulting tight program cannot be in o(k · log(k)), unless ETH fails. Our reduction forms a major improvement for the particular case where ℓ ≪ k. 3. Finally, we show a treewidth-aware reduction that takes any tight logic program Π and creates a propositional formula, whose treewidth is linear in the treewidth of the program. This reduction cannot be significantly improved under ETH. Our result also establishes that for deciding consistency of tight logic programs of bounded treewidth k, one indeed obtains the same runtime as for SAT, namley 2O(k) · poly(|at(Π)|), which is ETH-tight. Related Work. While the largest SCC size has already been considered (Janhunen 2006), it has not been studied in combination with treewidth. Also programs, where the number of even and/or odd cycles is bounded, have been analyzed (Lin and Zhao 2004), which is orthogonal to the size of the largest cycle or largest SCC size ℓ. Indeed, in the worst-case, each component might have an exponential number of cycles in ℓ. Further, the literature distinguishes the so-called feedback width (Gottlob, Scarcello, and Sideri 2002), which involves the number of atoms required to break the positive cycles. There are also related measures, called smallest backdoor size, where the removal of a backdoor, i.e., set of atoms, from the program results in normal or acyclic programs (Fichte and Szeider 2015; Fichte and Szeider 2017). 2 Background We assume familiarity with graph terminology. Given a directed graph G = (V, E). Then, a set C ⊆ V of vertices of G is a strongly-connected component (SCC) of G if C is a ⊆-largest set such that for every two distinct vertices u, v in C there is a directed path from u to v in G. A cycle over some vertex v of G is a directed path from v to v. Answer Set Programming (ASP). Further, we assume familiarity with propositional satisfiability (SAT) and follow standard definitions of propositional ASP (Brewka, Eiter, 1 Let G = (V, E) be a digraph and W ⊆ V . Then, a cycle in G is a W -cycle if it contains all vertices from W . 49 e g b t5 {b, e, f } a e d f b c g f d c Figure 1: Positive dependency graph DΠ of Π of Example 1. a t3 {b, d, e} t1 {a, b, d} t4 {e, f, g} {b, c, d} t2 Figure 2: Graph G (left) and a tree decomposition T of G (right). Example 1. Consider Π, given by gram the following pro; b ← a ; b ← {a ← d | {z } | {z } | {z d}; r1 b ← e, ¬f ; c ← b; d ← b, c; e ∨ f ∨ g ← }. | {z } | {z } | {z } | {z } r4 r5 r6 r2 Example 3. Recall program Π from Example 1 and observe that graph G of Figure 2 is the primal graph of Π. Further, we have Πt1 = {r1 , r2 , r3 }, Πt2 = {r3 , r5 , r6 }, Πt3 = ∅, Πt4 = {r7 } and Πt5 = {r4 }. r3 Observe r7 that Π is head-cycle-free. Figure 1 shows the positive dependency graph DΠ consisting of SCCs scc(e), scc(f ), scc(g), and scc(a) = scc(b) = scc(c) = scc(d). Then, I := {a, b, c, d, e} is an answer set of Π, since I |= Π, and we can prove with level mapping σ := {e 7→ 0, f 7→ 0, g 7→ 0, b 7→ 0, c 7→ 1, d 7→ 2, a 7→ 3} atom e by rule r7 , atom b by rule r4 , atom c by rule r5 , and atom d by rule r6 . Further answer sets are {f } and {g}. 3 Bounding Treewidth and Positive Cycles Recently, it was shown that under reasonable assumptions, namely the exponential time hypothesis (ETH), deciding consistency of normal logic programs is slightly superexponential and one cannot expect to significantly improve in the worst case. For a given normal logic program, where k is the treewidth of the primal graph of the program, this implies that one cannot decide consistency in time significantly better than 2k·⌈log(k)⌉ · poly(|at(Π)|). A tree decomposiTree Decompositions (TDs). tion (TD) (Robertson and Seymour 1986) of a given graph G=(V, E) is a pair T =(T, χ) where T is a tree rooted at root(T ) and χ assigns to each node S t of T a set χ(t) ⊆ V , called bag, such that (i) V = t of T χ(t), (ii) E ⊆ {{u, v} | t in T, {u, v} ⊆ χ(t)}, and (iii) “connectedness”: for each r, s, t of T , such that s lies on the path from r to t, we have χ(r) ∩ χ(t) ⊆ χ(s). For every node t of T , we denote by chld(t) the set of child nodes of t in T . The bags χ≤t below t consists of the union of all bags of nodes below t in T , including t. We let width(T ) := maxt of T |χ(t)|−1. The treewidth tw (G) of G is the minimum width(T ) over all TDs T of G. TDs can be 5-approximated in single exponential time (Bodlaender et al. 2016) in the treewidth. For a node t of T , we say that type(t) is leaf if t has no children and χ(t) = ∅; join if t has children t′ and t′′ with t′ 6= t′′ and χ(t) = χ(t′ ) = χ(t′′ ); int (“introduce”) if t has a single child t′ , χ(t′ ) ⊆ χ(t) and |χ(t)| = |χ(t′ )| + 1; forget if t has a single child t′ , χ(t′ ) ⊇ χ(t) and |χ(t′ )| = |χ(t)| + 1. If for every node t of T , type(t) ∈ {leaf, join, int, forget}, the TD is called nice. A TD can be turned into a nice TD (Kloks 1994)[Lem. 13.1.3] without increasing the width in linear time. Example 2. Figure 2 illustrates a graph G and a TD T of G of width 2, which is also the treewidth of G, since G contains (Kloks 1994) a completely connected graph among vertices b,c,d. In order to use TDs for ASP, we need dedicated graph representations of programs (Jakl, Pichler, and Woltran 2009). The primal graph2 GΠ of program Π has the atoms of Π as vertices and an edge {a, b} if there exists a rule r ∈ Π and a, b ∈ at(r). Let T = (T, χ) be a TD of primal graph GΠ of a program Π, and let t be a node of T . The bag program Πt contains rules entirely covered by the bag χ(t). Formally, Πt := {r | r ∈ Π, at(r) ⊆ χ(t)}. Proposition 1 (Lower Bound for Treewidth (Hecher 2020)). Given a normal or head-cycle-free logic program Π, where k is the treewidth of the primal graph of Π. Then, under ETH one cannot decide consistency of Π in time 2o(k·log(k)) · poly(|at(Π)|). While according to Proposition 1, we cannot expect to significantly improve the runtime for normal logic programs in the worst case, it still is worth to study the underlying reason that makes the worst case so bad. It is well-known that positive cycles are responsible for the hardness (Lifschitz and Razborov 2006; Janhunen 2006) of computing answer sets of normal logic programs. The particular issue with logic programs Π in combination with treewidth and large cycles is that in a tree decomposition of GΠ it might be the case that the cycle spreads across the whole decomposition, i.e., tree decomposition bags only contain parts of such cycles, which prohibits to view these cycles (and dependencies) as a whole. This is also the reason of the hardness given in Proposition 1 and explains why under bounded treewidth evaluating normal logic programs is harder than evaluating propositional formulas. However, if a given normal logic program only has positive cycles of length at most 3, and each atom appears in at most one positive cycle, the properties of tree decompositions already ensure that the atoms of each such positive cycle appear in at least one common bag. Indeed, a cycle of length at most 3 forms a completely connected subgraph and therefore it is guaranteed (Kloks 1994) that the atoms of the cycle are in one common bag of any tree decomposition of GΠ . Example 4. Recall program Π of Example 1. Observe that in any TD of GΠ it is required that there are nodes t, t′ with χ(t) ⊆ {b, c, d} and χ(t′ ) ⊆ {a, b, d} since a cycle of length 3 in the positive dependency graph DΠ (cf., Figure 1) forms a completely connected graph in the primal graph, cf., Figure 2 (left). 2 Analogously, the primal graph GF of a propositional Formula F uses variables of F as vertices and adjoins two vertices a, b by an edge, if there is a clause in F containing a, b. 50 In the following, we generalize this result to cycles of length at most ℓ, where we bound the size of these positive cycles in order to improve the lower bound of Proposition 1 on programs of bounded positive cycle lengths. This will provide not only a significant improvement in the running time on programs, where the size of positive cycles is bounded, but also shows that indeed the case of positive cycle lengths up to 3 can be generalized to lengths beyond 3. Consequently, we establish that not all positive cycles are bad assuming that the maximum size ℓ of the positive cycles is bounded, which provides an improvement of Proposition 1 as long as ℓ ≪ k, where k is the treewidth of GΠ . The overall idea of the algorithm relies on so-called dynamic programming, which be briefly recap next. Dynamic Programming on Tree Decompositions. Dynamic programming (DP) on TDs, see, e.g., (Bodlaender and Koster 2008), evaluates a given input instance I in parts along a given TD of a graph representation G of the instance. Thereby, for each node t of the TD, intermediate results are stored in a table τt . This is achieved by running a table algorithm, which is designed for a certain graph representation, and stores in τt results of problem parts of I, thereby considering tables τt′ for child nodes t′ of t. DP works for many problem instances I as follows. 1. Construct a graph representation G of I. 2. Compute a TD T = (T, χ) of G. For simplicity and better presentation of the different cases within our table algorithms, we use nice TDs for DP. 3. Traverse the nodes of T in post-order (bottom-up tree traversal of T ). At every node t of T during post-order traversal, execute a table algorithm that takes as input a bag χ(t), a certain bag instance It depending on the problem, as well as previously computed child tables of t. Then, the results of this execution is stored in table τt . 4. Finally, interpret table τn for the root node n of T in order to output the solution to the problem for instance I. Bounding Positive Cycles. In the remainder, we assume an HCF logic program Π, whose treewidth is given by k = tw (GΠ ). We let ℓscc(a) for each atom a be the number of atoms (size) of the SCC of a in DΠ . Further, we let ℓ := maxa∈at(Π) ℓscc(a) be the largest SCC size. This also bounds the lengths of positive cycles. If each atom a appears in at most one positive cycle, we have that ℓscc(a) is the cycle length of a and then ℓ is the length of the largest cycle in Π. We refer to the class of HCF logic programs, whose largest SCC size is bounded by a parameter ℓ by SCC-bounded ASP. Observe that the largest SCC size ℓ is orthogonal to the measure treewidth. Example 5. Consider program Π from Example 1. Then, ℓscc(e) =ℓscc(f ) =ℓscc(g) =1, ℓscc(a) =ℓscc(b) =ℓscc(c) = ℓscc(d) =4, and ℓ = 4. Now, assume a program Π′ , whose primal graph equals the dependency graph, which is just one large (positive) cycle. It is easy to see that this program has treewidth 2 and one can define a TD of GΠ′ , whose bags are constructed along the cycle. However, the largest SCC size coincides with the number of atoms. Conversely, there are instances of large treewidth without any positive cycle. Bounding cycle lengths or sizes of SCCs seems similar to the non-parameterized context, where the consistency of normal logic programs is compiled to a propositional formula (SAT) by a reduction based on level mappings that is applied on a SCC-by-SCC basis (Janhunen 2006). However, this reduction does not preserve the treewidth. On the other hand, while our approach also uses level mappings and proceeds on an SCC-by-SCC basis, the overall evaluation is not SCCbased, since this might completely destroy the treewidth in the worst-case. Instead, the evaluation is still guided along a tree decomposition, which is presented in two flavors. First, we show a dedicated parameterized algorithm for the evaluation of logic programs of bounded treewidth, followed by a treewidth-aware reduction to propositional satisfiability. 3.1 Now, the missing ingredient for solving problems via dynamic programming along a given TD, is a suitable table algorithm. Such algorithms have been already presented for SAT (Samer and Szeider 2010) and ASP (Jakl, Pichler, and Woltran 2009; Fichte et al. 2017; Fichte and Hecher 2019). We only briefly sketch the ideas of a table algorithm using the primal graph that computes models (not answer sets) of a given program Π. Each table τt consists of rows storing interpretations over atoms in the bag χ(t). Then, the table τt for leaf nodes t consist of the empty interpretation. For nodes t with introduced variable a ∈ χ(t), we store in τt interpretations of the child table, but for each such interpretation we decide whether a is in the interpretation or not, and ensure that the interpretation satisfies Πt . When an atom b is forgotten in a forget node t, we store interpretations of the child table, but restricted to atoms in χ(t). By the properties of a TD, it is then guaranteed that all rules containing b have been processed so far. For join nodes, we store in τt interpretations that are also in both child tables of t. 3.2 An Algorithm for SCC-bounded ASP and Treewidth Similar to the table algorithm sketched above, we present next a table algorithm BndCyc for solving consistency of SCCbounded ASP. Let therefore Π be a given SCC-bounded program of largest SCC size ℓ and T = (T, χ) be a tree decomposition of GΠ . Before we discuss the tables and the algorithm itself, we need to define level mappings similar to related work (Janhunen 2006), but adapted to SCC-bounded programs. Formally, a level mapping σ : A → {0, . . . , ℓ−1} over atoms A ⊆ at(Π) is a function mapping each atom a ∈ A to a level σ(a) such that the level does not exceed ℓscc(a) , i.e., σ(a) < ℓscc(a) . Towards Exploiting Treewidth for SCC-bounded ASP In the course of this and the next section, we establish the following result. Theorem 1 (Runtime of SCC-bounded ASP). Assume a HCF logic program Π, where the treewidth of the primal graph GΠ of Π is at most k and ℓ is the largest SCC size. Then, there is an algorithm for deciding the consistency of Π, running in time 2O(k·log(λ)) · poly(|at(Π)|), where λ = min({k, ℓ}). 51 σ16.i i i hI16.i ,P16.i , P14.i , σ14.i i i hI14.i , 1 h{e}, {e}, {e 7→ 0}i 1 h∅, ∅, ∅i 2 h{f }, {f }, {f 7→ 0}i ∅, {f 7→ [ℓ]}i 3 h{g}, {g}, {g 7→ 0}i τ16 2 h{f }, 3 h{e, f }, ∅, {e, f 7→ [ℓ]}i i hI12.i ,P12.i ,σ12.i i ∅, {b 7→ 0}i {e, f, g} t16 4 h{b}, 1 h∅, ∅, ∅i 5 h{b, e}, {b}, {b 7→ 0, 2 h{b}, ∅, {b → 7 0}i e 7→ [ℓ]}i τ11 {e, f } t15 τ12 6 h{b, f }, ∅, {f 7→ [ℓ]}i σ11.i i i hI11.i , P11.i , 7 h{b, e, f },{b},{e, f 7→ [ℓ]}i 1 h∅, ∅, ∅i {b, e, f } t14 τ14 2 h{b, d},∅, {b, d 7→ [ℓ−1]}i 3 h{b, d},{b},{b 7→ 2, d 7→ 0}i hI , P σ10.i i i 10.i 10.i , {b, e} t13 4 h{b, d},{d},{b 7→ 0, d 7→ 2}i h∅, ∅, ∅i 1 h{b, d},∅, {b 7→ [ℓ−1], 2 σ5.i i i hI5.i , P5.i , t12 {b} d → 7 [ℓ]}i 1 h∅, ∅, ∅i τ5 h{b, d},{d}, σ10.3 (b)< 3 2 h{b}, ∅, {b 7→ [ℓ]}i t11 {b, d} t 10 σ10.3 (d)−1i 3 h{b, d},∅, {b, d 7→ [ℓ]}i τ10 4 h{b, d},{b}, σ5.4 (d)< {b, d} t5 {b, d} t9 σ5.4 (b)−1i t4 τ9 P9.i , σ9.i i i {b, c, d} hI9.i , P4.i , σ4.i i {a, b, d} i hI4.i , h∅, ∅, ∅i 1 1 h∅, ∅, ∅i t3 {a, b} {c, d} t8 h{c}, ∅, {c 7→ [ℓ]}i 2 2 h{b}, ∅, {b 7→ [ℓ]}i h{b, c, d},{c}, σ9.3 (b)< 3 3 h{a, b}, ∅, {a, b 7→ [ℓ]}i σ9.3 (c)i t7 {c} 4 h{a, b, d}, {a}, σ4.4 (d)< t2 {a} h{b, c, d},{d, c}, σ9.4 (b)< 4 σ4.4 (a)i σ9.4 (c), σ9.4 (c)<σ9.4 (d)i 5 h{a, b, d}, {a, b}, σ4.5 (d)< τ4 ∅ t1 t6 ∅ σ4.5 (a), σ4.5 (a)<σ4.5 (b)i Listing 1: Table algorithm BndCyc(t, χ(t), Πt , hτ1 , . . . , τo i) for nodes of nice TDs. In: Node t, bag χ(t), bag program Πt , sequence hτ1 , . . . , τo i of child tables of t. Out: Table τt . 1 if type(t) = leaf then τt ← {h∅, ∅, ∅i} 2 else if type(t) = int and a ∈ χ(t) is the introduced atom then 3 τt ← {hI ′ , P ′ , σ ′ i | hI, P, σi ∈ τ1 , I ′ ∈ {I, Ia+ }, I ′ |= Πt , 4 σ ′ ∈ levelMaps(σ, {a} ∩ I ′ ), isMin(σ ′ , Πt ), P ′ = P ∪ proven(I ′ , σ ′ , Πt )} 5 else if type(t) = forget, a 6∈ χ(t) is the forgotten atom then 6 τt ← {hIa− , Pa− , σa∼ i | hI, P, σi ∈ τ1 , a ∈ P ∪ ({a} \ I)} 7 else if type(t) = join /* o=2 children of t */ then 8 τt ← {hI, P1 ∪ P2 , σi | hI, P1 , σi ∈ τ1 , hI, P2 , σi ∈ τ2 } 9 return τt For a function σ mapping x to σ(x), we let σx∼ :=σ \ {x 7→ σ(x)} be the function σ without containing x. Further, for given set S and an element e, we let Se+ :=S ∪ {e} and Se− :=S \ {e}. These level mappings are used in the construction of the tables of BndCyc, where each table τt for a node t of TD T consists of rows of the form hI, P, σi, where I ⊆ χ(t) is an interpretation of atoms χ(t), P ⊆ χ(t) is a set of atoms in χ(t) that are proven, and σ is a level mapping over χ(t). Before we discuss the table algorithm, we need auxiliary notation. Let proven(I, σ, Πt ) be a subset of atoms I containing all atoms a ∈ I where there is a rule r ∈ Πt proving a with σ. However, σ provides for a only a level number within the SCC of a, i.e., proven requires the relaxed characterization of provability that considers scc(a), as given in Section 2. Then, we denote by levelMaps(σ, I) those set of level mappings σ ′ that extend σ by atoms in I, where for each atom a ∈ I, we have a level σ ′ (a) with σ ′ (a) < ℓscc(a) . Further, we let isMin(σ, Πt ) be 0 if σ is not minimal, i.e., if there is an atom a with σ(a) > 0 where a rule r ∈ Πt proves a with a level mapping ρ that is identical to σ, but sets ρ(a) = σ(a) − 1, and be 1 otherwise. Figure 3: Tables obtained by DP on a TD T ′ using algorithm BndCyc of Listing 1. occurrence of isMin in Line 4. Whenever an atom a is forgotten in node t, i.e., if type(t) = forget, we take in Line 6 only rows of the table τ1 for the child node of t, where either a is not in the interpretation or a is proven, and remove a from the row accordingly. By the properties of TDs, it is guaranteed that we have encountered all rules involving a in any node below t. Finally, if t is a join node (type(t) = join), we ensure in Line 8 that we take only rows of both child tables of t, which agree on interpretations and level mappings, and that an atom is proven if proven in one of the two child rows. Example 6. Recall program Π with ℓ = 4 from Example 1. Figure 3 shows a nice TD T ′ of GΠ and lists selected tables τ1 , . . . , τ16 that are obtained during DP by using BndCyc (cf., Listing 1) on TD T ′ . Rows highlighted in gray are discarded and do not lead to an answer set, yellow highlighted rows form one answer set. For brevity, we compactly represent tables by grouping rows according to similar level mappings. We write [ℓ] for any value in {0, . . . , ℓ−1} and we sloppily write, e.g., σ9.3 (b) < σ9.3 (c) to indicate any level mapping σ9.3 in row 3 of table τ9 , where b has a smaller level than c. Node t1 is a leaf (type(t1 ) = leaf) and therefore τ1 = {h∅, ∅, ∅i} as stated in Line 1. Then, nodes t2 , t3 and t4 are introduce nodes. Therefore, table τ4 is the result of Lines 3 and 4 executed for nodes t2 , t3 and t4 , by introducing a, b, and d, respectively. Table τ4 contains all interpretations restricted to {a, b, d} that satisfy Πt4 = {r1 , r2 , r3 }, cf., Line 3. Further, each row contains a level mapping among atoms in the interpretation such that the corresponding set of proven atoms is obtained, cf., Line 4. Row 4 of τ4 for example requires a level mapping σ4.4 with σ4.4 (d) < σ4.4 (a) for a to be proven. Then, node t5 forgets a, which keeps only rows, where either a is not in the interpretation or a is in the set of proven atoms, and removes a from the result. The result of Line 6 on t5 is displayed in table τ5 , where Row 3 of τ4 does not have a successor in τ5 since a is not proven. For leaf Listing 1 depicts an algorithm BndCyc for solving consistency of SCC-bounded ASP. The algorithm is inspired by an approach for HCF logic programs (Fichte and Hecher 2019), whose idea is to evaluate Π in parts, given by the tree decomposition T . For the ease of presentation, algorithm BndCyc is presented for nice tree decompositions, where we have a clear case distinction for every node t depending on the node type type(t) ∈ {leaf, int, forget, join}. For arbitrary decompositions the cases are interleaved. If type(t) = leaf, we have that χ(t) = ∅ and therefore for χ(t) the interpretation, the set of proven atoms as well as the level mapping is empty, cf., Line 1 of Listing 1. Whenever an atom a ∈ χ(t) is introduced, i.e., if type(t) = int, we construct succeeding rows of the form hI ′ , P ′ , σ ′ i for every row in the table τ1 of the child node of t. We take such a row hI, P, σi of τ1 and guess whether a is in I, resulting in I ′ , and ensure that I ′ satisfies Πt , as given in Line 3. Consequently, I ′ is a model (not necessarily an answer set) of Πt . Then, Line 4 takes succeeding level mappings σ ′ of σ, as given by levelMaps, that are minimal (see isMin) and we finally ensure that the proven atoms P ′ update P by proven(I ′ , σ ′ , Πt ). Notably, if duplicate answer sets are not an issue, one can remove the 52 node t6 we have τt6 = τt1 . Similarly to before, t7 , t8 , and t9 are introduce nodes and τ9 depicts the resulting table for t9 . Table τ10 does not contain any successor row of Row 2 of τ9 , since c is not proven. Node t11 is a join node combining rows of τ5 and τ9 as given by Line 8. Observe that Row 3 of τ5 does not match with any row in τ9 . Further, combining Row 3 of τ5 with Row 3 of τ9 results in Row 4 of τ11 (since ℓ−1 = 3). The remaining tables can be obtained similarly. Table τ16 for the root node only depicts (solution) rows, where each atom is proven. size. Then, there is an algorithm for deciding the consistency of Π, running in time 2O(k·log(λ)) · poly(|at(Π)|), where λ = min({k, ℓ}). Proof. First, we compute (Bodlaender et al. 2016) a tree decomposition T = (T, χ) of GΠ that is a 5-approximation of k = tw (GΠ ) and has a linear number of nodes, in time 2O(k) · poly(|at(Π)|). Computing ℓscc(a) for each atom a ∈ at(Π) can be done in polynomial time. If ℓ > k, we directly run an algorithm (Fichte and Hecher 2019) for the consistency of Π. Otherwise, i.e., if ℓ ≤ k we run Listing 1 on each node t of T in a bottom-up (postorder) traversal. In both cases, we obtain a total runtime of 2O(k·log(λ)) · poly(|at(Π)|). In contrast to existing work (Fichte and Hecher 2019), if largest SCC size ℓ < k, where k is the treewidth of primal graph GΠ , our algorithm runs in time better than the lower bound given by Proposition 1. Further, existing work (Fichte and Hecher 2019) does not precisely characterize answer sets, but algorithm BndCyc of Listing 1 exactly computes all the answer sets of Π. Intuitively, the reason for this is that level mappings for an atom x ∈ at(Π) do not differ in different bags of T , but instead we use the same level (at most ℓscc(x) many possibilities) for x in all bags. Notably, capturing all the answer sets of Π allows that BndCyc can be slightly extended to count the answer sets of Π by extending the rows by an integer for counting accordingly. This can be extended further for answer set enumeration with linear delay, which results in an anytime enumeration algorithm that keeps for each row of a table its predecessor rows. Consequences on Correctness and Runtime. sketch correctness and finally show Theorem 1. 4 Treewidth-Aware Reductions for SCC-bounded ASP Next, we present a novel reduction from HCF ASP to tight ASP. Given a head-cycle-free logic program, we present a treewidth-aware reduction that constructs a tight logic program with little overhead in terms of treewidth. Concretely, if each SCC of the given head-cycle-free logic program Π has at most ℓ atoms, the resulting tight program has treewidth O(k · log(ℓ)). In the course of this section, we establish the following theorem. Theorem 2 (Removing Cyclicity of SCC-bounded ASP). Let Π be an HCF program, where the treewidth of GΠ is at most k and where every SCC C satisfies |C| ≤ ℓ. Then, there is a tight program Π′ with treewidth in O(k · log(ℓ)) such that for each answer set of Π there is exactly one answer set of Π′ , and vice versa. Next, we Lemma 1 (Correctness). Let Π be an HCF program, where the treewidth of GΠ is at most k and where every SCC C satisfies |C| ≤ ℓ. Then, for a given tree decomposition T = (T, χ) of primal graph GΠ , algorithm BndCyc executed for each node t of T in post-order is correct. 4.1 Reduction to Tight ASP The overall construction of the reduction is inspired by the idea of treewidth-aware reductions (Hecher 2020), where in the following, we assume an SCC-bounded program Π and a tree decomposition T = (T, χ) of GΠ such that the construction of the resulting tight logic program Π′ is heavily guided along T . In contrast to existing work (Hecher 2020), bounding cycles with the largest SCC size additionally allows to have a “global” level mapping (Janhunen 2006), i.e., we do not have different levels for an atom in different bags. Then, while the overall reduction is still guided along the tree decomposition T in order to take care to not increase treewidth too much, these global level mappings ensure that the tight program is guaranteed to preserve all answer sets (projected to the atoms of Π), as stated in Theorem 2. Before we discuss the construction in detail, we require auxiliary atoms and notation as follows. In order to guide the evaluation of the provability of an atom x ∈ at(Π) in a node t in T along the decomposition T , we use atoms pxt and px≤t to indicate that x was proven in node t (with some rule in Πt ) and below t, respectively. Further, we require atoms bjx , called level bits, for x ∈ at(Π) and 1 ≤ j ≤ ⌈log(ℓscc(x) )⌉, which are used as bits in order to represent in a level mapping the level of x in binary. To this end, we denote for x and a number i with 0 ≤ i < ℓscc(x) as well as a position number 1 ≤ j ≤ ⌈log(ℓscc(x) )⌉, the j-th position Proof (Sketch). The proof consists of both soundness, which shows that only correct data is in the tables, and completeness saying that no row of any table is missing. Soundness is established by showing an invariant for every node t, where the invariant is assumed for every child node of t. For the invariant, we use auxiliary notation program Π<t strictly below t consisting of Πt′ for any node t′ below t, as well as the program Π≤t below t, where Π≤t :=Π<t ∪ Πt . Intuitively, this invariant for t states that every row hI, P, σi of table τt ensures (1) “satisfiability”: I |= Πt , (2) “answer set extendability”: I can be extended to an answer set of Π<t , (3) “provability”: a ∈ P if and only if there is a rule in Π≤t proving a with σ, and (4) “minimality”: there is no a ∈ P, r ∈ Π≤t such that r proves a with σ ′ , where σ ′ coincides with σ, but sets σ ′ (a) = σ(a) − 1. Notably, the invariant for the empty root node n = root(T ) ensures that if τn 6= ∅, there is an answer set of Π. Completeness can be shown by establishing that if τt is complete, then every potential row that fulfills the invariant for any child node t′ of t, is indeed present in the corresponding table τt′ . Theorem 1 (Runtime of SCC-bounded ASP). Assume a HCF logic program Π, where the treewidth of the primal graph GΠ of Π is at most k and ℓ is the largest SCC 53 answer set of Π′ and vice versa. of i in binary by [i]j . Then, we let [[x]]i be the consistent set of literals over level bits bjx that is used to represent level number i for x in binary. More precisely, for each position number j, [[x]]i contains bjx if [i]j = 1 and ¬bjx otherwise, i.e., if [i]j = 0. Finally, we also use auxiliary atoms of the form x ≺ i (could be optimized out) to indicate that the level for x represented by [[x]]i is indeed smaller than i > 0. {x} ← ← Br+ , Br− ∪ Hr {bjx } ← Example 7. Recall program Π, level mapping σ, and largest SCC size ℓ = 4 from Example 1. For representing σ in binary, we require ⌈log(ℓ)⌉ = 2 bits per atom a ∈ at(Π) and we assume that bits are ordered from least to most significant bit. So [σ(e)]0 = [σ(e)]1 = 0, [σ(c)]0 = 1 and [σ(c)]1 = 0. Then, we have [[e]]σ(e) = {¬b0e , ¬b1e }, [[b]]σ(b) = {¬b0b , ¬b1b }, [[c]]σ(c) = {b0c , ¬b1c }, [[d]]σ(d) = {¬b0d , b1d }, and [[a]]σ(a) = {b0a , b1a }. x≺i← Next, we are ready to discuss the treewidth-aware reduction from SCC-bounded ASP to tight ASP, which takes Π and T and creates a tight logic program Π′ . To this end, let t be any node of T . First, truth values for each atom x ∈ χ(t) are subject to a guess by Rules (1) and by Rules (2) it is ensured that all rules of Πt are satisfied. Notably, by the definition of tree decompositions, Rules (1) and Rules (2) indeed cover all the atoms of Π and all rules of Π, respectively. Then, the next block of rules consisting of Rules (3)–(10) is used for ensuring provability and finally the last block of Rules (11)–(13) is required in order to preserve answer sets, i.e., these rules prevent duplicate answer sets of Π′ for one specific answer set of Π. For the block of Rules (3)–(10) to ensure provability, we need to guess the level bits for each atom x as given in Rules (3). Rules (4) ensure that we correctly define x ≺ i, which is the case if there exists a bit [i]j that is set to 1, but we ′ have ¬bjx and for all larger bits [i]j that are set to 0 (j ′ > j), ′ we also have ¬bjx . Then, for Rules (5) we slightly abuse notation x ≺ i and use it also for a set X, where X ≺ i denotes a set of atoms of the form x ≺ i for each x ∈ X. Rules (5) make sure that whenever a rule r ∈ Πt proves x with the level mapping given by the level bits over atoms in χ(t), we have provability pxt for x in t. However, only for the atoms of the positive body Br+ which are also in the same SCC C = scc(x) as x we need to check that the levels are smaller than the level of x, since by definition of SCCs, there cannot be a positive cycle among atoms of different SCCs. As a result, if there is a rule, where no atom of the positive body is in C, satisfying the rule is enough for proving x as given by Rules (6). If provability pxt holds, we also have px≤t by Rules (7) and provability is propagated from node t′ to its parent node t by setting p≤t if p≤t′ , as indicated by Rules (8). Finally, whenever an atom x is forgotten in a node t, we require to have provability px≤t ensured by Rules (9) and (10) since t might be root(T ). Preserving answer sets (bijectively): The last block consisting of Rules (11), (12), and (13) makes sure that atoms that are false or not in the answer set of Π′ get level 0 and that we do prohibit levels for an atom x that can be safely decreased by one without loosing provability. This ensures that for each answer set of Π we get exactly one corresponding ¬bjx , ¬bjx1 , . . . , ¬bjxs for each x ∈ χ(t); see3 (1) for each r ∈ Πt (2) for each x ∈ χ(t), 1 ≤ j ≤ ⌈log(ℓscc(x) )⌉; see3 (3) for each x ∈ χ(t), C = scc(x), 1 ≤ i < ℓC , 1 ≤ j ≤ ⌈log(ℓC )⌉, [i]j =1, {j ′ | j < j ′ ≤ ⌈log(ℓC )⌉, ′ [i]j =0} = {j1 , . . . , js } (4) pxt ← x, [[x]]i , Br+ , for each r ∈ Πt , x ∈ χ(t) with Br− ∪(Hr \{x}), (Br+ ∩C)≺i pxt ← x, Br+ , Br− ∪(Hr \{x}) px≤t ← pxt px≤t ← px≤t′ x ∈ Hr , C = scc(x), 1 ≤ i < ℓC , and Br+ ∩ C 6= ∅ (5) for each r ∈ Πt , x ∈ χ(t) with x ∈ Hr , Br+ ∩ scc(x) = ∅ for each x ∈ χ(t) (6) (7) for each x ∈ χ(t), t′ ∈ chld(t), x ∈ χ(t′ ) (8) for each t′ ∈ chld(t), x ∈ χ(t′ ) \ χ(t) for each x ∈ χ(n), n = root(T ) (10) ← ¬x, bjx for each x ∈ χ(t), 1 ≤ j ≤ ⌈log(ℓscc(x) )⌉ (11) ← x, [[x]]i , Br+ , Br− ∪(Hr \{x}), (Br+ ∩C)≺i−1 ← x, [[x]]i , Br+ , Br− ∪(Hr \{x}) for each r ∈ Πt , x ∈ χ(t) with ← x, ¬px≤t′ ← x, ¬px≤n (9) x ∈ Hr , C = scc(x), 2 ≤ i < ℓC , and Br+ ∩ C 6= ∅ (12) for each r ∈ Πt , x ∈ χ(t) with x ∈ Hr , C = scc(x), 1 ≤ i < ℓC , and Br+ ∩ C = ∅ (13) Example 8. Recall program Π of Example 1 and TD T = (T, χ) of GΠ as given in Figure 2. Rules (1) and Rules (2) are constructed for each atom a ∈ at(Π) and for each rule r ∈ Π, respectively. Similarly, Rules (3) are constructed for each of the ⌈log(ℓscc(a) )⌉ many bits of each atom a ∈ at(Π). Rules (4) serve as auxiliary definition, where for, e.g., atom c we construct c≺1 ← ¬b0c , ¬b1c ; c≺2 ← ¬b1c ; c≺3 ← ¬b0c ; and c≺3 ← ¬b1c . Next, we show Rules (5)–(13) for node t2 . No. Rules (5) pbt2 ← b, [[b]]1 , d≺1, d; pbt2 ← b, [[b]]2 , d≺2, d; pbt2 ← b, [[b]]3 , d≺3, d; pct2 ← c, [[c]]1 , d≺1, d; pct2 ← c, [[c]]2 , d≺2, d; pct2 ← c, [[c]]3 , d≺3, d; pdt2 ← d, [[d]]1 , b≺1, c≺1, b, c; pdt2 ← d, [[d]]2 , b≺2, c≺2, b, c; pdt2 ← d, [[d]]3 , b≺3, c≺3, b, c 3 A choice rule (Simons, Niemelä, and Soininen 2002) is of the form {a} ← and in an HCF logic program it corresponds to a disjunctive rule a ∨ a′ ← , where a′ is a fresh atom. 54 (7) pb≤t2 ← pbt2 ; pc≤t2 ← pct2 ; pd≤t2 ← pdt2 (11) ← ¬b, b0b ; ← ¬b, b1b ; ← ¬c, b0c ; ← ¬c, b1c ; ← ¬d, b0d ; ← ¬d, b1d (12) ← b, [[b]]2 , d≺1, d; ← b, [[b]]3 , d≺2, d; ← c, [[c]]2 , d≺1, d; ← c, [[c]]3 , d≺2, d; ← d, [[d]]2 , b≺1, c≺1, b, c; ← d, [[d]]3 , b≺2, c≺2, b, c For root node t5 of T , we obtain the following Rules (5)–(13). No. Rules (6) pbt5 ← b, e, ¬f (7) pb≤t5 ← pbt5 ; pe≤t5 ← pet5 ; pf≤t5 ← pft5 (8) pb≤t5 ← pb≤t3 ; pe≤t5 ← pe≤t3 ; pe≤t5 ← pe≤t4 ; pf≤t5 ← pf≤t4 (9) ← d, ¬pd≤t3 ; ← g, ¬pg≤t4 (10) ← b, ¬pb≤t5 ; ← e, ¬pe≤t5 ; ← f, ¬pf≤t5 (11) ← ¬b, b0b ; ← ¬b, b1b ; ← ¬e, b0e ; ← ¬e, b1e ; ← ¬f, b0f ; ← ¬f, b1f (13) ← b, [[b]]1 , e, ¬f ; ← b, [[b]]2 , e, ¬f ; ← b, [[b]]3 , e, ¬f Lemma 3 (Treewidth-Awareness). Let Π be an HCF program, where every SCC C satisfies |C| ≤ ℓ. Then, the treewidth of tight program Π′ obtained by the reduction above, i.e., Rules (1)–(13), by using Π and a TD T = (T, χ) of primal graph GΠ of width k, is in O(k · log(ℓ)). Proof (Sketch). We take T = (T, χ) and construct a TD T ′ :=(T, χ′ ) of GΠ′ , where χ′ is defined as follows. For every node t of T , whose parent node is t∗ , we let χ′ (t) :=χ(t) ∪ {bjx | x ∈ χ(t), 0 ≤ j ≤ ⌈log(ℓscc(x) )⌉} ∪ {pxt , px≤t , p≤t∗ | x ∈ χ(t)}. It is easy to see that indeed all atoms of every instance of Rules (1)–(13) appear in at least one common bag of χ′ . Further, we have connectedness of T ′ , i.e., T ′ is a TD of GΠ′ and |χ(t)| in O(k · log(ℓ)). Finally, we are in the position to prove Theorem 2. Theorem 2 (Removing Cyclicity of SCC-bounded ASP). Let Π be an HCF program, where the treewidth of GΠ is at most k and where every SCC C satisfies |C| ≤ ℓ. Then, there is a tight program Π′ with treewidth in O(k · log(ℓ)) such that for each answer set of Π there is exactly one answer set of Π′ , and vice versa. Correctness and Treewidth-Awareness. We discuss correctness and treewidth-awareness as follows. Lemma 2 (Correctness). Let Π be an HCF program, where the treewidth of GΠ is at most k and where every SCC C satisfies |C| ≤ ℓ. Then, the tight program Π′ obtained by the reduction above on Π and a tree decomposition T = (T, χ) of primal graph GΠ , is correct. Formally, for any answer set I of Π there is exactly one answer set I ′ of Π′ as given by Rules (1)–(13) and vice versa. Proof. First, we compute a tree decomposition T = (T, χ) of GΠ that is a 5-approximation of k = tw (GΠ ) in time 2O(k) · poly(|at(Π)|). Observe that the reduction consisting of Rules (1)–(13) on Π and T runs in polynomial time, precisely in time O(k · log(ℓ) · poly(|at(Π)|)). The claim follows by correctness (Lemma 2) and by treewidth-awareness as given by Lemma 3. Proof. “=⇒”: Given any answer set I of Π. Then, there exists a unique (Janhunen 2006), minimal level mapping σ proving each x ∈ I with 0 ≤ σ(x) < ℓscc(x) . Let P :={pxt , px≤t | r ∈ Πt proves x with σ, x ∈ I, t in T }. From this we construct an interpretation I ′ :=I ∪ {bjx | [σ(x)]j = 1, 0 ≤ j ≤ ⌈log(ℓscc(x) )⌉, x ∈ I} ∪ P ∪ {px≤t | x ∈ I, t′ ∈ T, t′ is below t in T, px≤t′ ∈ P }, which sets atoms as I and additionally encodes σ in binary and sets provability accordingly. It is easy to see that I ′ is an answer set of Π′ . “⇐=”: Given any answer set I ′ of Π′ . From this we construct I :=I ′ ∩ at(Π) as well as level mapping σ :={x 7→ fI (x) | x ∈ at(Π)}, where we define function fI ′ (x) : at(Π) → {0, . . . , ℓ−1} for atom x ∈ at(Π) to return 1 ≤ 0 < ℓscc(x) if {bjx | 0 ≤ j ≤ ⌈log(ℓscc(x) )⌉, [i]j = 1} = {bjx ∈ I ′ | 0 ≤ j ≤ ⌈log(ℓscc(x) )⌉}, i.e., the atoms in answer set I ′ binary-encode i for x. Assume towards a contradiction that I 6|= Π. But then I ′ does not satisfy at least one instance of Rules (1) and (2), contradicting that I ′ is an answer set of Π′ . Again, towards a contradiction assume that I is not an answer set of Π, i.e., at least one x ∈ at(Π) cannot be proven with σ. We still have px≤n ∈ I ′ for n = root(T ), by Rules (9) and (10). However, then we either have that px≤t ∈ I ′ or pxn ∈ I ′ by Rules (7) and (8) for at least one child node t of n. Finally, by the connectedness property (iii) of the definition of TDs, we have that there has to be a node t′ that is either n or a descendant of n where we have pxt′ ∈ I ′ . Consequently, by Rules (5) and (6) as well as auxiliary Rules (3) and (4) we have that there is a rule r ∈ Π that proves x with σ, contradicting the assumption. Similarly, Rules (11), (12), and (13) ensure minimality of σ. Having established Theorem 2, the reduction above easily allows for an alternative proof of Theorem 1. Instead of Algorithm BndCyc of Listing 1, one could also compile the resulting tight program of the reduction above to a propositional formula (SAT), and use an existing algorithm for SAT to decide satisfiability. Indeed, such algorithms run in time single-exponential in the treewidth (Samer and Szeider 2010) and we end up with similar running times as in Theorem 1. 4.2 Reduction to SAT Having established the reduction of SCC-bounded ASP to tight ASP, we now present a treewidth-aware reduction of tight ASP to SAT, which together allow to reduce from SCCbounded ASP to SAT. While the step from tight ASP to SAT might seem straightforward for the program Π′ obtained by the reduction above, in general it is not guaranteed that existing reductions, e.g. (Fages 1994; Lin and Zhao 2003; Janhunen 2006), do not cause a significant blowup in the treewidth of the resulting propositional formula. Indeed, one needs to take care and define a treewidth-aware reduction. Let Π be any given tight logic program and T = (T, χ) be a tree decomposition of GΠ . Similar to the reduction from SCC-bounded ASP to tight ASP, we use as variables besides the original atoms of Π also auxiliary variables. In order to preserve treewidth, we still need to guide the evaluation of the provability of an atom x ∈ at(Π) in a node t in T along the TD T , whereby we use atoms pxt and px≤t to indicate that x was proven in node t and below t, respectively. However, we do not need any level mappings, since there is no positive 55 cycle in Π, but we still guide the idea of Clark’s completion (Clark 1977) along TD T . Consequently, we construct the following propositional formula, where for each node t of T we add Formulas (14)–(18). Intuitively, Formulas (14) ensure that all rules are satisfied, cf., Rules (2). Formulas (15) and (16) take care that ultimately an atom that is set to true requires to be proven, similar to Rules (9) and (10). Finally, Formulas (17) and (18) provide the definition for an atom to be proven in a node and below a node, respectively, which is similar to Rules (5)–(8), but without the level mappings. Preserving answer sets: Answer sets are already preserved, i.e., we obtain exactly one model of the resulting propositional formula F for each answer set of Π and vice versa. If the equivalence (↔) in Formulas (17) and (18) is replaced by an implication (→), we might get duplicate models for one answer set while still ensuring preservation of consistency, i.e., the answers to both decision problems coincide. _ _ a for each r ∈ Πt ¬a ∨ (14) + − a∈Br a∈Br ∪Hr x → px≤n _ ( ^ a∧x∧ r∈Πt ,x∈Hr a∈Br+ px≤t ↔ pxt ∨ ( _ ^ ¬b) b∈Br− ∪(Hr \{x}) px≤t′ ) t′ ∈chld(t),x∈χ(t′ ) Knowing that under ETH tight ASP has roughly the same complexity for treewidth as SAT, we can derive the following corollary that complements the existing lower bound for normal ASP as given by Proposition 1. Corollary 1. Let Π be any normal logic program, where the treewidth of GΠ is at most k. Then, under ETH, there is no reduction to a tight logic program Π′ running in time 2o(k·log(k)) · poly(|at(Π)|) such that tw (GΠ′ ) is in o(k · log(k)). for each t′ ∈ chld(t), x ∈ χ(t′ ) \ χ(t) (15) for each x ∈ χ(n), n = root(T ) (16) x → px≤t′ pxt ↔ Proof. First, we reduce SAT to tight ASP, i.e., capture all models of a given formula F in a tight program Π. Thereby Π consists of a choice rule for each variable of F and a constraint for each clause. Towards a contradiction assume the contrary of this proposition. Then, we reduce Π back to a propositional formula F ′ , running in time 2o(k) · poly(|at(Π)|) with tw (GF ′ ) being in o(k). Consequently, we use an algorithm for SAT (Samer and Szeider 2010) on F ′ to effectively solve F in time 2o(k) · poly(|n|), where F has n variables, which finally contradicts ETH. 5 Conclusion and Future Work This paper deals with improving algorithms for the consistency of head-cycle-free (HCF) ASP for bounded treewidth. The existing lower bound states that under the exponential time hypothesis (ETH), we cannot solve an HCF program with n atoms and treewidth k in time 2o(k·log(k)) · poly(n). In this work, in addition to the treewidth, we also consider the size ℓ of the largest strongly-connected component of the positive dependency graph. Considering both parameters, we obtain a more precise characterization of the runtime: 2O(k·log(λ)) · poly(n), where λ = min({k, ℓ}). This improves the previous result when the strongly-connected components are smaller than the treewidth. Further, we provide a treewidth-aware reduction from HCF ASP to tight ASP, where the treewidth increases from k to O(k · log(ℓ)). Finally, we show that under ETH, tight ASP has roughly the same complexity lower bounds as SAT, which implies that there cannot be a reduction from HCF ASP to tight ASP such that the treewidth only increases from k to o(k · log(k)). Currently, we are performing experiments and practical analysis of our provided reductions. For future work we suggest to investigate precise lower bounds by considering extensions of ETH like the strong ETH (Impagliazzo and Paturi 2001). It might be also interesting to establish lower bounds by taking both parameters k and ℓ into account. for each x ∈ χ(t) (17) for each x ∈ χ(t) (18) Correctness and Treewidth-Awareness. Conceptually the proofs of Lemmas 4 and 5 proceed similar to the proofs of Lemmas 2 and 3, but without level mappings, respectively. Lemma 4 (Correctness). Let Π be a tight logic program, where the treewidth of GΠ is at most k. Then, the propositional formula F obtained by the reduction above on Π and a TD T of primal graph GΠ , consisting of Formulas (14)–(18), is correct. Formally, for any answer set I of Π there is exactly one satisfying assignment of F and vice versa. Lemma 5 (Treewidth-Awareness). Let Π be a tight logic program. Then, the treewidth of propositional formula F obtained by the reduction above by using Π and a TD T of GΠ of width k, is in O(k). Proof. The proof proceeds similar to Lemma 3. However, due to Formulas (18) and without loss of generality one needs to consider only TDs, where every node has constantly many child nodes. Such a TD can be easily obtained from any given TD by adding auxiliary nodes (Kloks 1994). References Alviano, M.; Calimeri, F.; Dodaro, C.; Fuscà, D.; Leone, N.; Perri, S.; Ricca, F.; Veltri, P.; and Zangari, J. 2017. The ASP system DLV2. In LPNMR’17, volume 10377 of LNAI, 215–221. Springer. Balduccini, M.; Gelfond, M.; and Nogueira, M. 2006. Answer set based design of knowledge systems. Ann. Math. Artif. Intell. 47(1-2):183–219. Ben-Eliyahu, R., and Dechter, R. 1994. Propositional semantics for disjunctive logic programs. Ann. Math. Artif. Intell. 12(1):53–87. However, we cannot do much better, as shown next. Proposition 2 (ETH-Tightness). Let Π be a tight logic program, where the treewidth of GΠ is at most k. Then, under ETH, the treewidth of the resulting propositional formula F can not be significantly improved, i.e., under ETH there is no reduction running in time 2o(k) · poly(|at(Π)|) such that tw (GF ) is in o(k). 56 Bichler, M.; Morak, M.; and Woltran, S. 2018. Single-shot epistemic logic program solving. In IJCAI’18, 1714–1720. ijcai.org. Bidoı́t, N., and Froidevaux, C. 1991. Negation by default and unstratifiable logic programs. Theoretical Computer Science 78(1):85–112. Bliem, B.; Morak, M.; Moldovan, M.; and Woltran, S. 2020. The impact of treewidth on grounding and solving of answer set programs. J. Artif. Intell. Res. 67:35–80. Bodlaender, H. L., and Koster, A. M. C. A. 2008. Combinatorial optimization on graphs of bounded treewidth. The Computer Journal 51(3):255–269. Bodlaender, H. L.; Drange, P. G.; Dregi, M. S.; Fomin, F. V.; Lokshtanov, D.; and Pilipczuk, M. 2016. A ck n 5-Approximation Algorithm for Treewidth. SIAM J. Comput. 45(2):317–378. Brewka, G.; Eiter, T.; and Truszczyński, M. 2011. Answer set programming at a glance. Communications of the ACM 54(12):92–103. Clark, K. L. 1977. Negation as failure. In Logic and Data Bases, Advances in Data Base Theory, 293–322. Plemum Press. Cygan, M.; Fomin, F. V.; Kowalik, Ł.; Lokshtanov, D.; Dániel Marx, M. P.; Pilipczuk, M.; and Saurabh, S. 2015. Parameterized Algorithms. Springer. Eiter, T., and Gottlob, G. 1995. On the computational cost of disjunctive logic programming: Propositional case. Ann. Math. Artif. Intell. 15(3–4):289–323. Fages, F. 1994. Consistency of Clark’s completion and existence of stable models. Methods Log. Comput. Sci. 1(1):51– 60. Fandinno, J., and Hecher, M. 2020. Treewidth-Aware Complexity in ASP: Not all Positive Cycles are Equally Hard. In ASPOCP@ICLP. Fichte, J. K., and Hecher, M. 2019. Treewidth and counting projected answer sets. In LPNMR’19, volume 11481 of LNCS, 105–119. Springer. Fichte, J. K., and Szeider, S. 2015. Backdoors to tractable answer-set programming. Artificial Intelligence 220(0):64– 103. Fichte, J. K., and Szeider, S. 2017. Backdoor trees for answer set programming. In ASPOCP@LPNMR, volume 1868 of CEUR Workshop Proceedings. CEUR-WS.org. Fichte, J. K.; Hecher, M.; Morak, M.; and Woltran, S. 2017. Answer set solving with bounded treewidth revisited. In LPNMR’17, volume 10377 of LNCS, 132–145. Springer. Fichte, J. K.; Kronegger, M.; and Woltran, S. 2019. A multiparametric view on answer set programming. Ann. Math. Artif. Intell. 86(1-3):121–147. Gebser, M.; Kaminski, R.; Kaufmann, B.; and Schaub, T. 2012. Answer Set Solving in Practice. Morgan & Claypool. Gelfond, M., and Lifschitz, V. 1991. Classical negation in logic programs and disjunctive databases. New Generation Comput. 9(3/4):365–386. Gottlob, G.; Scarcello, F.; and Sideri, M. 2002. Fixedparameter complexity in AI and nonmonotonic reasoning. Artif. Intell. 138(1-2):55–86. Guziolowski, C.; Videla, S.; Eduati, F.; Thiele, S.; Cokelaer, T.; Siegel, A.; and Saez-Rodriguez, J. 2013. Exhaustively characterizing feasible logic models of a signaling network using answer set programming. Bioinformatics 29(18):2320– 2326. Erratum see Bioinformatics 30, 13, 1942. Hecher, M. 2020. Treewidth-Aware Reductions of normal ASP to SAT – Is normal ASP harder than SAT after all? In KR’20. In Press. Impagliazzo, R., and Paturi, R. 2001. On the complexity of k-sat. J. Comput. Syst. Sci. 62(2):367–375. Impagliazzo, R.; Paturi, R.; and Zane, F. 2001. Which problems have strongly exponential complexity? J. of Computer and System Sciences 63(4):512–530. Jakl, M.; Pichler, R.; and Woltran, S. 2009. Answer-set programming with bounded treewidth. In IJCAI’09, volume 2, 816–822. Janhunen, T. 2006. Some (in)translatability results for normal logic programs and propositional theories. Journal of Applied Non-Classical Logics 16(1-2):35–86. Kloks, T. 1994. Treewidth. Computations and Approximations, volume 842 of LNCS. Springer. Lackner, M., and Pfandler, A. 2012. Fixed-parameter algorithms for finding minimal models. In KR’12. AAAI Press. Lifschitz, V., and Razborov, A. A. 2006. Why are there so many loop formulas? ACM Trans. Comput. Log. 7(2):261– 268. Lin, F., and Zhao, J. 2003. On tight logic programs and yet another translation from normal logic programs to propositional logic. In IJCAI’03, 853–858. Morgan Kaufmann. Lin, F., and Zhao, X. 2004. On odd and even cycles in normal logic programs. In AAAI, 80–85. AAAI Press / MIT Press. Lonc, Z., and Truszczynski, M. 2003. Fixed-parameter complexity of semantics for logic programs. ACM Trans. Comput. Log. 4(1):91–119. Marek, W., and Truszczyński, M. 1991. Autoepistemic logic. J. of the ACM 38(3):588–619. Nogueira, M.; Balduccini, M.; Gelfond, M.; Watson, R.; and Barry, M. 2001. An A-Prolog decision support system for the Space Shuttle. In PADL’01, volume 1990 of LNCS, 169–183. Springer. Pichler, R.; Rümmele, S.; and Woltran, S. 2010. Counting and enumeration problems with bounded treewidth. In LPAR’10, volume 6355 of LNCS, 387–404. Springer. Robertson, N., and Seymour, P. D. 1986. Graph minors II: Algorithmic aspects of tree-width. J. Algorithms 7:309–322. Samer, M., and Szeider, S. 2010. Algorithms for propositional model counting. J. Discrete Algorithms 8(1):50–64. Simons, P.; Niemelä, I.; and Soininen, T. 2002. Extending and implementing the stable model semantics. Artif. Intell. 138(1-2):181–234. 57 Towards Lightweight Completion Formulas for Lazy Grounding in Answer Set Programming Bart Bogaerts1 , Simon Marynissen1,2 , Antonius Weinzierl3 1 Vrije Universiteit Brussel 2 KU Leuven 3 TU Wien bart.bogaerts@vub.be, simon.marynissen@kuleuven.be, antonius.weinzierl@kr.tuwien.ac.at Abstract Lazy grounding takes the idea of lazily generating the SAT encoding one step further by also lazily performing the grounding process. That is, ASP rules are only instantiated when some algorithm detects that they are useful for the solver in its current state. The most prominent class of lazy grounding systems for ASP is based on computation sequences (Liu et al. 2007) and includes systems such as Omiga (Dao-Tran et al. 2012), GASP (Dal Palù et al. 2009), ASPeRiX (Lefèvre and Nicolas 2009) and the recently introduced ALPHA (Weinzierl 2017). The latter is the youngest and most modern of the family and the only one that integrates lazy grounding with a CDCL solver, resulting in superior search performance over its predecessors. Our work extends the ALPHA algorithm. Contrary to more traditional ASP systems, lazy grounding systems aim more at applications in which the full grounding is so large that simply creating it would pose issues (e.g., if it does not fit in your main memory). This phenomenon is known as the grounding bottleneck (Balduccini, Lierler, and Schüller 2013). Examples of such problems include queries over a large graph; planning problems, with a very large number of potential time steps, or problems where the full grounding contains a lot of unnecessary information and the actual search problem is not very hard. The essential idea underlying lazy grounding is that all parts of the grounding that do not help the solver in its quest to find a satisfying assignment (a stable model) or prove unsatisfiability are better not given to the solver since they only consume precious time and memory. Unfortunately, it is not easy to detect which parts that are and a trade-off shows up (Taupe, Weinzierl, and Friedrich 2019): producing larger parts of the grounding will improve search performance (e.g., propagation can prune larger parts of the search space) but grounding too much will — on the type of instances lazy grounding is built for — result in an unmanageable explosion of the ground theory. Lazy grounding systems and ground-and-solve systems reside on two extremes of this trade-off: the former produce a minimal required part of the theory to ensure correctness while the latter produce the entire bottom-up grounding. Our work moves lazy grounding a bit more to eager side of this trade-off. Specifically, we focus on completion formulas (Clark 1978) that essentially express that when an atom is true, there must be a rule that supports it (a rule with Lazy grounding is a technique for avoiding the so-called grounding bottleneck in Answer Set Programming (ASP). The core principle of lazy grounding is to only add parts of the grounding when they are needed to guarantee correctness of the underlying ASP solver. One of the main drawbacks of this approach is that a lot of (valuable) propagation is missed. In this work, we take a first step towards solving this problem by developing a theoretical framework for investigating completion formulas in the context of lazy grounding. 1 Introduction Answer set programming (ASP) (Marek and Truszczyński 1999) is a well-known knowledge representation paradigm in which logic programs under the stable semantics (Gelfond and Lifschitz 1988) are used to encode problems in the complexity class NP and beyond. From a practical perspective, ASP offers users a rich first-order language, ASP-Core2 (Calimeri et al. 2013), to express knowledge in, and many efficient ASP solvers (Gebser, Maratea, and Ricca 2017) can subsequently be used to solve problems related to knowledge expressed in ASP-Core2. Traditional ASP systems work in two phases. First, the input program is grounded (variables are eliminated). Second, a solver is used to find the stable models of the resulting ground theory. For a long time, the ASP community has focused strongly on developing efficient solvers, while only a few grounders were developed. Most modern ASP solvers are in essence extensions of satisfiability (SAT) (Marques Silva, Lynce, and Malik 2009) solvers, building on conflictdriven clause learning (CDCL) (Marques-Silva and Sakallah 1999). In recent years, in many formalisms that build on top of SAT, we have seen a move towards only generating parts of the SAT encoding on-the-fly, on moments when it is deemed useful for the solver. This idea lies at the heart of the CDCL(T) algorithm for SAT modulo theories (Barrett et al. 2009) and is embraced under the name lazy clause generation (Stuckey 2010) in constraint programming (Rossi, van Beek, and Walsh 2006). Answer set programming is no exception: the so-called unfounded set propagator and aggregate propagator are implemented using the same principles; when needed, they generate clauses for the underlying SAT algorithm. Additionally, lazy clause generation forms the basis of recent constraint ASP solvers (Banbara et al. 2017). 58 The set of all atoms is denoted by A. If a ∈ A, then var(a) denotes the set of variables occurring in a. We say that a is ground if var(a) = ∅. The set of all ground atoms is denoted Agr . A literal is an atom p or its negation ¬p. The former is called a positive literal, the latter a negative literal. Slightly abusing notation, if l is a literal, we use ¬l to denote the literal that is the negation of l, i.e., we use ¬(¬p) to denote p. The set of all literals is denoted L and the set of ground literals Lgr . A clause is a disjunction of literals. A (normal) rule is an expression of the form true body and that atom in the head). While ground-andsolve systems add these formulas (in the form of clauses) to their ground theory, lazy grounders cannot do this easily; the reason is that the set of ground rules that could derive a certain atom is not known (more instantiations could be found later on). Consider, for example the atom p(a) and a rule p(X) ← q(X, Y ). where the set of ground instantiations of this rule with p(a) in the head depends on the set of atoms over the binary predicate q. Unless those instances over q are fully grounded, a lazy grounder cannot add the corresponding completion formula. In this paper, we develop lightweight algorithms to detect when that set of rules is complete and hence, when completion formulas are added. Our hypothesis is that doing this will improve search performance without blowing up the grounding and as such result in overall improved performance of lazy-grounding ASP systems, and specifically the ALPHA system. The main contribution of our paper is the development of a novel method to discover completion formulas during lazy grounding. Our method starts from a static analysis of the input program in which we discover functional dependencies between variable occurrences. During the search, this static analysis is then used to figure out the right moment to add the completion formulas in a manner that is inspired by the two-watched literal scheme from SAT to avoid adding the completion constraints on moments they have no chance of propagating anyway. We do not have an implementation of this idea available yet, but instead focus on the theoretical principles. The rest of this paper is structured as follows. In Section 2 we recall some preliminaries. Section 3 contains the different methods for discovering completion formulas. In Section 4, we discuss extensions of our work that could be used to find even more completion formulas. We conclude in Section 5. 2 p←L where p is an atom and L a set of literals. If r is such a rule, its head, positive body, negative body and body are defined as H(r) = p, B+ (r) = A ∩ L, B− (r) = {q ∈ A | ¬q ∈ L} and B(r) = L respectively. We call r a fact if B(r) = ∅ and ground if p and all literals in L are ground. We use var(r) to denote the set of variables occurring in r, i.e., [ var(q). var(r) = var(p) ∪ q∈L A rule r is safe if all variables in r occur in its positive body, i.e., if var(r) ⊆ var(B+ (r)). A logic program P is a finite set of safe rules. P is ground if each r ∈ P is. In our examples, logic programs are presented in a more general format, using, e.g., choice rules (see (Calimeri et al. 2020)). These can easily be translated into the format considered here. If X is a set of variables, a grounding substitution of X is a mapping σ : X → C. The set of all substitutions of X is denoted sub(X) If e is an expression, a grounding substitution for e is a grounding subtitution of its variables. We write [c1 /X1 , . . . , cn /Xn ] for the substitution that maps each Xi to ci and each other variable to itself. The result of applying a substitution σ to an expression e is the expression obtained by replacing all variables X by σ(X) and is denoted σ(e). The most general unifier of two substitutions is defined as usual (Martelli and Montanari 1982). A substitution σ extends a substitution τ if σ is equal to τ in the domain of τ . The grounding of a rule is given by Preliminaries We now introduce some preliminaries related to answer set programming in general and the ALPHA algorithm specifically. This section is based on the preliminaries of (Bogaerts and Weinzierl 2018). gr(r) = {σ(r) | σ is a grounding substitution} and the (full) grounding of a program P is defined as S gr(P) = r∈P gr(r). A (Herbrand) interpretation I is a finite set of ground atoms. The satisfaction relation between interpretations and literals is given by Answer set programming. Let C be a set of constants, V be a set of variables, and Q be a set of predicates, each with an associated arity, i.e., elements of Q are of the form p/k where p is the predicate name and k its arity. We assume the existence of built-in predicates, such as equality, with a fixed interpretation. A (non-ground) term is an element of C ∪ V.1 The set of all terms is denoted T . Our definition of a term does not allow for nesting. This eases our exposition, but is not essential for our results. For instance, it allows us to view + as a ternary predicate +/3, i.e. +(X, Y, Z) means that X + Y = Z. A (non-ground) atom is an expression of the form p(t1 , . . . , tk ) where p/k ∈ Q and ti ∈ T for each i. I |= p if p ∈ I, and I |= ¬p if p 6∈ I. An interpretation satisfies a set L of literals if it satisfies each literal in L. A partial (Herbrand) interpretation I is a consistent set of ground literals (consistent here means that it does not contain both an atom and its negation). The value of a literal l in a partial interpretation I is lI = t if l ∈ I, f if ¬l ∈ I and u otherwise. Given a (partial) interpretation I and a ground program P, we inductively define when an atom is justified (Denecker, Brewka, and Strass 2015) as follows. An atom p is justified 1 Following Weinzierl (2017), we omit function symbols to simplify the presentation. All our results still hold in the presence of function symbols, except for termination, for which additional (syntactic) restrictions must be imposed. 59 in I by P if there is a rule r ∈ P with H(r) = p such that each q + ∈ B+ (r) is justified in I by P and each q − ∈ B− (r) is false in I. A built-in atom is justified in I by P if it is true in I. An interpretation I is a model of a ground program P if for each rule r ∈ P with I |= B(r), also I |= H(r). An interpretation I is a stable model (or answer set) of a ground program P (Gelfond and Lifschitz 1988) if it is a model of P and each true atom in I is justified in I by P. This nonstandard characterization of stable models coincides with the original reduct-based characterization, as shown by Denecker, Brewka, and Strass (2015) but simplifies the rest of our presentation. If P is non-ground, we say that I is an answer set of P if it is an answer set of gr(P). The set of all answer sets of P is denoted AS(P). decide Pick (using some heuristics (Taupe, Weinzierl, and Schenner 2017)) one atom p, occurring in Pg that is unknown in Iα and add (p, δ) or (¬p, δ) to α.2 justification-conflict If all atoms in Pg are assigned while some atom is true but not justified, learn a new clause that avoids visiting this assignment again. Worst-case the learned clause contains the negation of all decisions, but Bogaerts and Weinzierl (2018) developed more optimized analysis methods. After learning this clause, AL PHA backjumps. 3 Deriving Completion Formulas We now discuss our modifications to the ALPHA algorithm that allow us to add completion formulas. There are two main problems to be tackled here: the first, and most fundamental is Question 1: how to generate completion clauses, or stated differently, how to find all the rules that can derive a certain atom, without creating the full grounding, and the second is, Question 2: when to add completion formulas to the solver. The general idea for the generation is that we will develop approximation methods that overapproximate the set of instantiations of rules that can derive a given atom based on a static analysis of the program. The reason why we look for an overapproximation is since in general finding the exact set of such instantiations would require a semantical analysis. Our methods below are designed based on the principle that such an overapproximation should be as tight as possible. Specifically, our methods will be based on functional dependencies and determined predicates. This section starts by proving definitions for bounds. After that we explain how bounds can be used in ALPHA. The last subsection describes the different type of bounds and how they can be detected and combined. The ALPHA algorithm. We now recall the formalization of ALPHA of Bogaerts and Weinzierl (2018). This differs from the original presentation of Weinzierl (2017) in that it does not use the truth value MUST-BE-TRUE, but instead makes the justifiedness of atoms explicit. The state of AL PHA is a tuple hP, Pg , C, α, SJ i, where • P is a logic program, • Pg ⊆ gr(P) is the so-far grounded program; we use Σg ⊆ Agr to denote the set of ground atoms that occur in Pg , • C is a set of (learned) clauses, • α is the trail; this is a sequence of tuples (l, c) with l a literal and c either the symbol δ, a rule in Pg or a clause in C. α is restricted to not containing two tuples (l, c) and (¬l, c′ ); in a tuple (l, c) ∈ α, c represents the reason for making l true: either decision (denoted δ) or propagation because of some rule or clause; α implicitly determines a partial interpretation denoted Iα = {l | (l, c) ∈ α for some c}. • SJ ⊆ A is the set of atoms that are justified by Pg in Iα . For clause learning and W propagation, a rule p ← L is treated as the clause p ∨ l∈L ¬l. Hence, whenever we refer to “a clause” in the following, we mean any rule in Pg (viewed as a clause) or any clause in C. We refer to rules whenever the rule structure is needed (for determining justified atoms). ALPHA interleaves CDCL and grounding. It performs (iteratively) the following steps (listed by priority). conflict If a clause in C ∪ Pg is violated, analyze the conflict, learn a new clause (add to C) and back-jump (undo changes to α and SJ that happened since a certain point) following the so-called 1UIP schema (Zhang et al. 2001). (unit) propagate If all literals of a clause c ∈ C ∪Pg except for l are false in Iα , add (l, c) to α. justify If there is a rule r such that B+ (r) ⊆ SJ and ¬B− (r) ⊆ Iα , add H(r) to SJ . ground If, for some grounding substitution σ and r ∈ P, B+ (σ(r)) ⊆ Iα , add σ(r) to Pg . In practice, when adding this rule, ALPHA makes a new – intermediate – propositional variable β(σ(r)) to represent the body of the rule, similar to (Anger et al. 2006). 3.1 Bounds The core concept of our detection mechanism is the notion of bounds. We have already stated that we want to find overapproximations of grounding substitutions. We now formalize this. Definition 3.1. Given a rule r in a program P. A grounding substitution σ is relevant in r with respect to P if B+ (σ(r)) is justified in some partial interpretation of P. The following lemma follows immediately from the characterization of stable models in terms of justifications (Denecker, Brewka, and Strass 2015). Lemma 3.2. Let I be an answer set of P. If I |= p, then there is a rule r in P and a relevant substitution σ in r such that σ(H(r)) = p. Proof. Since the justification characterization of answer sets, we know that p is justified in I. Then by the definition of justified, the proof follows. Definition 3.3. Given a rule r and two sets X and Y of variables in r. A function f : sub(X) → 2sub(Y ) is called 2 ALPHA actually only allows deciding on certain atoms (those of the form β(r)), hence our presentation is slightly more general. 60 a bound in r if for all σ ∈ sub(X) it holds that f (σ) is a superset of the elements τ ∈ sub(Y ) for which there is a relevant substitution in r that extends both σ and τ . To denote that f is a bound, we write f : X y Y . If X = ∅, then we say Y is bounded by f in r. If f (σ) contains at most one element for each σ ∈ sub(X), then f is called a functional bound. 3.2 makes sense to add the completion constraint. This method is very lightweight: it does not trigger additional grounding, does not change the fundamental algorithm underlying AL PHA , and only adds very few additional constraints. It does enable better pruning of the search space. The second way is more proactive, but also more invasive. It happens during the justification-conflict reasoning step. If an atom h is true, but not justified, instead of triggering the justification analysis to resolve why this situation happens, we add the completion formula for h, thereby also avoiding the justification-conflict. However, since certain atoms β(τ (ri )) from Proposition 3.4 are not yet known to the solver, also these corresponding rules need to be grounded. For this reason the second way is more intrusive into the grounding algorithm. How to use bounds Bounds can be used to calculate overapproximations of completion formulas. To start, assume that a predicate p is defined only in a single rule r. Assume there is a bound f : var(H(r)) y var(r), and let σ ∈ sub(var(H(r))). Then with σ, we can determine an overapproximation of the completion formula of h = σ(H(r)) as follows: _ β(τ (r)). ¬h ∨ 3.3 How to find bounds In the previous subsection, we showed how bounds can be used to improve the lazy grounding algorithm. We now turn our attention to the question of how to find bounds. In particular, the various types of bounds we define in this section can all be found using a static analysis of the program. We illustrate our methods in increasing difficulty, illustrating each of them with examples of rules we encountered in practice, in encodings of the 5th ASP competition (Calimeri et al. 2016). τ ∈f (σ) The case when p has multiple rules is similar and is formalized in the following proposition. Proposition 3.4. Let h be a ground atom. Let r1 , . . . , rn be the rules in P whose head unifies with h. Let σi denote the most general unifier of h and H(ri ). If there is a bound fi : var(H(ri )) y var(ri ) for all i, then _ _ β(τ (ri )) ¬h ∨ 1≤i≤n τ ∈fi (σi ) Case 1: Non-projective rules The first case is very simple: in case all variables occurring in a rule also occur in the head, then we know that for each atom, there is at most one variable substitution that turns the head of the rule into the specified atom. We call such a rule non-projective since no body variables are projected out. Proposition 3.6. If r is a non-projective rule, i.e., if var(H(r)) = var(r), then the following is a bound: holds in all answer sets of P. Proof. For all answer sets I of P for which I |= ¬h, the clause trivially holds in I. So assume an answer set I for which I |= h. This means there is a rule ri in P that derives h. Hence by Lemma 3.2, there is a relevant substitution ρ in r that extends σi . This means that I |= β(ρ(r)). By the definition of a bound, it holds that ρ ∈ f (σ). Therefore I satisfies the clause, which we needed to show. id : sub(var(H(r))) → sub(var(r)) : σ 7→ {σ}. Proof. Take σ ∈ sub(var(H(r))). Let τ ∈ sub(var(r)) for which there is a relevant substitution ρ in r that extends both σ and τ . Then τ = ρ = σ. Therefore τ ∈ id (σ), which proves that id is a bound. Remark 3.5. By Lemma 3.11 and Lemma 3.12, it is sufficient to have a bound var(H(r)) y var(B(r)) for each rule r. For both the multiple and the single rule case, the generated clause might be unwieldy, in particular if the bounds are bad overapproximations. Therefore, it is crucial that good bounds are detected, which is discussed in the next subsection. Of course, a question that remains unanswered is when such bounds should be added to the solver. We see two ways to do this. The first way is a very lightweight mechanism that happens during the ground reasoning step. The idea is that as soon as all rules that can derive a specific head h have been grounded, then we add the completion formula for h. Keeping track of this can be done very cheaply: the bounds provide us with an upper bound on the number of rules that can derive a given atom; it suffices to keep track of a simple counter for each atom to know when the criterion is satisfied. As soon as this is the case, all the atoms β(τ (ri )) mentioned in Proposition 3.4 are defined in the solver and it In case a predicate has a single non-projective rule, for each ground instance of the rule, the head is in fact equivalent to the body. This is a very specific and restricted case. We mention it here for two reasons. First of all, this is the only case for which ALPHA, without our extensions already adds completion constraints. Secondly, this (restricted) situation does show up in practical problems. For instance the following rule was taken from the new Knight Tour with Holes encoding used in the 5th ASP competition (Calimeri et al. 2016). move(X, Y, XX, Y Y ) ← valid (X, Y, XX, Y Y ), ¬other (X, Y, XX, Y Y ). Of course, if all the rules for a predicate are non-projective, then we can combine the trivial bounds on each rule to find a completion formula; however, this is is not yet detected in the existing ALPHA algorithm. 61 Case 2: Direct functional dependencies In certain cases, the body of a rule can contain variables the head does not, yet without increasing the number of instantiations that can derive the same head. This happens especially if some arithmetic operations are present. To illustrate this, consider the rule {gt(A, X, U )} ←elem(A, X), comUnit(U ), comUnit(U1 ), U1 = U + 1, rule(A), U < X. taken from the new Partner Units encoding used in the 5th ASP competition (Calimeri et al. 2016). This type of pattern occurs quite often, also for instance in Tower of Hanoi and in many temporal problems in which a time parameter is incremented by one or in problems over a grid in which coordinates are incremented by one. We can see that even though the variable U1 occurs only in the body of the rule, for each instantation of the head there can be at most one grounding substitution of the rule that derives it. Hence, if all rules for gt have this structure, the completion can also be detected here. We now formalize this idea. If p is a predicate with arity n, by pj (with 1 ≤ j ≤ n) we denote the j th argument position of p. For any set J of argument positions, denote by sub(J) the set of assignments of constants to the positions in J. A tuple of constants c1 , . . . , cn , is succinctly denoted by c. If p(c) is an atom and J a set of argument positions in p, we write c|J to denote the element in sub(J) that maps each pj ∈ J to cj . Definition 3.7. A ground atom h is relevant in P if there is a rule r in P and a relevant grounding substitution σ in r such that σ(H(r)) = h. A ground built-in atom is relevant in P if it is true. Definition 3.8. Let J and K be sets of argument positions of a predicate p in P. We say that J → K is a functional dependency if for all σ ∈ sub(J), there exists at most one τ ∈ sub(K) and relevant atom p(c) in P such that c|J = σ and c|K = τ . For instance, if p is equality, the following are some functional dependencies: {=1 } → {=2 }, {=2 } → {=1 }, {=1 , =2 } → {=1 }. Of the ones mentioned here, the last one is the least interesting. Another example is the predicate +/3. It has among others the following functional dependencies: {+1 , +2 } → {+3 }, {+1 , +3 } → {+2 }, {+3 , +2 } → {+1 }. If a built-in predicate p with arity n occurs in the positive body of a rule r, then a functional dependency of p determines a bound in r. Proposition 3.9. Assume p is a built-in predicate and p(t) ∈ B+ (r). A functional dependency J → K of p induces a J→K ) in r: functional bound (denoted p(t) mapping a σ to {τσ } if τσ exists and ∅ otherwise. We prove that f is a bound; hence take any σ ∈ sub(X). If there is no τ ∈ sub(Y ) for which there is a relevant substitution in r that extends both σ and τ , then we are done. So suppose, there is such a τ . We prove that τ = τσ . Any relevant extension in r of both τ and σ justifies p(X); hence satisfies p(X). By definition of τσ we have that τ = τσ . Therefore, τ ∈ f (σ). This proves that f is a bound. That f is functional follows directly from its definition. var({ti | pi ∈ J}) y var({ti | pi ∈ K}). Proof. Let X = var({ti | pi ∈ J}) and Y = var({ti | pi ∈ K}). Let σ ∈ sub(X). Since J → K is a functional dependency, there exists at most one τσ ∈ sub(Y ) such that the atom p(t) is satisfied under some extension of both σ and τσ . Define f : sub(X) → 2sub(Y ) Typical ASP encodings of graph coloring do not contain the rule (1) but instead use the rule As we will see later, bounds originating from functional dependencies of built-in predicates will act as a base case for further functional bounds. Case 3: Determined predicates Given a program P we call a predicate determined if its defined only by facts. The interpretation of determined predicates can be computed efficiently prior to the solving process, and their value can be used to find bounds on the instantiations of other rules. An example can be found in graph coloring, in which a rule colored (N ) ← assign(N, C), color (C) (1) expresses that a node is colored if it is assigned a color. The predicate color here is determined since it is given by facts. Thus, we know that for each node n, there are at most as many instances of the rule that derive colored (n) as there are colors. Notably, the completion contraint that would be added by taking this into account, is exactly the redundant constraint that was added manually in the graph coloring experiments of Leutgeb and Weinzierl (2017) to help lazy grounding, i.e. ¬colored (n) ∨ assign(n, col1 ) ∨ · · · ∨ assign(n, colk ) Our new methods obtain this constraint automatically, thereby easing the life of the modeler. Proposition 3.10. Let r be a rule with d(t) ∈ B+ (r) and d a determined predicate. In that case there exists a bound ∅ y X, where X is the set of variables in t. Proof. Every fact d(c) for a tuple of constants c corresponds to at most one element σc in sub(X). Since d is given by facts, we can enumerate its interpretation I d . Let f : sub(∅) → 2sub(X) : σ 7→ {σc | c ∈ I d } We prove that f is a bound. Take σ ∈ sub(∅). Note that σ is necessarily the trivial substitution. Take τ ∈ sub(X) for which there is a relevant substitution in r that extends both σ and τ . We prove that τ ∈ f (σ), i.e. τ = σc for some c ∈ I d . By the existence of that relevant substitution in r, we have that d(t) is satisfied under τ ; hence τ is equal to some σc for some c ∈ I d . This proves that f is a bound. colored (N ) ← assign(N, C). Even in this case, it is possible to determine that C is bounded by a determined predicate by inspecting the defining rules of assign. This is formalized in the remainder of this section. 62 substitution that extends both σ and τ . Hence, τ ∈ f (σ) since f is a bound. Combining this proves that υ ∈ h(σ); hence proving that h is a bound. Case 4: Combining bounds Bounds can be obtained from other bounds in several ways. We already found three base cases of bounds, given in Propositions 3.6, 3.9, and 3.10: 1. If Y ⊆ X ⊆ var(r), then id : X y Y is a bound, where id is the function mapping σ to {σ}. If only functional bounds are considered, then Lemma 3.12 and Lemma 3.13, together with our first base case forms the axiomatic system for functional dependencies developed by Armstrong (1974). To illustrate the combination of bounds, consider a rule J→K induced by a built-in atom p(t) ∈ 2. The bound p(t) B+ (r) with functional dependency J → K. 3. The bound induced by an atom d(t) ∈ B+ (r) for a determined predicate d. Additionally, bounds of different types can be altered or combined to get new bounds, as shown in the following lemmas. Lemma 3.11. Let f : X y Y be a bound in r. Then for any X ⊆ X ′ and Y ′ ⊆ Y ′ , the function h(X) ← +(X, 1, Z), = (Z, U ). In this case, X y U is a functional bound in r: by using the functional dependency of + we see that X y Z is a functional bound; by using the dependencies of =, we see that Z y U is functional bound, hence we can combine them, by using Lemma 3.13, to get the desired dependency. Even more is possible. If f : X y Y and g : X y Y are bounds, then the pointwise union and intersection are also bounds. While the union will not be of much benefit for finding good overapproximations of completion formulas, the intersection of two bounds can be useful since it allows for more precise approximations. ′ f ′ : sub(X ′ ) → 2sub(Y ) : σ 7→ {τ |Y ′ | τ ∈ f (σ|X )} is also a bound. (σ|X denotes σ restricted to the variables in X) Proof. Take σ ∈ sub(X ′ ). Let τ ′ ∈ sub(Y ′ ) for which there exists a relevant substitution ρ in r that extends both σ and τ ′ . We prove that τ ′ ∈ f (σ), i.e. there exist a τ ∈ f (σ|X ) such that τ ′ = τ |Y ′ . Take τ = ρ|Y . By definition, τ |Y ′ = τ ′ . We know that ρ extends both σ|X and τ . Therefore, since f is a bound, it holds that τ ∈ f (σ|X ). This proves that f ′ is a bound. Case 5: Bounds on argument positions We have shown that if d is a determined predicate, then it induces a bound. However, sometimes bounds by determined predicates are not explicit. For instance, in the graph coloring example it would make perfect sense to drop color (C) from the body of the rule since the fact that C is a color should follow already from its occurrence in assign(N, C), resulting in the rule Lemma 3.12. Let f : X y Y be a bound in r and let U ⊆ var(r). Let h denote the function h : sub(X ∪ U ) → 2 colored (N ) ← assign(N, C). sub(Y ∪U ) However, from the definition of assign, one can see that that C is bound by the determined predicate color and hence the completion constraint could, in principle, still be derived. We now formally show how to do this. Definition 3.14. Let p be a predicate with arity n in a program P and J and K be sets of argument positions in p. If f is a function from sub(J) to 2sub(K) such that for every relevant atom p(c) in P it holds that c|K ∈ f (c|J ), then f is said to be a bound in p, which we denote by f : J y K. If J = ∅, then we say K is bounded by f . Bounds in rules and predicates are not independent: bounds in rules determine bounds on argument positions and vice versa. This is formalized in the following two propositions. Proposition 3.15. Let p be a predicate symbol and J and K sets of argument positions in p. Assume that for each rule r of the form p(t) ← ϕ in P, fr : var(t|J ) y var(t|K ) is a bound in r, then the union of these fr induces a bound in p. where h(σ) = {τ · σ|U \Y | τ ∈ f (σ|X )} and · is used to denote the combination of two disjoint projected substitutions. The function h is a bound from X ∪ U to Y ∪ U . Proof. Take σ ∈ sub(X∪U ). Let τ ∈ sub(Y ∪U ) for which there is a relevant substitution ρ in r that extends both σ and τ . We prove that τ ∈ h(σ). We know that ρ also extends both σ|X and τ |Y . Now, since f is a bound, τ |Y ∈ f (σ|X ). Since ρ extends both σ and τ , it holds that σ|U \Y = τ |U \Y because U \ Y is contained in the domains of both σ and τ . Therefore τ = τ |Y · τ |U \Y = τ ′ · σ|U \Y for some τ ′ ∈ f (σ|X ). This proves that τ ∈ h(σ); hence h is a bound. Lemma 3.13. If f : X y Y and g : Y y Z are bounds in r, then the following function is a bound: [ g(τ ) h : sub(X) → 2sub(Z) : σ 7→ Proof. Let A be any set of argument positions in p. Then A corresponds uniquely to a set Vr ⊆ var(H(r)) for each rule r of p, and Vr is the same for each rule r of p. Therefore, this set is denoted V . It is straightforward that sub(A) is in a one-to-one relation with sub(V ). Misusing notation, we assume sub(A) = sub(V ). Then, we can define f : sub(J) → 2sub(K) mapping σ to ∪r fr (σ). We now τ ∈f (σ) Proof. Take σ ∈ sub(X). Let υ ∈ sub(Z) for which there is a relevant substitution ρ in r that extends both σ and υ. As usual we prove that υ ∈ h(σ). Take τ = ρ|Y . Then ρ is a relevant substitution that extends both τ and υ. Therefore, since g is a bound, υ ∈ g(τ ). Likewise, ρ is a relevant 63 prove that f is a bound in p. Hence, take a relevant atom p(c) in P. It sufficies to prove that c|K ∈ f (c|J ). Since p(c) is relevant, there is a rule r of p and a relevant grounding substitution ρ such that ρ(H(r)) = p(c). By the one-to-one correspondence between sub(J) and sub(VJ ) and sub(K) and sub(VK ), we know that c|J ∈ sub(VJ ) and c|K ∈ sub(VK ). Therefore, since fr is a bound, we know that c|K ∈ fr (c|J ). Hence, c|K ∈ f (c|J ), which proves that f is a bound in p. first rule of r, Y is bounded by w, and by transitivity X is bounded by w as well. In the second rule of r, X is bounded by u and Y bounded by w. Therefore, r1 is bounded by the union of u and w, while r2 is bounded by w. Finally, we obtain the following completion formula for o: A simple example illustrating this proposition is as follows: Suppose we have the following rules for p: 1. find all bounds on variables in rules (using a fixpoint procedure, using the base cases and lemmas in Case 4 and Proposition 3.16) ¬o(a) ∨ r(1, a) ∨ r(2, a) ∨ r(3, a) ∨ r(4, a) ∨ r(5, a) In theory, to find bounds we repeat the two steps below until a fixpoint is reached: p(X, Y ) ← X = Y + 1. p(X, Y ) ← X = Y − 1. 2. find all bounds on argument positions of predicates (using a fixpoint procedure, using Proposition 3.15) (we can restrict ourselves to the predicates occurring in positive bodies, since that are the only predicates useful for generating completion formulas) Both rules have functional bounds from X to Y and vice versa. By taking the union of these two bounds, we get the bound p1 y p2 where X is mapped to {X − 1, X + 1}. This shows that functional bounds on rules do not necessarily give rise to functional bounds on argument positions. If new bounds in predicates are detected, then these can be used to find new bounds in rules analogous to Proposition 3.9. Proposition 3.16. Let p be a predicate with a bound f : J y K in p. If p(t) ∈ B+ (r), then there is a bound var t|J y var t|K 4 Future work To tackle this problem in its most general form, one could develop methods similar to grounding with bounds (Wittocx, Mariën, and Denecker 2010) that were developed in the context of model expansion (Mitchell and Ternovska 2005) for an extensions of first-order logic (Denecker and Ternovska 2008) that closely relates to answer set programming (Denecker et al. 2019). While the cases studied in the previous section allow for adding completion constraints in a wide variety of applications, we see the current work as a stepping stone towards a more extensive theory of approximations that enable adding completion constraints. In this section, we provide several directions in which the current work can be extended. in r. This bound is functional, if f is functional. Proof. Let X = var t|J and Y = var t|K . Any element τ ′ ∈ sub(K) corresponds to a unique element τ ∈ sub(Y ). Similarly, any σ ∈ sub(X) corresponds to a unique element σ ∈ sub(J). Define g : sub(X) → 2sub(Y ) : σ 7→ {τ | τ ′ ∈ f (σ ′ )} Dynamic overapproximations The approximations developed and described in the previous section can all be determined statically. However, during solving sometimes more consequences at decision level zero are derived. Taking these also into account (instead of just the determined predicates) can result in better approximations and hence more completion constraints. Take σ ∈ sub(X). Let τ ∈ sub(Y ) and let ρ be a relevant substitution in r that extends both σ and τ . We prove that τ ∈ f (σ). Since f is a bound in p, for each relevant atom p(c) it holds that c|K ∈ f (c|J ). Since ρ is relevant, we know that p(t) is justified; hence p(ρ(t)) is a relevant atom. Therefore, ρ(t)|K ∈ f (ρ(t)|J ) because f is a bound. We can see that ρ(t)|J corresponds to σ and ρ(t)|K corresponds to τ , which completes the proof. More bounds in predicates For finding new opportunities to add completion formulas, it is necessary that (especially functional) bounds between argument positions are detected, eventhough they are not directly used in generating the completion formulas. This detection can be done by syntactic means, such as inspecting their defining rules, or by semantic means (De Cat and Bruynooghe 2013). We already supplied Proposition 3.15, however this is not sufficient to find all useful bounds. For example, in each rule below we have functional bounds {2, 3} y {4, 5} and {4, 5} y {2, 3}, but the complete predicate has the following fundamental functional bounds {1, 2, 3} y {4, 5} and {1, 4, 5} y {2, 3}. This is because if you know the first argument position, then you know the rule that is used. If for example you have neighbor (n, X, Y, XX, Y Y ) in the positive body of a rule, The interaction between Proposition 3.15 and Proposition 3.16 is shown in the following example program: u(1..3). w(3..5). p(A, B) ← u(A), w(B). q(B) ← p(C, B). r(X, Y ) ← q(Y ), X = Y. r(X, Y ) ← p(X, Y ). o(a) ← r(X, a). We know that both u and w are determined predicates. Therefore, in the rule of p, A is bounded by u and B bounded by w. This indicates that p1 is bounded by u and p2 is bounded by w. Similarly, q 1 is bounded by w. In the 64 then you know the first rule is applicable: X = XX and Y = Y Y − 1. neighbor (D, X, Y, X, Y Y ) ← D = n, Y = Y Y − 1. neighbor (D, X, Y, X, Y Y ) ← D = s, Y = Y Y + 1. neighbor (D, X, Y, XX, Y ) ← D = w, X = XX − 1. neighbor (D, X, Y, XX, Y ) ← D = e, X = XX + 1. These dependencies are not detected by the double fixpoint procedure. Intuitively, what is going on here is that the first argument of neighbor is inherently linked to which rule is applicable. Depending on that first argument, we can decide which functional dependency can be generalized to the predicate level (but it is not always the same). 5 Calimeri, F.; Faber, W.; Gebser, M.; Ianni, G.; Kaminski, R.; Krennwallner, T.; Leone, N.; Ricca, F.; and Schaub, T. 2013. ASP-Core-2 input language format. Technical report, ASP Standardization Working Group. Calimeri, F.; Gebser, M.; Maratea, M.; and Ricca, F. 2016. Design and results of the fifth answer set programming competition. Artif. Intell. 231:151–181. Calimeri, F.; Faber, W.; Gebser, M.; Ianni, G.; Kaminski, R.; Krennwallner, T.; Leone, N.; Maratea, M.; Ricca, F.; and Schaub, T. 2020. ASP-Core-2 input language format. TPLP 20(2):294–309. Clark, K. L. 1978. Negation as failure. In Logic and Data Bases, 293–322. Plenum Press. Dal Palù, A.; Dovier, A.; Pontelli, E.; and Rossi, G. 2009. GASP: Answer set programming with lazy grounding. Fundam. Inform. 96(3):297–322. Dao-Tran, M.; Eiter, T.; Fink, M.; Weidinger, G.; and Weinzierl, A. 2012. Omiga: An open minded grounding on-the-fly answer set solver. In del Cerro, L. F.; Herzig, A.; and Mengin, J., eds., JELIA, volume 7519 of LNCS, 480– 483. Springer. De Cat, B., and Bruynooghe, M. 2013. Detection and exploitation of functional dependencies for model generation. TPLP 13(4–5):471–485. Denecker, M., and Ternovska, E. 2008. A logic of nonmonotone inductive definitions. ACM Trans. Comput. Log. 9(2):14:1–14:52. Denecker, M.; Lierler, Y.; Truszczynski, M.; and Vennekens, J. 2019. The informal semantics of answer set programming: A Tarskian perspective. CoRR abs/1901.09125. Denecker, M.; Brewka, G.; and Strass, H. 2015. A formal theory of justifications. In Calimeri, F.; Ianni, G.; and Truszczyński, M., eds., Logic Programming and Nonmonotonic Reasoning - 13th International Conference, LPNMR 2015, Lexington, KY, USA, September 27-30, 2015. Proceedings, volume 9345 of Lecture Notes in Computer Science, 250–264. Springer. Gebser, M.; Maratea, M.; and Ricca, F. 2017. The sixth answer set programming competition. J. Artif. Intell. Res. 60:41–95. Gelfond, M., and Lifschitz, V. 1988. The stable model semantics for logic programming. In Kowalski, R. A., and Bowen, K. A., eds., ICLP/SLP, 1070–1080. MIT Press. Lefèvre, C., and Nicolas, P. 2009. The first version of a new ASP solver: ASPeRiX. In Erdem, E.; Lin, F.; and Schaub, T., eds., LPNMR, volume 5753 of LNCS, 522–527. Springer. Leutgeb, L., and Weinzierl, A. 2017. Techniques for efficient lazy-grounding ASP solving. In Seipel, D.; Hanus, M.; and Abreu, S., eds., Declare 2017 – Conference on Declarative Programming, proceedings, number 499 in Institut für Informatik technical report, 123–138. Liu, L.; Pontelli, E.; Son, T. C.; and Truszczynski, M. 2007. Logic programs with abstract constraint atoms: The role of computations. In Dahl, V., and Niemelä, I., eds., Logic Programming, 23rd International Conference, ICLP 2007, Conclusion In this paper, we highlighted the issue of missing completion formulas in lazy grounding and provided lightweight solutions for this issue based on static program analysis. In our theoretical analysis, we found that the completion formulas that can now be added are in some cases identical to redundant constraints added to improve search performance; hence, usage of our techniques eliminates this burden for the programmer. Our next step in this research will be implementing the presented ideas and experimenting to find out what their impact is on the runtime of lazy grounders. In Section 4, we identified several directions in which this work can continue that would allow for the detection of even more completion constraints. We intend to evaluate these as well in follow-up research. References Anger, C.; Gebser, M.; Janhunen, T.; and Schaub, T. 2006. What’s a head without a body? In Brewka, G.; Coradeschi, S.; Perini, A.; and Traverso, P., eds., ECAI, 769–770. IOS Press. Armstrong, W. W. 1974. Dependency structures of data base relationships. IFIP Congress 580–583. Balduccini, M.; Lierler, Y.; and Schüller, P. 2013. Prolog and ASP inference under one roof. In Cabalar, P., and Son, T. C., eds., Logic Programming and Nonmonotonic Reasoning, 12th International Conference, LPNMR 2013, Corunna, Spain, September 15-19, 2013. Proceedings, volume 8148 of LNCS, 148–160. Springer. Banbara, M.; Kaufmann, B.; Ostrowski, M.; and Schaub, T. 2017. Clingcon: The next generation. TPLP 17(4):408–461. Barrett, C. W.; Sebastiani, R.; Seshia, S. A.; and Tinelli, C. 2009. Satisfiability modulo theories. In Biere et al. (2009). 825–885. Biere, A.; Heule, M.; van Maaren, H.; and Walsh, T., eds. 2009. Handbook of Satisfiability, volume 185 of Frontiers in Artificial Intelligence and Applications. IOS Press. Bogaerts, B., and Weinzierl, A. 2018. Exploiting justifications for lazy grounding of answer set programs. In Lang, J., ed., Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 1319, 2018, Stockholm, Sweden., 1737–1745. ijcai.org. 65 Porto, Portugal, September 8-13, 2007, Proceedings, volume 4670 of Lecture Notes in Computer Science, 286–301. Springer. Marek, V., and Truszczyński, M. 1999. Stable models and an alternative logic programming paradigm. In Apt, K. R.; Marek, V.; Truszczyński, M.; and Warren, D. S., eds., The Logic Programming Paradigm: A 25-Year Perspective. Springer-Verlag. 375–398. Marques-Silva, J. P., and Sakallah, K. A. 1999. GRASP: A search algorithm for propositional satisfiability. IEEE Transactions on Computers 48(5):506–521. Marques Silva, J. P.; Lynce, I.; and Malik, S. 2009. Conflictdriven clause learning SAT solvers. In Biere et al. (2009). 131–153. Martelli, A., and Montanari, U. 1982. An efficient unification algorithm. ACM Trans. Program. Lang. Syst. 4(2):258–282. Mitchell, D. G., and Ternovska, E. 2005. A framework for representing and solving NP search problems. In Veloso, M. M., and Kambhampati, S., eds., AAAI, 430–435. AAAI Press / The MIT Press. Rossi, F.; van Beek, P.; and Walsh, T., eds. 2006. Handbook of Constraint Programming, volume 2 of Foundations of Artificial Intelligence. Elsevier. Stuckey, P. J. 2010. Lazy clause generation: Combining the power of SAT and CP (and mip?) solving. In Lodi, A.; Milano, M.; and Toth, P., eds., Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems, 7th International Conference, CPAIOR 2010, Bologna, Italy, June 14-18, 2010. Proceedings, volume 6140 of Lecture Notes in Computer Science, 5–9. Springer. Taupe, R.; Weinzierl, A.; and Friedrich, G. 2019. Degrees of laziness in grounding - effects of lazy-grounding strategies on ASP solving. In Balduccini, M.; Lierler, Y.; and Woltran, S., eds., Logic Programming and Nonmonotonic Reasoning - 15th International Conference, LPNMR 2019, Philadelphia, PA, USA, June 3-7, 2019, Proceedings, volume 11481 of Lecture Notes in Computer Science, 298–311. Springer. Taupe, R.; Weinzierl, A.; and Schenner, G. 2017. Introducing Heuristics for Lazy-Grounding ASP Solving. In 1st International Workshop on Practical Aspects of Answer Set Programming. Weinzierl, A. 2017. Blending lazy-grounding and CDNL search for answer-set solving. In Balduccini, M., and Janhunen, T., eds., Logic Programming and Nonmonotonic Reasoning - 14th International Conference, LPNMR 2017, Espoo, Finland, July 3-6, 2017, Proceedings, volume 10377 of Lecture Notes in Computer Science, 191–204. Springer. Wittocx, J.; Mariën, M.; and Denecker, M. 2010. Grounding FO and FO(ID) with bounds. J. Artif. Intell. Res. (JAIR) 38:223–269. Zhang, L.; Madigan, C. F.; Moskewicz, M. W.; and Malik, S. 2001. Efficient conflict driven learning in Boolean satisfiability solver. In ICCAD, 279–285. 66 Splitting a Logic Program Efficiently Rachel Ben-Eliyahu-Zohary Department of Software Engineering Azrieli College of Engineering, Jerusalem, Israel rbz@jce.ac.il Abstract splitting sets, splitting sets that include certain atoms, or splitting sets that define a bottom part with minimum number of rules or bottom that are easy to compute, for example, a bottom which is an HCF program [Ben-Eliyahu and Dechter, 1994]. Second, we ask if it is possible to relax the definition of splitting sets such that we can now split programs that could not be split using the original definition. We answer affirmatively to the second question as well, and we present a more general and relaxed definition of a splitting set. Answer Set Programming (ASP) is a successful method for solving a range of real-world applications. Despite the availability of fast ASP solvers, computing answer sets demands a very large computational power, since the problem tackled is in the second level of the polynomial hierarchy. A speed-up in answer set computation may be attained, if the program can be split into two disjoint parts, bottom and top. Thus, the bottom part is evaluated independently of the top part, and the results of the bottom part evaluation are used to simplify the top part. Lifschitz and Turner have introduced the concept of a splitting set, i.e., a set of atoms that defines the splitting. In this paper, we address two issues regarding splitting. First, we show that the problem of computing a splitting set with some desirable properties can be reduced to a classic Search Problem and solved in polynomial time. Second, we show that the definition of splitting sets can be adjusted to allow splitting of a broader class of programs. 1 2 Preliminaries 2.1 Disjunctive Logic Programs and Stable Models A propositional Disjunctive Logic Program (DLP) is a collection of rules of the form A1 | . . . |Ak ←− Ak+1 , . . . , Am , not Am+1 , . . . , not An , n ≥ m ≥ k ≥ 0, Introduction Answer Set Programming (ASP) is a successful method for solving a range of real-world applications. Despite the availability of fast ASP solvers, the task of computing answer sets demands extensive computational power, because the problem tackled is in the second level of the polynomial hierarchy. A speed-up in answer set computation may be gained, if the program can be divided into several modules in which each module is computed separately [Lifschitz and Turner, 1994; Janhunen et al., 2009; FLL, 2009]. Lifschitz and Turner propose to split a logic program into two disjoint parts, bottom and top, such that the bottom part is evaluated independently from the top part, and the results of the bottom part evaluation are used to simplify the top part. They have introduced the concept of a splitting set, i.e., a set of atoms that defines the splitting [Lifschitz and Turner, 1994]. In addition to inspiring incremental ASP solvers [Gebser et al., 2008], splitting sets are shown to be useful also in investigating answer set semantics [Dao-Tran et al., 2009; Oikarinen and Janhunen, 2008; FLL, 2009]. In this paper we raise and answer two questions regarding splitting sets. The first question is, how do we compute a splitting set? We show that if we are looking for a splitting set having a desirable property that can be tested efficiently, we can find it in polynomial time. Examples of desirable splitting sets can be minimum-size 67 where the symbol “not ” denotes negation by default, and each Ai is an atom (or variable). For k + 1 ≤ i ≤ m, we will say that Ai appears positive in the body of the rule, while for m + 1 ≤ i ≤ n, we shall say that Ai appears negative in the body of the rule If k = 0, then the rule is called an integrity rule. If k > 1, then the rule is called a disjunctive rule. The expression to the left of ←− is called the head of the rule, while the expression to the right of ←− is called the body of the rule. Given a rule r, head (r) denotes the set of atoms in the head of r, and body(r) denotes the set of atoms in the body of r. From now, when we refer to a program, it is a DLP. Stable Models [Gelfond and Lifschitz, 1991] of a program P are defined as Follows: Let Lett(P) denote the set of all atoms occurring in P. Let a context be any subset of Lett(P). Let P be a negation-by-default-free program. Call a context S closed under P iff for each rule A1 | . . . |Ak ← Ak+1 , . . . , Am in P, if Ak+1 , . . . , Am ∈ S, then for some i = 1, . . . , k, Ai ∈ S. A Stable Model of P is any minimal context S, such that S is closed under P . A stable model of a general DLP is defined as follows: Let the reduct of P w.r.t. P and the context S be the DLP obtained from P by deleting (i) each rule that has not A in its body for some A ∈ S, and (ii) all subformulae of the form not A of the bodies of the remaining rules. Any context S which is a stable model of the reduct of P w.r.t. P and the context S is a stable model of P. 2.2 Programs and graphs With every program P we associate a directed graph, called the dependency graph of P, in which (a) each atom in Lett(P) is a node, and (b) there is an arc directed from a node A to a node B if there is a rule r in P such that A ∈ body(r) and B ∈ head (r). A super-dependency graph SG is an acyclic graph built from a dependency graph G as follows: For each strongly connected component (SCC) c in G there is a node in SG, and for each arc in G from a node in a strongly connected component c1 to a node in a strongly connected component c2 (where c1 6= c2 ) there is an arc in SG from the node associated with c1 to the node associated with c2 . A program P is Head-Cycle-Free (HCF), if there are no two atoms in the head of some rule in P that belong to the same component in the super-dependency graph of P [Ben-Eliyahu and Dechter, 1994]. Let G be a directed graph and SG be a super dependency graph of G. A source in G (or SG) is a node with no incoming edges. By abuse of terminology, we shall sometimes use the term “source” or “SCC” as the set of nodes in a certain source or a certain SCC in SG, respectively, and when there is no possibility of confusion we shall use the term rule for the set of all atoms that appears in the rule. Given a node v in G, scc(v) denotes the set of all nodes in the SCC in SG to which v belongs, and tree(v) denotes the set of all nodes that belongs to any SCC S such that there is a path in SG from S to scc(v). Similarly, when S is a set of nodes, tree(S) is the union of all tree(v) for every v ∈ S. For example, given the super dependency graph in Figure 1, scc(e) = {e, h}, tree(e) = {a, b, e, h}, tree({f, g}) = {a, b, c, d, f, g} and tree(r), where r = c|f ←− not d is actually tree({c, d, f }) which is {a, b, c, d, f }. A source in a program will serve as a shorthand for “a source in the super dependency graph of the program.” Given a source S of a program P, PS denotes the set of rules in P that uses only atoms from S. Example 2.1 (Running Example) Suppose we are given the following program P 1. 2. 3. 4. 5. 6. 7. 8. a e|b f g|d c|f h e h ←− ←− ←− ←− ←− ←− ←− ←− not b not a not b c not d e a, not h a In Figure 1 the dependency graph of P is illustrated in solid lines. The SG is marked with dotted lines. Note that {a, b} is a source in the SG of P, but it is not a splitting set. 2.3 Splitting Sets The definitions of Splitting Set and the Splitting Set Theorem are adopted from a paper by Lifschitz and Turner [Lifschitz and Turner, 1994]. We restate them here using the notation and the limited form of programs discussed in our work. Definition 2.2 (Splitting Set) A Splitting Set for a program P is a set of of atoms U such that for each rule r in P, if one of the atoms in the head of r is in U , then all the atoms in r are in U . We denote by bU (P) the set of all rules in P having only atoms from U. The empty set is a splitting set for any program. For an example of a nontrivial splitting set, the set {a, b, e, h} is a splitting set for the program P introduced in Example 2.1. The set b{a,b,e,h} (P) is {r1 , r2 , r6 , r7 , r8 }. For the Splitting set theorem, we need the a procedure called Reduce, which resembles many reasoning methods in knowledge representation, as, for example, unit propagation in DPLL and other constraint satisfaction algorithms [Davis et al., 1962; 68 Procedure Reduce(P,X,Y ) Input: A program P and two sets of atoms: X and Y Output: An update of P assuming all the atoms in X are true and all atoms in Y are false 1 2 3 4 5 foreach atom a ∈ X do foreach rule r in P do If a appears negative in the body of r delete r ; If a is in the head of r delete r; Delete each positive appearance of a in the body of r; 10 foreach atom a ∈ Y do foreach rule r in P do If a appears positive in the body of r, delete r ; If a is in the head of r, delete a from the head of r; Delete each negative appearance of a in the body of r; 11 return P; 6 7 8 9 Dechter, 2003]. Reduce(P,X,Y ) returns the program obtained from a given program P in which all atoms in X are set to true, and all atoms in Y are set to false. Reduce(P,X,Y ) is shown in Figure Reduce. For example, Reduce(P,{a, e, h},{b}), where P is the program from Example 2.1, is the following program (the numbers of the rules are the same as the corresponding rules of the program in Example 2.1): 3. 4. 5. f g|d c|f ←− ←− ←− c not d Theorem 2.3 (Splitting Set Theorem) (adopted from [Lifschitz and Turner, 1994]) Let P be a program, and let U be a splitting set for P. A set of atoms S is a stable model for P if and only if S = X ∪ Y , where X is a stable model of bU (P), and Y is a stable of Reduce(P, X, U − X). As seen in Example 2.1, a source is not necessarily a splitting set. A slightly different definition of a dependency graph is possible. The nodes are the same as in our definition, but in addition to the edges that we already have, we add a directed arc from a variable A to a variable B whenever A and B are in the head of the same rule. It is clear that a source in this variation of dependency graph must be a splitting set. The problem is that the size of a dependency graph built using this new definition may be exponential in the size of the head of the rules, while we are looking for a polynomial-time algorithm for computing a nontrivial splitting set. 2.4 Search Problems The area of search is one of the most studied and most known areas in AI (see, for example, [Pearl, 1984]). In this paper we show how the problem of computing a nontrivial minimum-size splitting set can be expressed as a search problem. We first recall basic definitions in the area of search. A search problem is defined by five elements: set of states, initial state, actions or successor function, goal test, and path cost. A solution is a sequence of actions leading from the initial state to a goal state. Figure 2 provides a basic search algorithm [Russell and Norvig, 2010]. There are many different strategies to employ when we choose the next leaf node to expand. In this paper we use uniform cost, according to which we expand the leaf node with the lowest path cost. Proof: The set tree(r) is a union of SCCs. We shall show that for every SCC S such that S ⊆ tree(r), S ⊆ SP . Let S ′ be the root of tree(r). The proof is by induction on the distance i from S to S ′ . Case i = 0. Then S = S ′ , and since S ′ is the root of tree(r) and head (r) ∩ SP 6= ∅, by Lemma 3.1 S ⊆ SP . Induction Step. Suppose that for all SCCs S ∈ tree(r) such that the distance from S to S ′ is of size i S ⊆ SP . Let R be an SCC in tree(r), such that the distance from R to S ′ is of size i + 1. So, there must be an SCC R′ , such that there is an edge in tree(r) from R to R′ , and the distance from R′ to S ′ is of size i. By the induction hypothesis, R′ ⊆ SP . Since there is an edge from R to R′ in tree(r), it must be the case that there is a rule r in P, such that an atom from R, say Q, is in body(r), and an atom from R′ , say Q′ , is in head (r). By induction hypothesis, Q′ ∈ SP , and since SP is a Splitting Set, it must be that Q ∈ SP . By Lemma 3.1, R ⊆ SP . Figure 1: The [super]dependency graph of the program P. ✷ Corollary 3.3 Every Splitting set is a collection of trees. Note that the converse of Corollary 3.3 does not hold. In our running example, for instance, tree(g) = {c, d, g}, but {c, d, g} is not a splitting set. 4 Figure 2: Tree Search Algorithm 3 Between Splitting Sets and Dependency Graphs In this section we show that a splitting set is actually a tree in the SG of the program P. The first lemma states that if an atom Q is in some splitting set, all the atoms in scc(Q) must be in that splitting set as well. Lemma 3.1 Let P be a program, let SP be a Splitting Set in P, let Q ∈ SP , and let S = scc(Q). It must be the case that S ⊆ SP . Proof: Let R ∈ S. We will show that R ∈ SP . Since Q ∈ S, and S is a strongly connected component, it must be that for each Q′ ∈ S there is a path in SG -the super dependency graph of P - from Q′ to Q, such that all the atoms along the path belong to S. The proof goes by induction on i, the number of edges in the shortest path from Q′ to Q. Case i = 0. Then Q = Q′ , and so obviously Q′ ∈ SP . Induction Step. Suppose that for all atoms Q′ ∈ S, such that the shortest path from Q′ to Q is of size i, Q′ belongs to SP . Let R be an atom in S, such that the shortest path from R to Q is of size i + 1. So, there must be an atom R′ such that there is an edge in SG from R to R′ , and the shortest path from R′ to Q is of size i. By the induction hypothesis, R′ ∈ SP . Since there is an edge from R to R′ in SG, it must be that there is a rule r in P, such that R ∈ body(r) and R′ ∈ head (r). Since R′ ∈ SP and SP is a Splitting Set, it must be the case that R ∈ SP . ✷ Lemma 3.2 Let P be a program, let SP be a Splitting Set in P, let r be a rule in P, and S an SCC in SG – the super dependency graph of P. If head (r) ∩ SP 6= ∅, then tree(r) ⊆ SP . 69 Computing a minimum-size Splitting Set as a search problem We shall now confront the problem of computing a splitting set with a desirable property. We shall focus on computing a nontrivial minimum-size splitting set. Given a program P, this is how we view the task of computing a nontrivial minimum-size splitting set as a search problem. We assume that there is an order over the rules in the program. State Space. The state space is a collection of forests which are subgraphs of the super dependency graph of P. Initial State. The empty set. Actions. 1. The initial state can unite with one of the sources in the super dependency graph of P. 2. A state S, other than the initial state, has only one possible action, which is: (a) Find the lowest rule r (recall that the rules are ordered) such that head (r) ∩ S 6= ∅ and Lett(r) 6⊆ S; (b) Unite S with tree(r). Transition Model The result of applying an action on a state S is a state S ′ that is a superset of S as the actions describe. Goal Test A state S is a goal state, if there is no rule r ∈ P such that head (r) ∩ S 6= ∅ and Lett(r) 6⊆ S. (In other words, a goal state is a state that represents a splitting set.); Path Cost The cost of moving from a state S to a state S ′ is |S ′ | − |S|, that is the number of atoms added to S when it was transformed to S ′ . So, the path cost is actually the number of atoms in the final state of the path. Once the problem is formulated as a search problem, we can use any of the search algorithms developed in the AI community to solve it. We do claim here, however, that the computation of a nontrivial minimum-size splitting set can be done in time that is polynomial in the size of the program. This search problem can be solved, for example, by a search algorithm called Uniform Cost. Algorithm Uniform Cost [Russell and Norvig, 2010] is a variation of Dijkstra’s single-source shortest path algorithm [Dijkstra, 1959; Felner, 2011]. Algorithm Uniform Cost is optimal, that is, it returns a shortest path to a goal state. Since the search problem is formulated so that the length of the path to a goal state is the size of the splitting set that the goal state represents, Uniform Cost will find a minimum-size splitting set. The time complexity of this algorithm is O(bm ), where b is the branching factor of the search tree generated, and m is the depth of the optimal solution. It is easy to see that m cannot be larger than the number of rules in the program, because once we use a rule for computing the next state, this rule cannot be used any longer in any sequel state. As for b, the branching factor, except for the initial state, each state can have at most one child; to generate a child we apply the lowest rule that demonstrates that the current state is not a splitting set. In a given a specific state, the time that required to calculate its child is polynomial in the size of the program. Therefore, this search problem can be solved in polynomial time. This claim is summarized in the following proposition. Proposition 4.1 A minimum-size nontrivial splitting set can be computed in time polynomial in the size of the program. The following example demonstrates how the search algorithm works, assuming that we are looking for the smallest non-empty splitting set, and we are using uniform cost search. Figure 4: Average size of nonempty splitting sets. tree. The leaf {a, b} with path cost 2, that was there before, and the leaf {c, d, g}, that was just added, with path cost 3. So we go and check whether {a, b} is a splitting set and find out that Rule no. 2 is the lowest rule that proves it is not. W,e add the tree of Rule no. 2 and get the child {a, b, e, h} with a path cost 4. So, we go now and check whether {c, d, g} is a splitting set and find that Rule no. 5 is the lowest rule that proves that it is not. We add the tree of Rule no. 5 and get the child {c, d, g, f, a, b} with a path cost 6. Back to the leaf {a, b, e, h}, the leaf with the shortest path, we find that it is also a splitting set, and we stop the search. ✷ 5 Experiments We have implemented our algorithm and tested it on randomly generated programs, having no negation as failure. A stable model is actually a minimal model for this type of program. For each program we have computed a nontrivial minimum-size splitting set. The average nontrivial minimum size of a splitting set, and the median of all nontrivial minimum size splitting sets, as a function of the rules to variable number ratio, are shown in Graph 4 and Graph 5, respectively. The average and median were taken over 100 programs generated randomly, starting with a ratio of 2 and generating 100 random programs for each interval of 0.25. It is clear from the graphs that in the transition value of 4.25 (See [Selman et al., 1996]) the size of the splitting set is maximal, and it is equal to the number of variables in the program. This is a new way of explaining that, programs in the phase transition value of rules to variable are hard to solve Figure 3: The search tree for P. Example. Suppose we are given the program P of Example 2.1, and we want to apply the search procedure to compute a nontrivial minimum-size splitting set. The search tree is shown in Figure 3. Our initial state is the empty set. By the definition of the search problem, the successors of the empty set are the sources of the super dependency graph of the program, which in this case are {a, b} and {c, d}, both of which with action cost 2. Since both current leaves have the same path cost, we shall choose randomly one of them, say {c, d}, and check whether it is a goal state, or in other words, a splitting set. It turns out {c, d} is not a splitting set, and the lowest rule that proves it is rule No. 4 that requires a splitting set that includes d to have also c and g. So, we make the leaf {c, d, g} the son of {c, d} with action cost 1 (only one atom, g, was added to {c, d}). Now we have two leaves in the search 70 6 Relaxing the splitting set condition As the experiments indicate, in the hard random problems the only nonempty splitting set is the set of all atoms in the program. In such cases splitting is not useful at all. In this section we introduce the concept of generalized splitting set (g-splitting set), which is a relaxation of the concept of a splitting set. Every splitting set is a g-splitting set, but there are g-splitting sets that are not splitting sets. Definition 6.1 (Generalized Splitting Set.) A Generalized Splitting Set (g-splitting set) for a program P is a set of of atoms U such that for each rule r in P, if one of the atoms in the head of r is in U , then all the atoms in the body of r are in U . splitting method is efficient, but clearly it can be quite resource demanding in the worst case. Baumann [Baumann, 2011] discuss splitting sets and graphs, but they do not go all the way in introducing a polynomial algorithm for computing classical splitting sets, as we do here. The authors of [Baumann et al., 2012] suggest quasi-splitting, a relaxation of the concept of splitting that requires the introduction of new atoms to the program, and they describe a polynomial algorithm, based on the dependency graph of the program, to efficiently compute a quasi-splitting set. Our algorithm is essentially a search algorithm with fractions of the dependency graph as states in the search space. We do not need the introduction of new atoms to define g-splitting sets. 8 Figure 5: Median size of nonempty splitting sets. Thus, g-splitting sets that are not splitting sets may be found only when there are disjunctive rules in the program. Example 6.2 Suppose we are given the following program P: 1. a ←− not b 2. b ←− not a 3. b|c ←− a 4. a|d ←− b The program has only the two trivial splitting sets — the empty set and {a, b, c, d}. However, the set {a, b} is a g-splitting set of P. We next demonstrate the usefulness of g-splitting sets. We show that it is possible to compute a stable model of a program P by computing a stable model of PS for a g-splitting set S of P, and then propagating the values assigned to atoms in S to the rest of the program. Theorem 6.3 (program decomposition.) Let P be a program. For any g-splitting-set S in P, let X be a stable model of PS . Moreover, let P ′ = Reduce(P,X,S-X), where Reduce(P, X, S − X) is the result of propagating the assignments of the model X in the program P. Then, for any stable model M ′ of P ′ , M ′ ∪ X is a stable model of P. The proof can be found in the full version of the paper. Consider the program P from Example 6.2, which has two stable models: {a, c} and {b, d}. Let us compute the stable models of P according to Theorem 6.3. We take U = {a, b}, which is a g-splitting set for P . The bottom of P according to U , denoted b{a,b} (P), are Rule 1 and Rule 2, that is: {a ←− not b, b ←− not a}. So the bottom has two stable models: {a}, and {b}. If we propagate the model {a} to the top of the program, we are left with the rule {c ←− }, and we get the stable model {a, c}. If we propagate the model {b} to the top of the program, we are left with the rule {d ←− }, and we get the stable model {b, d}. 7 Related Work The idea of splitting is discussed in many publications. Here we discuss papers that deal with generating splitting sets and relaxing the definition of a splitting set. The work in [Ji et al., 2015] suggests a new way of splitting that introduces a possibly exponential number of new atoms to the program. The authors show that for some typical programs their 71 Conclusions The concept of splitting has a considerable role in logic programming. This paper has two major contributions. First, we show that the task of looking for an appropriate splitting set can be formulated as a classical search problem and computed in time that is polynomial in the size of the program. Search has been studied extensively in AI, and when we formulate a problem as a search problem, we immediately benefit from the library of search algorithms and strategies that has developed in the past and will be generated in the future. Our second contribution is introducing gsplitting sets, which are a generalization of the definition of splitting sets, as presented by Lifschitz and Turner. This allows for a larger set of programs to be split to non-trivial parts. References [Baumann et al., 2012] Ringo Baumann, Gerhard Brewka, Wolfgang Dvořák, and Stefan Woltran. Parameterized Splitting: A Simple Modification-Based Approach, pages 57–71. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012. [Baumann, 2011] Ringo Baumann. Splitting an argumentation framework. In James P. Delgrande and Wolfgang Faber, editors, Logic Programming and Nonmonotonic Reasoning, pages 40–53, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg. [Ben-Eliyahu and Dechter, 1994] Rachel Ben-Eliyahu and Rina Dechter. Propositional semantics for disjunctive logic programs. Annals of Mathematics and Artificial Intelligence, 12:53–87, 1994. [Dao-Tran et al., 2009] Minh Dao-Tran, Thomas Eiter, Michael Fink, and Thomas Krennwallner. Modular nonmonotonic logic programming revisited. In Patricia M. Hill and David S. Warren, editors, Logic Programming, pages 145–159, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg. [Davis et al., 1962] Martin Davis, George Logemann, and Donald Loveland. A machine program for theorem-proving. Communications of the ACM, 5(7):394–397, 1962. [Dechter, 2003] Rina Dechter. Constraint processing. Morgan Kaufmann, 2003. [Dijkstra, 1959] Edsger W. Dijkstra. A note on two problems in connexion with graphs. Numerische mathematik, 1(1):269– 271, 1959. [Felner, 2011] Ariel Felner. Position paper: Dijkstra’s algorithm versus uniform cost search or a case against dijkstra’s algorithm. In Fourth annual symposium on combinatorial search, 2011. [FLL, 2009] Symmetric Splitting in the General Theory of Stable Models., 01 2009. [Gebser et al., 2008] Martin Gebser, Roland Kaminski, Benjamin Kaufmann, Max Ostrowski, Torsten Schaub, and Sven Thiele. Engineering an incremental asp solver. In Maria Garcia de la Banda and Enrico Pontelli, editors, Logic Programming, pages 190–205, Berlin, Heidelberg, 2008. Springer Berlin Heidelberg. [Gelfond and Lifschitz, 1991] Michael Gelfond and Vladimir Lifschitz. Classical negation in logic programs and disjunctive databases. New Generation Computing, 9:365–385, 1991. [Janhunen et al., 2009] Tomi Janhunen, Emilia Oikarinen, Hans Tompits, and Stefan Woltran. Modularity aspects of disjunctive stable models. Journal of Artificial Intelligence Research, 35:813–857, 2009. [Ji et al., 2015] Jianmin Ji, Hai Wan, Ziwei Huo, and Zhenfeng Yuan. Splitting a logic program revisited. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI’15, pages 1511–1517. AAAI Press, 2015. [Lifschitz and Turner, 1994] Vladimir Lifschitz and Hudson Turner. Splitting a logic program. In ICLP, volume 94, pages 23–37, 1994. [Oikarinen and Janhunen, 2008] Emilia Oikarinen and Tomi Janhunen. Achieving compositionality of the stable model semantics for smodels programs. Theory and Practice of Logic Programming, 8(5-6):717–761, 2008. 72 [Pearl, 1984] Judea Pearl. Heuristics: intelligent search strategies for computer problem solving. 1984. [Russell and Norvig, 2010] Stuart J. Russell and Peter Norvig. Artificial Intelligence - A Modern Approach, Third International Edition. Pearson Education, 2010. [Selman et al., 1996] Bart Selman, David G Mitchell, and Hector J Levesque. Generating hard satisfiability problems. Artificial intelligence, 81(1-2):17–29, 1996. Interpreting Conditionals in Argumentative Environments Jesse Heyninck,1 Gabriele Kern-Isberner,1 , Kenneth Skiba2 , Matthias Thimm2 1 Technical University Dortmund, Dortmund, Germany University of Koblenz-Landau, Koblenz, Germany jesse.heyninck@tu-dortmund.de, gabriele.kern-isberner@cs.tu-dortmund.de, kennethskiba@uni-koblenz.de, thimm@uni-koblenz.de 2 conditionals with the informal meaning “if φ is true then, usually, ψ is true as well” and written as (ψ|φ). In abstract dialectical frameworks, these pairs are interpreted as acceptance conditions, and interpreted as “if φ is accepted then ψ is accepted as well”. The resemblance of these informal interpretations is striking, but both approaches use fundamentally different semantics to formalise these interpretations. In previous works (Kern-Isberner and Thimm 2018; Heyninck, Kern-Isberner, and Thimm 2020) we looked at the question of what happens if we translate an ADF into a conditional logic knowledge base, used conditional logic reasoning mechanisms on the latter, and interpreted the results in argumentative terms. Our results showed, that the intuition behind the semantics of the two worlds is generally different, but there are also cases where their semantics coincide. In this paper, we look at the complementary question from before. We investigate what happens if we translate a conditional logic knowledge base into an ADF, use ADF reasoning mechanisms on the latter, and interpret the results in conditional logic terms. Outline of this Paper: After introducing the necessary preliminaries in Section 2 on propositional logic (Section 2.1), conditional logic (Section 2.2) and abstract dialectial frameworks (Section 2.3), we present our argumentative interpretation of conditionals in Section 3. We first present our translation for literal conditional knowledge bases (Section 3.2) and discuss the behaviour of the negation needed in this translation (Section 3.3). Thereafter we show the adequacy of this translation under both two-valued semantics in Section 3.4 and under other semantics in Section 3.5. We then generalize the translation as to allow for what we call extended literal conditional knowledge bases (Section 3.6) and discuss several properties of our translation in Section 3.7. Thereafter, we further motivate the design choices made in our interpretation in Section 4. Finally, we compare our work with related work (Section 5) and conclude in Section 6. Abstract In the field of knowledge representation and reasoning, different paradigms have co-existed for many years. Two central such paradigms are conditional logics and formal argumentation. Despite recent intensified efforts, the gap between these two approaches has not been fully bridged yet. In this paper, we contribute to the bridging of this gap by showing how plausible conditionals can be interpreted in argumentative reasoning enviroments. In more detail, we provide interpretations of conditional knowledge bases in abstract dialectical frameworks, one of the most general approaches to computational models of argumentation. We motivate the design choices made in our translation, show that different semantics give rise to several forms of adequacy, and show several desirable properties of our translation. 1 Introduction Different paradigms of modelling human-like reasoning behaviour have emerged over the years within the field of Knowledge Representation and Reasoning. For one, conditional logics (Kraus, Lehmann, and Magidor 1990; Nute 1984) are a classical approach to non-monotonic reasoning that focus on the role of defeasible rules of the form (φ|ψ) with the intuitive interpretation “if ψ is true then, usually, φ is true as well”. There exist several sophisticated reasoning approaches (Goldszmidt and Pearl 1996; Kern-Isberner 2001) that aim at resolving issues pertaining to contradictory rules. On the other hand, the more recent argumentative approaches (Atkinson et al. 2017) focus on the role of arguments, i. e., derivations of claims involving multiple rules, and how to resolve issues between arguments with contradictory claims. In particular, the abstract approach to formal argumentation (Dung 1995) has gained quite some interest in the wider community. One of the most general and expressive formalisms to abstract argumentation are Abstract Dialectical Frameworks (ADFs) (Brewka et al. 2013), which model the acceptability of arguments via general acceptability functions. In this paper we investigate the correspondence between abstract dialectical frameworks and conditional logics. Syntactically, both frameworks focus on pairs of objects such as (φ, ψ). In conditional logic, these pairs are interpreted as 2 Preliminaries In the following, we briefly recall some general preliminaries on propositional logic, as well as technical details on conditional logic and ADFs (Brewka et al. 2013). Copyright c 2019, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. 73 2.1 Propositional Logic probabilities. Such a conditional (ψ|φ) can be accepted as plausible if its verification φ ∧ ψ is more plausible than its falsification φ∧¬ψ, where plausibility is often modelled by a total preorder on possible worlds. This is in full compliance with nonmonotonic inference relations φ |∼ ψ (Makinson 1988) expressing that from φ, ψ may be plausibly/defeasibly derived. An obvious implementation of total preorders are ordinal conditional functions (OCFs), (also called ranking functions) κ : Ω → N ∪ {∞} (Spohn 1988). They express degrees of (im)plausibility of possible worlds and propositional formulas φ by setting κ(φ) := min{κ(ω) | ω |= φ}. OCFs κ provide a particularly convenient formal environment for nonmonotonic and conditional reasoning, allowing for simply expressing the acceptance of conditionals and nonmonotonic inferences via stating that (ψ|φ) is accepted by κ iff φ |∼ κ ψ iff κ(φ∧ψ) < κ(φ∧¬ψ), implementing formally the intuition of conditional acceptance based on plausibility mentioned above. For an OCF κ, Bel (κ) denotes the propositional beliefs that are implied by all most plausible worlds, i. e. Bel (κ) = {φ | ∀ω ∈ κ−1 (0) : ω |= φ}. We write κ |= φ if φ ∈ Bel (κ). Specific examples of ranking models are system Z yielding the inference relation |∼Z (Goldszmidt and Pearl 1996) and c-representations (Kern-Isberner 2001). We discuss system Z defined as follows. A conditional (ψ|φ) is tolerated by a finite set of conditionals ∆ if there is a possible world ω with (ψ|φ)(ω) = 1 and (ψ ′ |φ′ )(ω) 6= 0 for all (ψ ′ |φ′ ) ∈ ∆, i. e. ω verifies (ψ|φ) and does not falsify any (other) conditional in ∆. The Z-partitioning (∆0 , . . . , ∆n ) of ∆ is defined as: • ∆0 = {δ ∈ ∆ | ∆ tolerates δ}; • ∆1 , . . . , ∆n is the Z-partitioning of ∆ \ ∆0 . For δ ∈ ∆ we define: Z∆ (δ) = i iff δ ∈ ∆i and (∆0 , . . . , ∆n ) is the Z-partioning of ∆. Finally, the ranking Z function κZ ∆ is defined via: κ∆ (ω) = max{Z(δ) | δ(ω) = 0, δ ∈ ∆} + 1, with max ∅ = −1. We can now define ∆ |∼Z φ iff ⊤ |∼ κZ φ (which can be seen to be equivalent ∆ to φ ∈ Bel (κZ ∆ )). Below the following Lemma about system Z will prove useful: Lemma 1. Let ω ∈ Ω and ∆ be a conditional knowledge −1 base. Then ω 6∈ (κZ (0) iff δ(ω) = 0 for some δ ∈ ∆. ∆) For a set At of atoms let L(At) be the corresponding propositional language constructed using the usual connectives ∧ (and), ∨ (or), ¬ (negation) and → (material implication). We will sometimes write φ̊ to denote some element of {φ, ¬φ}. The set of literals is denoted by Lit = {φ̊ | φ ∈ At}. A (classical) interpretation (also called possible world) ω for a propositional language L(At) is a function ω : At → {⊤, ⊥}. Let Ω(At) denote the set of all interpretations for At. We simply write Ω if the set of atoms is implicitly given. An interpretation ω satisfies (or is a model of) an atom a ∈ At, denoted by ω |= a, if and only if ω(a) = ⊤. The satisfaction relation |= is extended to formulas as usual. As an abbreviation we sometimes identify an interpretation ω with its complete conjunction, i. e., if a1 , . . . , an ∈ At are those atoms that are assigned ⊤ by ω and an+1 , . . . , am ∈ At are those propositions that are assigned ⊥ by ω we identify ω by a1 . . . an an+1 . . . am (or any permutation of this). For example, the interpretation ω1 on {a, b, c} with ω(a) = ω(c) = ⊤ and ω(b) = ⊥ is abbreviated by abc. For Φ ⊆ L(At) we also define ω |= Φ if and only if ω |= φ for every φ ∈ Φ. Define the set of models Mod(X) = {ω ∈ Ω(At) | ω |= X} for every formula or set of formulas X. A formula or set of formulas X1 entails another formula or set of formulas X2 , denoted by X1 ⊢ X2 , if Mod(X1 ) ⊆ Mod(X2 ). 2.2 Reasoning with Nonmonotonic Conditionals Conditional logics are concerned with conditionals of the form (φ|ψ) whose informal meaning is “if ψ is true then, usually, φ is true as well”. A conditional knowledge base ∆ is a set of such conditionals. It is atomic if for every (φ|ψ) ∈ ∆, φ, ψ ∈ At and it is literal if for every (φ|ψ) ∈ ∆, φ, ψ ∈ Lit. We will not count the constants ⊤ or ⊥ as atoms or literals. If for every (φ|ψ) ∈ ∆, φ, ψ ∈ Lit ∪ {⊤}, we say ∆ is an extended literal conditional knowledge base. There are many different conditional logics (cf., e. g., (Kraus, Lehmann, and Magidor 1990; Nute 1984)), and we will just use basic properties of conditionals that are common to many conditional logics and are especially important for nonmonotonic reasoning: Basically, we follow the approach of de Finetti (de Finetti 1974) who considered conditionals as generalized indicator functions for possible worlds resp. propositional interpretations ω: ( 1 : ω |= φ ∧ ψ 0 : ω |= φ ∧ ¬ψ ((ψ|φ))(ω) = (1) u : ω |= ¬φ Proof. This follows immediately in view of the fact that ω ∈ −1 (κZ (0) iff δ(ω) 6= 0 for every δ ∈ ∆. ∆) We now illustrate OCFs in general and System Z in particular with the well-known “Tweety the penguin”-example. Example 1. Let ∆ = {(f |b), (b|p), (¬f |p)}, which expresses that most birds (b) fly (f ), most penguins ((p)) are birds, and most penguins do not fly. This conditional knowledge base has the following Z-partitioning: ∆0 = {(f |b)} and ∆1 = {(b|p), (¬f |p)}. This gives rise to the following κZ ∆ -ordering over the worlds based on the signature {b, f, p}: where u stands for unknown or indeterminate. In other words, a possible world ω verifies a conditional (ψ|φ) iff it satisfies both antecedent and conclusion ((ψ|φ)(ω) = 1); it falsifies, or violates it iff it satisfies the antecedence but not the conclusion ((ψ|φ)(ω) = 0); otherwise the conditional is not applicable, i. e., the interpretation does not satisfy the antecedence ((ψ|φ)(ω) = u). We say that ω satisfies a conditional (ψ|φ) iff it does not falsify it, i. e., iff ω satisfies its material counterpart φ → ψ. Hence, conditionals are threevalued logical entities and thus extend the binary setting of classical logics substantially in a way that is compatible with the probabilistic interpretation of conditionals as conditional ω bpf bpf 74 κZ ∆ 2 2 ω bpf bpf κZ ∆ 1 2 ω bpf bpf κZ ∆ 0 0 ω bpf bpf κZ ∆ 1 0 As an example of a κZ ∆ -belief, observe that ¬p, ¬(b ∧ ). ¬f ) ∈ Bel (κZ ∆ 2.3 f p Abstract Dialectical Frameworks We briefly recall some technical details on abstract dialectical frameworks (ADF) following loosely the notation from (Brewka et al. 2013). We can depict an ADF D as a directed graph whose nodes represent statements or arguments which can be accepted or not. With links we represent dependencies between nodes. A node s is depended on the status of the nodes with a direct link to s, denoted parent nodes parD (s). With an acceptance function Cs we define the cases when the statement s can be accepted (truth value ⊤), depending on the acceptance status of its parents in D. An ADF D is a tuple D = (S, L, C) where S is a set of statements, L ⊆ S × S is a set of links, and C = {Cs }s∈S is a set of total functions Cs : 2parD (s) → {⊤, ⊥} for each s ∈ S with parD (s) = {s′ ∈ S | (s′ , s) ∈ L}. By abuse of notation, we will often identify an acceptance function Cs by its equivalent acceptance condition which models the acceptable cases as a propositional formula. An ADF D = (S, L, C) is interpreted through 3-valued interpretations v : S → {⊤, ⊥, u}, which assign to each statement in S either the value ⊤ (true, accepted), ⊥ (false, rejected), or u (unknown). A 3-valued interpretation v can be extended to arbitrary propositional formulas over S via strong Kleene semantics: 1. v(¬φ) = ⊥ iff v(φ) = ⊤, v(¬φ) = ⊤ iff v(φ) = ⊥, and v(¬φ) = u iff v(φ) = u; 2. v(φ ∧ ψ) = ⊤ iff v(φ) = c(ψ) = ⊤, v(φ ∧ ψ) = ⊥ iff v(φ) = ⊥ or v(ψ) = ⊥, and v(φ ∧ ψ) = u otherwise; 3. v(φ ∨ ψ) = ⊤ iff v(φ) = ⊤ or v(ψ) = ⊤, v(φ ∨ ψ) = ⊥ iff v(φ) = c(ψ) = ⊥, and v(φ ∨ ψ) = u otherwise. b Figure 1: Graph representing links between nodes for D in Example 2. • • • • • v is a 2-valued model iff v ∈ V 2 and v is a model. v is complete for D iff v = ΓD (v). v is preferred for D iff v is ≤i -maximally complete for D. v is grounded for D iff v is ≤i -minimally complete for D. v is stable iff v is a model of D and {s ∈ S | v(s) = ⊤} = {s ∈ S | w(s) = ⊤} where w is the grounded interpretation of Dv . We denote by 2mod(D), complete(D), preferred(D) respectively stable(D) the sets of 2-valued models and complete, preferred, respectively stable interpretations of D. The grounded interpretation, which in (Brewka and Woltran 2010) is shown to be unique, will be denoted by vgD . If D is clear from the context we will just write vg . Notice that any complete interpretation is also a model. We finally define consequence relations for ADFs: Definition 2. Given sem ∈ {2mod, preferred, stable}, an ∩ ADF D = (S, L, C) and s ∈ L(S), we define: D |∼sem s[¬s] iff v(s) = ⊤[⊥] for all v ∈ sem(D). D |∼grounded s[¬s] iff vgD (s) = ⊤[⊥]. We illustrate ADFs by looking at a naive formalization of the Penguin-example in abstract dialectical argumentation: Example 2. Let D = ({p, b, f }, L, C) with Cp = p, Cb = p and Cf = ¬p∨b. The corresponding graph for D can be find in Figure 1. This ADF has two two-valued models, which are also its preferred models: v1 with v1 (p) = v1 (b) = ⊥ and v1 (f ) = ⊤ and v2 with v2 (p) = v2 (b) = v2 (f ) = ⊤. The grounded interpretation assigns u to all nodes p, b and f . V consists of all three-valued interpretations whereas V 2 consists of all the two-valued interpretations (i. e. interpretations such that for every s ∈ S, v(s) ∈ {⊤, ⊥}). Then v is a model of D if for all s ∈ S, if v(s) 6= u then v(s) = v(Cs ). We define an order ≤i over {⊤, ⊥, u} by making u the minimal element: u <i ⊤ and u <i ⊥ and this order is lifted pointwise as follows (given two valuations v, w over S): v ≤i w iff v(s) ≤i w(s) for every s ∈ S. So intuitively the classical truth values contain more information than the truth value u. The set of two-valued interpretations extending a valuation v is defined as [v]2 = {w ∈ V 2 | v ≤i w}. Given a set of valuations V , ⊓i V (s) = v(s) if for every v ′ ∈ V , v(s) = v ′ (s) and ⊓i V (s) = u otherwise. ΓD (v) : S → {⊤, ⊥, u} where s 7→ ⊓i {w(Cs ) | w ∈ [v]2 }. For the definition of the stable model semantics, we need to define the reduct Dv of D given v, defined as: Dv = (S v , Lv , C v ) with: • S v = {s ∈ S | v(s) = ⊤}, • Lv = L ∩ (S v × S v ), and • C v = {Cs [{φ | v(φ) = ⊥}/⊥] | s ∈ S v }. where Cs [φ/ψ] is the formula obtained by substituting every occurence of φ in Cs by ψ. Definition 1. Let D = (S, L, C) be an ADF with v : S → {⊤, ⊥, u} an interpretation: 3 Interpreting Conditionals in ADFs In (Heyninck, Kern-Isberner, and Thimm 2020) we looked at the problem of translating an ADF into a conditional logic knowledge base. We now look at the complementary question, namely translating a conditional logic knowledge base into an ADF. These two translations will help to better understand the connection between argumentation and reasoning from conditional knowledge bases. In this section, we present an interpretation of conditional knowledge bases into abstract dialectical frameworks. In Section 3.1 we introduce the language used for translating knowledge bases and formulate several notions of adequacy used for evaluating our translation. The translation is presented in Section 3. In Section 3.3 we discuss the use of the newly introduced negation, whereafter we show the adequacy of our translation under two-valued (Section 3.4) and other semantics (Section 3.5). Thereafter, we discuss how to translate normality statements in Section 3.6 and finally we discuss properties of the translation in Section 3.7. 75 3.1 Translations of Conditionals into ADFs pe To obtain an adequate translation, it will prove useful to extend the language with a new atomic negation operator e. We denote the set of atoms negated by this new negation by f At = {φe | φ ∈ At}. Lite (At) = At ∪ f At. When At is clear from the context, we will somtimes just write Lite . It will prove useful to define the following notions: Definition 3. We define the functions p 3.2 −· :Lite → Lite φ ψe φ xφy = ¬ψ ( φe −φ = ψ f Translation D1 The guiding idea behind our first translation is that given a conditional (p|q), what we take into account is the following behaviour: if q is believed then p should be believed. Now one way to translate this in ADFs is to have q as a positive or a supporting link for p. Another way to formalize this idea, however, is to require that q can be believed only if so is p, i. e. {Cq } ⊢ p. In other words the consequent p is a supporting link of the antecedent q. We will here explore the latter idea and show in Section 4.2 that the former idea leads to inadequate translations. We are now ready to define our translation D1 from conditional knowledge bases into ADFs. Definition 4. Given a literal conditional knowledge base ∆, we define: D1 (∆) = (Lite (At(∆)), L, C) where: Cφ = V pψq for any φ ∈ {ψ, ψe | ψ ∈ At(∆)}. ¬−φ∧ x·y :Lite → Lit pφq = b fe Figure 2: Graph representing the links between nodes of D1 (∆) in Example 3. p·q :Lit → Lite with: eb if φ ∈ At if φ = ¬ψ for some ψ ∈ At if φ ∈ At if φ = ψe for some ψ ∈ At if φ ∈ At if φ = ψe for some ψ ∈ At Let Cli (At) is the set of all literal conditional knowledge bases over At and D(Lite (At)) all the ADFs defined on the basis of S (i. e. D = (Lite (At), L, C)). In this paper, we consider translations D : Cli (At) → D(Lite (At)), and in particular translations which preserve the meaning of the translated knowledge base ∆. In more detail, we will use two notions of adequacy to evaluate translations. The first notion is respecting ∆ and is based on de Finetti’s conception of conditionals as generalized indicator functions to worlds described above. Indeed, given a conditional knowledge base ∆ we can straightforwardly extend (de Finetti 1974)’s notion of conditionals as generalized indicator functions to worlds in Ω(Lite (At(∆))). In more detail, for such an ω ∈ Ω(Lite (At(∆))), we define:  1 : ω |= pφq ∧ pψq ((ψ|φ))(ω) = u : ω |= ¬pφq  0 : ω |= pφq ∧ −pψq (ψ|xφy)∈∆ Given a literal φ ∈ Lite (At(∆)), the intuition behind Cφ is the following. The first part ¬ − φ ensures that e behaves like a negation by ensuring that the contrary −φ of φ is not believed when φ is believed. The second part of the condiV tion Cφ , (ψ|xφy)∈∆ pψq, ensures that conditionals are interpreted adequately. In more detail, it ensures that φ is only believed if for every conditional (ψ|xφy) which has φ as an antecedent (modulo transformation to the original language Lit), the consequent pψq is believed (again, modulo transformation into the extended language Lite ) . Notice that for any φ ∈ At, the conditions can be equivalently written as (where ψ is an atom): V • Cφ = ¬φe ∧ (ψ̊|φ)∈∆ pψ̊q. V • Cφe = ¬φ ∧ (ψ̊|¬φ)∈∆ pψ̊q. We illustrate our translation by first looking at the Tweetyexample: Example 3. ∆ = {(f |b), (b|p), (¬f |p)}. The following nodes are part of the ADF: {b, eb, f, fe, p, pe}. We have the following conditions: • Cb = ¬eb ∧ f • Cp = ¬e p ∧ b ∧ fe We will say that an interpretation ω ∈ Ω(Lite (At(∆))) respects ∆ if (δ)(ω) 6= 0 for any δ ∈ ∆. The second notion of adequacy is stronger and requires equivalence on the level of the non-monotonic inference relation. In more detail, we say a translation D is inferentially equivalent w.r.t. an ADF-based inference relation1 |∼ if for any conditional knowledge base ∆: ∆ |∼Z φ iff D(∆) |∼ φ. Clearly, inferential equivalence w.r.t. |∼ sem (for some semantics sem) of a translation D : Cli (At) → D(Lite (At)) implies that all the interpretations in sem(D(∆)) respect ∆ for any literal conditional knowledge base ∆. • Cx = ¬ − x for x ∈ {f, fe, eb, pe}. The corresponding graph can be found in Figure 2. We can read this as follows: b can be believed whenever it is not believed that eb (i. e. nothing is both a bird and a not-bird) and it is believed that f (i. e. something is a bird only if it flies). Argumentatively, eb attacks b and f supports b. Likewise, b and fe support p (whereas pe attacks p). 1 An ADF-based infrence relation is a relation |∼ ⊆ D(S) × L(S). Examples of such inference relations are those defined in Definition 2. 76 D1 (∆) has the following two-valued models: i vi (b) vi (eb) vi (f ) vi (fe) vi (p) 1 ⊤ ⊥ ⊤ ⊥ ⊥ ⊤ ⊥ ⊤ ⊥ 2 ⊥ 3 ⊥ ⊤ ⊤ ⊥ ⊥ Example 6. ∆ = {(p|q), (p|q)}. We have D(∆) = ({p, q, pe, qe}, L, C) with Cq = ¬e q ∧ pe, Cqe = ¬q ∧ pe, Cp = ¬e p, Cpe = ¬p. This ADF has the following two-valued models: p) vi (q) vi (e q) i vi (p) vi (e 1 ⊥ ⊤ ⊥ ⊤ ⊤ ⊤ ⊥ 2 ⊥ 3 ⊤ ⊥ ⊥ ⊥ Notice that v3 is a two-valued model since v3 (¬e p) = ⊥ and thus v3 (Cq ) = v1 (Cqe) = ⊥. This two-valued model interprets e as an incomplete negation (i. e. there might be e -gaps), since both q and qe are false in v3 . vi (e p) ⊤ ⊤ ⊤ Notice that these two-valued models correspond to the most plausible worls according to κZ ∆ (see Example 1). Another benchmark example well-known from the literature is the so-called Nixon diamond, where equally plausible rules lead to mutually inconsistent conclusions. Example 4 (The Nixon Diamond). Let ∆ = {(p|q), (¬p|r)}. Then D1 (∆) = ({p, pe, q, qe}, L, C) with: • Cq = ¬e q∧p • Cr = ¬e r ∧ pe • Cx = ¬ − x for x ∈ {p, pe, qe, re} 2mod(D1 (∆)) = {v1 , v2 , v3 , v4 } with: i 1 2 3 4 vi (q) ⊤ ⊥ ⊥ ⊥ vi (e q) ⊥ ⊤ ⊤ ⊤ vi (r) ⊥ ⊤ ⊥ ⊥ vi (e r) ⊤ ⊥ ⊤ ⊤ vi (p) ⊤ ⊥ ⊥ ⊤ However, for any literal knowledge base ∆ and any twovalued model ω of D1 (∆), e is a consistent in ω (i.e. there are no e -gluts): Proposition 1. Let a literal conditional knowledge base ∆, some φ ∈ At(∆), and ω ∈ 2mod(D1 (∆)) be given. Then e = ⊥ and ω(φ) e = ⊤ implies ω(φ) = ω(φ) = ⊤ implies ω(φ) ⊥. vi (e p) ⊥ ⊤ ⊤ ⊥ Proof. Suppose ∆ is a literal conditional knowledge base and φ ∈ At(∆) and ω ∈ 2mod(D1 (∆)). Suppose now ω(φ) = ⊤. Since ω ∈ 2mod(D(∆)), ω(φ) = ω(Cφ ). Since e = ⊤, e V pψq, ω(Cφ ) = ⊤ implies ω(¬φ) Cφ = ¬φ∧ (ψ|φ)∈∆ −1 It can be observed that (κZ (0) = {pqr, pqr, p̊qr}. ∆) As in the previous example, 2mod(D1 (∆)) corresponds to −1 (κZ (0). ∆) In the Section 3.4, we will see that the correspondence be−1 tween 2mod(D1 (∆)) and (κZ (0) in the above examples ∆) is no coincidence. 3.3 e = ⊥. The case for ω(φ) e is analogous. i. e. ω(φ) 3.4 Adequacy of Translation D1 We first show that two-valued models of D1 (∆) respect ∆: Proposition 2. Let a literal conditional knowledge base ∆, ω ∈ 2mod(D1 (∆)) and (φ|ψ) ∈ ∆ be given. Then ω(pψq) = ⊤ implies ω(pφq) = ⊤. Properties of e Before discussing the adequacy of the translation ∆1 , it is important to ask whether e fulfills some well-known properties of negations, such as completeness and consistency. Completeness of e in an interpretation ω means that for every φ ∈ At, at least one of φ and φe is true in ω, whereas consistency in an interpretation ω means that at most one of φ and φe is true in ω (for any φ ∈ At). Definition 5. Given ω ∈ Ω(At ∪ f At), we say e is: Proof. Suppose that ω ∈ 2mod(D(∆)) and let (φ|ψ) ∈ ∆. Suppose that ω(pψq) = ⊤. We assume first that ψ, φ ∈ At. V pφ′ q and (φ|ψ) ∈ ∆, Since Cψ = ¬ψe ∧ ′ (φ |ψ)∈∆ Cψ = ¬ψe ∧ φ ∧ ^ pφ′ q (φ′ |ψ)∈∆\{(φ|ψ)} and thus Cψ ⊢ φ. Since ω ∈ 2mod(D(∆)), ω(ψ) = ω(Cψ ) = ⊤. Since Cψ ⊢ φ, this means ω(φ) = ⊤. Since φ ∈ At, this implies ω(pφq) = ⊤. The other cases are analogous. e = ⊤. • complete in ω if for all φ ∈ At, ω(φ) = ⊤ or ω(φ) e = ⊥. • consistent in ω if for all φ ∈ At, ω(φ) = ⊥ or ω(φ) We can illustrate these definitions with a simple example: Example 5. Consider the following interpretations of {p, pe}: p) is vi consistent? is vi complete? i vi (p) vi (e 1 ⊥ ⊥ yes no ⊤ yes yes 2 ⊥ 3 ⊤ ⊤ no yes 4 u u no no Corollary 1. Let a literal conditional knowledge base ∆ be given. Then any ω ∈ 2mod(D1 (∆)) respects ∆. Proof. By Proposition 2, for any ω ∈ 2mod(D1 (∆)) and any (φ|ψ) ∈ ∆, ω(pψq) = ⊥ or ω(pψq ∧ pφq) = ⊤, which implies that ω((φ|ψ)) 6= 0. We can now easily show that every two-valued model of D1 (∆) corresponds to a maximally plausible world ω. We first have to define a function that allows us to associate two-valued models in the language using e with the worlds Ω(At) (and vice-versa). We first observe that there extist knowledge bases ∆ for which there are two-valued models ω of D1 (∆) s.t. e is not complete in ω, as witnessed by the following example: 77 Definition 6. Where ω ∈ Ω(Lite (At)) and e is complete in ω, we define ω↓ ∈ Ω(At) as the world such that for every φ ∈ At: ⊤ if ω(φ) = ⊤ ω↓(φ) = e =⊤ ⊥ if ω(φ) Let ω ∈ Ω(At). Then we define ω↓ ∈ Ω(Lite ) as the world such that for every φ ∈ At: Proof. We show this by showing the claim for any φ ∈ Wn Vm ˚ L(At) in disjunctive normal form, i. e. φ = i=1 j=1 φji . Suppose ω↓ |= φ, i. e. there is some 1 ≤ i ≤ n s.t. Vm ˚ ω↓ |= j=1 φji . By Fact 2 and Definition 6, this implies Wn V m ˚ Vm ˚ ω |= j=1 φji and thus ω |= i=1 j=1 φji . The other direction is analogous. e = ⊥ iff ω(φ) = ⊤ ω↑(φ) = ⊤ and ω↑(φ) e = ⊤ and ω↑(φ) = ⊥ iff ω(φ) = ⊥ ω↑(φ) Given some ADF D, we define: D |∼ 2mod φ iff ω(φ) = ⊤ for every ω ∈ 2mod(D) for which e is complete in ω. Proof. Suppose ∆ is a literal conditional knowledge base, ω ∈ 2mod(D1 (∆)) and e is complete in ω. Indeed, let (φ|ψ) ∈ ∆ and suppose ω↓ |= pψq. By Definition 6, this implies ω |= ψ. With Proposition 2, this implies that ω |= pφq. Again with Definition 6, this implies ω↓ |= φ. Thus, we have established that if ω ∈ 2mod(D1 (∆)) and e is complete in ω then ω↓ 6|= ψ ∧ ¬φ for any (φ|ψ) ∈ ∆, i. e. ((φ|ψ))(ω↓) 6= 0 (for any (φ|ψ) ∈ ∆). With Lemma 1 this means κZ ∆ (ω) = 0. Proof. Suppose first that ∆ |∼Z φ, i. e. for every ω ∈ Ω s.t. κZ ∆ (ω) = 0, ω |= φ. Take now some ω ∈ 2mod(D1 (∆)) s.t. e is complete in ω. With Proposition 3, κZ ∆ (ω↓) = 0 and thus ω↓ |= φ. With Definition 6, also ω |= φ. Thus, we have shown that for any ω ∈ 2mod(D1 (∆)) s.t. e is complete in ∩,c ω, ω |= φ which implies D1 (∆) |∼ 2mod φ. ∩,c Suppose now that D1 (∆) |∼ 2mod φ, i. e. for every ω ∈ 2mod(D1 (∆)) s.t. e is complete in ω, ω ′ |= φ. Take now some ω ∈ Ω(At) s.t. κZ ∆ (ω) = 0. With Lemma 2 ω↑ ∈ 2mod(D1 (∆)) and with Fact 1, e is complete in ω↑. Thus, ω↑ |= φ. With Lemma 3, this implies that ω |= φ. Thus we have shown that for every ω ∈ Ω(At), κZ ∆ (ω) = 0 implies ω |= φ, which implies that ∆ |∼Z φ. ∩,c Theorem 1. Given a literal conditional knowledge base ∆, ∩,c ∆ |∼Z φ iff D1 (∆) |∼ 2mod φ. We can now show the correspondence between ecomplete two-valued models and maximally plausible worlds. Proposition 3. Let a literal conditional knowledge base ∆ and an ω ∈ 2mod(D1 (∆)) for which e is complete in ω be given. Then κZ ∆ (ω↓) = 0. Fact 1. For any ω ∈ Ω, e is complete in ω↑. Lemma 2. Let a literal conditional knowledge base ∆ and some ω ∈ Ω be given. Then if κZ ∆ (ω) = 0 then ω↑ ∈ 2mod(D1 (∆)). 3.5 Other Semantics In this section we show that other semantics also respect ∆. We first investigate the two-valued stable semantics and then move to the three-valued complete, preferred and grounded semantics. Proof. Let a literal conditional knowledge base ∆ and some ω ∈ Ω be given. Consider some φ ∈ Lite . We show that ω↑ |= φ iff ω↑ |= Cφ , which implies ω↑ is a two-valued model of D1 (∆). For this suppose first that ω↑ |= φ and suppose towards a contradiction ω↑ |= ¬Cφ , i. e. ω↑ |= φe ∨ V ¬ (ψ|xφy)∈∆ pψq. With Proposition 1 and since ω↑ |= φ, e which implies ω↑ |= ¬ V pψq, i. e. there ω↑ 6|= φ, Stable Semantics We first notice that not every twovalued model of D1 (∆) is stable: Example 7. Let ∆ = {(p|q), (q|p)}. Then D1 (∆) = ({p, q, pe, qe}, L, C) with Cp = ¬e p ∧ q, Cq = ¬e q ∧ p and Cxe = ¬x for any x ∈ {p, q}. Notice that ω with ω(p) = ω(q) = ⊤ and ω(e p) = ω(e q) = ⊥ is a two-valued model of D1 (∆). It is, however, not stable. To see this, notice that (D1 (∆))ω = ({p, q}, L, C ω ) with Cpω = ⊤ ∧ q and Cqω = ⊤ ∧ p. The grounded extension v of (D1 (∆))ω assigns v(p) = v(q) = u. (ψ|xφy)∈∆ is some (ψ|xφy) ∈ ∆ s.t. ω↑ |= ¬pψq. By definition of ω↑, this implies that ω |= ¬ψ. But then ω |= φ ∧ ¬ψ for some (ψ|φ) ∈ ∆, contradiction to κZ ∆ (ω) = 0. Suppose now (again towards a contradiction) that ω↑ |= Cφ and ω↑ 6|= φ. e Since Cφ = ¬φe ∧ By Fact 1, ω↑ 6|= φ implies ω↑ |= φ. V pψq, this contradicts ω↑ |= Cφ . (ψ|xφy)∈∆ Furthermore, stable models might be incomplete w.r.t. e, just like the two-valued models: Example 8. Recall the conditional knowledge base from Example 6. There, v3 ∈ 2mod(D1 (∆)) with v3 (p) = ⊤ and v3 (e p) = v3 (q) = v3 (e q ) = ⊥. We have (D1 (∆)v3 ) = ({p}, L, C v3 ) with Cp = ¬⊥. Since the grounded extension v of (D1 (∆)v3 ) = ({p}, L, C v3 ) assigns v(p) = ⊤, we see that v3 is stable. As was argued in Example 6, e is incomplete in v3 . Fact 2. Let some ω ∈ Ω(Lite ) s.t. e is complete in ω and some φ ∈ At be given. Then ω |= φe iff ω |= ¬φ. e By Proposition 1, ω 6|= φ and Proof. Suppose first ω |= φ. thus ω |= ¬φ. Suppose now that ω |= ¬φ. Since e is come plete in ω, by Definition 6, ω |= φ. However, we can make some immediate observations about the stable models of D1 (∆). We first recall the following result: Lemma 3. Let some ω ∈ Ω(Lite ) s.t. e is complete in ω and some φ ∈ L(At) be given. Then ω↓ |= φ iff ω |= φ. 78 Theorem 2 ((Brewka et al. 2017, Theorem 3.1)). For any ADF D, stable(D) ⊆ 2mod(D). It follows from Theorem 2 and Proposition 2 that every stable model of D1 (∆) for which e is complete, respects ∆: Proposition 4. Let a literal conditional knowledge base ∆ and some (φ|ψ) be given. Then for any ω ∈ stable(D1 (∆)), if ω |= pψq then ω |= pφq. We can furthermore show that any stable model of D1 (∆) is maximally plausible according to κZ ∆ (modulo the ↓transformation): Proposition 5. Let a literal conditional knowledge base ∆ and an ω ∈ stable(D1 (∆)) for which e is complete be given. Then κZ ∆ (ω↓) = 0. pe We illustrate D1elcb (∆) with an example: Example 9. Let ∆ = {(p|⊤), (q|p)}. Then D1elcb (∆) = ({p, pe, q, qe}, L, C) with Cp = ¬e p ∧ q, Cpe = ⊥ and Cx = ¬ − x for any x ∈ {q, qe}. We have two two-valued models, v1 and v2 with: v1 (p) = v1 (q) = ⊤, v1 (e p) = v1 (e q ) = ⊥, v2 (e q ) = ⊤ and v2 (p) = v2 (q) = v2 (e p) = ⊥. Even though this option gives rise to an incomplete interpretation, v2 , there is no two-valued interpretation of D12 (∆) that falsifies any rule in ∆. This is no coincidence as we show below. We now show the adequacy of D1elcb for extended literal knowledge bases: Proposition 7. Given an extended literal conditional knowledge base ∆ and an ω ∈ 2mod(D1elcb ) for which e is complete in ω be given. Then κZ ∆ (ω↓) = 0. Proof. Suppose that v ∈ V is a model and let (φ|ψ) ∈ ∆. Suppose that v(pψq) = ⊤. Since v is a model, v(pψq) = ⊤ implies v(Cpψq V ) = ⊤. Since (φ|ψ) ∈ ∆, Cpψq = ¬ − pψq ∧ pφq ∧ (φ′ |ψ)∈∆\{(φ|ψ)} pφ′ q, and thus v(Cpψq ) = ⊤ implies v(pφq) = ⊤. Proof. Suppose ∆ is an extendend literal conditional knowledge base and e is complete in ω. We show that ω↓ 6|= ψ∧¬φ for any (φ|ψ) ∈ ∆, which with Lemma 1 implies the Proposition. We show the claim for ψ = ⊤, since the case where ψ 6= ⊤ is identical to the proof of Proposition 3. Thus consider (φ|⊤) ∈ ∆. Since this means with Definition 7, C−pφq = ⊥ and e is complete in ω, ω |= φ. With Definition 6, this means ω↓ |= φ. Corollary 2. Let a literal conditional knowledge base ∆ and some (φ|ψ) ∈ ∆ be given. Then: 1. For any sem ∈ {complete, preferred} and v ∈ Sem(D1 (∆)), v respects ∆. 3.6 respects ∆. Proposition 8. Given an extended literal conditional knowledge base ∆ and an ω ∈ Ω(At), if κZ ∆ (ω) = 0 then ω↑ ∈ 2Mod(D1elcb ). 2 Extended Literal Conditional Knowledge Bases Proof sketch. Suppose that φ ∈ {ψ, ψe | ψ ∈ At} and there is some (x−φy|⊤) ∈ ∆ (and thus Cφ = ⊥) and ω↑ |= φ. Since κZ ∆ (ω) = 0, (x−φy|⊤) ∈ ∆ implies that ω |= x−φy, which with Definition 6 implies ω↑ |= −φ, contradicting ω↑ |= φ and Proposition 1. Thus, for any φ ∈ {ψ, ψe | ψ ∈ At} for which there is some (x−φy|⊤) ∈ ∆: ω↑ |= φ iff ω↑ |= Cφ . The other case is identical to the proof of Lemma 2. Since in our translation D1 , a conditional (φ|ψ) results in a support link from φ to ψ, it is not immediately clear how to translate a normality statement of the form (φ|⊤), among others since ⊤ will not correspond to a node in the ADF. We circumvent this problem by modelling normality statements (φ|⊤) by requiring that −pφq is not believed, i. e. by setting C−pφq = ⊥. This results in the following translation for extended literal conditional knowledge bases: Definition 7. Given an extended literal conditional knowledge base ∆, we define: D1elcb (∆) = (Lite (At(∆)), L, C) where: for any φ ∈ Lite (At(∆)), ⊥ if ∃(x−φy|⊤) ∈ ∆ V Cφ = ¬ − φ ∧ (ψ|xφy)∈∆ pψq otherwise The proof of the following Theorem, stating the inferen∩,c tial equivalence of D1elcb w.r.t. |∼ 2mod is completely analogous to the proof of Theorem 1: Theorem 3. Given an extended literal conditional knowl∩,c edge base ∆, ∆ |∼Z φ iff D1elcb (∆) |∼ 2mod φ. We notice that the first case can be expanded into the following form (where φ ∈ At): The reader might wonder why we did not simply set Cφ = ⊤ for any (φ|⊤) ∈ ∆. This would result in an inadequate translation, since any information about conditionals with φ as an antecedent would be removed from the ADF, as illustrated by the following example. • Cφ = ⊥ if there is some (¬φ|⊤) ∈ ∆ 2 D (∆) Recall that vg 1 qe • Cφe = ⊥ if there is some (φ|⊤) ∈ ∆ Three-Valued Semantics For all of the well-known threevalued semantics, we can show (just like for the two-valued and stable models) that any corresponding interpretation of the translation D1 (∆) respects ∆ (thus generalizing Proposition 2): Proposition 6. Let a literal conditional knowledge base ∆ and a model v ∈ V of D1 (∆) be given. Then for any (φ|ψ) ∈ ∆, if v(pψq) = ⊤ then v(pφq) = ⊤. 2. q Figure 3: Graph representing the links between nodes of D1elcb (∆) in Example 9. Proof. Follows from Theorem 2 and Proposition 7. D (∆) vg 1 p denotes the grounded extension of D1 (∆). 79 for example, p q ∈ 2mod(D2 ({(p|¬q)}) since (p|¬q) is not taken into account in Cq . We could propose making the following adjustment to avoid this: Definition 9. Given a literal conditional knowledge base ∆, V we let D3 (∆) = (At(∆), L, C) where: Cφ = (ψ|φ)∈∆ ψ ∧ V (ψ|¬φ)∈∆ ¬ψ if there is some (ψ|φ) ∈ ∆ or some (ψ|¬φ) ∈ ∆ and Cφ = φ otherwise. However, since 2mod(D3 ({(q|p), (q|¬p)}) = {q̊p}, this also results in an inadequate translation, since ((q|¬p))(qp) = 0 and thus κZ ∆ (qp) = 1. A third option would be to take: Definition 10. Given a literal conditional knowledge base ∆, V we let D4V(∆) = (At(∆), L, C) where: Cφ = (ψ|¬φ)∈∆ ¬ψ if there is some (ψ|φ) ∈ ∆ (ψ|φ)∈∆ ψ ∨ or some (ψ|¬φ) ∈ ∆ and Cφ = φ otherwise. Notice that 2mod(D4 ({(q|p), (s|¬p)}) contains pqs. Since ((q|p))(pqs) = 0, this means D4 is not an adequate translation. There are, of course, some other variations possible, which do, however, lead to similar inadequacies. We hope to have convinced the reader of the fact that any translation which is based purely on the syntax of conditional knowledge bases does require a second negation.3 Example 10 (Example 9 continued). We consider ∆ = {(p|⊤), (q|p)} (as in Example 9). If we translated this knowledge base using D1 and by in addition setting Cp = ⊤ from above, we get: D′ (∆) = ({p, pe, q, qe}, L, C) with Cp = ⊤ and Cx = ¬ − x for x ∈ {e p, q, qe}. In that case, there are two two-valued models, v3 and v4 with: v3 (p) = v1 (q) = ⊤, v3 (e p) = v3 (e q ) = ⊥, v4 (p) = v4 (e q) = ⊤ and v4 (e p) = v4 (q) = ⊥. In that case, there is a (complete) two-value model, namely v2 , that validates p but not q, even though (q|p) ∈ ∆ (in fact, (q|p) is even in ∆0 ). 3.7 Properties of the Translation (Gottlob 1994) proposed several desirable properties for translations between non-monotonic formalisms like adequacy, polynomiality and modularity. In Section 3.4 we already discussed adequacy in-depth and we have shown, that our translation is adequate on the level of beliefs for all semantics and for any extended literal knowledge base. A translation satisfies polynomiality if the translation is calculable with reasonable bounds. It is easy to see, that our translation is polynomial in the length of the translated conditional knowledge base. For modularity we follow the formulation of (Strass 2013) for a translation from ADFs to a target formalism, even though modularity was originally defined for translations between circumscription and default logic (Imielinski 1987). In other words modular means that “local” changes in the translated conditional knowledge base results in “local” changes in the translation. A minimal notion of modularity would be that if we have to syntactically disjoint conditional knowledge bases ∆1 and ∆2 , then changes in ∆1 will result only in changes to Cs for some s ∈ Lite (At(∆1 )). Clearly the translation presented in this paper is modular. The biggest downside of this translation is the fact, that it is not language-preserving since we use a language extension in this translation to construct the ADFs. Finally, it is clear, that this translation is syntax-based, in the sense that the translation D1 (∆) can be derived purely on the basis of the logical form of the knowledge base ∆. 4 4.2 One guiding idea behind our translation D1 is that, relative to a conditional knowledge base ∆, a node φ ∈ Lite can be believed only if for every conditional (ψ|xφy) ∈ ∆, pψq is believed. In other words, the links go from the consequent pψq to the antecedent φ. One might wonder if adequacy is preserved when we let the links between nodes run from antecedent to consequent. Such an alternative translation could be the following: Definition 11. Given a literal conditional knowledge base ∆, we define: D5 (∆) = ({φ, φe | φ ∈ At(∆)}, L, C) where W Cφ = ¬φe ∧ (xφy|ψ)∈∆ pψq for any φ ∈ Lite . This translation is not adequate, however: Example 11. Let ∆ = {(p|q), (¬p|s)}. Then D5 (∆) = ({p, pe, q, qe, s, se}, L, C) with: Cp = ¬e p ∧ q, Cpe = ¬p ∧ s, Cx = ¬ − x for any x ∈ {q, qe, s, se}. We depicted the corresponding graph in Figure 4. Consider v(q) = v(s) = v(e p) = ⊤ and v(e q ) = v(e s) = v(p) = ⊥. Then v is a two-valued model of D3 (∆) (indeed, observe that v(Cp ) = v(¬e p ∧ q) = ⊥ since v(e p) = ⊤). However, notice that κZ ∆ (pqs) = 1 since ((p|q))(pqs) = 0. Thus, two-valued models of D5 (∆) might not correspond to Design Choices In this section we motivate some important design choices underlying our translation D1 , especially the extension of the language to include the negation e, the direction of supporting links resulting from conditionals (φ|ψ) in the translated conditional knowledge base and the restriction to literal conditional knowledge bases. 4.1 Antecedents as Partial Sufficient Conditions The necessity of e The critical reader might wonder, given that ADFs allow for the negation ¬ to be used in formulating acceptance conditions for nodes, if a second negation e is really needed? Indeed, a first proposal for a translation avoiding e would be the following: Definition 8. Given a literal conditional knowledge base ∆, V we let D2 (∆) = (At(∆), L, C) where: Cφ = (ψ|φ)∈∆ ψ if there is some (ψ|φ) ∈ ∆. and Cφ = φ otherwise. Such a translation would be inadequate since conditionals with negative antecedents are not taken into account. Thus, 3 Since ADFs under two-valued model semantics are equiexpressive with propositional logic (Strass 2014), it is not hard to come up with a translation that is adequate. For example, it is straightforward to show the adequacy (under two-valued semantics) of the following translation. Let D⋆ (∆) = (Atoms(∆), L, C) with: _ ^ _ Cφ = ω∧ ¬ω∨ ω κZ ∆ (ω)=0 and ω|=φ κZ ∆ (ω)>0 and ω|=φ κZ ∆ (ω)>0 and ω|=¬φ for any φ ∈ At(∆). But such a translation is dependent on the semantics of system Z and therefore is not syntax-based. 80 pe qe p q paring the strength of arguments and counterarguments. Our approach differs both in goal (we investigate the correspondence between argumentation and conditional logics instead of integrating insights from the latter into the former) and generality (DeLP is a specific and arguably rather peculiar argumentation formalism whereas ADFs are some of the most general formalism around). Several works investigate postulates for nonmonotonic reasoning known from conditional logics (Kraus, Lehmann, and Magidor 1990) for specific structured argumentation formalisms, such as assumption-based argumentation (Čyras and Toni 2015; Heyninck and Straßer 2018) and ASPIC+ (Li, Oren, and Parsons 2017). These works revealed gaps between nonmonotonic reasoning and argumentation which we try to bridge in this paper. Besnard et al. (Besnard, Grégoire, and Raddaoui 2013) develop a structured argumentation approach where general conditional logic is used as the base knowledge representation formalism. Their framework is constructed in a similar fashion as the deductive argumentation approach (Besnard and Hunter 2008) but they also provide with conditional contrariety a new conflict relation for arguments, based on conditional logical terms. Even though insights from conditional logics are used in that paper, this approach stays well within the paradigm of structured argumentation. In (Strass 2015) Strass presents a translation from an ASPIC-style defeasible logic theory to ADFs. While actually Strass embeds one argumentative formalism (the ASPICstyle theory) into another argumentative formalism (ADFs) and shows how the latter can simulate the former, the process of embedding is similar to our approach. However, inferentially the formalism of (Strass 2015) is more akin to ASPIC+ , in the sense that literals cannot be accepted unless there is some rule deriving them. Arguably, this formalism is more akin to D5 (see Definition 4.2), as in the ADFs generated by (Strass 2015), rules result in support of the consequents of rules. se s Figure 4: Graph representing the links between nodes of D5 (∆) in Example 11. maximally plausible worlds (even if the negation e is complete in such a model). 4.3 Literal Conditionals The final design choice made in this paper we motivate is the fact that we restricted attention to (possibly extended) literal conditional knowledge base as the object of translation. The reason is that we choose to represent conditionals (φ|ψ) as links between nodes φ and ψ (modulo transformation to the extend language). Moving to conditionals with arbitrary propositional formulas as antecedents and consequents would make it impossible to retain such a representation, since in abstract dialectical argumentation, nodes are essentially atomic. 5 Related Work Our aim in this paper is to lay foundations of integrative techniques for argumentative and conditional reasoning. There are previous works, which have similar aims or are otherwise related to this endeavour. We will discuss those in the following. First, there is huge body of work on structured argumentation (see e. g. (Besnard et al. 2014)). In these approaches, arguments are constructed on the basis of a knowledge base possibly consisting of conditionals. An attack relation between these arguments is constructed based on some syntactic criteria. Acceptable arguments are then identified by applying argumentation semantics to the resulting argumentation frameworks. Even though these formalisms also allow for argumentation-based inferences from a set of conditionals, these approaches will often give rise to inferences rather different from conditional logics. For example, in ASPIC+ (Modgil and Prakken 2018), the knowledge base consisting solely of the defeasible rule p ⇒ q will warrant no inference (in fact the set of arguments based on this knowledge base will be empty), whereas, for example, ∩,c D1 ({(q|p)}) |∼ 2mod ¬(p ∧ ¬q). This difference is caused by the fact that in structured argumentation, arguments are typically constructed in a proof-like manner. This means that defeasible rules can only be applied when there is positive evidence for the antecedent. Conditional logics, and our translation by extension, on the other hand, generate models that do not falsify any plausible conditional. There have been some attempts to bridge the gap between specific structured argumentation formalisms and conditional reasoning. For example, in (Kern-Isberner and Simari 2011) conditional reasoning based on System Z (Goldszmidt and Pearl 1996) and DeLP (Garcı́a and Simari 2004) are combined in a novel way. Roughly, the paper provides a novel semantics for DeLP by borrowing concepts from System Z that allows using plausibility as a criterion for com- 6 Outlook and Conclusion In this paper we have presented and investigated a translation from conditional knowledge bases into abstract dialectical argumentation based on the syntatic similarities between the two frameworks. We provide an interpretation of plausible conditionals in abstract dialectical argumentation. We have shown that this interpretation is adequate under all of the well-known semantics for ADFs and have shown that the translation is polynomial and modular. Interestingly, the translation requires an extension of the language, which we have argued in Section 4 cannot be avoided. Another limitation of our interpretation is that adequacy is only shown with respect to the level of beliefs Bel (κZ ∆) (or equivalently the level of the most plausible worlds −1 (κZ (0)). In future work, we plan to investigate meth∆) ods to obtain conditional inferences from ADFs and compare them with system Z. One proposal to do this is founded upon the Ramsey-test (Ramsey 2007), which says that a conditional (φ|ψ) is accepted if belief in ψ leads to belief in φ. Several ways of modelling the hypothetical belief in ψ are to be considered, such as revision by ψ (using e. g. revision of ADFs as proposed by (Linsbichler and 81 Woltran 2016)), observations of φ (Booth et al. 2012) or interventions with φ (Rienstra 2014). Furthermore, we plan to tackle the combination of the translation presented in this paper and the one from ADFs into conditional logics analyzed in previous works (Kern-Isberner and Thimm 2018; Heyninck, Kern-Isberner, and Thimm 2020). We want to answer the question what happens if we apply these translation one after each other. Finally, we plan to generalize the results of this paper to other conditional logics besides system Z, which we have chosen because of the many desirable properties it satisfies. for default reasoning, belief revision, and causal modeling. AI 84(1-2):57–112. Gottlob, G. 1994. The power of beliefs or translating default logic into standard autoepistemic logic. In Foundations of Knowledge Representation and Reasoning. Springer. 133– 144. Heyninck, J., and Straßer, C. 2018. A comparative study of assumption-based approaches to reasoning with priorities. In Second Chinese Conference on Logic and Argumentation. Heyninck, J.; Kern-Isberner, G.; and Thimm, M. 2020. On the correspondence between abstract dialectical frameworks and non-monotonic conditional logics. In 33rd International FLAIRS Conference. Imielinski, T. 1987. Results on translating defaults to circumscription. Artificial Intelligence 32(1):131–146. Kern-Isberner, G., and Simari, G. R. 2011. A default logical semantics for defeasible argumentation. In FLAIRS. Kern-Isberner, G., and Thimm, M. 2018. Towards conditional logic semantics for abstract dialectical frameworks. In et al., C. I. C., ed., Argumentation-based Proofs of Endearment, volume 37 of Tributes. College Publications. Kern-Isberner, G. 2001. Conditionals in nonmonotonic reasoning and belief revision: considering conditionals as agents. Springer-Verlag. Kraus, S.; Lehmann, D.; and Magidor, M. 1990. Nonmonotonic reasoning, preferential models and cumulative logics. AI 44(1-2):167–207. Li, Z.; Oren, N.; and Parsons, S. 2017. On the links between argumentation-based reasoning and nonmonotonic reasoning. In TAFA, 67–85. Springer. Linsbichler, T., and Woltran, S. 2016. Revision of abstract dialectical frameworks: Preliminary report. In First International Workshop on Argumentation in Logic Programming and Non-Monotonic Reasoning, Arg-LPNMR 2016. Makinson, D. 1988. General theory of cumulative inference. In NMR, 1–18. Springer. Modgil, S., and Prakken, H. 2018. Abstract rule-based argumentation. Nute, D. 1984. Conditional logic. In Handbook of philosophical logic. Springer. 387–439. Ramsey, F. P. 2007. General propositions and causality. Rienstra, T. 2014. Argumentation in flux: modelling change in the theory of argumentation. Ph.D. Dissertation, University of Luxembourg. Spohn, W. 1988. Ordinal conditional functions: A dynamic theory of epistemic states. In Causation in decision, belief change, and statistics. Springer. 105–134. Strass, H. 2013. Approximating operators and semantics for abstract dialectical frameworks. Artificial Intelligence 205:39–70. Strass, H. 2014. On the relative expressiveness of argumentation frameworks, normal logic programs and abstract dialectical frameworks. In 15th International Workshop on Non-Monotonic Reasoning, 292. Strass, H. 2015. Instantiating rule-based defeasible theories in abstract dialectical frameworks and beyond. Journal of Logic and Computation 28(3):605–627. Acknowledgements The research reported here was supported by the Deutsche Forschungsgemeinschaft under grant KE 1413/11-1. References Atkinson, K.; Baroni, P.; Giacomin, M.; Hunter, A.; Prakken, H.; Reed, C.; Simari, G. R.; Thimm, M.; and Villata, S. 2017. Toward artificial argumentation. AI Magazine 38(3):25–36. Besnard, P., and Hunter, A. 2008. Elements of argumentation, volume 47. MIT press Cambridge. Besnard, P.; Garcia, A.; Hunter, A.; Modgil, S.; Prakken, H.; Simari, G.; and Toni, F. 2014. Introduction to structured argumentation. Argument & Computation 5(1):1–4. Besnard, P.; Grégoire, É.; and Raddaoui, B. 2013. A conditional logic-based argumentation framework. In International Conference on Scalable Uncertainty Management, 44–56. Springer. Booth, R.; Kaci, S.; Rienstra, T.; and van der Torre, L. 2012. Conditional acceptance functions. In 4th International Conference on Computational Models of Argument (COMMA 2012), 470–477. Brewka, G., and Woltran, S. 2010. Abstract dialectical frameworks. In Twelfth International Conference on the Principles of Knowledge Representation and Reasoning. Brewka, G.; Strass, H.; Ellmauthaler, S.; Wallner, J. P.; and Woltran, S. 2013. Abstract dialectical frameworks revisited. In Twenty-Third International Joint Conference on Artificial Intelligence. Brewka, G.; Ellmauthaler, S.; Strass, H.; Wallner, J. P.; and Woltran, S. 2017. Abstract dialectical frameworks: An overview. The IfCoLog Journal of Logics and their Applications 4(8):2263–2317. Čyras, K., and Toni, F. 2015. Non-monotonic inference properties for assumption-based argumentation. In TAFA, 92–111. Springer. de Finetti, B. 1974. Theory of probability (2 vols.). Dung, P. M. 1995. On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games. Artificial Intelligence 77:321–358. Garcı́a, A. J., and Simari, G. R. 2004. Defeasible logic programming: An argumentative approach. TPLP 4(1+ 2):95– 138. Goldszmidt, M., and Pearl, J. 1996. Qualitative probabilities 82 Inductive Reasoning with Difference-making Conditionals Meliha Sezgin1 , Gabriele Kern-Isberner1 , Hans Rott2 1 Department of Computer Science, TU Dortmund University, Germany 2 Department of Philosophy, University of Regensburg, Germany meliha.sezgin@tu-dortmund.de, gabriele.kern-isberner@cs.uni-dortmund.de, hans.rott@ur.de Abstract the countryside. Unfortunately, the weather quickly changed and it became cold (c). Due to the low temperatures one of the rather old pipes in the house broke (b) and the agent had to call a plumber (p) to get the damage fixed. In belief revision theory, conditionals are often interpreted via the Ramsey test. However, the classical Ramsey Test fails to take into account a fundamental feature of conditionals as used in natural language: typically, the antecedent is relevant to the consequent. Rott has extended the Ramsey Test by introducing so-called difference-making conditionals that encode a notion of relevance. This paper explores differencemaking conditionals in the framework of Spohn’s ranking functions. We show that they can be expressed by standard conditionals together with might conditionals. We prove that this reformulation is fully compatible with the logic of difference-making conditionals, as introduced by Rott. Moreover, using c-representations, we propose a method for inductive reasoning with sets of difference-making conditionals and also provide a method for revising ranking functions by a set of difference-making conditionals. 1 In this example, it is clear that the cold temperatures are the reason for the broken pipe. Yet, this is not well reflected if we use a standard conditional ‘If it is cold then the pipe will break’. We would rather say that the pipe broke because it was cold. The notion of relevance featuring here is encoded in the Relevant Ramsey Test which governs difference-making conditionals first introduced under a different name by Rott (1986) and then studied in Rott (2019). Except for a very recent paper by Raidl (2020), the logic of difference-making conditionals has been explored only in a purely qualitative framework. We characterize differencemaking conditionals in the framework of Spohn’s (1988) ranking functions and provide a simple and elegant semantics which we can use to define an inductive representation, that is, to build up an epistemic state from a (conditional) knowledge base, as well as a revision method for differencemaking conditionals. Our main contributions in this paper are the following: Introduction On most accounts of conditionals, a conditional of the form ‘If A then B’ is true or accepted if (but not only if) B is true or accepted and A does not undermine B’s truth or acceptance. On the suppositional account, for instance, if you believe B and the supposition that A is true does not remove B, you may (and must!) accept ‘If A, then B’. On this account, there is no need that A furthers B or supports B or is evidence or a reason for B. This does not square well with the way we use conditionals in natural language. Skovgaard-Olsen et al. (2019) have conducted an empirical study and concluded that the positive relevance reading (reason-relation reading) of indicative conditionals is a conventional aspect of their meaning which cannot be cancelled ‘without contradiction’. This, of course, is helpful only if the notion of contradiction is clear, but we aim to flesh out the positive relevance reading in an intuitive and yet precise way. The difference-making conditionals studied in this paper aim at capturing the relevance reading that is conveyed semantically or pragmatically by the utterance of conditionals in natural language. (Unfortunately, use of the term ‘relevance conditionals’ has been preempted by a completely different use in linguistics). Let us begin by giving an example that illustrates what we mean by the term ‘relevance’: • We transfer Rott’s notion of difference-making conditionals to the framework of ordinal conditional functions and reformulate the relevant Ramsey Test in this framework. • We define an inductive representation for a set of difference-making conditionals in the framework of ranking functions. • We set up a method for revising a ranking function by a set of difference-making conditionals, and we elaborate this general method for revising by a single difference-making conditional in the ranking functions framework, based on the c-revisions introduced by Kern-Isberner (2001). • We compare the notion of evidence or support captured by difference-making conditionals to the one offered in related approaches like the ‘evidential conditionals’ of Crupi and Iacona (2019a) or Spohn’s (2012) notion of ‘reason’. The rest of this paper is organized as follows: In section 2, we define the formal preliminaries and notations used throughout the paper. Section 3 summarizes concepts and results from Rott’s (2019) work on difference-making con- Example 1. An agent wanted to escape the hustle and bustle of the city and decided to move into an old farm house in 83 sense of Halpern (2003), most commonly represented as probability distributions, possibility distributions (Dubois and Prade 2006) or ordinal conditional functions (Spohn 1988, 2012). A knowledge base is consistent if and only if there is (a representation of) an epistemic state that accepts the knowledge base, i.e., all conditionals in ∆. Ordinal conditional functions (OCFs, also called ranking functions) κ : Ω → N ∪ {∞}, with κ−1 (0) 6= ∅, assign to each world ω an implausibility rank κ(ω). OCFs were first introduced by Spohn (1988). The higher κ(ω), the less plausible ω is, and the normalization constraint requires that there are worlds having maximal plausibility. Then one puts κ(A) := min{κ(ω) | ω A} and κ(∅) = ∞. Due to κ−1 (0) 6= ∅, at least one of κ(A) and κ(A) must be 0. A proposition A is believed if κ(A) > 0, and the belief set of a ranking function κ is defined as Bel (κ) = Th(κ−1 {0}). ditionals. Then, in section 4, we define a ranking semantics for difference-making conditionals via an OCF-version of the Relevant Ramsey Test and prove the basic principles using a reformulation of a difference-making conditional as a pair of more standard conditionals. In section 5, we construct an inductive representation for sets of differencemaking conditionals using c-representations. Section 6 introduces a method for revising by difference-making conditionals based on c-revisions in the framework of ranking functions. In section 7, we discuss alternative approaches to incorporating relevance in conditionals. The concluding section 8 sums up our findings. 2 Formal Preliminaries Let L be a finitely generated propositional language over an alphabet Σ with atoms a, b, c, . . . and with formulas A, B, C, . . .. For conciseness of notation, we will omit the logical and-connector, writing AB instead of A ∧ B, and overlining formulas will indicate negation, i.e., A means ¬A. The set of all propositional interpretations over Σ is denoted by ΩΣ . As the signature will be fixed throughout the paper, we will usually omit the subscript and simply write Ω. ω A means that the propositional formula A ∈ L holds in the possible world ω ∈ Ω; then ω is called a model of A, and the set of all models of A is denoted by Mod (A). For propositions A, B ∈ L, A B holds iff Mod (A) ⊆ Mod (B), as usual. By slight abuse of notation, we will use ω both for the model and the corresponding conjunction of all positive or negated atoms. This will allow us to ease notation a lot. Since ω A means the same for both readings of ω, no confusion will arise. The set of classical consequences of a set of formulas A ⊆ L is Cn(A) = {B | A |= B}. The deductively closed set of formulas which has exactly a subset W ⊆ Ω as a model is called the formal theory of W and defined as Th(W) = {A ∈ L | ω |= A for all ω ∈ W}. We extend L to a conditional language (L|L) by introducing a conditional operator ( · | · ), so that (L|L) = {(B|A) | A, B ∈ L}. (L|L) is a flat conditional language, no nesting of conditionals is allowed. A is called the antecedent of (B|A), and B is its consequent. (B|A) expresses ‘If A, then (plausibly) B’. In the following, conditionals (B|A) ∈ (L|L) are referred to as standard conditionals or, if there is no danger of confusion, simply conditionals. We further extend our framework of conditionals to a language with might conditionals L|L by introducing a might conditional operator ·|· (the angle brackets are supposed to remind the reader of a split diamond operator). For a might conditional D|C , we call C the antecedent and D the consequent. As for standard conditionals, L|L is a flat conditional language, and D|C expresses ‘If C, then D might be the case’. In a way, the might conditional D|C is the negation of the standard conditional (D|C) (Lewis 1973). The former is accepted iff the latter isn’t. A (conditional) knowledge base is a finite set of conditionals ∆ = {(B1 |A1 ), . . . , (Bn |An )} ∪ { Bn+1 |An+1 , . . . , Bm |Am }. To give an appropriate semantics to (standard resp. might) conditionals and knowledge bases, we need richer semantic structures like epistemic states in the ⊳ ⊲ That is, the verification of (B|A) is more plausible than its falsification or the premise of the conditional is always false. ⊳ ⊳ ⊳ ⊳ ⊲ Note that accepting a might conditional is not equivalent to the acceptance of the conditional with negated consequent (κ |= (D|C)) but weaker since it allows for indifference between CD and CD. In this case both (D|C) and (D|C) fail to be accepted. 3 The Ramsey Test, the Relevant Ramsey Test and difference-making conditionals In the following, let Ψ be an epistemic state of any general format, and let Bel be an operator on belief states that assigns to Ψ the set of beliefs held in Ψ. Let ∗ be a revision operator on epistemic states, and let (B|A) be a conditional. The Ramsey Test (so-called after a footnote in Ramsey 1931) was made popular by Stalnaker (1968). According to it, ‘If A then B’ is accepted in a belief state just in case B is an element of the belief set Bel (Ψ ∗ A) that results from a revision of the belief state Ψ by the sentence A. Formally: (RT) Ψ |= (B|A) iff B ∈ Bel (Ψ ∗ A). If belief states are identified with ranking functions, the Ramsey Test reads as follows: κ |= (B|A) iff B ∈ Bel (κ ∗ A); this, taken together with Definition 1 implies a constraint on κ∗A. The condition B ∈ Bel (Ψ∗A) can be reformulated using some basic properties of ranking functions: ⊳ ⊲ ⊲ ⊲ Definition 2. A might conditionals D|C is accepted in an epistemic state represented by an OCF κ, written as κ |= D|C , if and only if κ 6|= (D|C) or κ(C) = ∞, i.e., κ(CD) ≤ κ(CD) or κ(C) = ∞. ⊳ ⊲ ⊳ ⊲ ⊳ ⊲ ⊳ Definition 1. A (standard) conditional (B|A) is accepted in an epistemic state represented by an OCF κ, written as κ |= (B|A), iff κ(AB) < κ(AB) or κ(A) = ∞. ⊲ B ∈ Bel (κ ∗ A) = Th((κ ∗ A)−1 {0}) ⇔ ∀ ω ∈ min(Mod (κ ∗ A)) it holds that ω |= B ⊲ ⇔ (κ ∗ A)(B) < (κ ∗ A)(B) ⇔ (κ ∗ A)(B) > 0 84 ⇔ κ ∗ A |= B. Rott took ≫ to be an intrinsically contrastive connective. It is important to note, however, that unlike ‘B because A’ and ‘Since A, B’, which can only be accepted if A is believed to be true, the acceptance of A ≫ B neither entails nor is entailed by a particular belief status of A. (RRT) provides a clear and simple doxastic semantics for relevance-encoding conditionals with antecedents and consequents that may be arbitrary compounds of propositional sentences. Since (RRT) is more complex than (RT), it is hardly surprising that difference-making conditionals don’t satisfy some of the usual principles for standard conditionals such as CM, Cut and Or. Rott discusses some examples showing how CM, Cut and OR can fail with difference-making conditionals. The most striking fact, however, is that differencemaking conditionals do not even validate Right Weakening which has long seemed entirely innocuous to conditional logicians. Rott even called the invalidity of RW the hallmark of difference-making conditionals and indeed of the relevance relation. Another notable property of differencemaking conditionals is that B ∈ Cn(A) does not imply that A ≫ B is accepted. If B is accepted “anyway” (like for instance a logical truth B is), then A cannot be relevant to B, even if it implies B. That many of the familiar principles for standard conditionals become invalid for difference-making conditionals does not mean that there is no logic to the latter. Here are the basic principles of difference-making conditional operators that Rott (2019) shows to be complete with respect to the basic AGM postulates for belief revision (actually Rott uses a slight weakening of the basic AGM postulates that allows that revisions by non-contradictions may result in inconsistent belief sets): (≫0) ⊥ ≫ ⊥. (≫1) If A ≫ BC, then A ≫ B or A ≫ C. (≫2a) A ≫ C iff (A ≫ AC and A ≫ A ∨ C). (≫2b) A ≫ AC iff (notA ≫ A∨C and A ≫ A). (≫3–4) ⊥ ≫ A ∨ C iff (⊥ ≫ A and A ≫ A ∨ C). (≫5) A ∨ B ≫ ⊥ iff (A ≫ ⊥ and B ≫ ⊥). (≫6) If Cn(A) = Cn(B) and Cn(C) = Cn(D), then: A ≫ C iff B ≫ D. All of these principles are to be read as quantified over all belief states Ψ: ‘A ≫ C’ is short for ‘Ψ |= A ≫ C’ and ‘not A ≫ C’ is short for ‘Ψ 6|= A ≫ C’. Roughly, a principle of the form ‘If ∆, then Γ’ is valid iff for every belief state Ψ, if the (possibly negated) conditionals mentioned in ∆ are all accepted in Ψ, then the (possibly negated) conditionals mentioned in Γ are accepted in Ψ, too. It follows from principles (≫0) – (≫6) that (And) is also valid for difference-making conditionals. (≫1) is dual to the well-known principle of Disjunctive Rationality; it is called Conjunctive Rationality in Rott (2020). Like its dual, Conjunctive Rationality is a non-Horn condition. Another non-Horn condition is the right-to-left direction of (≫2b). The presence of non-Horn conditions means that reasoning with difference-making conditionals is not trivial. In order to determine what may be inferred from a knowledge base containing difference-making conditionals, we cannot We can also define a Ramsey Test for might conditionals: Ψ |= B|A iff B 6∈ Bel (Ψ ∗ A), that is, iff Ψ 6|= (B|A). Or more specifically, in terms of ranking functions: κ |= B|A iff B 6∈ Bel (κ ∗ A), that is, iff κ 6|= (B|A), which follows from Definition 2. The condition B 6∈ Bel (Ψ ∗ A) can again be reformulated using some properties of ranking functions: ⊳ ⊲ ⊳ ⊲ B 6∈ Bel (κ ∗ A) = Th((κ ∗ A)−1 {0}) ⇔ ∃ ω ∈ min(Mod (κ ∗ A)) such that ω |= B ⇔ (κ ∗ A)(B) = 0 ⇔ κ ∗ A 6|= B. Given assumptions on belief revision in the tradition of Alchourrón, Gärdenfors and Makinson (1985), Ramsey Test conditionals are known to satisfy, among other things, the following principles of And, Right Weakening, Cautious Monotonicity, Cut and Or: (And) If (B|A) and (C|A), then (BC|A). (RW) If (B|A) and C ∈ Cn(B), then (C|A). (CM) If (B|A) and (C|A), then (C|AB). (Cut) If (B|A) and (C|AB), then (C|A). (Or) If (C|A) and (C|B), then (C|A ∨ B). All of these principles are to be read as quantified over all belief states Ψ: ‘(B|A)’ is short for ‘Ψ |= (B|A)’. Roughly, a principle of the form ‘If ∆, then (B|A)’ is valid iff for every belief state Ψ, if the conditionals mentioned in ∆ are all accepted in Ψ, then (B|A) is accepted in Ψ, too. The Ramsey Test falls squarely within the paradigm of the suppositional account mentioned above. Assume that an agent happens to believe B. Assume further that her beliefs are consistent with A (or that she actually already believes that A). Then, given a widely endorsed condition of belief preservation, the Ramsey Test rules that the agent is committed to accepting the conditional (B|A). There need not be any relation of relevance or support between A and B. In particular, if you happen to believe A and B, this is sufficient to require acceptance of (B|A). How can the Ramsey Test be adapted to capture the idea that the antecedent should be relevant to the consequent? One straightforward way is to interpret conditionals as being contrastive: The antecedent should make a difference to the consequent. In order to implement this idea without introducing a dependence on the actual belief status of the antecedent, Rott (2019) suggests the following Relevant Ramsey Test: (RRT) Ψ |= A ≫ B iff B ∈ Bel (Ψ ∗ A) and B 6∈ Bel (Ψ ∗ A). We call conditionals that are governed by (RRT) differencemaking conditionals, and we have changed the notation here from (B|A) to A ≫ B in order to mark our transition from standard would conditionals to difference-making conditionals. A ≫ B can be read as ‘If A, then (relevantly) B.’ Here the consequent is accepted if we revise the belief state by the antecedent, but the consequent fails to be accepted if we revise by the negation of the antecedent. Rott’s idea was to liken conditionals to the natural-language connectives ‘because’ and ‘since’ that are widely taken to express the contrast that a cause or a reason is making to its effect. Thus 85 simply use the axioms as closure operators. This is analogous to the problem of rational consequence relations in the sense of Lehmann and Magidor (1992) that have made it necessary to invent special inference methods like rational closure/system Z and c-representations. In the following, we will use the method of c-representations to deal with difference-making conditionals. A major part of our task ahead may be described as doing for c-representations what Booth and Paris (1998) achieved for rational closure. 4 only used standard conditionals. The might conditionals express that if it is not cold, then the pipe might not break, and if the pipe does not break, we might not call the plumber. Here the might conditionals formulated in natural language perhaps sound a bit odd, but together with the standard conditionals they express the reason relations introduced by the difference-making conditionals. Next, we turn to the basic principles for differencemaking conditionals. Note that when checking the principles of Rott, instead of a general epistemic state Ψ, we use a ranking function κ. Theorem 1. Let κ be a ranking function and let κ |= A ≫ B be as defined in (2). Then · ≫ · satisfies the basic principles of difference-making conditionals. Ranking semantics for difference-making conditionals In this section, we define a semantics for difference-making conditionals in the framework of Spohn’s ranking functions. We make use of standard conditionals and might conditionals in order to express that the antecedent of the conditional is relevant to the consequent. We justify our definition of difference-making conditionals by showing that the Relevant Ramsey Test holds and we show that the Basic principles are satisfied. Proof. (≫0): We show that κ |= ⊥ ≫ ⊥, i.e., κ∗⊥ |= ⊥ and κ ∗ ⊤ 6|= ⊥. These are true by the success and consistency conditions for revisions, respectively. (≫1): Let κ |= A ≫ BC. We have to show that κ |= A ≫ B or κ |= A ≫ C. Via (2) it follows that we have to show that κ(ABC) < κ(A(B ∨ C)) and κ(A(B ∨ C)) ≤ κ(ABC) implies κ(AB) < κ(AB), κ(AB) ≤ κ(AB), or κ(AC) < κ(AC), κ(AC) ≤ κ(AC). From κ(ABC) < κ(A(B ∨ C)), we derive κ(ABC) < κ(AB ∨ AC) = min{κ(AB), κ(AC)}, and hence both κ(ABC) < κ(AB) and κ(ABC) < κ(AC). Since ABC |= AB, AC we obtain κ(AB) < κ(AB) and κ(AC) < κ(AC). Moreover, from κ(A(B ∨ C)) ≤ κ(ABC), we derive that either κ(AB) ≤ κ(ABC) or κ(AC) ≤ κ(ABC). Since ABC |= AB, AC we obtain that either κ(AB) ≤ κ(AB) or κ(AC) ≤ κ(AC). (≫2a): We have to show that κ |= A ≫ C iff (κ |= A ≫ AC and κ |= A ≫ A ∨ C). Via (2) it follows that we have to show that κ(AC) < κ(AC) and κ(AC) ≤ κ(AC) iff (κ(AC) < κ(AC) and κ(A) ≤ κ(⊥)) and (κ(A) < κ(⊥) and κ(AC) ≤ κ(AC)). This holds trivially. (≫2b): We have to show that κ |= A ≫ AC iff (not κ |= A ≫ A ∨ C and κ |= A ≫ A). Via (2) it follows that we have to show that κ(AC) < κ(AC), κ(A) ≤ κ(⊥) iff (κ(A) ≥ κ(⊥) or κ(AC) < κ(AC)) and κ(A) < κ(⊥) and κ(A) ≤ κ(⊥). This holds trivially. (≫3–4): We have to show that κ |= ⊥ ≫ A ∨ C iff (κ |= ⊥ ≫ A and κ |= A ≫ A ∨ C). Via (2) it follows that we have to show that κ(AC) ≤ κ(A ∨ C) iff κ(A) ≤ κ(A) and κ(AC) ≤ κ(AC). The direction from left to right is immediate. For the converse direction, note that κ(AC) ≤ κ(AC) implies that κ(AC) = κ(A). So we get from κ(A) ≤ κ(A) and κ(AC) ≤ κ(AC) that κ(AC) ≤ min{κ(A), κ(AC)} = κ(A ∨ C), as desired. (≫5): We have to show that κ |= A∨ B ≫ ⊥ iff (κ |= A ≫ ⊥ and κ |= B ≫ ⊥). But conditionals with impossible consequents are accepted iff the antecedents are impossible, i.e., have κ-rank ∞. So the claim follows from the fact that κ(A ∨ B) = min{κ(A), κ(B)}. Definition 3 (Relevant Ramsey Test for OCFs). Let κ be an OCF, A ≫ B be a difference-making conditional and ∗ a revision operator for OCFs. We define the Relevant Ramsey Test for OCFs as follows: (RRTocf ) κ |= A ≫ B iff B ∈ Bel (κ ∗ A) and B 6∈ Bel (κ ∗ A). Using some basic properties of ranking functions, we can reformulate (RRTocf ) : κ |= A ≫ B iff κ ∗ A |= B and κ ∗ A 6|= B. (1) From (1), we obtain for A with κ(A), κ(A) < ∞: ⊳ ⊲ (2) κ |= A ≫ B iff κ |= {(B|A), B|A } iff both of the following two conditions hold: κ(AB) < κ(AB) and (3) κ(AB) ≤ κ(AB). (4) Difference-making conditionals defined by (RRTocf ) can be expressed by pairs of conditionals. The first conditional (B|A) corresponds to the first part of the (RRTocf ) , B ∈ Bel (κ ∗ A), using basically the standard Ramsey Test. The clause for (RRTocf ) implies the clause for the standard Ramsey Test. The second conditional B|A corresponds to the second part of the (RRTocf ) , namely B 6∈ Bel (κ ∗ A). We now continue with Example 1 in order to elucidate our reformulation in (2). ⊳ ⊲ Example 2 (Continue Example 1). The agent’s pipe broke because the temperatures were too low, and therefore she had to call a plumber to have the pipe fixed. These connections can be expressed using difference-making conditionals c ≫ b and b ≫ p. Applying (2), we can reformulate ∆≫ = {c ≫ b, b ≫ p} = {(b|c), b|c , (p|b), p|b }. The standard conditionals express that if it is cold, then the pipe will break, and if the pipe breaks, then the agent will call a plumber. But the reason relation would get neglected if we ⊳ ⊲ ⊳ ⊲ 86 ⊥≫⊥ κ ∗ ⊤ 6|= ⊥ Bel (κ) is consistent A≫⊥ κ ∗ A |= ⊥ A is a doxastic impossibility A≫⊥ κ ∗ A |= ⊥ A is a doxastic necessity ⊥≫A κ ∗ ⊤ 6|= A A is a non-belief A≫A κ ∗ A 6|= A A is contingent A ≫ AC κ ∗ A |= C C is in Bel (κ ∗ A) and κ ∗ A 6|= ⊥ and A is contingent A≫A ∨ C κ ∗ A 6|= C not A ≫ A ∨ C κ ∗ A |= C ditionals, we get a c-representation of sets of differencemaking conditionals. First, we will turn to the application of the technique of c-representations to sets of standard and might conditionals. Proposition 2 (C-representation of sets of standard and might conditionals). Let ∆ = {(Bi |Ai )}i=1,...,n ∪ { Bi |Ai }i=n+1,...,m be a set of standard and might conditionals. A c-representation of ∆ is given by an OCF of the form X κ− (5) κc∆ (ω) = i ⊳ ω|=Ai B i C is not in Bel (κ ∗ A) with non-negative impact factors κ− i for each conditional (Bi |Ai ) ∈ ∆ resp. Bi |Ai ∈ ∆ satisfying X X κ− κ− κ− (6) i (≥) min { k } − min { k} C is in Bel (κ ∗ A) ⊳ Table 1: The meanings of some basic difference-making conditionals. ω|=Ai Bi ⊲ ω|=Ai B i ω|=Ak B k i6=k ω|=Ak B k i6=k for all 1 ≤ i ≤ m. If i ∈ {1, . . . , n}, i.e. the impact factor stands for a standard conditional, then we need strict inequalities ‘>’. If i ∈ {n + 1, . . . , m}, i.e. the impact factor stands for a might conditional, then we do not need strict inequalities and ‘≥’ is sufficient. To calculate a c-representation of a set of conditionals ∆ we need to solve a system of inequalities, given by formula (6) for each i = 1, . . . , m, which ensure κc∆ |= ∆. More precisely, with the ranks of formulas, and formula (5) the constraint κc∆ (Ai Bi ) < κc∆ (Ai B i ) for 1 ≤ i ≤ n resp. κc (Ai Bi ) ≤ κc (Ai B i ) for n + 1 ≤ i ≤ m expands to X X min { κ− κ− (7) k } (≤) min { k} (≫6): If Cn(A) = Cn(B) and Cn(C) = Cn(D), then A ≫ C iff B ≫ D. This follows trivially since structurally analogous compounds of logically equivalent sentences are logically equivalent and thus get the same κranks. The basic principles explore the logic of conditionals governed by (RRT). The reformulation in Definition 3 shows that the notion of the Relevant Ramsey Test can be transferred to the OCF framework. The relevance of the antecedent to the consequent can be expressed by splitting the two directions within the (RRTocf ) into two conditionals, one might and one standard conditional. In Theorem 1, we have shown that this reformulation serves the logic behind difference-making conditionals. Theorem 1 should be compared with the results of Raidl (2020). Rott (2019) explained the meanings of some basic difference-making conditionals, and the explanations still work within the OCF framework. They are collected in table 1. Note that the meanings also reflect the idea of the basic principles. For example, (≫2a) says that C is in the revision of κ ∗ A and not in the revision κ ∗ A iff A ≫ AC and A ≫ A ∨ C, which is exactly the meaning of these two basic difference-making conditionals. Also for (≫2b) the meanings of the difference-making conditionals of both sides of ‘iff’ are exactly the same. 5 ⊲ ω|=Ai Bi | ω|=Ak B k {z (7a) ω|=Ai B i } | ω|=Ak B k {z } (7b) The left minimum ranges over the models Ai Bi , so the conditional (Bi |Ai ) resp. Bi |Ai is not falsified by any considered world and thus κ− i is no element of any sum (7a). As opposed to this, the right minimum ranges over the models of Ai B i , so the conditional (Bi |Ai ) resp. Bi |Ai is falsified by every considered world and thus κ− i is an element of every sum (7b). With these deliberations, we can rewrite the inequalities to X X − min { κ− κ− k } (≤) κi + min { k } (8) ⊳ ⊲ ⊳ ω|=Ai Bi Inductive representation of difference-making conditionals and therefore κ− i (≥) min { In this section, we define an inductive representation of sets of difference-making conditionals ∆≫ by setting up epistemic states in form of OCFs that are admissible with respect to ∆≫ . We use the approach of c-representations firstly introduced by Kern-Isberner (2001). C-representation are not only capable of setting up epistemic states that represent sets of standard conditionals but were extended to might conditionals (see Eichhorn, Kern-Isberner and Ragni 2018). By combining the representation of standard and might con- ω|=Ai B i ω|=Ak B k i6=k ω|=Ai Bi X ω|=Ak B k i6=k κ− k } − min { ω|=Ai B i ⊲ ω|=Ak B k i6=k X κ− k} (9) ω|=Ak B k i6=k for all 1 ≤ i ≤ m. As we have seen, save for the strictness the inequalities defining impact factors for standard and might conditionals are the same and therefore can be expressed using ‘(≥)’. Also note that c-representations are not unique since the solution of the system of inequalities is not unique. If the system of inequalities in (6) has a solution 87 the set ∆≫ consists just of one difference-making conditional ∆≫ = {A ≫ B}. In this case, the system of inequalities always has a solution and we can define the κc∆≫ as follows: Theorem 3. Let A ≫ B be a difference-making conditional. κcA≫B is a c-representation of A ≫ B iff there are integers − κ− st , κw , such that, for ω ∈ Ω  − κst , ω |= AB (13) κcA≫B (ω) = κ− , ω |= AB , for all ω ∈ Ω  w 0, else then ∆ is consistent and (5) is a model of ∆. For the converse, Kern-Isberner (2001, p. 69, 2004, p. 26) has shown that every finite consistent knowledge base consisting solely of standard conditionals has a c-representation; but it is still an open question whether this result extends to knowledge bases including might conditionals. According to (2), a difference-making conditional A ≫ B can be reformulated as a set of a standard and a might conditional {(B|A), B|A }. So, for a set of difference-making conditionals ∆≫ = {Ai ≫ Bi | i = 1, . . . , n}, ∆≫ can be implemented via {(Bk |Ak ) | k = 1, . . . , n} ∪ { B l |Al | l = 1, . . . , n}. In this way, we can get an inductive representation of ∆≫ by a c-representation as follows: ⊳ ⊲ ⊳ ⊲ and Definition 4 (C-representation for sets of difference-making conditionals). Let ∆≫ = {Ai ≫ Bi | i = 1, . . . , n} be a set of difference-making conditionals. An OCF κ is a crepresentation of ∆≫ iff n X κc∆≫ (ω) = n X κ− k + ω|=Ak B k k=1 λ− l . (10) Let us now continue with our example concerning the agent’s broken pipe: Example 3 (Continue Example 2). Using the representation of the set of difference-making conditionals ∆≫ = {c ≫ b, b ≫ p} as pairs of standard and weak conditionals from Example 2, we can construct a c-representation κc∆≫ using Definition 4. First we have to solve the system of inequalities − defining the impact factors. Let κ− 1 and λ1 correspond to the standard and the might conditional representations of c ≫ b, − and let κ− 2 and λ2 apply similarly to b ≫ p: − with non-negative impact factors κ− k resp. λl for each con≫ ditional (Bk |Ak ) ∈ ∆ resp. B j |Aj ∈ ∆≫ satisfying X X κ− λ− κ− j + k > min { l } ω|=Ak Bk − min { ω|=Ak B k − min { ω|=Al Bl ω|=Al Bl X κ− j + X λ− j + ω|=Aj B j k6=j λ− l ≥ min { ω|=Al B l ⊲ ω|=Aj B j k6=j ω|=Aj Bj l6=j λ− l } X κ− k} (11) − − κ− 1 > min{0, κ2 } − min{0, λ2 } = 0, ω|=Al Bl − − λ− 1 ≥ min{0, λ2 } − min{0, κ2 } = 0, − − κ− 2 > min{0, λ1 } − min{0, λ1 } = 0, ω|=Ak B k ω|=Aj Bj l6=j X X λ− j + X − − λ− 2 ≥ min{0, κ1 } − min{0, κ1 } = 0. (12) κ− k} The minima on the left-hand side range over worlds verifying the corresponding (standard resp. might) conditional and the minima on the right-hand side range over worlds falsifying these. We take the minimum of the summed up impact factors indicating that other conditionals are falsified. Since the impact factors are non-negative the minima equal − − − zero. We choose κ− 1 = κ2 = 1 and λ1 = λ2 = 0 and get the c-representation presented in table 2. It is easy to verify that κc∆≫ |= c ≫ p, so in this example the differencemaking conditionals satisfy transitivity. Note, however, that transitivity is only ‘valid by default’, that is, it can easily be undercut by the addition of another premise. For instance, it is possible to consistently add c ≫ p as a third premise to ∆≫ . The extended knowledge base has a c-representation − − − − − (based on κ− 1 = κ3 = 2, κ2 = 1 and λ1 = λ2 = λ3 = 0) that does not satisfy c ≫ p because it does not even satisfy (p|c). ω|=Ak B k Equations (11) and (12) ensure that the impact factors are chosen such that κc∆≫ |= ∆≫ . Just like in (6), (11) resp. (12) follows from the success condition in (3) resp. (4). Since we chose different impact factors κ− resp. λ− for the standard resp. the might conditionals, the terms in the minima look more complex even though they can be derived from (6). Also we replaced the general form of might conditionals B|A by the more specific might conditional B i |Ai , taking advantage of the special structure of difference-making conditionals. C-representations of difference-making conditionals exist iff all inequalities (11) and (12) are solvable. Sets of difference-making conditionals, can be inductively represented by a c-representation. The crucial part is the reformulation of difference-making conditionals as sets of one standard and one might conditional in (2). Due to the high adaptability of the approach of c-representations, it is possible to deal with such a set of mixed conditionals. In order to illustrate c-representations of differencemaking conditionals, we now turn to the special case when ⊳ ⊲ (14) Proof. Let ∆≫ = {A ≫ B}. Since Ai B i Ai Bi ≡ ⊥, (13) follows immediately from (10). κ− st > 0 follows from (11) and κ− w ≥ 0 follows from (12), since there are no other difference-making conditional to interact with. ω|=Al Bl l=1 ⊳ − κ− st > 0 and κw ≥ 0 ⊳ ⊲ 6 Revision by difference-making conditionals In this section we discuss a revision method for epistemic states represented by an OCF with one difference-making conditional. Therefore, we make use of the characterisation 88 ω cbp cbp cbp cbp κc∆≫ (ω) 0 κ− 2 =1 − κ− 1 + λ2 = 1 κ− 1 =1 κc∆≫ (ω) λ− 1 =0 − κ− 2 + λ1 = 1 − λ2 = 0 0 ω cbp cbp cbp cbp Since κ0 is a constant factor, it can be removed from the inequality. As in c-representation, the factor κ− i is no element of the left sum, whereas the right sum ranges over worlds falsifying (Bi |Ai ) resp. Bi |Ai and therefore the factor κ− i is an element of every sum. With these deliberations we can rewrite the inequalities to (16) for all 1 ≤ i ≤ m. Note that the impact factors defining c-revisions are not unique because there are multiple solutions of the system of inequalities in (16). The question as to which choice of the impact factors is ‘best’ is part of our ongoing work. Now we turn to the revision of an epistemic state by a single difference-making conditional in the framework of OCFs. In (2) we showed that the revision by a differencemaking conditional is equivalent with revising a ranking function by a special set of conditionals, since A ≫ B corresponds to {(B|A), B|A }. Thus, we need a revision method which is capable of dealing with a mixed set of conditionals. As we have seen before, c-revisions are an adaptable revision method for sets of conditionals, both for standard and for might conditionals. Following the general schema of c-revisions, we get: ⊳ Table 2: The ranking function κc∆≫ of Example 3. of a difference-making conditional as a set of one standard conditional and one might conditional in (2) and provide a method for simultaneously revising an epistemic state with a standard and a might conditional. C-revisions, introduced by Kern-Isberner (2001), provide a highly general framework for revising epistemic states by sets of conditionals. In the framework of ranking functions, c-revisions are capable of revising an OCF by a set of conditionals with respect to conditional interaction within the new information, while preserving conditional beliefs in the former belief state. This is all depicted in the principle of conditional preservation, which implies the Darwiche-Pearl postulates for revising epistemic states (Kern-Isberner 2001, 2004). We will now introduce a simplified version of crevisions for sets of standard and might conditionals. Proposition 4 (C-revisions by sets of standard and might conditionals). Let κ be an OCF specifying a prior epistemic state and let ∆ = {(Bi |Ai ) | i = 1, . . . , n}∪{ Bi |Ai | i = n + 1, . . . , m} be a set of standard and might conditionals which represent the new information. Then a c-revision of κ by ∆ is given by an OCF of the form X κ ∗ ∆(ω) = κ∗∆ (ω) = κ0 + κ(ω) + κ− (15) i ⊳ ⊳ ⊲ Definition 5 (C-revision by a difference-making conditional). Let κ be an OCF specifying a prior epistemic state and let A ≫ B = {(B|A), B|A } be a difference-making conditional which represents the new information. Then a c-revision of κ by A ≫ B is given by an OCF of the form ⊳ ⊲ ⊲ κ ∗ A ≫ B(ω) = κ∗∆≫ (ω) = κ0 + κ(ω) + κ− i with non-negative impact factors for each conditional (Bi |Ai ) ∈ ∆ resp. Bi |Ai ∈ ∆ satisfying X κ− κ− i (≥) min {κ(ω) + k }− ⊲ ω|=Ai Bi ω|=Ak B k i6=k X min {κ(ω) + ω|=Ai B i ≤) min {κ0 + κ(ω) + ( ω|=Ai B i (18) κ− w (19) ≥ κ(AB) − κ(AB). ⊳ ⊲ ⊲ ω|=Ai B i X κ− w ω|=Ai Bi Since AB and AB are exclusive and we revise with just a single difference-making conditional, we get (17). The success condition for the standard conditional (B|A) in (3) and the success condition for might conditional B|A in (4) − lead to inequalities defining impact factors κ− st resp. κw . For − κst it holds that (18) follows immediately from (16): X X κ− κ− κ− st > min {κ(ω) + k } − min {κ(ω) + k }. ⊳ ⊲ ω|=AB ω|=Ak B k X κ− st > κ(AB) − κ(AB) κ ∗ (A ≫ B)(ω) = κ ∗ {(B|A), B|A }(ω) X = κ0 + κ(ω) + κ− st + κ0 is a normalization factor to ensure that κ∗∆ is an OCF. The κ− i can be considered as impact factors of the single conditional (Bi |Ai ) ∈ ∆ resp. Bi |Ai ∈ ∆ for falsifying the conditionals in ∆ which have to be chosen so as to ensure success κ∗∆ |= ∆ by (16). As before, we use ‘(≥)’ as a dummy operator which is replaced by the strict inequality symbol > for standard conditionals, while for might conditionals it is replaced by the inequality symbol ≥. From the success condition κ∗∆ (Ai Bi ) (≤) κ∗∆ (Ai B i ) and the ranks of formulas, it holds that X κ− min {κ0 + κ(ω) + k} ω|=Ai Bi (17) ⊳ ⊲ ω|=Ak B k i6=k ⊳ κ− st , ω |= AB κ− w , ω |= AB As before, κ0 is a normalization factor. The premises of the standard and the might conditional defining the difference-making conditional A ≫ B are exclusive, so the set A ≫ B = {(B|A), B|A } is consistent and κ∗∆≫ always exists. The form of κ∗∆≫ in (17) follows from (15): (16) κ− k} with ω|=Ai B i ⊳ ⊲ κ− k }. ω|=Ak B k ω|=AB ω|=Ak B k The minimal range over worlds AB resp. AB, so the might conditional B|A are not falsified by any considered world ⊳ ⊲ ω|=Ak B k 89 ω cbpd cbpd cbpd cbpd cbpd cbpd cbpd cbpd κ∗ (ω) 0 0 + κ− w =0 1 1 + κ− w =1 1 + κ− st = 2 1 1 + κ− st = 2 1 ω cbpd cbpd cbpd cbpd cbpd cbpd cbpd cbpd κ∗ (ω) 0 0 + κ− w =0 1 1 + κ− w =1 0 + κ− st = 1 0 0 + κ− st = 1 0 The idea of incorporating relevance into the analysis of conditionals has been around for a long time, and several attempts to implement this kind of connective have been made. In this section, we explore and compare some of these ideas. The earliest work establishing a tight connection between conditionals and belief revision was Gärdenfors (1979). In a similar vein Fariñas and Herzig (1996) uncover a strong link between belief contraction (which is known to be dual to belief revision) and dependence. Their idea is close to the idea of relevance introduced in Rott (1986) and their work is cited by Rott (2019). This is what Fariñas and Herzig understand by the phrase ‘B depends on A’: FHD Ψ |= A ❀ B iff B ∈ Bel (Ψ) and B 6∈ Bel (Ψ−̇A). So B depends on A if and only if B is believed in the current belief state Ψ and B is no longer believed if A is withdrawn from the belief set of Ψ. There are some notable differences to the Relevant Ramsey Test. The most striking one is that the domain of Fariñas and Herzig’s dependency relation is restricted to the agent’s current belief set, since Ψ |= A ❀ B implies that A, B ∈ Bel (Ψ). It fails to acknowledge dependencies between non-beliefs, i.e., propositions that the agent either believes to be false or suspends judgement on, like the propositions featuring in counterfactuals which typically are non-beliefs. A second strand of research to compare with the present one is the study of conditionals incorporating relevance in a probabilistic framework that was begun by Douven (2016) and Crupi and Iacona (2019b). Crupi and Iacona (2019a) suggested a non-probabilistic possible-worlds semantics for the ‘evidential conditional’ that can be defined as follows: CPC A ✄ B iff (B|A) and (A|B). Let us call such conditionals contraposing conditionals. Crupi and Iacona call a rule essentially identical to (CPC) the ‘Chrysippus Test’ (Crupi and Iacona 2019a) and say that it characterizes the evidential interpretation of conditionals according to which ‘a conditional is true just in case its antecedent provides evidence [or support] for its consequent.’ Raidl (2019) provided the first completeness proof for the ‘evidential conditional’ which has been improved in Raidl, Crupi and Iacona (2020). Independently, Booth and Chandler (2020, Proposition 12) hit upon the same concept of contraposing conditionals and have started investigating it. Rott (2020) raises doubts as to whether contraposition really captures the idea of evidence or support. It is true that contraposing conditionals do violate RW, and this violation was called the hallmark of relevance by Rott. Except for that, contraposing conditionals are formally very well-behaved as they validate, for example, Or, Cautious Monotony, Negation Rationality and Disjunctive Rationality. These principles are all violated by difference-making conditionals. However, Rott argues that the contrastive notion of difference-making is better motivated as an explication of evidence and support than contraposition. The Relevant Ramsey Test—which can be found, under the name ‘Strong Ramsey Test’, already in Rott (1986)—has ancestors in Gärdenfors’ (1980) notion of explanation and in Spohn’s (1983) notion of reason which both encode the idea that the Table 3: Schematic c-revised ranking function κ∗ = κc∆≫ ∗(d ≫ b) of Example 4. Note that κ0 = 0 which is why it is not represented in this table. and thus the sums are empty and we get (18). Analogously, (19) follows from (16): X X κ− κ− κ− w ≥ min {κ(ω) + l } − min {κ(ω) + l }. ω|=A B ω|=Al B l ω|=AB ω|=Al B l The sums in the minima are empty because the strong conditional is not falsified for any world satisfying AB resp. AB. Conditions (18) and (19) ensure the success condition κ∗∆w |= ∆w . As we have seen, c-revision provides a revision method for OCFs which can handle sets of standard and might conditionals. The admissible impact factors allow for a combination of standard and might conditionals in the revision. Together with the special structure of difference-making conditionals, we obtain a revision method for epistemic states which take a difference-making conditional as input and therefore ensure that the premise of the conditional is relevant for the antecedent. Now we give an example of a c-revision by a single difference-making conditional: Example 4 (continue Example 3). The plumber arrives at the agent’s house and tells her that another common reason for broken pipes are deposits in the pipe (d). Since the house is pretty old, the pipe could have also broken because of these deposits. The agent revises her belief state κc∆≫ with the new information d ≫ b = {(b|d), b|d }. ˙ with ȧ = {a, a} for Note that κc∆≫ (ċḃṗ) = κc∆≫ (ċḃṗd) any boolean variable a. Using (18) and (19), we calculate − c c c κ− st > κ∆≫ (bd) − κ∆≫ (bd) = 0 and κw ≥ κ∆≫ (bd) − − c − κ∆≫ (bd) = 0, and choose κst = 1 and κw = 0. Using Definition 4 we get κc∆≫ ∗ (d ≫ b) = κ∗ which is depicted in table 3. Note that in κ∗ still the difference-making conditional c ≫ b holds, so the new reason-relation between the deposits and the broken pipe does not overwrite the connection between cold temperatures and the broken pipe. ⊳ ⊲ 7 Related Work Difference-making conditionals establish a notion of relevance for conditionals, namely that the antecedent A of a conditional ‘If A, then B’ is relevant for its consequent B. 90 κ2 (mr) = 2 and κ2 (mr) = 3 captures scenario 2.As we can see κ1 |= m ≫ r, since κ1 (mr) = 0 < 1 = κ(mr) and κ1 (m r) = 2 ≤ 3 = κ1 (mr), but κ1 6|= m ✄ r, since κ1 (m r) = 2 > 1 = κ1 (mr). For the second scenario, it holds κ2 |= m ✄ r but κ2 6|= m ≫ r. If we compare this with our intuition towards the relation between medicine and recovery, we find that the difference-making conditional gets the example right. Another argument for the notion of relevance encoded by difference-making conditionals is that they comply with Spohn’s work who defines causation as follows: A is a cause of B iff A and B obtain, A precedes B, and A raises the metaphysical or epistemic status of B given the obtaining circumstances. (Spohn 2012, p. 352) As we can see, this is a compound of facts, times, obtaining circumstances and a reason relation. We do not deal with the first three components, but we can compare differencemaking conditionals with Spohn’s concept of reason. In terms of ranking functions, A is a reason for B if the following inequality holds for a ranking functions κ: explanans (or the reason) should raise the doxastic status of the explanandum (or of what the reason is a reason for). If we define a ranking semantics for contraposing conditionals using the framework of Spohn’s ranking functions, we can compare these two notions of relevance from technical point of view. Let κ be a ranking function and A ✄ B be a contraposing conditional with contingent A and B. Then (CP C ocf ) κ |= A ✄ B iff κ |= (B|A) and κ |= (A|B) iff both of the following two conditions hold: κ(AB) < κ(AB) and (20) κ(AB) < κ(AB). (21) Difference-making and contraposing conditionals both require the acceptance of the standard conditional (B|A), but they differ in the case when the antecedent is denied. Compare (20) and (21) with (3) and (4). Difference-making conditionals require the AB-worlds to be more or equally plausible as the AB-worlds stressing that the denial of the antecedent should not lead to acceptance of the consequent. For contraposing conditionals, the denial of the consequent leads to denial of the antecedent, so some AB-worlds are required to be strictly more plausible than all the AB-worlds. Difference-making conditionals place inequality constraints on all possible worlds in Ω{A,B} , whereas contraposing conditionals do not deal with the position of AB-worlds at all. To give a feel for the contrast between difference-making conditionals and contraposing conditionals, we present an example from Rott (2020) and transfer it to the framework of ranking functions. Suppose an infectious disease breaks out with millions of cases, and consider the following two scenarios concerning a treatment: Scenario 1: Almost all of the people infected were administered a medicine and almost all of them have recovered. However, only few of the persons who did not receive the medicine have recovered. Scenario 2: Only very few of the people infected were administered the medicine. But fortunately, most people end up recovering anyway. It turns out that within the group of people who got the medicine slightly less people have recovered than within the group who did not get it. We compare these two scenarios and imagine an agent who has contracted the disease, but of whom it is not know whether she got the medicine. In Scenario 1, the fact that the agent received the medicine would clearly support the fact that she recovered, as it would clearly make the recovery more likely. So we are justified in accepting the conditional ‘If the agent received the medicine, she has recovered’. However, in scenario 2 it does not make sense to apply this conditional. It is likely that the agent has recovered, but having received the medicine would not be evidence for the recovery. We depict these two scenarios using ranking functions. Let m stand for ‘the agent received the medicine’ and r for ‘the agent recovered’. The ranking function κ1 with κ1 (mr) = 0, κ1 (mr) = 1, κ1 (m r) = 2 and κ1 (mr) = 3 captures scenario 1 and κ2 with κ2 (mr) = 0, κ2 (m r) = 1, κ(B|A) − κ(B|A) > κ(B|A) − κ(B|A). (22) Compare Spohn (2012, p. 105, using the definition of twosided ranks τ (B|A) = κ(B|A) − κ(B|A)). Inequality (22) expresses that the conditional (B|A) is stronger than (B|A). Thus, A is a direct[!] cause of B in Spohn’s sense just in case A and B are true, the event represented by A precedes the event represented by B and Spohn’s inequality (22) holds, given the obtaining circumstances. For κ |= A ≫ B, equations (3) and (4) hold. Via the definition of ranks for conditionals we first elaborate on (22): (22) ⇔ κ(AB) − κ(A) − (κ(AB) − κ(A)) > κ(AB) − κ(A) − (κ(AB) − κ(A)) ⇔ κ(AB) − κ(AB) > κ(AB) − κ(AB). Now if κ |= A ≫ B, then the left-hand side is positive, due to (3), whereas the right-hand side is not, due to (4). So, the inequality expressing the notion of reason defined by Spohn follows immediately from the definition of difference-making conditionals as a set of standard and might conditionals. As was pointed out by Eric Raidl (2020, p. 17), A ≫ C expresses that A is a ‘sufficient reason’ for C in the terminology of Spohn (2012, pp. 107–108). 8 Conclusion Difference-making conditionals aim at capturing the intuition that the antecedent A of a conditional is relevant to its consequent B, that A supports B or is a reason or evidence for it. The Relevant Ramsey Test encodes this idea, ruling that revising by the antecedent should lead to acceptance of the consequent, which is the standard Ramsey Test, but also ruling that revising by the negation of the antecedent should not lead to the acceptance of the consequent. Rott (2019) defined the Relevant Ramsey Test and difference-making conditionals in a purely qualitative framework. In the present paper we extended his approach to ranking functions by first 91 on Theoretical Aspects of Rationality and Knowledge, 147– 161. San Francisco, CA: Morgan Kaufmann. Gärdenfors, P. 1979. Conditionals and changes of belief. In Niiniluoto, I., and Tuomela, R., eds., The Logic and Epistemology of Scientific Change, volume 30(2–4) of Acta Philosophica Fennica. Amsterdam: North-Holland. 381–404. Gärdenfors, P. 1980. A pragmatic approach to explanations. Philosophy of Science 47(3):404–423. Halpern, J. 2003. Reasoning about Uncertainty. Cambridge, MA: MIT Press. Kern-Isberner, G. 2001. Conditionals in Nonmonotonic Reasoning and Belief Revision, volume 2087 of Lecture Notes in Computer Science. Berlin: Springer. Kern-Isberner, G. 2004. A thorough axiomatization of a principle of conditional preservation in belief revision. Annals of Mathematics and Artificial Intelligence 40(1– 2):127–164. Lehmann, D., and Magidor, M. 1992. What does a conditional knowledge base entail? Artificial Intelligence 55(1):1–60. Lewis, D. K. 1973. Counterfactuals. Oxford: Blackwell. Raidl, E.; Iacona, A.; and Crupi, V. 2020. The logic of the evidential conditional. Manuscript March 2020. Raidl, E. 2019. Quick completeness for the evidential conditional. PhilSci-Archive, http://philsci-archive.pitt.edu/ 16664. Raidl, E. 2020. Definable conditionals. Topoi. https://doi. org/10.1007/s11245-020-09704-3. Rott, H. 1986. Ifs, though, and because. Erkenntnis 25(3):345–370. Rott, H. 2019. Difference-making conditionals and the relevant Ramsey test. Review of Symbolic Logic. https:// doi.org/10.1017/S1755020319000674. Rott, H. 2020. Notes on contraposing conditionals. PhilSciArchive, http://philsci-archive.pitt.edu/17092. Skovgaard-Olsen, N.; Collins, P.; Krzyanowska, K.; Hahn, U.; and Klauer, K. C. 2019. Cancellation, negation, and rejection. Cognitive Psychology 108:42–71. Spohn, W. 1983. Deterministic and probabilistic reasons and causes. In Hempel, C. G.; Putnam, H.; and Essler, W. K., eds., Methodology, Epistemology, and Philosophy of Science. Dordrecht: Springer. 371–396. Spohn, W. 1988. Ordinal conditional functions: A dynamic theory of epistemic states. In Harper, W. L., and Skyrms, B., eds., Causation in Decision, Belief Change, and Statistics. Dordrecht: Springer. 105–134. Spohn, W. 2012. The Laws of Belief. Oxford: Oxford University Press. Stalnaker, R. C. 1968. A theory of conditionals. In Rescher, N., ed., Studies in Logical Theory (American Philosophical Quarterly Monographs 2). Oxford: Blackwell. 98–112. transferring the Relevant Ramsey Test to the framework of OCFs. We defined difference-making conditionals as a pair consisting of a standard and a might conditional, which is in full compliance with the basic principles that Rott identified for difference-making conditionals. Using this transformation we benefitted from the flexible approach of crepresentations and c-revisions, defining an inductive representation and a revision method for conditionals incorporating relevance. To the best of our knowledge, there is no other revision method capable of dealing with not only sets of conditionals but also sets of conditionals of different types, namely standard and might-conditionals. Finally, drawing on the ranking semantics for difference-making conditionals, we compared different approaches to relevance or evidence in conditionals. We showed that difference-making conditionals express something very close to Spohn’s concept of reason in the context of ranking functions, but that they are fundamentally different from the evidential (or contraposing) conditionals studied by Crupi, Iacona and Raidl. For future work we plan on elaborating on the inductive representation of mixed sets of conditionals. Moreover, we will continue working on the incorporation of relevance in different kinds of epistemic states and examine different revision methods for conditionals incorporating relevance. References Alchourrón, C.; Gärdenfors, P.; and Makinson, D. 1985. On the logic of theory change: Partial meet contraction and revision functions. Journal of Symbolic Logic 50(2):510–530. Booth, R., and Chandler, J. 2020. On strengthening the logic of iterated belief revision: Proper ordinal interval operators. Artificial Intelligence 285:103289. Booth, R., and Paris, J. 1998. A note on the rational closure of knowledge bases with both positive and negative knowledge. Journal of Logic, Language and Information 7(2):165–190. Crupi, V., and Iacona, A. 2019a. The evidential conditional. PhilSci-Archive, http://philsci-archive.pitt.edu/16759. Crupi, V., and Iacona, A. 2019b. Three ways of being non-material. PhilSci-Archive, http://philsci-archive. pitt.edu/16478. Douven, I. 2016. The Epistemology of Indicative Conditionals: Formal and Empirical Approaches. Cambridge: Cambridge University Press. Dubois, D., and Prade, H. 2006. Possibility theory and its applications: a retrospective and prospective view. In Della Riccia, G.; Dubois, D.; Kruse, R.; and Lenz, H.-J., eds., Decision Theory and Multi-Agent Planning. Vienna: Springer. 89–109. Eichhorn, C.; Kern-Isberner, G.; and Ragni, M. 2018. Rational inference patterns based on conditional logic. In McIlraith, S. A., and Weinberger, K. Q., eds., Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 1827–1834. Menlo Park, CA: AAAI Press. Fariñas del Cerro, L., and Herzig, A. 1996. Belief change and dependence. In Proceedings of the 6th Conference 92 Stability in Abstract Argumentation Jean-Guy Mailly , Julien Rossit LIPADE, Université de Paris {jean-guy.mailly, julien.rossit}@u-paris.fr Abstract mention some application to crime investigation (more precisely, Internet trade fraud). We also have in mind some other natural applications, like automated negotiation. For instance, if an agent is certain her argument for supporting her preferred offer cannot be accepted at any future step of the debate, she can switch her offer to another one, that may be less preferred, but at least could be accepted. In this paper, we adapt the notion of stability to abstract argumentation, and we show that checking stability is equivalent to performing some well-known reasoning tasks in Argument-Incomplete AFs (Baumeister, Rothe, and Schadrack 2015; Baumeister, Neugebauer, and Rothe 2018; Niskanen et al. 2020). While existing work on stability in structured argumentation focuses on a particular semantics (namely the grounded semantics), our approach is generic with respect to the underlying extension-based semantics. Moreover we consider both credulous and skeptical variants of argumentative reasoning. This paper is organized as follows. Section 2 introduces the basic notions of abstract argumentation literature in which our work takes place, and presents the concept of stability for structured argumentation frameworks. We then propose in Section 3 a counterpart of this notion of stability adapted to abstract argumentation frameworks, and we show how we can reduce it to well-known reasoning tasks. We provide some lower and upper bounds for the computational complexity of checking whether an AF is stable. Section 4 then describes an application scenario in the context of automated negotiation. Finally, Section 5 discusses related work, and Section 6 concludes the paper by highlighting some promising future works. The notion of stability in a structured argumentation setup characterizes situations where the acceptance status associated with a given literal will not be impacted by any future evolution of this setup. In this paper, we abstract away from the logical structure of arguments, and we transpose this notion of stability to the context of Dungean argumentation frameworks. In particular, we show how this problem can be translated into reasoning with Argument-Incomplete AFs. Then we provide preliminary complexity results for stability under four prominent semantics, in the case of both credulous and skeptical reasoning. Finally, we illustrate to what extent this notion can be useful with an application to argumentbased negotiation. 1 Introduction Formal argumentation is a family of non-monotonic reasoning approaches with applications to (e.g.) multi-agent systems (McBurney, Parsons, and Rahwan 2012), automated negotiation (Dimopoulos, Mailly, and Moraitis 2019) or decision making (Amgoud and Vesic 2012). Roughly speaking, we can group the research in this domain in two families: abstract argumentation (Dung 1995) and structured argumentation (Besnard et al. 2014). The former is mainly based on the seminal paper proposed by Dung, where abstract argumentation frameworks (AFs) are defined as directed graphs where the nodes represent arguments and the edges represent attacks between them. In this setting, the nature of arguments and attacks is not defined, only their interactions are represented in order to determine the acceptability status of arguments. On the opposite, different settings have been proposed where the arguments are built from logical formulas or rules, and the nature of attacks is based on logical conflicts between the elements inside the arguments. See e.g. (Baroni, Gabbay, and Giacomin 2018) for a recent overview of abstract and structured argumentation. In a particular structured argumentation setting, the notion of stability has been defined recently (Testerink, Odekerken, and Bex 2019). Intuitively, it represents a situation where a certain argument of interest will not anymore have the possibility to change its acceptability status. Either it is currently accepted and it will remain so, or on the contrary it is currently rejected, and nothing could make it accepted in the future. In the existing work on this topic, the authors 2 2.1 Background Abstract Argumentation Let us first introduce the abstract argumentation framework defined in (Dung 1995). Definition 1. An argumentation framework (AF) is a pair F = hA, Ri where A is the set of arguments and R ⊆ A×A is the attack relation. In this framework, we are not concerned by the precise nature of arguments (e.g. their internal structure or their origin) and attacks (e.g. the presence of contradictions between elements on which arguments are built). Only the relations 93 between arguments (i.e. the attacks) are taken into account to evaluate the acceptability of arguments. We focus on finite AFs, i.e. AFs with a finite set of arguments. For a, b ∈ A, we say that a attacks b if (a, b) ∈ R. Moreover, if b attacks some c ∈ A, then a defends c against b. These notions are extended to sets of arguments: S ⊆ A attacks (respectively defends) b ∈ A if there is some a ∈ S that attacks (respectively defends) b. The acceptability of arguments is evaluated through a notion of extension, i.e. a set of arguments that are jointly acceptable. To be considered as an extension, a set has to satisfy some minimal requirements: We refer the interested reader to (Baroni, Caminada, and Giacomin 2018) for more details about these semantics, as well as other ones defined after Dung’s initial work. From the set of extensions σ(F) (for σ ∈ {co, pr, gr, st}), we define two reasoning modes: • an argument a ∈ A is credulously accepted with respect to σ iff a ∈ S for some S ∈ σ(F); • an argument a ∈ A is skeptically accepted with respect to σ iff a ∈ S for each S ∈ σ(F). Then, a possible enrichment of Dung’s framework consists in taking into account some uncertainty in the AF. This yields the notion of Incomplete AFs, studied e.g. in (Baumeister, Rothe, and Schadrack 2015; Baumeister, Neugebauer, and Rothe 2018; Niskanen et al. 2020). Here, we focus on a particular type, namely Argument-Incomplete AFs, but for a matter of simplicity we just refer to them as Incomplete AFs. Definition 3. An incomplete argumentation framework (IAF) is a tuple I = hA, A? Ri where • A is the set of certain arguments; • A? is the set of uncertain arguments; • R ⊆ (A ∪ A? ) × (A ∪ A? ) is the attack relation; and A, A? are disjoint sets of arguments. Example 2. The IAF I = hA, A? , Ri is shown on Figure 2. The dotted nodes represent the uncertain arguments A? . Plain nodes and arrows have the same meaning as previously. • S ⊆ A is conflict-free (denoted S ∈ cf(F)) iff ∀a, b ∈ S, (a, b) 6∈ R; • S ∈ cf(F) is admissible (denoted S ∈ ad(F)) iff S defends all its elements against all their attackers. Then, Dung defines several semantics: Definition 2. Given F = hA, Ri an AF, a set S ⊆ A is: • a complete extension (S ∈ co(F)) iff S ∈ ad(F) and S contains all the arguments that it defends; • a preferred extension (S ∈ pr(F)) iff S is a ⊆-maximal complete extension; • the unique grounded extension (S ∈ gr(F)) iff S is the ⊆-minimal complete extension; • a stable extension (S ∈ st(F)) iff S ∈ cf(F) and S attacks each a ∈ A \ S, where ⊆-maximal and ⊆-minimal denote respectively the maximal and the minimal elements for classical set inclusion. Example 1. Let F = hA, Ri be the AF depicted in Figure 1. Nodes in the graph represent the arguments A, while the edges correspond to the attacks R. Its extensions for σ ∈ {gr, st, pr, co} are given in Table 1. a2 a1 a6 a3 a4 a5 a1 a6 a3 a4 a5 a7 Figure 2: An Example of IAF I Uncertain arguments are those that may not actually belong to the system (for instance because of some uncertainty about the agent’s environment). There are different ways to “solve” the uncertainty in an IAF, that correspond to different completions: Definition 4. Given I = hA, A? Ri an IAF, a completion is an AF F = hA′ , R′ i where • A ⊆ A′ ⊆ A ∪ A? ; • R′ = R ∩ (A′ × A′ ). Example 3. Considering again I from the previous example, we show all its completions at Figure 3. For each uncertain argument in A? = {a4 , a7 }, there are two possibilities: either the argument is present, or it is not. Thus, there are four completions. a7 Figure 1: An Example of AF F Semantics σ grounded stable preferred complete a2 σ-extensions {{a1 }} {{a1 , a4 , a6 }} {{a1 , a4 , a6 }, {a1 , a3 }} {{a1 , a4 , a6 }, {a1 , a3 }, {a1 }} Table 1: σ-Extensions of F 94 a2 a1 a3 a6 a2 a1 a6 a5 a3 a4 a5 (a) C1 a2 a1 a3 (b) C2 a6 a2 a1 a6 a5 a3 a4 a5 a7 (c) C3 ¬p and −(¬p) = p, with ¬ the classical negation. We call p (respectively ¬p) a positive (respectively negative) literal. Definition 5. An argumentation setup is a tuple AS = hL, R, Q, K, τ i where: • L is a set of literals s.t. l ∈ L implies −l ∈ L; • R is a set of defeasible rules p1 , . . . , pm ⇒ q s.t. p1 , . . . , pm , q ∈ L. Such a rule is called “a rule for q”. • Q ⊆ L is a set of queryable literals, s.t. no q ∈ Q is a negative literal; • K ⊆ L is the agent’s (consistent) knowledge base; • τ ∈ L is a particular literal called the topic. Usual mechanisms are used to define arguments and attacks. An argument for a literal q is an inference tree rooted in a rule p1 , . . . , pm ⇒ q, such that for each pi , there is a child node that is either an argument for pi , or an element of the knowledge base. Then, an argument A attacks an argument B if the literal supported by A is the negation of some literal involved in the construction of B. From the sets of arguments and attacks built in this way, the grounded extension is defined as usual (see Definition 2). Given an argumentation setup AS, the status of the topic τ may be: • unsatisfiable if there is no argument for τ in AS; • defended if there is an argument for τ in the grounded extension of AS; • out if there are some arguments for τ in AS, and all of them are attacked by the grounded extension; • blocked in the remaining case. Then, stability can be defined, based on the following notion of future setups: Definition 6. Let AS = hL, R, Q, K, τ i be an argumentation setup. The set of future setups of AS, denoted by F (AS), is defined by F (AS) = {hL, R, Q, K ′ , τ i | K ⊆ K ′ }. AS is called stable if for each AS ′ ∈ F (AS), the status of τ is the same as in AS. Intuitively, a future setup is built by adding new literals to the knowledge base (keeping the consistency property, of course). Then, new arguments and attacks may be built thanks to these new literals. The setup is stable if these new arguments and attacks do not change the status of the topic. To conclude this section, let us mention that (Testerink, Odekerken, and Bex 2019) provides a sound algorithm that approximates the reasoning task of checking the stability of the setup. This algorithm is however not complete, i.e. AS is actually stable if the algorithm outcome is a positive answer, but there are stable setups that are not identified by the algorithm. This algorithm has the interest of being polynomially computable (more precisely, it stops in O(n2 ) steps, where n = |L| + |R|). a7 (d) C4 Figure 3: The Completions of I This means that a completion is a “classical” AF made of all the certain arguments, some of the uncertain elements, and all the attacks that concern the selected arguments. Reasoning with an IAF generalizes reasoning with an AF, by taking into account either some or each completion. Formally, given I an IAF and σ a semantics, the status of an argument a ∈ A is: • possibly credulously accepted with respect to σ iff a belongs to some σ-extension of some completion of I; • possibly skeptically accepted with respect to σ iff a belongs to each σ-extension of some completion of I; • necessarily credulously accepted with respect to σ iff a belongs to some σ-extension of each completion of I; • necessarily skeptically accepted with respect to σ iff a belongs to each σ-extension of each completion of I. Example 4. Let us consider again I from the previous example, and its completions C1 , C2 , C3 and C4 . We observe that a1 is necessarily skeptically accepted for any semantics, since it appears unattacked in every completion (thus, it belongs to every extension of every completion). On the opposite, a6 is possibly credulously accepted with respect to the preferred semantics: it belongs to some extension of C4 . It is not skeptically accepted (because {a1 , a3 } is a preferred extension of C4 as well), and it is not necessarily accepted (because in C1 , it is not defended against a5 , thus it cannot belong to any extension). 2.2 Stability in Structured Argumentation 3 Now we briefly introduce the argumentation setting from (Testerink, Odekerken, and Bex 2019), based on ASPIC+ (Modgil and Prakken 2014). Let us start with the notation that is used to represent the negation of a literal, i.e. for a propositional variable p, −p = Stability in Abstract Argumentation In this section, we describe how we adapt the notion of stability to abstract argumentation. Contrary to previous works, we do not focus on a specific semantics, and thus we consider both credulous and skeptical reasoning. Moreover, we 95 provide a translation of the stability problem into reasoning with AFs and IAFs. Despite being theoretically intractable, efficient algorithms exist for solving these problems in practice. So it paves the way to future implementations of an exact algorithm for checking stability, and its applications to concrete scenarios. 3.1 a1 a6 a3 a4 a5 Formal Definition of Stability in AFs a6 a3 a4 a7 From now on, we consider a finite argumentation universe that is represented by an AF FU = hAU , RU i. We suppose that any “valid” AF is made of arguments and attacks in FU , i.e. F = hA, Ri s.t. A ⊆ AU and R = RU ∩ (A × A). Definition 7. Given an AF F = hA, Ri, we call the future AFs the set of AFs F (F) = {F ′ = hA′ , R′ i | A ⊆ A′ }. Intuitively speaking, this means that a future AF represents a possible way to continue the argumentative process (by adding arguments and attacks), accordingly to FU . This corresponds to some kind of expansions of F (Baumann and Brewka 2010), where the authorized expansions are constrained by FU . This is reminiscent of the set of authorized updates defined in (de Saint-Cyr et al. 2016). Notice that F is a particular future AF. Now we have all the elements to define stability. Definition 8. Given an AF F = hA, Ri, a ∈ A an argument, and σ a semantics, we say that F is credulously (respectively skeptically) σ-stable with respect to a iff • either ∀F ′ ∈ F (F), a is credulously (respectively skeptically) accepted with respect to σ; • or ∀F ′ ∈ F (F), a is not credulously (respectively skeptically) accepted with respect to σ. Although in this paper we focus on σ ∈ {gr, st, pr, co}, the definition of stability is generic, and the concept can be applied when any extension semantics is used (Baroni, Caminada, and Giacomin 2018). Example 5. Let us consider the argumentation universe FU and the AF F, both depicted in Figure 4. The argument a3 is not credulously σ-stable for σ = st, since it is credulously accepted in F, but not in the future AF where a2 is added. On the contrary, it is skeptically σ-stable since it is not skeptically accepted in F, nor in any future AF. a6 is skeptically σ-stable as well, but for another reason: indeed we observe that in F (and in any future AF), a6 is defended by the (unattacked) argument a7 , thus it belongs to every extension. On this simple example, it may seem obvious to determine that a5 , a6 and a7 will keep their status. However, let us notice that determining whether an argument keeps its status when an AF is updated has been studied, and is not a trivial question in the general case (Baroni, Giacomin, and Liao 2014; Alfano, Greco, and Parisi 2019). 3.2 a2 a5 a7 (a) FU (b) F Figure 4: The Argumentation Universe FU and a Possible AF F Definition 9. Given F = hA, Ri, the corresponding IAF is IF = hA, AU \ A, RU i. The corresponding IAF is built from the whole set of arguments that appear in the universe. The ones that belong to F are the certain arguments, while the other ones are uncertain. Then of course, all the attacks from the universe appear in the IAF. The set of completions of IF is actually F (F). Example 6. Figure 5 shows the IAF corresponding to F. The arguments that belong to the universe but not to F (namely, a1 and a2 ) appear as uncertain arguments. This means that the four completions of this IAF correspond to F (F). a2 a1 a6 a3 a4 a5 a7 Figure 5: The IAF IF Corresponding to F We give a characterization of stability based on the IAF corresponding to an AF. Proposition 1. Given an AF F = hA, Ri, a ∈ A an argument, and σ a semantics, F is credulously (respectively skeptically) σ-stable with respect to a iff • either a is necessarily credulously (respectively skeptically) accepted in IF with respect to σ; • or a is not possibly credulously (respectively skeptically) accepted in IF with respect to σ. This result shows that solving efficiently the stability problem is possible, using for instance the SAT-based piece of software taeydennae (Niskanen et al. 2020) for reasoning in IF . Now, we provide preliminary complexity results. We start with upper bounds for the computational complexity of stability. Computational Issues We now provide a method for checking the stability of an AF with respect to some argument. The method is generic regarding the underlying extension semantics. It is based on the observation that the set of future AFs can be encoded into a single IAF (see Definition 3). 96 Proposition 2. The upper bound complexity of checking whether an AF is (credulously or skeptically) σ-stable with respect to an argument is as presented in Table 2. σ st co gr pr Credulous ∈ ΠP 2 ∈ ΠP 2 ∈ coNP ∈ ΠP 3 instance, these preferences can be obtained from a notion of utility associated with each offer). So, each agent’s goal is to make her preferred practical argument (i.e. the one that supports the preferred offer) accepted at the end of the debate. Each agent, in turn, can add one (or more) argument(s) that defend her preferred argument. In this first version of the negotiation framework, agents have a total ignorance about their opponent. Then, an enriched version of this protocol can be defined, where the agents use the notion of argumentation universe to model their (uncertain) knowledge about the opponent. Then, stability can help the agent to obtain a better outcome: if at some point, the agent’s preferred practical argument is rejected and stable, this means that this argument will not be accepted at the end of the debate, whatever the actual moves of the other agents. It is then profitable to the agent to change her goal, defending now the argument that supports her second preferred offer instead of the first one. This can reduce the number of rounds in the negotiation (and thus, any communication cost associated with these rounds), and even improve the outcome of the negotiation for the agent. Let us now provide a concrete example. We suppose that the offers O = {o1 , o2 , o3 } are supported by one practical argument each, i.e. {p1 , p2 , p3 } with pi supporting oi . The practical arguments are mutually exclusive. The preferences of the agents are opposed: agent 1 has a preference ranking o3 >1 o2 >1 o1 , while the preferences of agent 2 are o1 >2 o2 >2 o3 . So, at the beginning of the debate, the goal of agent 1 (respectively agent 2) is to accept the argument p3 (respectively p1 ). Let us suppose that the first round consists in agent 1 attacking the argument p1 with three arguments a1 , a2 and a3 , thus defending p3 . This situation is depicted in Figure 6. Skeptical ∈ ΣP 2 ∈ coNP ∈ coNP ∈ ΣP 3 Table 2: Upper Bound Complexity of Checking Stability Sketch of proof. Non-deterministically guess a pair of future AFs F ′ and F ′′ . Check that a is credulously (respectively skeptically) accepted in F ′ , and a is not credulously (respectively skeptically) accepted in F ′′ . The complexity of credulous (respectively skeptical) acceptance in AFs (Dvorák and Dunne 2018) allows to deduce an upper bound for credulous (respectively skeptical) stability. Now we also identify lower bounds for the computational complexity of stability. Proposition 3. The lower bound complexity of checking whether an AF is (credulously or skeptically) σ-stable with respect to an argument is as presented in Table 3. σ st co gr pr Credulous NP-hard NP-hard P-hard NP-hard Skeptical coNP-hard P-hard P-hard ΠP 2 -hard Table 3: Lower Bound Complexity of Checking Stability p1 Sketch of proof. Credulous (respectively skeptical) acceptance in an AF F can be reduced to credulous (respectively skeptical) stability, such that the current AF is F, and the argumentation universe is FU = F. Thus, F is credulously (respectively skeptically) σ-stable with respect to some argument a iff a is credulously (respectively skeptically) accepted in F with respect to σ. The nature of the reduction (its computation is bounded with logarithmic space and polynomial time) makes it suitable for determining both Phardness and C-hardness, for C ∈ {NP, coNP, ΠP 2 }. Thus, we can conclude that stability is at least as hard as acceptance in AFs. From known complexity results for AFs (Dvorák and Dunne 2018), we deduce the lower bounds given in Table 3. 4 a1 p2 p3 a3 a2 Figure 6: The Negotiation Debate F1 In F1 , that represents the state of the debate after agent 1’s move, the argument p1 is clearly rejected under the stable semantics,1 since it is not defended against a1 , a2 and a3 . Consider that agent 2 has one argument at her disposal, a4 , with the corresponding attacks (a4 , a3 ) and (a4 , p2 ). Without a possibility to anticipate the evolution of the debate, the best action for agent 2 is to utter this argument, thus defending p1 against a3 and p2 . Now, let us suppose that agent 2 has an opponent modelling in the form of the argumentation universe FU , described at Figure 7. Now, we observe that p1 does not appear in any extension of any future framework. Indeed, it is obvious that, if one of a4 , a5 and a6 is not present at the end of the debate, then Applying Stability to Automated Negotiation Now, we discuss the benefit of stability in a concrete application scenario, namely automated negotiation. Let us consider a simple negotiation framework, where practical arguments (i.e. those that support some offers) are mutually exclusive, and for each agent there is a preference relation between the offers supported by these arguments (for 1 97 As well as any semantics considered in this paper. p1 a1 p2 6 p3 In this paper, we have addressed a first study which investigates to what extent the notion of stability can be adapted to abstract argumentation frameworks. In particular, we have shown how it relates with Incomplete AFs, that are a model that integrates uncertainty in abstract argumentation. Our preliminary complexity results, as well as the translation of stability into reasoning with IAFs pave the way to the development of efficient computational approaches for stability, taking benefit from SAT-based techniques. Finally, we have shown that, besides the existing application of stability to Internet fraud inquiry (Testerink, Odekerken, and Bex 2019), this concept has other potential applications, like automated negotiation. This paper opens the way for several promising research tracks. First of all, we plan to study more in depth complexity issues in order to determine tight results for the semantics that were studied here. Other direct future works include the investigation of other semantics, and the implementation of our stability solving technique in order to experimentally evaluate its impact in a context of automated negotiation. We have focused on stability in extension semantics, which means that an argument will either remain accepted, or remain unaccepted. However, in some cases, it is important to deal more finely with unaccepted arguments. It is possible with 3-valued labellings (Caminada 2006). Studying the notion of stability when such labellings are used to evaluate the acceptability of arguments is a natural extension of our work. In some contexts, the assumption of a completely known argumentation universe is too strong. For such cases, it seems that using arbitrary IAFs (with also uncertainty on the attack relation) is a potential solution. Uncertainty on the existence (or direction) of attacks makes sense, for instance, when preferences are at play. Indeed, dealing with preferences in abstract argumentation usually involves a notion of defeat relation, that is a combination of the attacks and preferences. This defeat relation may somehow ”cancel” or ”reverse” the initial attack (Amgoud and Cayrol 2002; Amgoud and Vesic 2014; Kaci, van der Torre, and Villata 2018), thus some uncertainty or ignorance about the other agents’ preferences can be represented as uncertainty in the attack relation of the argumentation universe. We are also interested in stability for other abstract argumentation frameworks. Besides preference-based argumentation that we have already mentioned, Dung’s AFs has been generalized by adding a support relation (Amgoud et al. 2008), or associating quantitative weights with attacks (Dunne et al. 2011) or arguments (Rossit et al. 2020), or associating values with arguments (Bench-Capon 2002). But adapting the notion of stability to these frameworks may require different techniques than the one used in this paper. Also, the recent claim-based argumentation (Dvorák and Woltran 2020) provides an interesting bridge between structured argumentation and purely abstract frameworks. It makes sense to study stability in this setting, as a step that would make our results for different semantics and reasoning modes available for structured argumentation frameworks. a3 a2 a6 a4 a5 Figure 7: The Argumentation Universe FU p1 is not defended against (respectively) a3 , a2 or a1 . Otherwise, if a4 , a5 and a6 appear together, the mutual attack between a5 and a6 will be at the origin of two extensions, one where a6 appears with a2 (then defeating p1 ), and the other one containing a5 and a1 (thus defeating again p1 ). This means that p1 is rejected in F1 , and it is (both credulously and skeptically) σ-stable. In this situation, it is in the interest of agent 2 to stop arguing, and proposing instead the option supported by the argument p2 . Indeed, according to the agent’s preferences, p2 is the best option if p1 is not available anymore. Not only using the notion of stability in the argumentation universe allows to stop the debate earlier, but it also allows the agent 2 to propose her second best option, which would not be possible if she had uttered a4 . 5 Conclusion Related Work Dynamics of abstract argumentation frameworks (Doutre and Mailly 2018) has received much attention in the last decade. We can summarize this field in two kinds of approaches: the goal either is to modify an AF to enforce some (set of) arguments as accepted, or to determine to what extent the acceptability of arguments is impacted by some changes in the AF. In the first family, we can mention extension enforcement (Baumann and Brewka 2010), that is somehow dual to stability. Enforcement is exactly the operation that consists in finding whether it is possible to modify an AF to ensure that a set of arguments becomes (included in) an extension, while stability is the property of an argument that will keep its acceptance status, whatever the future evolution of the AF. Control Argumentation Frameworks (Dimopoulos, Mailly, and Moraitis 2018) are also somehow related to stability, since they are a generalization of Dung’s AFs that permit to realize extension enforcement under uncertainty. The second family of works in the field of argumentation dynamics are those that propose efficient approaches to recompute the extensions or the set of (credulously/skeptically) accepted arguments when the AF is modified (Baroni, Giacomin, and Liao 2014; Alfano, Greco, and Parisi 2019). Although related to stability, these approaches do not provide an algorithmic solution to the problem studied in our work, since they focus on one update of the AF at once, instead of the set of all the future AFs. 98 References Doutre, S., and Mailly, J.-G. 2018. Constraints and changes: A survey of abstract argumentation dynamics. Argument & Computation 9(3):223–248. Dung, P. M. 1995. On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games. Artif. Intell. 77(2):321–358. Dunne, P. E.; Hunter, A.; McBurney, P.; Parsons, S.; and Wooldridge, M. J. 2011. Weighted argument systems: Basic definitions, algorithms, and complexity results. Artif. Intell. 175(2):457–486. Dvorák, W., and Dunne, P. E. 2018. Computational problems in formal argumentation and their complexity. In Baroni, P.; Gabbay, D.; Giacomin, M.; and van der Torre, L., eds., Handbook of Formal Argumentation. College Publications. 631–688. Dvorák, W., and Woltran, S. 2020. Complexity of abstract argumentation under a claim-centric view. Artif. Intell. 285:103290. Kaci, S.; van der Torre, L. W. N.; and Villata, S. 2018. Preference in abstract argumentation. In Proc. of COMMA’18, 405–412. McBurney, P.; Parsons, S.; and Rahwan, I., eds. 2012. Proc. of ArgMAS’11, volume 7543 of Lecture Notes in Computer Science. Springer. Modgil, S., and Prakken, H. 2014. The ASPIC+ framework for structured argumentation: a tutorial. Argument & Computation 5(1):31–62. Niskanen, A.; Neugebauer, D.; Järvisalo, M.; and Rothe, J. 2020. Deciding acceptance in incomplete argumentation frameworks. In Proc. of AAAI’20, 2942–2949. Rossit, J.; Mailly, J.-G.; Dimopoulos, Y.; and Moraitis, P. 2020. United we stand: Accruals in strength-based argumentation. Argument & Computation. To appear. Testerink, B.; Odekerken, D.; and Bex, F. 2019. A method for efficient argument-based inquiry. In Proc. of FQAS’19, 114–125. Alfano, G.; Greco, S.; and Parisi, F. 2019. An efficient algorithm for skeptical preferred acceptance in dynamic argumentation frameworks. In Proc. of IJCAI’19, 18–24. Amgoud, L., and Cayrol, C. 2002. A reasoning model based on the production of acceptable arguments. Ann. Math. Artif. Intell. 34(1-3):197–215. Amgoud, L., and Vesic, S. 2012. On the use of argumentation for multiple criteria decision making. In Proc. of IPMU’12, 480–489. Amgoud, L., and Vesic, S. 2014. Rich preferencebased argumentation frameworks. Int. J. Approx. Reason. 55(2):585–606. Amgoud, L.; Cayrol, C.; Lagasquie-Schiex, M.; and Livet, P. 2008. On bipolarity in argumentation frameworks. Int. J. Intell. Syst. 23(10):1062–1093. Baroni, P.; Caminada, M.; and Giacomin, M. 2018. Abstract argumentation frameworks and their semantics. In Baroni, P.; Gabbay, D.; Giacomin, M.; and van der Torre, L., eds., Handbook of Formal Argumentation. College Publications. 159–236. Baroni, P.; Gabbay, D. M.; and Giacomin, M. 2018. Handbook of Formal Argumentation. College Publications. Baroni, P.; Giacomin, M.; and Liao, B. 2014. On topologyrelated properties of abstract argumentation semantics. A correction and extension to dynamics of argumentation systems: A division-based method. Artif. Intell. 212:104–115. Baumann, R., and Brewka, G. 2010. Expanding argumentation frameworks: Enforcing and monotonicity results. In Proc. of COMMA’10, 75–86. Baumeister, D.; Neugebauer, D.; and Rothe, J. 2018. Credulous and skeptical acceptance in incomplete argumentation frameworks. In Proc. of COMMA’18, 181–192. Baumeister, D.; Rothe, J.; and Schadrack, H. 2015. Verification in argument-incomplete argumentation frameworks. In Proc. of ADT’15, 359–376. Bench-Capon, T. J. M. 2002. Value-based argumentation frameworks. In Proc. of NMR’02, 443–454. Besnard, P.; Garcı́a, A. J.; Hunter, A.; Modgil, S.; Prakken, H.; Simari, G. R.; and Toni, F. 2014. Introduction to structured argumentation. Argument & Computation 5(1):1–4. Caminada, M. 2006. On the issue of reinstatement in argumentation. In Proc. of JELIA’06, 111–123. de Saint-Cyr, F. D.; Bisquert, P.; Cayrol, C.; and LagasquieSchiex, M. 2016. Argumentation update in YALLA (yet another logic language for argumentation). Int. J. Approx. Reason. 75:57–92. Dimopoulos, Y.; Mailly, J.-G.; and Moraitis, P. 2018. Control argumentation frameworks. In Proc. of AAAI’18, 4678– 4685. Dimopoulos, Y.; Mailly, J.-G.; and Moraitis, P. 2019. Argumentation-based negotiation with incomplete opponent profiles. In Proc. of AAMAS’19, 1252–1260. 99 Weak Admissibility is PSPACE-complete Wolfgang Dvořák1 , Markus Ulbricht2 , Stefan Woltran1 1 TU Wien, Institute of Logic and Computation 2 Leipzig University {dvorak,woltran}@dbai.tuwien.ac.at mulbricht@informatik.uni-leipzig.de Abstract self-defeating. In a nutshell, weak admissibility captures this idea of defense only against “reasonable” arguments in the following way: Given a conflict-free candidate set E of a framework F , E is weakly admissible if no attacker of E is weakly admissible in the subframework F E containing the arguments whose acceptance state is still undecided wrt. E (i. e. is neither contained in E nor attacked by E). As a matter of fact this definition includes all admissible sets (the subframework F E induced by an admissible set E does not contain any attacker of E whatsoever) but also tolerates, among other situations, self-attacking attackers (since those are never contained in any weakly-admissible set) or attackers which are contained in a self-defeating odd cycle which is not resolved. For example, b is weakly admissible in both F and G: We study the complexity of decision problems for weak admissibility, a recently introduced concept in abstract argumentation to deal with arguments of self-defeating nature. Our results reveal that semantics based on weak admissibility are of much higher complexity (under typical assumptions) compared to all argumentation semantics which have been analysed in terms of complexity so far. In fact, we show PSPACE-completeness of all standard decision problems for w-admissible and w-preferred semantics (with the exception of skeptical w-admissible acceptance which is trivial). As a strategy for implementation we also provide a polynomialtime reduction to DATALOG with stratified negation. 1 Introduction Abstract argumentation frameworks as introduced by Dung (1995) are nowadays identified as key concept to understand the fundamental mechanisms behind formal argumentation and nonmonotonic reasoning. In these frameworks, it is solely the attack relation between (abstract) arguments that is used to determine acceptable sets of arguments. A central property for a set of arguments to be acceptable is admissibility, which states that (i) arguments from the set do not attack each other and (ii) each attacker of an argument in the set is attacked by the set. The vast majority of semantics for abstract argumenation are based on this concept, most prominently preferred semantics which is defined via subsetmaximal admissible sets. However, already Dung noticed that the concept of defense as expressed by Condition (ii) can be seen problematic when self-defeating arguments are involved, i. e. are attacking the candidate set. Indeed, the concern comes from the fact that a defense against an argument which is self-contradicting might be not necessary at all. Although this issue has been known for a long time, no semantics for abstract argumentation among the numerous invented so far (see e.g. (Baroni, Caminada, and Giacomin 2011)) has addressed this problem in a commonly agreed way. A recent approach to tackle this particular problem has been proposed by Baumann, Brewka, and Ulbricht (2020) where the concept of weak admissibility is introduced. The underlying idea is to weaken admissibility in a way that counterattacks are only required against “proper” arguments, i. e. arguments that are not directly or indirectly a2 F: a b a3 G: b a1 but not in H, because here the self-defeat is resolved: a2 H: a3 c b a1 The price we have to pay is that weak admissibility is recursive in its nature as since a set E of arguments is verified by checking weak admissibility of certain sets of arguments contained in an induced subframework F E and so on. The key question in terms of computational complexity is now whether this recursion does any harm. In this paper, we answer this question affirmatively. Our main contributions are as follows: • We show that all standard decision problems for wadmissible and w-preferred semantics (with the exception of skeptical w-admissible acceptance) are PSPACEcomplete. • Towards implementation we provide a polynomial-time reduction to non-recursive DATALOG with stratified negation which is known to be PSPACE-complete in terms of program-complexity (cf. (Dantsin et al. 2001)). 100 By definition, F E is the subframework of F obtained by removing the range of E as well as corresponding attacks, i. e. F E = F↓A\E ⊕ . Intuitively, the E-reduct contains those arguments whose status still needs to be decided, assuming the arguments in E are accepted. Consider therefore the following illustrating example. Example 2.3 (Reduct and Admissibility). Let F be the AF depicted below. In contrast to {a} we verify the admissibility of {b} in F . However, their reducts are identical and contain the self-defeating argument c only. The complexity analysis we provide is of particular interest, since all known complexity results for argumentation semantics are located within the first two layers of the polynomial hierarchy (see, e.g. (Dvořák and Dunne 2018)). This holds even for semantics which have a certain recursive nature like cf2- or stage2-semantics; see (Gaggl and Woltran 2013; Dvořák and Gaggl 2016) for the respective complexity analyses. We recall that under the assumption that the polynomial hierarchy does not collapse, problems complete for PSPACE are rated as significantly harder than problems located at lower levels of the polynomial hierarchy. Our results are mirrored in the complexity landscape of nonmonotonic reasoning in the broad sense, where decision problems for many prominent formalisms (like default logic or circumscription) are located on the second level of the polynomial hierarchy (see, e.g. (Cadoli and Schaerf 1993; Thomas and Vollmer 2010) for survey articles), and only a few formalisms reach PSPACE-hardness. Examples for the latter are nested circumscription (Cadoli, Eiter, and Gottlob 2005), nested counterfactuals (Eiter and Gottlob 1996), model-preference default logic (Papadimitriou 1991), and theory curbing (Eiter and Gottlob 2006). 2 c Background Standard Concepts and Classical Semantics We fix a non-finite background set U . An argumentation framework (AF) (Dung 1995) is a directed graph F = (A, R) where A ⊆ U represents a set of arguments and R ⊆ A × A models attacks between them. In this paper we consider finite AFs only. Let F denote the set of all finite AFs over U . Now assume F = (A, R). For S ⊆ A we let F ↓S = (A ∩ S, R ∩ (S × S)). For a, b ∈ A, if (a, b) ∈ R we say that a attacks b as well as a attacks (the set) E given that b ∈ E ⊆ A. Moreover, E is conflict-free in F (for short, E ∈ cf (F )) iff for no a, b ∈ E, (a, b) ∈ R. We say a set E defends an argument a (in F ) if any attacker of a is attacked by some argument of E, i. e. for each (b, a) ∈ R, there is c ∈ E such that (c, b) ∈ R. A semantics σ is a mapping σ : F → 2U where we have F 7→ σ(F ) ⊆ 2A , i. e. given an AF F = (A, R) a semantics returns a subset of 2A . In this paper we consider so-called admissible and preferred semantics (abbr. ad and pr , ). Definition 2.1. Let F = (A, R) be an AF and E ∈ cf (A). 1. E ∈ ad (F ) iff E defends all its elements, 2. E ∈ pr (F ) iff E is ⊆-maximal in ad (F ). 2.2 a c b a b Observe that the reduct does not contain any attacker of the admissible set {b} in contrast to the non-admissible set {a}. The reduct is the central notion in the definition of weak admissible semantics (Baumann, Brewka, and Ulbricht 2020): Definition 2.4. For an AF F = (A, R), E ⊆ A is called weakly admissible (or w-admissible) in F (E ∈ ad w (F )) iff 1. E ∈ cf (F ) and S 2. for any attacker y of E we have y ∈ / ad w F E . The major difference between the standard definition of admissibility and the “weak” one is that arguments do not have to defend themselves against all attackers: attackers which do not appear in any w-admissible set of the reduct can be neglected. Example 2.5 (Example 2.3 ctd.). In the previous example we observed {a} ∈ / ad (F ). Let us verify the weak admissibility of {a} in F . Obviously, {a} is conflict-free in F (condition 1). Moreover, since c S is the only attacker of {a} in F {a} we have to check c ∈ / ad w F {a} (condition 2). Since {c} violates conflict-freeness in the reduct F {a} = ({c}, {(c, c)}) we find {c} ∈ / ad w F {a} yield S w {a} S ing ad F = ∅. Hence, c ∈ / ad w F {a} holds proving the claim. Example 2.6. Now assume a is attacked by an odd cycle a1 , a2 , a3 . Let us check whether {a} ∈ ad w (F ): Let us start by giving the necessary preliminaries. 2.1 F {a} = F {b} : F: a2 a3 F {a} : a a1 Weak Admissibility In fact, the only conflict-free set attacking a is {a3 }. However, in the reduct (F {a} ){a3 } the set {a2 } is weaklyadmissible. Since a2 attacks a3 , {a3 } ∈ / ad w (F {a} ){a3 } : The reduct is a central subject of study in this paper. For a compact definition, we use, given AF F = (A, R), EF+ = {a ∈ A | E attacks a in F } as well as EF⊕ = E ∪ EF+ . The latter set is known as the range of E in F . When clear from the context, we omit subscript F . Definition 2.2. Let F = (A, R) be an AF and let E ⊆ A. The E-reduct of F is the AF F E = (E ∗ , R ∩ (E ∗ × E ∗ )) where E ∗ = A \ EF⊕ . a2 a3 (F {a} ){a3 } : a1 101 a ΠP2 = coNPNP of problems that can be solved in nondeterministic polynomial time when the algorithm has access to an NP oracle. Finally PSPACE contains the problems that can be solved in polynomial space and exponential time. We have P ⊆ NP/coNP ⊆ ΠP2 ⊆ PSPACE. Let us consider another example illustrating the mechanisms of weak admissibility beyond self-defeating arguments. Example 2.7. Consider the following AF F . a2 F: a1 3 a4 In this section, we investigate the complexity of the standard decision problems in argumentation for w-admissibility and w-preferred semantics. Let us start by building up some intuition about the complexity of weak admissibility semantics. As for most semantics the verification problem is a suitable groundwork. a3 Let us verify that –although this seems a bit surprising at first glance– {a4 } ∈ ad w (F ). To see this, we note that a2 is the only attacker in the corresponding reduct: Example 3.1. Consider the following AF F , adapted from the well-known standard translation from propositional formulas to AFs: a2 F {a4 } : a1 Complexity Analysis a4 a3 t Now since a2 is attacked by a1 , it stands no chance of being w-admissible in F {a4 } , although it is not a self-defeating argument. Thus {a4 } ∈ ad w (F ). c1 Although Example 2.7 may appear somewhat counterintuitive, it is similar in spirit to Example 2.6: in both cases, weak admissibility verifies whether a certain attacker can be neglected. In Example 2.6, a3 does no harm since it is contained in a self-defeating odd loop; in Example 2.7 a2 does no harm since it is defeated by the undisputed a1 . Following the classical Dung-style semantics, weakly preferred extensions are defined as ⊆-maximal w-admissible extensions. x1 x̄1 x2 c3 x̄2 x3 x̄3 Let us check whether E = {t} ∈ ad w (F ). This is the case if none of the attackers c1 , c2 , or c3 occur in a w-admissible extension of the reduct F E , which is obtained by removing the argument t from F . Now take c1 . We see that c1 does not occur in a wadmissible extension in F E : It is attacked by both x1 and x̄1 which are in turn both w-admissible in any relevant sub-AF of F . Similarly, neither c2 nor c3 occur in a w-admissible extension of F E . Thus E ∈ ad w (F ). Definition 2.8. For an AF F = (A, R), E ⊆ A is called weakly preferred (or w-preferred) in F (E ∈ pr w (F )) iff E is ⊆-maximal in ad w (F ). Although this example was quite straightforward, several observations can be made: For more details regarding the definition and basic properties of weak admissibility we refer the reader to (Baumann, Brewka, and Ulbricht 2020). 2.3 c2 • weak admissibility does not appear to be a local property: the reason why E = {t} is w-admissible in the previous example are the arguments x1 , . . . , x̄3 which are not contained in E; we also see that this example is quite small and can be extended to chains of arbitrary length, Decision Problems and Complexity Classes For an AF F = (A, R) and a semantics σ, we say an argument a ∈ A is credulously accepted S T (skeptically accepted) in F w.r.t. σ if a ∈ σ(F ) (a ∈ σ(F )). The corresponding decision problems for a semantics σ, given an AF F and argument a, are as follows: Credulous Acceptance Credσ : Deciding whether a is credulously accepted in F w.r.t. σ; Skeptical Acceptance Skeptσ : Deciding whether a is skeptically accepted in F w.r.t. σ. We also consider the following decision problems, given an AF F : Verification of an extension Verσ : deciding whether a set of arguments is in σ(F ); and Existence of a non-empty extension NEmpty σ : deciding whether σ(F ) contains a non-empty set. Finally, we assume the reader to be familiar with the basic concepts of computational complexity theory (see, e.g. (Dvořák and Dunne 2018)) as well as the standard classes P, NP as well as coNP. In addition we consider the class • unless there is some shortcut, several sub-AFs need to be computed, inducing a recursion with depth in O(|A|) in the worst case, • it is not clear at first glance whether deciding credulous acceptance is actually much easier, because guessing a suitable set (here {t, x1 , x2 , x3 }) might skip computationally expensive recursive steps. The main contribution of this paper is to formally prove that there are no shortcuts and no suitable guessing in any case: All considered non-trivial problems are PSPACE-complete. Our results are summarized in Table 1 together with the results for admissible and preferred semantics (cf. (Dvořák and Dunne 2018)). 102 For NEmpty ad w = NEmpty pr w we iterate over all nonempty subsets of the arguments and test whether the set is w-admissible. If one of them is w-admissible we terminate and return yes otherwise we return false. Table 1: Complexity of w-admissible / w-preferred semantics in comparison with admissible / preferred semantics. Credσ Skeptσ Verσ NEmpty σ σ ad NP-c trivial in P NP-c coNP-c NP-c pr NP-c ΠP2 -c ad w PSPACE-c trivial PSPACE-c PSPACE-c pr w PSPACE-c PSPACE-c PSPACE-c PSPACE-c 3.2 Hardness Results We show hardness by a reduction from the PSPACEcomplete problem of deciding whether a QBF is valid. To this end we consider QBFs of the form Φ = ∀xn ∃xn−1 . . . ∀x2 ∃x1 : φ(x1 , x2 , . . . , xn−1 , xn ). 3.1 Membership Results Notice that Φ might start with an universal or existential quantifier and then alternates between universal and existential quantifiers after each variable and ends with an existential quantifier. φ is a propositional formula in CNF given V W by a set of clauses C, i.e, φ = c∈C l∈c l. We call a QBF starting with a universal quantifier a ∀-QBF and a QBF starting with an existential quantifier a ∃-QBF. Finally, observe that we named variables in reverse order to avoid renaming variables in our proofs by induction. We start with a reduction that maps QBFs to AFs such that the validity of the QBF can be read of by inspecting the w-admissible sets of the AF. We will later extend this reduction to encode the specific decision problems under our considerations. Reduction 3.4. Given a QBF Φ with propositional formula φ(x1 , . . . , xn ) we define the AF GΦ = (A, R) with A and R as follows. In this section we provide an algorithm that can be implemented in PSPACE and closely follows the definition of wadmissibility. Lemma 3.2. Verad w is in PSPACE. Proof. An algorithm for verifying that E ∈ ad w (F ) proceeds as follows: • Test whether E ∈ cf (F ); if not return false, • compute the reduct F E , • iterate over all subsets S of F E that contains at least one attacker of E and test whether S is w-admissible; if so return false; else return true. Notice that the last step involves recursive calls. However, the size of the considered AF is decreasing in each step and thus the recursion depth is in O(n). Moreover, we only need to store the current AF as well as the set S to verify. Finally, iterating over all subsets of an AF can be done in PSPACE as well. Hence, the above algorithm is in PSPACE. A ={xi , x̄i , pi | 1 ≤ i ≤ n} ∪ {c | c ∈ C} R ={(xi , x̄i ), (x̄i , xi ) | 1 ≤ i ≤ n}∪ {(xi , xi+1 ), (xi , x̄i+1 ) | 1 ≤ i < n}∪ {(x̄i , xi+1 ), (x̄i , x̄i+1 ) | 1 ≤ i < n}∪ {(xi , c) | xi ∈ c ∈ C} ∪ {(x̄i , c) | ¬xi ∈ c ∈ C}∪ {(c, x1 ), (c, x̄1 ) | c ∈ C}∪ {(pi , pi+1 ) | 1 ≤ i < n}∪ {(xi , pi ), (x̄i , pi ) | 1 ≤ i ≤ n}∪ {(pi , xi−1 ), (pi , x̄i−1 ) | 2 ≤ k ≤ n}∪{(p1 , c) | c ∈ C} Given that verifying is in PSPACE we can adapt standard algorithms to obtain the PSPACE membership of the other problems. Notice, that Skeptad w is always false as the empty-set is always w-admissible. Proposition 3.3. All of the following problems can be solved in PSPACE: Credad w , Verad w , NEmpty ad w , Credpr w , Skeptpr w , Verpr w , and NEmpty pr w Example 3.5. Let us consider the valid QBF ∀x2 ∃x1 : φ with φ = c1 ∧ c2 = (¬x2 ∨ x1 ) ∧ (x2 ∨ ¬x1 ). Let us apply Reduction 3.4 to obtain an AF F . It will be convenient to think of several layers, each one induced by a variable occurring in the QBF at hand. We thus have two layers here, with xi and x̄i attacking each other in the expected way and each layer attacked by its predecessor: Proof. Verad w ∈ PSPACE is by Lemma 3.2. The other memberships are by the following algorithms that can be easily implemented in PSPACE with calls to other PSPACE problems , e.g. Verad w , and thus are themselves in PSPACE. Verpr w can be solved by first verifying that the set is wadmissible and then iterating over all super-sets and verifying that they are not w-admissible. For Credad w = Credpr w we iterate over all subsets of the arguments that contain the query argument and test whether the set is w-admissible. As soon as we find a subset that is w-admissible we can stop and return that the argument is credulously accepted. Otherwise if none of the sets is w-admissible the argument is not credulously accepted. For Skeptpr w we iterate over all subsets of the arguments that do not contain the query argument and test whether the set is w-preferred. As soon as we find a subset that is w-preferred we can stop and return that the argument is not skeptically accepted. Otherwise if none of the sets is w-preferred the argument is skeptically accepted. x2 x1 x̄2 x̄1 The x-arguments attack the c-arguments in the natural way. The c-arguments attack the X1 layer only. X2 103 X1 C The arguments p1 and p2 induce odd cycles to forbid certain possible extensions. X2 X1 p2 is w-admissible: The reduct is the same as the one depicted above, with the attack from x̄1 to c2 removed. Thus neither {x1 } nor {x̄1 } are w-admssible in (F ′ )E : The argument c2 occurs in both ((F ′ )E ){x1 } and ((F ′ )E ){x̄1 } and hence witnesses that both arguments are not w-admissible in (F ′ )E . This means in turn that E = {x̄2 } is w-admissible in F ′ . C p1 Example 3.6. For the sake of demonstrating our construction, let us assume our QBF consists of three variables, i. e. consider ∃x3 ∀x2 ∃x1 : φ with φ = (¬x2 ∨ x1 ) ∧ (x2 ∨ ¬x1 ) as above. The AF F induced by Reduction 3.4 is the following: In full rig-out, Reduction 3.4 applied to our QBF looks as follows: c1 c2 c1 x2 x1 x̄2 x̄1 p2 p1 c1 x2 x1 x̄3 x̄2 x̄1 p2 p1 Now E = {x3 } is w-admissible: The reduct F {x3 } is the AF F from the previous example, where both {x2 } and {x̄2 } are not w-admissible (recall that we consider the formula φ from above). Thus E is w-admissible. The previous examples hint at the following behavior of Reduction 3.4: c2 x2 x3 p3 Now regarding our QBF note that setting x2 to true requires x1 is to be true as well and setting x2 to false requires x1 to be false. This translates to F as follows: Take E = {x̄2 }, corresponding to setting x2 to false. The set E is not wadmissible in F . To see this, consider the reduct F E : c2 • if a QBF of the form ∀x2 ∃x1 : φ, evaluates to true then neither {x2 } nor {x̄2 } is w-admissible, • if a QBF of the form ∃x3 ∀x2 ∃x1 : φ, evaluates to true, then at least one of {x3 } and {x̄3 } is w-admissible, and x1 • analogous reasoning applies to QBFs which evaluate to false. x2 We also want to mention that e.g. in Example 3.6 the arguments x3 and x̄3 are the only possible candidates for wadmissible extensions: x1 p2 • p2 , for example, is attacked by x2 and x̄2 in the corresponding reduct; this can only be prevented by also including x1 or x̄1 , which in turn are in conflict with p2 , p1 • c1 , for example, is attacked by p1 which can only be removed from the reduct by including x1 or x̄1 , but both are attacked by c1 , Now {x̄1 } (corresponding to ¬x1 in the QBF) is wadmissible in F E (even admissible) and attacks x̄2 witnessing that E ∈ / ad w (F ). Similarly, {x2 } is not w-admissible since it is attacked by x1 in the corresponding reduct. Let us now consider a QBF which evaluates to false. For this, we move from φ to φ′ = C1 ∧ C2 = (¬x2 ∨ x1 ) ∧ (x2 ). Note that φ′ is obtained from φ by removing ¬x1 from C2 . Consider the induced QBF ∀x2 ∃x1 : φ′ . Let F ′ be the AF obtained by applying Reduction 3.4. This time, E = {x̄2 } • x2 , for example, is attacked by p3 , but attacks x3 , x̄3 and p2 which are the only attackers of p3 . The following proposition formalizes that these observations are true in general. Proposition 3.7. For a QBF Φ, 104 First assume that Φ is valid and consider {xn } (the argument for {x̄n } is analogous). We show that {xn } is not {x } w-admissible. Consider GΦ n = GΦ1 . By the induction hypothesis we have that {xn−1 } or {x̄n−1 } is weakly ad{x } missible in GΦ n and as both xn−1 and x̄n−1 attack xn we have that {xn } is not w-admissible. Now assume that Φ is not valid and w.l.o.g. assume that Φ1 is not valid. By the induction hypothesis we have that neither {xn−1 } nor {x̄n−1 } is weakly admissible in {x } GΦ n = GΦ1 and thus {xn } is w-admissible. 1. if Φ is of the form ∃xn ∀xn−1 . . . ∃x1 : φ(x1 , x2 , . . . , xn ) we have that ad w (GΦ ) ∩ {{xn }, {x̄n }} = 6 ∅ if Φ is valid and ad w (GΦ ) = {∅} otherwise; and 2. if Φ is of the form ∀xn ∃xn−1 . . . ∃x1 : φ(x1 , x2 , . . . , xn ) we have that ad w (GΦ ) = {∅} if Φ is valid and ad w (GΦ ) ∩ {{xn }, {x̄n }} = 6 ∅ otherwise. Moreover, in both cases ad w (GΦ ) ⊆ {{xn }, {x̄n }, ∅}. In order to prove the above proposition we introduce several technical lemmas. Lemma 3.8. For a QBF Φ, ad w (GΦ ) ⊆ {{xn }, {x̄n }, ∅}. Proof. Let E ∈ ad w (GΦ ). Assume pi ∈ E for some i ≥ 2. Since E must be conflictfree, we have xi−1 ∈ / E, x̄i−1 ∈ / E, and pi+1 ∈ / E as well as xi ∈ / E and x̄i ∈ / E. Thus both xi and x̄i occur in the reduct F E and do not have any attacker in F E . In this case E cannot be w-admissible since pi ∈ E is attacked by {xi } and {x̄i } which are w-admissible in F E . So the assumption pi ∈ E for some i ≥ 2 must be false. Consider now p1 ∈ E. Similarly, if E ∈ cf (F ), then cj ∈ / E for each j as well as x1 , x̄1 ∈ / E and hence, p1 is attacked by x1 and x̄1 which are unattacked in F E ; contradiction. Now assume cj ∈ E for some j. Since cj attacks both x1 and x̄1 , {p1 } is w-admissible in F E which in turn attacks cj . Thus cj ∈ E is impossible. Finally, if xi ∈ E or x̄i ∈ E for i ≤ n − 1, then either pi+1 is unattacked in F E (which attacks both arguments) or pi ∈ E, xi+1 ∈ E, or x̄i+1 ∈ E (which contradicts E ∈ cf (F )). Hence xi ∈ / E. Lemma 3.11. If Proposition 3.7 holds for ∀-QBFs with n−1 variables then it also holds for ∃-QBFs with n variables. Proof. Consider an ∃-QBF Φ = ∃xn ∀xn−1 . . . ∃x1 : φ(x1 , x2 , . . . , xn ). We have that Φ is valid iff one of Φ1 = ∀xn−1 ∃xn−2 . . . ∃x1 : φ(x1 , x2 , . . . , ⊤) and Φ2 = ∀xn−1 ∃xn−2 . . . ∃x1 : φ(x1 , x2 , . . . , ⊥) is valid. Moreover, {x̄ } {x } GΦ n = GΦ1 and GΦ n = GΦ2 . First assume that Φ is valid and w.l.o.g. assume that Φ1 is valid. We show that {xn } is w-admissible. Consider {x } GΦ n = GΦ1 . By the induction hypothesis we have that {x } neither {xn−1 } nor {x̄n−1 } is weakly admissible in GΦ n and thus {xn } is w-admissible. Now assume that Φ is not valid and and consider {xn } (the argument for {x̄n } is analogous). By the induction hypothesis we have that {xn−1 } or {x̄n−1 } is weakly admissi{x } ble in GΦ n and as both xn−1 and x̄n−1 attack xn we have that {xn } is not w-admissible. The remainder of the proof proceeds by induction on the number of variables: Lemma 3.9 is the base case and the consecutive lemmata constitute the induction step. We next extend our reduction by two further arguments φ, pn+1 in order to show our hardness results. Lemma 3.9. For Φ = ∃x1 : φ(x1 ) we have that ad w (GΦ )∩ {{x1 }, {x̄1 }} = 6 ∅ iff Φ is valid. Reduction 3.12. Given a ∀-QBF Φ = ∀xn ∃xn−1 . . . ∃x1 : φ(x1 , x2 , . . . , xn ) we define the AF FΦ = GΦ ∪({φ, pn+1 }, {(φ, pn+1 ), (pn+1 , xn ), (pn+1 , x̄n ), (xn , φ), (x̄n , φ)}). Proof. By the above lemma it suffices to consider the sets {x1 }, {x̄1 }. The formula Φ is valid iff x1 or ¬x1 appears in all clauses. ⇒: Assume {x1 } is a w-admissible set but Φ is not valid, i. e. there is a c ∈ C such that x1 6∈ c. By construction x1 attacks p1 and is attacked by c and thus c is unattacked in the reduct and thus {c} is w-admissible in the reduct which is in contradiction to {x1 } being w-admissible. A similar reasoning applies to the case where {x̄1 } is w-admissible but Φ is not valid, ⇐: Assume that the formula is valid and w.l.o.g. assume that x1 appears in all clauses. Then by construction x1 attacks all the other arguments in GΦ and thus {x1 } is a wadmissible set. Example 3.13. Recall the valid QBF from our first example: ∀x2 ∃x1 : φ with φ = C1 ∧ C2 = (¬x2 ∨ x1 ) ∧ (x2 ∨ ¬x1 ). Augmenting Reduction 3.4 with Reduction 3.12 yields the following AF F : c1 c2 x2 x1 x̄2 x̄1 φ Lemma 3.10. If Proposition 3.7 holds for ∃-QBFs with n−1 variables then it also holds for ∀-QBFs with n variables. Proof. Consider a ∀-QBF Φ = ∀xn ∃xn−1 . . . ∃x1 : φ(x1 , x2 , . . . , xn ). We have that Φ is valid iff both Φ1 = ∃xn−1 ∀xn−2 . . . ∃x1 : φ(x1 , x2 , . . . , ⊤) and Φ2 = ∃xn−1 ∀xn−2 . . . ∃x1 : φ(x1 , x2 , . . . , ⊥) are valid. More{x̄ } {x } over, GΦ n = GΦ1 and GΦ n = GΦ2 . p3 105 p2 p1 Note the similarity to Example 3.6: Basically, φ replaces the pair x3 , x̄3 of arguments. Hence it is easy to see that {φ} is w-admissible since the reduct F {φ} is again the first AF from Example 3.5 possessing no w-admissible argument. denotes the set of all ground atoms over U . A rule r is of the form We now formally characterize the potential w-admissible sets in Reduction 3.12. with m ≥ k ≥ 0, where a, b1 , . . . , bm are atoms, and “not” stands for default negation. The head of r is a and the body of r is body(r) = {b1 , . . . , bk , not bk+1 , . . . , not bm }. Furthermore, body + (r) = {b1 , . . . , bk } is the positive body and body − (r) = {bk+1 , . . . , bm } is the negative body. When convenient, we write (parts of) a rule body as conjunction. A rule r is ground if r does not contain variables. Moreover, the DATALOG safety condition requires that all variables of a rule appear in the positive body. We say that a predicate A depends on predicate B if there is a rule where A appears in the head and B in the body. We say a program is non-recursive if the dependencies between predicates have no cycle. In DATALOG we distinguish between input predicates which are given by an input database, i. e. a set of ground atoms, and the predicates that are defined by the rules of the program. The complexity analysis of DATALOG distinguishes data-complexity where one considers an arbitrary but fixed program and analyses the complexity w.r.t. the size of the database and program-complexity where one considers an arbitrary but fixed database and analyses the complexity w.r.t. the size of the program. Our encoding refers to the latter notion as we encode weakly-admissible sets of an AF as a non-recursive DATALOG program which is then applied to a fixed input database. For our encoding we consider an AF F = (A, R) with arguments A = {a1 , . . . , an }. The weakly-admissible sets will be encoded as an n-ary predicate wadm(e1 , . . . en ) where variable ei indicates whether argument ai is in the extension or not. That is, our fixed database will be over the boolean domain {0, 1}. Our encoding closely follows the definition of wadmissible sets, which of course is recursive. To avoid recursion in the DATALOG program we will exploit that the recursion depth is bounded by n, as a reduct is always smaller than the AF from which it is built (we delete at least the arguments from the extension). We will introduce n-copies of certain predicates, each of which can only be used on a certain recursion depth of the w-admissible definition. The input database contains a unary predicate dom = {0, 1} defining the boolean domain of our variables and standard predicates that allow to encode the arithmetic operation we are using in our rules, e.g. the binary predicates equal = {(0, 0), (1, 1)} and leq = {(0, 0), (0, 1), (1, 1)} (below we denote them via “=” and “≤” symbols). We will first introduce certain auxiliary predicates which we require in order to define wadm(e1 , . . . en ). In our encoding will use variables {xi , yi , di , ei | 1 ≤ i ≤ n} to represent whether arguments are in certain sets or not. We will use the following short hands to group variables that together represent a set of arguments: X = x1 , . . . , xn , Y = y1 , . . . , yn , D = d1 , . . . , dn , and E = e1 , . . . , en . We will use each set of variables to represent a set of arguments such that the i-th variable is set to 1 iff the i-th argument is in the set and 0 otherwise. a ← b1 , . . . , bk , not bk+1 , . . . , not bm . Lemma 3.14. For a QBF Φ, ad w (FΦ ) ⊆ {∅, {φ}}. Proof. In comparison to Lemma 3.8, we are only left to consider pn+1 . The assumption pn+1 ∈ E ∈ ad w (F ) yields an analogous contradiction: E ∈ cf (F ) implies φ, xn ∈ / E and hence pn+1 is attacked by {φ} ∈ ad w (F E ). Proposition 3.15. Given a ∀-QBF Φ = ∀xn ∃xn−1 . . . ∃x1 : φ(x1 , x2 , . . . , xn ) we have that Φ is valid if and only if ad w (FΦ ) = {∅, {φ}}, and ad w (FΦ ) = {∅} otherwise. Proof. We have that the empty-set is always w-admissible and by Lemma 3.14 that {φ} is the only candidate for being {φ} a w-admissible set. Now consider {φ} and the reduct FΦ . {φ} We have that FΦ = GΦ and xn and x̄n being the attackers of φ. By Proposition 3.7 we have that {xn } or {x̄n } is wadmissible in the reduct iff Φ is not valid. Thus {φ} is wadmissible iff Φ is valid. Theorem 3.16. All of the following problems are PSPACEcomplete: Credad w , Verad w , NEmpty ad w , Credpr w , Skeptpr w , Verpr w , and NEmpty pr w . Proof. The membership results are by Proposition 3.3. The hardness results are all by Reduction 3.12 and Proposition 3.15. It only remains to state the precise problem instances that are equivalent to testing the validity of the ∀QBF Φ. First, consider Credad w = Credpr w . In the AF FΦ we have that φ is credulously accepted w.r.t. w-admissible semantics iff {φ} ∈ ad w (FΦ ) iff Φ is valid. Now, consider Verad w and Verpr w . We have that {φ} ∈ ad w (FΦ ) iff {φ} ∈ pr w (FΦ ) iff Φ is valid. Next, consider Skeptpr w . We have that φ is skeptically accepted iff pr w (FΦ ) = {{φ}} iff Φ is valid. Finally, consider NEmpty ad w = NEmpty pr w . We have that the only w-preferred/w-admissible extension is the empty-set iff Φ is not valid. That is, Reduction 3.12 provides a reduction from ∀-QBF to all of the considered problems, and as it can be clearly performed in polynomial time, the PSPACE-hardness of all these problems follows. 4 DATALOG Encoding In this section we provide a DATALOG encoding for w-admissible semantics. Our reduction will generate a polynomial size logic program that falls into the class of non-recursive DATALOG with stratified negation which is known to be PSPACE-complete in terms of programcomplexity (Dantsin et al. 2001). We first briefly recall the syntax of DATALOG programs. We fix a countable set U of constants. An atom is an expression p(t1 , . . . , tn ), where p is a predicate symbol of arity n ≥ 0 and each term ti is either a variable or an element from U . An atom is ground if it is free of variables. BU 106 The first constraint ensures that each argument in E is also in the range. The second constraint ensures that each argument in F ↓X that is attacked by E is in the range (but makes no statement about arguments not in F↓X ). The third constraint encodes that an argument is only in the range if it is in E or attacked by E and the final constraint ensures that only arguments in F↓X can be in the range. We start with encoding the subset relation between two sets of arguments X, Y by n ^ x i ≤ yi . X⊆Y ← i=1 Next we define a predicate cf (·) encoding conflictfreeness. Conflict-free sets can be defined by a rule which for each attack checks that not both incident arguments are in the set. cf (E) ← n ^ ^ dom(ei ), i=1 Example 4.3. Consider our running example, the initial AF, and the set E = {a4 }. We then obtain Range(1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1) which represents that the range of E equals {a3 , a4 }. Now consider this sub-AF G = F ↓{a1 ,a2 } and the set E = {a2 }. We obtain that Range(1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0) which represents that the range of E in G equals {a2 }. ei + ej ≤ 1. (i,j)∈R Notice that we added the dom(ei ) predicates in the body only to meet the safety condition of DATALOG (for arguments that are not involved in any attack). Example 4.1. We will use the AF F from Example 2.7 as our running example for this section. We are now ready to encode w-admissible semantics. In a first step we encode the reduct operation and use a predicate Red(X, E, Y ) encoding that when we are in the sub-AF F ↓X and build the reduct for the argument set E we obtain the sub-AF F↓Y a2 F: a1 a4 Red(X, E, Y ) ←Range(X, E, D), Range(X, E, D) defines the range of E within the subframework F ↓X and the second constraint makes sure that exactly those argument which are in X but not in the range of E are included in the reduct F↓Y (notice that by the definition of Range we have di ≤ xi ). For our example we obtain that cf (e1 , e2 , e3 , e4 ) = {(0, 0, 0, 0), (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1), (0, 1, 1, 0), (1, 0, 0, 1)} which corresponds to the seven conflict-free sets of F . Next we define the predicate Att(·, ·) which encodes that the first set of arguments attacks the second set. To this end, for each attack (aj , ak ) ∈ R, we add the following rule to the DATALOG program: n ^ dom(di ), i=1 n ^ Example 4.4. Consider our running example, the initial AF, and the set E = {a4 }. This determines the first eight variables of Red and we then obtain Red(1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0) as the Range predicate sets D to (0, 0, 1, 1). This reflects that F E = F ↓{a1 ,a2 } . Now consider this sub-AF G = F ↓{a1 ,a2 } and the set E = {a2 }. We obtain Red(1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0) as the Range predicate sets D to (1, 0, 0, 0). This reflects that GE = F↓{a1 } . dom(ei ), dj = 1, ek = 1. i=1 The dom(·) predicates again ensure that we meet the safety condition of DATALOG. Example 4.2. Consider our running example and sets D = {a2 }, E = {a4 }. We have that D attacks E as (a2 , a4 ) is an attack of F . Our DATALAOG program thus has a rule V4 V4 Att(D, E) ← i=1 dom(di ), i=1 dom(ei ), d2 = 1, e4 = 1. and we thus obtain Att(0, 1, 0, 0, 0, 0, 0, 1). Similarly we obtain Att(1, 0, 0, 0, 0, 1, 0, 0) as a1 attacks a2 . In the following, with slight abuse of notation, we will use F ↓X to refer to the sub-AF of F that is given by the arguments in the set represented by X, i. e. F↓X = (A′ , R ∩ (A′ × A′ )) with A′ = {ai | xi = 1}. We define a predicate Range(X, E, D) that defines the range D of an extension E in the AF F↓X . n ^ min(ei , xj ) ≤ dj , Range(X, E, D) ← E ⊆ D, In order to define the predicate wadm(E) we introduce predicates Pi (X, E), 1 ≤ i ≤ n that encode the wadmissible sets of the reducts on the i-th recursion level. Recall that the recursion depth of the computation is bounded by n. The variables X in Pi (X, E) encode the arguments of the reduct and the variables E encode the extension, i. e. E represents an w-admissible set of F ↓X . We have that the initial AF F corresponds to the reduct containing all arguments. wadm(E) ←P1 (1, . . . , 1, E). Example 4.5. We want to show that E = {a4 } is wadmissible in our running example, i. e. we want to prove that wadm(0, 0, 0, 1). By the above rule this is equivalent to showing P1 (1, 1, 1, 1, 0, 0, 0, 1). (i,j)∈R n ^ i=1 yi = x i − d i i=1 a3 Att(D, E) ← n ^ Next we define the w-admissible sets of each reduct. We first state the rules for 1 ≤ i ≤ n − 1 and then consider the special case Pn where at most one argument is left in the (di ≤ ei + max ej ), D ⊆ X. (j,i)∈R 107 reduct. The lower bound was proved by a suitable adjustment of the well-known standard translation from propositional formulas to AFs, with some noteworthy novel features: i) the argument φ representing whether or not the formula evaluates to true is not attacked by the arguments Ci representing the clauses, but only by two of the variables, ii) the arguments representing the variables occurring in the given formula attack each other forming several layers in order to implement the quantifier alternation, iii) auxiliary arguments pj are required to guide the simulation of the aforementioned alternation, and iv) none of the arguments corresponding to variables in the QBF at hand are contained in a w-admissible extension of the constructed AF; the important part of the construction is the interaction of arguments which are not accepted. The construction demonstrates that the “look ahead” incorporated in w-admissible semantics, which we hinted at in Examples 2.7 and 3.1, is as expressive as any reasoning problem in PSPACE. It is quite surprising that a simple definition of an argumentation semantics (after all, only the reduct and conflict-free semantics are mentioned) with a natural motivation such as reducing the damage caused by self-defeating arguments (as opposed to tailoring artificial semantics in order to reach this expressive power) is PSPACE-hard. However, this high computational complexity also calls for the investigation of suitable algorithms. As a first step towards this direction, we provided a DATALOG encoding for w-admissible semantics. Implementing and evaluating it is part of future work. In light of the results obtained in this paper, several other reasoning problems are now also expected to be PSPACEcomplete: Most notably, the standard reasoning problems for weakly complete and weakly grounded extensions (see (Baumann, Brewka, and Ulbricht 2020)), but also more sophisticated ones like enforcement (Wallner, Niskanen, and Järvisalo 2017) or computing repairs (Baumann and Ulbricht 2019; Brewka, Thimm, and Ulbricht 2019). A possible future work direction is to formally prove these conjectures. The most natural strategy to handle the considerable computational complexity is the investigation of suitable subclasses of AFs, which has already been proven beneficial for the classical Dung-style semantics (see e.g. (Dvořák and Dunne 2018, Section 3.4)). This might also provide the bases for potential fixed-parameter tractable algorithms (Dvořák, Ordyniak, and Szeider 2012; Dvořák, Pichler, and Woltran 2012). Pi (X, E)←E ⊆ X, cf (E), not Qi (X, E). Qi (X, E)←Red(X, E, Y ), D ⊆ Y, Att(D, E), Pi+1 (Y, D). n X xi ≤ 1, E ⊆ X, cf (E). Pn (X, E)← i=1 The first two rules are a direct encoding of the definition of w-admissible sets. That is, a set E is weakly admissible in F if it is conflict-free in F and there is no weakly-admissible set D in the reduct F E that attacks that E. The last rule covers the special case at recursion depth n we have that at most one argument is left in the reduct. Example 4.6. In order to prove that {a4 } is w-admissible we need to show that P1 (1, 1, 1, 1, 0, 0, 0, 1) is derived by the encoding. By the above we have that (0, 0, 0, 1) ⊆ (1, 1, 1, 1) and cf (0, 0, 0, 1) hold and thus we have to investigate Q1 (1, 1, 1, 1, 0, 0, 0, 1). We have that by Red(X, E, Y ), Y must be (1, 1, 0, 0), and further, by D ⊆ Y , D must be one of (0, 0, 0, 0), (0, 1, 0, 0), (1, 0, 0, 0), (1, 1, 0, 0). Finally, by Att(D, 0, 0, 0, 1), we have that D must be either (0, 1, 0, 0) or (1, 1, 0, 0) corresponding to the sets {a2 } and {a1 , a2 }. We thus have to test P2 (1, 1, 0, 0, 0, 1, 0, 0) and P2 (1, 1, 0, 0, 1, 1, 0, 0). The latter immediately fails as (1, 1, 0, 0) 6∈ cf . For the former we have that we have that (0, 1, 0, 0) ⊆ (1, 1, 0, 0) and cf (0, 1, 0, 0) hold and we have to investigate Q2 (1, 1, 0, 0, 0, 1, 0, 0). For Red(X, E, Y ), Y must be (1, 0, 0, 0) and further, by D ⊆ Y and Att(D, 0, 1, 0, 0), D must be (1, 0, 0, 0). We thus have to test P3 (1, 0, 0, 0, 1, 0, 0, 0). We have that (1, 0, 0, 0) ⊆ (1, 0, 0, 0) and cf (1, 0, 0, 0) hold, and we have to investigate Q3 (1, 0, 0, 0, 1, 0, 0, 0). Now, Y must be (0, 0, 0, 0) to make Red(X, E, Y ) hold, and by D ⊆ Y also D = (0, 0, 0, 0). But as (0, 0, 0, 0, 1, 0, 0, 0) 6∈ Att, i. e. the empty-set does not attack {a1 }, we cannot prove Q3 (1, 0, 0, 0, 1, 0, 0, 0). But then not Q3 (1, 0, 0, 0, 1, 0, 0, 0) is true and we obtain P3 (1, 0, 0, 0, 1, 0, 0, 0) as well as Q2 (1, 1, 0, 0, 0, 1, 0, 0). Given that Q2 (1, 1, 0, 0, 0, 1, 0, 0) holds we have that we cannot prove P2 (1, 1, 0, 0, 0, 1, 0, 0) and thus, as also P2 (1, 1, 0, 0, 1, 1, 0, 0) failed, we cannot prove Q1 (1, 1, 1, 1, 0, 0, 0, 1). But then not Q1 (1, 1, 1, 1, 0, 0, 0, 1) is true and we obtain P1 (1, 1, 1, 1, 0, 0, 0, 1) and wadm(0, 0, 0, 1). Notice that our DATALOG encoding is indeed nonrecursive and thus can be solved in PSPACE. Acknowledgments 5 Conclusion This research has been supported by WWTF through project ICT19-065, FWF through project P30168, and DFG through project BR 1817/7-2. In this paper, we investigated the computational complexity of the standard reasoning problems for weakly admissible and weakly preferred semantics. More specifically we examined the verification problem, the problem of deciding whether or not a given AF possesses a non-empty extension, as well as credulous and skeptical acceptance of a given argument. It turns out that all of them except the trivial skeptical acceptance for ad w are PSPACE-complete in general. References Baroni, P.; Caminada, M.; and Giacomin, M. 2011. An introduction to argumentation semantics. The Knowledge Engineering Review 26:365–410. 108 Baumann, R., and Ulbricht, M. 2019. If nothing is accepted– repairing argumentation frameworks. Journal of Artificial Intelligence Research 66:1099–1145. Baumann, R.; Brewka, G.; and Ulbricht, M. 2020. Revisiting the foundations of abstract argumentation: Semantics based on weak admissibility and weak defense. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2742–2749. AAAI Press. Brewka, G.; Thimm, M.; and Ulbricht, M. 2019. Strong inconsistency. Artificial Intelligence 267:78–117. Cadoli, M., and Schaerf, M. 1993. A survey of complexity results for nonmonotonic logics. J. Log. Program. 17(2/3&4):127–160. Cadoli, M.; Eiter, T.; and Gottlob, G. 2005. Complexity of propositional nested circumscription and nested abnormality theories. ACM Trans. Comput. Log. 6(2):232–272. Dantsin, E.; Eiter, T.; Gottlob, G.; and Voronkov, A. 2001. Complexity and expressive power of logic programming. ACM Comput. Surv. 33(3):374–425. Dung, P. M. 1995. On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games. Artificial Intelligence 77(2):321–357. Dvořák, W., and Dunne, P. E. 2018. Computational problems in formal argumentation and their complexity. In Baroni, P.; Gabbay, D.; Giacomin, M.; and van der Torre, L., eds., Handbook of Formal Argumentation. College Publications. also appears in IfCoLog Journal of Logics and their Applications 4(8):2623–2706. Dvořák, W., and Gaggl, S. A. 2016. Stage semantics and the scc-recursive schema for argumentation semantics. J. Log. Comput. 26(4):1149–1202. Dvořák, W.; Ordyniak, S.; and Szeider, S. 2012. Augmenting tractable fragments of abstract argumentation. Artif. Intell. 186:157–173. Dvořák, W.; Pichler, R.; and Woltran, S. 2012. Towards fixed-parameter tractable algorithms for abstract argumentation. Artif. Intell. 186:1–37. Eiter, T., and Gottlob, G. 1996. The complexity of nested counterfactuals and iterated knowledge base revisions. J. Comput. Syst. Sci. 53(3):497–512. Eiter, T., and Gottlob, G. 2006. Reasoning under minimal upper bounds in propositional logic. Theor. Comput. Sci. 369(1-3):82–115. Gaggl, S. A., and Woltran, S. 2013. The cf2 argumentation semantics revisited. J. Log. Comput. 23(5):925–949. Papadimitriou, C. H. 1991. On selecting a satisfying truth assignment (extended abstract). In 32nd Annual Symposium on Foundations of Computer Science, San Juan, Puerto Rico, 1-4 October 1991, 163–169. IEEE Computer Society. Thomas, M., and Vollmer, H. 2010. Complexity of nonmonotonic logics. Bulletin of the EATCS 102:53–82. Wallner, J. P.; Niskanen, A.; and Järvisalo, M. 2017. Complexity results and algorithms for extension enforcement in abstract argumentation. J. Artif. Intell. Res. 60:1–40. 109 Cautious Monotonicity in Case-Based Reasoning with Abstract Argumentation Guilherme Paulino-Passos1 , Francesca Toni1 1 Imperial College London, Department of Computing {g.passos18, f.toni}@imperial.ac.uk Abstract of the “default argument” in the grounded extension (Dung 1995), and use fragments of the AA framework for explanation (e.g. dispute trees as in (Čyras, Satoh, and Toni 2016b; Cocarascu et al. 2020) or excess features in (Čyras et al. 2019)). Different incarnations of AA-CBR use different mechanisms for defining “specificity”, ”irrelevance” and ”default argument”: the original version in (Čyras, Satoh, and Toni 2016a) defines all three notions in terms of ⊇ (and is thus referred to in this paper as AA-CBR⊇ ); thus, AA-CBR⊇ is applicable only to cases characterised by sets of features; the version used for classification in (Cocarascu et al. 2020) defines “specificity” in terms of a generic partial order , ”irrelevance” in terms of a generic relation 6∼ and ”default argument” in terms of a generic characterisation δC (and is thus referred to in this paper as AA-CBR,6∼,δC ). Thus, AA-CBR,6∼,δC is in principle applicable to cases characterised in any way, as sets of features or unstructured (Cocarascu et al. 2020). Here we will study a special, regular instance of AA-CBR,6∼,δC (which we refer to as AA-CBR ) in which “irrelevance” and the ”default argument” are both defined in terms of “specificity” (and in particular the “default argument” is defined in terms of the “most specific” case). AA-CBR admits AA-CBR⊇ as an instance, obtained by choosing =⊇ and by restricting attention to “coherent” casebases (whereby there is no ”noise”, in that no two cases with different outcomes are characterised by the same set of features). Recently, abstract argumentation-based models of case-based reasoning (AA-CBR in short) have been proposed, originally inspired by the legal domain, but also applicable as classifiers in different scenarios, including image classification, sentiment analysis of text, and in predicting the passage of bills in the UK Parliament. However, the formal properties of AA-CBR as a reasoning system remain largely unexplored. In this paper, we focus on analysing the non-monotonicity properties of a regular version of AA-CBR (that we call AA-CBR ). Specifically, we prove that AA-CBR is not cautiously monotonic, a property frequently considered desirable in the literature of non-monotonic reasoning. We then define a variation of AA-CBR which is cautiously monotonic, and provide an algorithm for obtaining it. Further, we prove that such variation is equivalent to using AA-CBR with a restricted casebase consisting of all “surprising” cases in the original casebase. 1 Introduction Case-based reasoning (CBR) relies upon known solutions for problems (past cases) to infer solutions for unseen problems (new cases), based upon retrieving past cases which are “similar” to the new cases. It is widely used in legal settings (e.g. see (Prakken et al. 2015; Čyras, Satoh, and Toni 2016a)), for classification (e.g. via the k-NN algorithm) and, more recently, within the DEAr methodology (Cocarascu et al. 2020)) and for explanation (e.g. see (Nugent and Cunningham 2005; Kenny and Keane 2019; Cocarascu et al. 2020)). In this paper we focus on a recent approach to CBR based upon an argumentative reading of (past and new) cases (Čyras, Satoh, and Toni 2016a; Čyras, Satoh, and Toni 2016b; Cocarascu, Čyras, and Toni 2018; Čyras et al. 2019; Cocarascu et al. 2020), and using Abstract Argumentation (AA) (Dung 1995) as the underpinning machinery. In this paper, we will refer to all proposed incarnations of this approach in the literature generically as AA-CBR (the acronym used in the original paper (Čyras, Satoh, and Toni 2016a)): they all generate an AA framework from a CBR problem, with attacks from “more specific” past cases to “less specific” past cases or to a “default argument” (embedding a sort of bias), and attacks from new cases to ”irrelevant” past cases; then, they all reduce CBR to membership AA-CBR was originally inspired by the legal domain in (Čyras, Satoh, and Toni 2016a), but some incarnations of AA-CBR, integrating dynamic features, have proven useful in predicting and explaining the passage of bills in the UK Parliament (Čyras et al. 2019), and some instances of AA-CBR,6∼,δC have also shown to be fruitfully applicable as classifiers in a number of scenarios, including classification with categorical data, with images and for sentiment analysis of text (Cocarascu et al. 2020). In this paper we study non-monotonicity properties of AA-CBR understood at the same time as a reasoning system and as a classifier. These properties, typically considered for logical systems, intuitively characterise in which sense systems may stop inferring some conclusions when more information is made available to them (Makinson 1994). These properties are thus related to modelling in- 110 was considered guilty, represented as ({hm}, +). Consider now a new case ({hm, sd}, ?), with an unknown outcome, of a defendant who committed homicide, but for which it was proven that it was in self-defence (sd). In order to predict the new case’s outcome by CBR, AA-CBR reduces the prediction problem to that of membership of the default argument in the grounded extension G (Dung 1995) of the AA framework in Figure 1: given that (∅, −) 6∈ G, the predicted outcome is positive (i.e. guilty), disregarding sd and, indeed, no matter what other feature this case may have. Thus, up to this point, having the feature hm is a sufficient condition for predicting guilty. If, however, the courts decides that for this new case the defendant should be acquitted, the case ({hm, sd}, −) enters in our casebase. Now, having the feature hm is no longer a sufficient condition for predicting guilty, and any case with both hm and sd will be predicted a negative outcome (i.e. that the person is innocent). This is the case for predicting the outcome of a new case with again both hm and sd, in AA-CBR using the AA framework in Figure 2. Thus, adding a new case to the casebase removed some conclusions which were inferred from the previous, smaller casebase. This illustrates non-monotonicity. ference which is tentative and defeasible, as opposed to the indefeasible form of inference of classical logic. Nonmonotonicity properties have already been studied in argumentation systems, such as ABA and ABA+ (Čyras and Toni 2015; Čyras and Toni 2016), ASP IC + (Dung 2014; Dung 2016) and logic-based argumentation systems (Hunter 2010). In this paper, we study those properties for the application of argumentation to classification, in particular in the form of AA-CBR. The following example illustrates AA-CBR (and AA-CBR⊇ in particular) as well as its non-monotonicity, in a legal setting. (∅, −) ({hm}, +) ({hm, sd}, ?) Figure 1: Initial AA framework for Example 1. Past cases (with their outcomes) and the new case (with no outcome, indicated by a question mark) are represented as arguments. AA-CBR predicts outcome + for the new case. (Grounded extension in colour.) In this paper we prove that the kind of inference underpinning AA-CBR lacks a standard non-monotonicity property, namely cautious monotonicity. Intuitively this property means that if a conclusion is added to the set of premises (here, the casebase), then no conclusion is lost, that is, everything which was inferable still is so. In terms of a supervised classifier, satisfying cautious monotonicity culminates in being “closed” under self-supervision. That is, augmenting the dataset with conclusions inferred by the classifier itself does not change the classifier. Then, we make a two-fold contribution: we define (formally and algorithmically) a provably cautiously monotonic variant of AA-CBR , that we call cAA-CBR , and prove that it is equivalent to AA-CBR applied to a restricted casebase consisting of all “surprising” cases in the original casebase. We also show that the property of cautious monotonicity of cAA-CBR leads to the desirable properties of cumulativity and rational monotonicity. All results here presented are restricted to coherent casebases, in which no case characterisation (problem) occurs with more than one outcome (solution). (∅, −) ({hm}, +) ({hm, sd}, −) ({hm, sd}, ?) Figure 2: Revised AA framework for Example 1. Here, the added past case changes the AA-CBR-predicted outcome to − by limiting the applicability of the previous past case. (Again, grounded extension in colour.) Example 1. Consider a simplified legal system built by cases and adhering, like most modern legal systems, to the principle by which, unless proven otherwise, no person is to be considered guilty of a crime. This can be represented by a “default argument” (∅, −), indicating that, in the absence of any information about any person, the legal system should infer a negative outcome − (that the person is not guilty). (∅, −) can be understood as an argument, in the AA sense, given that it is merely what is called a relative presumption, since it is open to proof to the contrary, e.g. by proving that the person did indeed commit a crime. Let us consider here one possible crime: homicide1 (hm). In one case, it was established that the defendant committed homicide, and he 2 2.1 Background Abstract argumentation An abstract argumentation framework (AF) (Dung 1995) is a pair (Args, ), where Args is a set (of arguments) and is a binary relation on Args. For α, β ∈ Args, if α β, then we say that α attacks β and that α is an attacker of β. For a set of arguments E ⊆ Args and an argument α ∈ Args, E defends α if for all β α there exists γ ∈ E such that γ β. Then, the grounded extension of (Args, ) can be S constructed as G = i>0 Gi , where G0 is the set of all unattacked arguments, and ∀i > 0, Gi+1 is the set of arguments that Gi defends. For any (Args, ), the grounded extension G always exists and is unique and, if (Args, ) is 1 This is merely a hypothetical example, so the terms used do not correspond to a specific jurisdiction. 111 A casebase D is coherent if there are no two cases (αC , αo ), (βC , βo ) ∈ D such that αC = βC but αo 6= βo . well-founded (Dung 1995), extensions under other semantics (e.g. stable extensions (Dung 1995), where E ⊆ Args is stable if ∄α, β ∈ E such that α β and, moreover, ∀α ∈ Args \ E, ∃β ∈ E such that β α) are equal to G. In particular for finite AFs, (Args, ) is well-founded iff it is acyclic. Given (Args, ), we will sometimes use α ∈ (Args, ) to stand for α ∈ Args. 2.2 For simplicity of notation, we sometimes extend the definition of to X × Y , by setting (αc , αo ) (βc , βo ) iff αc βc .3 Definition 3 (Adapted from (Cocarascu et al. 2020)). The AF mined from a dataset D and a new case (NC , ?) is (Args, ), in which: Non-monotonicity properties • Args = D ∪ {(δC , δo )} ∪ {(NC , ?)} ; • for (αC , αo ), (βC , βo ) ∈ D ∪ {(δC , δo )}, it holds that (αC , αo ) (βC , βo ) iff 1. αo 6= βo , 2. αC βC , and 3. ∄(γC , γo ) ∈ D ∪ {(δC , δo )} with αC ≻ γC ≻ βC and γo = αo ; • for (βC , βo ) ∈ D ∪ {(δC , δo )}, it holds that (NC , ?) (βC , βo ) iff (NC , ?) 6∼ (βC , βo ). We will be interested in the following properties.2 An arbitrary inference relation ⊢ (for a language including, in particular, sentences a, b, etc., with negations ¬a and ¬b, etc., and sets of sentences A, B) is said to satisfy: 1. non-monotonicity, iff A ⊢ a and A ⊆ B do not imply that B ⊢ a; 2. cautious monotonicity, iff A ⊢ a and A ⊢ b imply that A ∪ {a} ⊢ b; 3. cut, iff A ⊢ a and A ∪ {a} ⊢ b imply that A ⊢ b; The AF mined from a dataset D alone is (Args ′ , ′ ), with Args ′ = Args \ {(NC , ?)} and ′ = ∩(Args ′ × Args ′ ). 4. cumulativity, iff ⊢ is both cautiously monotonic and satisfies cut; Note that if D is coherent, then the “equals” case in the item 2 of the definition of attack will never apply. As a result, the AF mined from a coherent D (and any (NC , ?)) is guaranteed to be well-founded. 5. rational monotonicity, iff A ⊢ a and A 6⊢ ¬b imply that A ∪ {b} ⊢ a; 6. completeness, iff either A ⊢ a or A ⊢ ¬a. 3 Definition 4 (Adapted from (Cocarascu et al. 2020)). Let G be the grounded extension of the AF mined from D and (NC , ?), with default argument (δC , δo ). The outcome for NC is δo if (δC , δo ) is in G, and δ¯o otherwise. Setting the ground In this section we define AA-CBR , adapting definitions from (Cocarascu et al. 2020). All incarnations of AA-CBR, including AA-CBR , map a database D of examples labelled with an outcome and an unlabelled example (for which the outcome is unknown) into an AF. Here, the database may be understood as a casebase, the labelled examples as past cases and the unlabelled example as a new case: we will use these terminologies interchangeably throughout. In this paper, as in (Cocarascu et al. 2020), examples/cases have a characterisation (e.g., as in (Čyras, Satoh, and Toni 2016a), characterisations may be sets of features), and outcomes are chosen from two available ones, one of which is selected up-front as the default outcome. Finally, in the spirit of (Cocarascu et al. 2020), we assume that the set of characterisations of (past and new) cases is equipped with a partial order (whereby α ≺ β holds if α β and α 6= β and is read “α is less specific than β”) and with a relation 6∼ (whereby α 6∼ β is read as “β is irrelevant to α”). Formally: In this paper we focus on a particular case of this scenario: Definition 5. The AF mined from D alone and the AF mined from D and (NC , ?), with default argument (δC , δo ), are regular when the following requirements are satisfied: 1. the irrelevance relation 6∼ is defined as: x1 6∼ x2 iff x1 6 x2 , and 2. δC is the least element of X.4 This restriction connects the treatment of a characterisation αC as a new case and as a past case. We will see below that these conditions are necessary in order to satisfy desirable properties, such as Theorem 7. In the remainder, we will restrict attention to regular mined AFs. We will refer to the (regular) AF mined from D and (NC , ?), with default argument (δC , δo ), as AF (D, NC ), and to the (regular) AF mined from D alone as AF (D). Also, for short, given AF (D, NC ), with default argument (δC , δo ), we will refer to the outcome for NC Definition 2 (Adapted from (Cocarascu et al. 2020)). Let X be a set of characterisations, equipped with a partial order ≺ and a binary relation 6∼. Let Y = {δo , δ¯o } be the set of (all possible) outcomes, with δo the default outcome. Then, a casebase D is a finite set such that D ⊆ X × Y (thus a past case α ∈ D is of the form (αC , αo ) for αC ∈ X and αo ∈ Y ) and a new case is of the form (NC , ?) for NC ∈ X. We also discriminate a particular element δC ∈ X and define the default argument (δC , δo ) ∈ X × Y . 2 3 In (Cocarascu et al. 2020) was directly given over X × Y . Note that, in X ×Y , anti-symmetry may fail for two cases with different outcomes but the same characterisation, if D is not coherent, and thus is merely a preorder on X × Y . When we are restricted to a coherent D, we can guarantee it is a partial order. 4 Indeed this is not a strong condition, since it can be proved that if αC 6 δC then all cases (αC , αo ) in the casebase could be removed, as they would never change an outcome. On the other hand, assuming also the first condition in Definition 5, if (αC , ?) is the new case and αC 6 δC , then the outcome is δ¯o necessarily. We are mostly following the treatment of Makinson (1994). 112 as AA-CBR (D, NC ).5 In the remainder of the paper we assume as given arbitrary X, Y , D, (NC , ?), (δC , δo ) (satisfying the previously defined constraints), unless otherwise stated. In the remainder of this section we will identify some properties of AA-CBR , concerning its behaviour as a form of CBR. α = (αC , αo ) such that βC ≺ αC . By contradiction, assume βo 6= o. Let Γ = {γ ∈ Args | γ = (γC , γo ), βC ≺ γC αC and γo = o}. Notice that Γ is nonempty, as α ∈ Γ. Γ is the set of “potential attackers” of β, but only -minimal arguments in Γ do actually attack β. Let η be such a -minimal element of Γ.6 By construction, η attacks β. Thus β is attacked and not in G0 , a contradiction. Hence, βo = o, as required. (b) For the inductive step, let us assume that the property holds for a generic Gi , and let us prove it for Gi+1 . Let β = (βC , βo ) ∈ Gi+1 \ Gi (if β ∈ Gi , the property holds by the induction hypothesis). (NC , ?) does not attack β, as otherwise β would not be defended by Gi , as Gi is conflict-free. Thus, once again, as β is not a nearest case, there is a nearest case α = (αC , αo ) such that βC ≺ αC . Again, assume that βo 6= o. Then let Γ = {γ ∈ Args | γ = (γC , γo ), βC ≺ γC αC and γo = o}, with η a -minimal element of Γ. Then η attacks β. However, as Gi defends β, there is then θ ∈ Gi such that θ attacks η. By inductive hypothesis, θ is either (NC , ?) or θ = (θC , o). The first option is not possible, as η ∈ Γ, and thus ηC αC , and of course αC NC . Thus, ηC NC and is thus not attacked by (NC , ?). This means that (θC , o) attacks η = (ηC , ηo ). But this is absurd as well, as η ∈ Γ and thus ηo = o = θo . Therefore, our assumption that βo 6= o was false, that is, βo = o, as required. 2. If o = δ¯o , the default argument (δC , δo ) is not in G, since we have just proven that all arguments in G other than (NC , ?) have outcome o. 3. If o = δo , then let β be an attacker of (δC , δo ), and thus of the form β = (βC , δ¯o ) (again see how regularity is necessary, since otherwise (NC , ?) could be the attacker). β is not in G and, since G is also a stable extension, some argument in G attacks β. This is true for any attacker β of the default argument, and thus the default argument is defended by G. As G contains every argument it defends, the default argument is in the grounded extension, confirming that the outcome for NC is δo . Agreement with nearest cases. Our first property regards the predictions of AA-CBR in relation to the “most similar” (or nearest) cases to the new case, when these nearest cases all agree on an outcome. This property generalises (Čyras, Satoh, and Toni 2016a, Proposition 2) in two ways: by considering the entire set of nearest cases, instead of requiring a unique nearest case, for AA-CBR , instead of its instance AA-CBR⊇ . As in (Čyras, Satoh, and Toni 2016a), we prove this property for coherent casebases. We first define the notion of nearest case. Definition 6. A case (αC , αo ) ∈ D is nearest to NC iff αC NC and it is maximally so, that is, there is no (βC , βo ) ∈ D such that αC ≺ βC NC . Theorem 7. If D is coherent and every nearest case to NC is of the form (αC , o) for some outcome o ∈ Y (that is, all nearest cases to the new case agree on the same outcome), then AA-CBR (D, NC ) = o (that is, the outcome for NC is o). Proof. Let G be the grounded extension of AF (D, NC ). An outline of the proof is as follows: 1. We will first prove that each argument in G is either (NC , ?) or of the form (βC , o) (that is, agreeing in outcome with all nearest cases). 2. Then we will prove that if o = δ¯o (that is, o is the non-default outcome), then (δC , δo ) 6∈ G (and thus AA-CBR (D, NC ) = δ¯o , as envisaged by the theorem). 3. Finally, by using the fact that AF (D, NC ) is wellfounded (given that D is coherent), and thus G is also stable, we will prove that if o = δo (that is, o is the default outcome), then (δC , δo ) ∈ G (and thus AA-CBR (D, NC ) = δo , as envisaged by the theorem). Addition of new cases. The next result characterises the set of past cases/arguments attacked when the dataset is extended with a new labelled case/argument. In particular, this result compares the effect of predicting the outcome of some N2 from D alone and from D extended with (N1 , o1 ), when there is no case in D with characterisation N1 already and moreover D is coherent. This result will be used later in the paper and is interesting in its own right as it shows that, any argument attacked by the “newly added” case (N1 , o1 ) is easily identified in the sets G0 and G1 in the grounded extension G, being sufficient to check those rather than the entire casebase D. We will now prove 1-3. S 1. By definition G = i>0 Gi . We prove by induction that, for every i, each argument in Gi is either (NC , ?) or of the form (βC , o). Then, given that each element of G belongs to some Gi , the property holds for G. (a) For the base case, consider G0 . (NC , ?) and all nearest cases are unattacked, and thus in G0 (notice how this requires the AF to be regular, otherwise nearest cases could be irrelevant). G0 may however contain further unattacked cases. Let β = (βC , βo ) be such a case. If NC 6 βC , then (δC , δo ) 6∼ β and thus (NC , ?) attacks β, contradicting that β in unattacked. So βC NC . As β is not a nearest case, there is a nearest case Lemma 8. Let D be coherent, N1 , N2 ∈ X, o1 ∈ Y , and suppose that there is no case in D with char6 Note that η is guaranteed to exist, as Γ is non-empty and otherwise we would be able to build an arbitrarily long chain of (distinct) arguments, decreasing w.r.t. ≺. However this would allow a chain with more elements than the cardinality of Γ, which is absurd. 5 Note that we omit to indicate in the notations the default argument (δC , δo ), and leave it implicit instead for readability. 113 relevant to NC , and thus α NC , which in turn implies that α ∈ D2 , since D1NC = D2NC . On the other hand, as α 6∈ G20 , there is a case β ∈ AF2 such that β α. However, α 6∈ AF1 , otherwise α would be attacked in AF1 and thus not in G10 . But then, since D1NC = D2NC , this means that β 6 NC . Finally, this means that (NC , ?) β, and thus G20 defends it. Therefore, β ∈ G21 , what we wanted to prove. • For the induction step, from j to j + 1: Again, if G1j+1 ⊆ G2j+1 , we are done. If not, there is a α ∈ G1j+1 \ G2j+1 . Again we can check that this implies that α ∈ D2 . Now, since α ∈ G1j+1 , then G1j defends it. But now, by inductive hypothesis, G1j ⊆ G2j+1 . Therefore, G2j+1 also defends α, which implies that α ∈ G2j+2 ,as we wanted.7 This concludes the induction. acterisation N1 . Consider AF1 = AF (D, N1 ) and AF2 = AF (D ∪ {(N1 , o1 )}, N2 ). Finally, let G(AF1 )and G(AF2 ) be the respective grounded extensions. Let β ∈ D be such that (N1 , o1 ) β in AF2 . Then, 1. for every γ that attacks β in AF1 , N1 6∼ γ (that is, γ is irrelevant to N1 and, by regularity, N1 6 γ); 2. in AF1 , (N1 , ?) defends β; S 3. β ∈ G(AF1 ) and, for G(AF1 ) = i>0 Gi , β is either in G0 (that it, it is unattacked), or in G1 . 4. For every θ = (θC , θo ) ∈ D such that (N1 , ?) defends θ in AF1 , if θo 6= o1 , then, in AF2 , (N1 , o1 ) θ. Proof. 1. Let β = (βC , βo ). From the definition of attack: (i) N1 ≻ βC , (ii) o1 6= βo , and (iii) there is no (αC , xo ) such that xo = o1 and N1 ≻ αC ≻ βC . Consider η = (ηC , ηo ) such that η attacks β in AF1 (if there is no such η then the result trivially holds). Assume by contradiction that η is relevant to N1 . Then by regularity N1 ηC . But since D is coherent and (N1 , o1 ) 6∈ D, η and N1 are distinct, and thus N1 ≻ ηC . As η attacks β, ηo 6= βo , but this in turn implies that ηo = o1 , since (N1 , o1 ) also attacks β, in AF2 . But then N1 ≻ ηC ≻ βC , with ηo = o1 . This contradicts requirement 3 in the second bullet of Definition 3 of the attack between (N1 , o1 ) and β. Therefore, η is not relevant to N1 , as we wanted to prove. 2. Trivially true, by 1 (as, if η is an attacker β, then N1 6∼ η; but then (N1 , ?) η). 3. Trivially true, by 2. 4. Since (N1 , ?) defends θ in AF1 , then any attacker η of θ is irrelevant to N1 , and by regularity, N1 6 η. Thus requirement 3 in the second bullet of Definition 3 is satisfied. Requirement 1 is the hypothesis and requirement 2 is satisfied since (N1 , ?) defends θ in AF1 . To conclude, we can now see that G1 = G2 , since, once more without loss of generality, if we consider α ∈ G1 , by definition of G1 there is a j such that α ∈ G1j . But since G1j ⊆ G2j+1 , α ∈ G2 . This proves that G1 ⊆ G2 . The converse can be proven analogously. 4 Non-monotonicity analysis of classifiers In this section we provide a generic analysis of the nonmonotonicity properties of data-driven classifiers, using D, X and Y to denote generic inputs and outputs of classifiers, admitting our casebases, characterisations and outcomes as special instances. Later in the paper, we will apply this analysis to AA-CBR and our modification thereof. Typically, a classifier can be understood as a function from an input set X to an output set Y . In machine learning, classifiers are obtained by training with an initial, finite D ⊆ (X × Y ), called the training set. In (any form of) AA-CBR, D can also be seen as a training set of sorts. Thus, we will characterise a classifier as a two-argument function C that maps from a dataset D⊆ (X × Y ) and from a new input x ∈ X to a prediction y ∈ Y .8 Notice that this function is total, in line with the common assumptions that classifiers generalise beyond their training dataset. Let us model directly the relationship between the dataset D and the predictions it makes via the classifier as an inference system in the following way: Coinciding predictions. The last result (also used later in the paper) identifies a “core” in the casebase for the purposes of outcome prediction: this amounts to all past cases that are less (or equally) specific than the new case for which the prediction is sought. In other words, irrelevant cases in the casebase do not affect the prediction in regular AFs. Lemma 9. Let D1 and D2 be two datasets. Let NC ∈ X be a characterisation, and DiNC = {α ∈ Di | α NC } for i = 1, 2. If D1NC = D2NC , then AA-CBR (D1 , NC ) = AA-CBR (D2 , NC ) (that is, AA-CBR predicts the same outcome for NC given the two datasets). Definition 10. Given a classifier C: 2(X×Y ) × X → Y , let L = L+ ∪ L− be a language consisting of atoms L+ = X × Y and negative sentences L− = {¬(x, y)|(x, y) ∈ X × Y }. + Then, ⊢C is an inference relation from 2L to L such that Proof. For i = 1, 2, let AFiS= AF (Di , NC ) and the grounded extensions be Gi = j>0 Gij . We will prove that ∀j : G1j ⊆ G2j+1 and G2j ⊆ G1j+1 , and this allows us to prove that G1 = G2 , which in turn implies the outcomes are the same. Here we consider only G1j ⊆ G2j+1 , as the other case is entirely symmetric. By induction on j: 7 In abstract argumentation it can be verified that, if E ⊆ Args defends an argument γ, and E ⊆ E ′ , then E ′ also defends γ. 8 Notice that this understanding relies upon the assumption that classifiers are deterministic. Of course this is not the case for many machine learning models, e.g. artificial neural networks trained using stochastic gradient descent and randomised hyperparameter search. This understanding is however in line with recent work using decision functions as approximations of classifiers whose output needs explaining (e.g. see (Shih, Choi, and Darwiche 2019)). Moreover, it works well when analysing AA-CBR . • For the base case j = 0: If G10 ⊆ G20 , we are done, since we always have that Gij ⊆ Gij+1 . If not, there is a α ∈ G10 \ G20 . Since α ∈ G10 , it is 114 • D ⊢C (x, y), iff C(D, x) = y; • D ⊢C ¬(x, y), iff there is a y ′ such that C(D, x) = y ′ and y ′ 6= y.9 Intuitively, C defines a simple language L consisting of atoms (representing labelled examples) and their negations, and ⊢C applies a sort of closed world assumption around C. Then, we can study non-monotonicity properties from Section 2.2 of ⊢C . Theorem 11. 1. ⊢C is complete, i.e. for every (x, y) ∈ (X× Y ), either D ⊢C (x, y) or D ⊢C ¬(x, y). 2. ⊢C is consistent, i.e. for every (x, y) ∈ (X × Y ), it does not hold that both D ⊢C (x, y) and D ⊢C ¬(x, y). 3. ⊢C is cautiously monotonic iff it satisfies cut. 4. ⊢C is cautiously monotonic iff it is cumulative. 5. ⊢C is cautiously monotonic iff it satisfies rational monotonicity. (∅, −) ({a}, +) ({c}, +) ({a, b}, −) ({c, z}, −) Figure 3: AF (D), given (δC , δo ) = (∅, −), for the proof of Theorem 12. D ⊢AA-CBR as required. Proof. 1. By definition of ⊢C , directly from the totality of C. 2. By definition of ⊢C , since C is a function. 3. Let ⊢C be cautiously monotonic, D ⊢C p and D ∪ {p} ⊢C q, for p, q ∈ L. By completeness, either D ⊢C q or D ⊢C ¬q (here ¬q = r if q = ¬r, and ¬r if q = r). In the first case we are done. Suppose the second case holds. Since D ⊢C p, by cautious monotonicity D ∪ {p} ⊢C ¬q. But then D ⊢C q and D ⊢C ¬q, which is absurd since ⊢C is consistent. Therefore D 6⊢C ¬q, and then D ⊢C q. The converse can be proven analogously. 4. Trivial from 3. 5. Since ⊢C is complete, D 6⊢C ¬p implies D⊢C p, and thus rational monotonicity reduces to cautious monotonicity. (N1 , +) and D ⊢AA-CBR (N2 , −), (∅, −) ({a}, +) ({c}, +) ({a, b}, −) ({c, z}, −) ({a, b, c}, ?) 5 Cautious monotonicity in AA-CBR Our first main result is about (lack of) cautious monotonicity of the inference relation drawn from the classifier AA-CBR (D, NC ). Theorem 12. ⊢AA-CBR is not cautiously monotonic. Figure 4: AF (D, N1 ) for the proof of Theorem 12, with the grounded extension coloured. (∅, −) Proof. We will show a counterexample, instantiating in the following way: X = 2{a,b,c,z} , Y = {−, +}, and =⊇. Define D= {({a}, +), ({c}, +), ({a, b}, +), ({c, z}, +)} and (δC , δo )= (∅, −) from which AF (D) in Figure 3 is obtained, and two new cases: N1 = {a, b, c} and N2 = {a, b, c, z}. Let us now consider AA-CBR (D, N1 ) and AA-CBR (D, N2 ). We can see in Figure 4 that D ⊢AA-CBR (N1 , +) and in Figure 5 that D ⊢AA-CBR (N2 , −). Now, finally, let us consider AF (D ∪ {(N1 , +)}, N2 )) in Figure 6. We can then conclude that D ∪ {(N1 , +)} ⊢AA-CBR (N2 , +) even though ({a}, +) ({a, b}, −) ({c}, +) ({c, z}, −) ({a, b, c, z}, ?) 9 We could equivalently have defined D ⊢C ¬(x, y) iff C(D, x) 6= y. We have not done so as the used definition can be generalized for a scenario in which C is not necessarily a total function. This scenario is left for future work. Figure 5: AF (D, N2 ) for the proof of Theorem 12, with the grounded extension coloured. 115 question was judged as expected by the case law, and it may seem strange that the order in which it happens may affects the case in the second question. (∅, −) ({a}, +) ({a, b}, −) The example above aims only to illustrate an interpretation in which the way AA-CBR operates does not seem appropriate. Whether this behaviour of AA-CBR⊇ in particular is desirable or not depends on other elements such as the interrelation between features (in general, for AA-CBR , between the characterisations and the partial order). ({c}, +) ({c, z}, −) 6 ({a, b, c}, +) A cumulative AA-CBR We will now present cAA-CBR , a novel, cumulative incarnation of AA-CBR which satisfies cautious monotonicity. ({a, b, c, z}, ?) Figure 6: AF (D ∪ {(N1 , +)}, N2 ) for the proof of Theorem 12, with the grounded extension coloured. Preliminaries. Firstly, let us present some general notions, defined in terms of the ⊢C inference relation from an arbitrary classifier C. Intuitively, we are after a relation ⊢′C such that if D ⊢C c and D ⊢C d, then D ∪ {c} ⊢′C d (in our concrete setting, ⊢C =⊢AA-CBR and ⊢′C =⊢cAA-CBR ). We also want the property that, whenever D is “well-behaved” (in a sense to be made precise later), D ⊢C s iff D ⊢′C s. In this way, given that D ⊢′C c and D ⊢′C d, then we would conclude D ∪ {c} ⊢′C d, making ⊢′C a cautious monotonic relation. We will define ⊢′C by building a subset of the original dataset in such a way that cautious monotonicity is preserved. We start with the following notion of (un)surprising examples: Note that the proof of Theorem 12 shows that the inference relation drawn from the original form of AA-CBR (that is AA-CBR⊇ ) is also non-cautiously monotonic, given that the counterexample in the proof is also obtained by using AA-CBR⊇ . This counterexample amounts to an expansion of Example 1, as follows. Example 13. (Example 1 continued) Consider now that a different type of crime happened: public offending someone’s honour, which we will call defamation (df ). In one case, it was established that the defendant did publicly damage someone’s honour, and was considered guilty ({df }, +). In a subsequent case, even if proven that the defendant did hurt someone’s honour, it was established that this was done by a true allegation (the truth defence), and thus the case was dismissed, represented as ({df, td}, −). What happens, then, if a same defendant is: Definition 14. An example (x, y) ∈ X × Y is unsurprising (or not surprising) w.r.t. D iff D \ {(x, y)} ⊢C (x, y). Otherwise, (x, y) is called surprising. We then define the notion of concise (subset of) the dataset, amounting to surprising cases only w.r.t. the dataset: 1. simultaneously proven guilty of homicide, of defamation, but shown to have committed the homicide in self-defence (({hm, df, sd}, ?))? 2. simultaneously proven guilty of homicide, of defamation, shown to have committed the homicide in self-defence, also shown to have committed defamation by a true allegation (({hm, df, sd, td}, ?))? Definition 15. Let S ⊆ X × Y be a dataset, S ′ ⊆ S, and let ϕ(S ′ ) = {(x, y) ∈ S | (x, y) is surprising w.r.t. S ′ }. Then S ′ is concise w.r.t. S whenever it is a fixed point of ϕ, that is, ϕ(S ′ ) = S ′ . To illustrate this notion in the context of AA-CBR, consider the dataset S from which the AF in Figure 6 is drawn. S is not concise w.r.t. itself, since ({a, b, c}, +) is unsurprising w.r.t. S (indeed, S \ {({a, b, c}, +)} ⊢AA-CBR ({a, b, c}, +), see Figure 4). Also, S ′ = S \ {({a, b}, −), ({a, b, c}, +)} is not concise either (w.r.t. S), as ({a, b}, −) is surprising w.r.t. S ′ (the predicted outcome being +), but not an element of S ′ . The only concise subset of S in this example is thus S ′′ = S \ {({a, b, c}, +)}. Let us now consider D′ ⊆ D, for D the dataset underpinning our ⊢C . If D′ is concise w.r.t. D, (x, y) ∈ (X × Y ) \ D is an example not in D already and D′ ⊢C (x, y), then (x, y) is unsurprising w.r.t. D′ , and thus D′ is still concise w.r.t. D ∪ {(x, y)}. Now, suppose that there is exactly one such concise D′ ⊆ D w.r.t. D (let us refer to this subset simply as concise(D)). Then, it seems attractive to define ⊢′C , as: D ⊢′C (x, y) iff concise(D) ⊢C (x, y). Such ⊢′C We can map this to our counterexample in Theorem 12 by setting a = hm, b = sd, c = df , and z = td. The first question is answered by the AF represented in Figure 4, with outcome +, that is, the defendant is considered guilty. What we show in the proof of Theorem 12, given this interpretation of the counter-example, is that the answer to the second question in AA-CBR would depend on whether the case in the first question was already judged or not. If not, then the cases ({hm, sd}, −) and ({df, td}, −) would be the nearest cases, and the outcome would be −, that is, not guilty. However, if the case in the first question was already judged and incorporated into the case law, it would serve as a counterargument for ({hm, sd}, −), and guarantee that the outcome is +, that is, guilty. Intuitively this seems strange, and we focus on one reason for that: the case in the first 116 for ({c}, ?). Thus, every argument in stratum1 is surprising, and are thus included in the next AF , resulting in D1 = ({a}, +), ({c}, +) and AF1 = AF (D1 ). Now, the second stratum is stratum2 = {({a, b}, −), ({c, z}, −)}. We can verify that AA-CBR (D1 , ({a, b}, ?)) = + and AA-CBR (D1 , ({c, z}, ?)) = +. Thus ({a, b}, −) and ({c, z}, −) are both surprising, and then included in next step, that is, D2 = D1 ∪ {({a, b}, −), ({c, z}, −)}, and AF2 = AF (D2 ). Finally, stratum3 = {({a, b, c}, +)}. Now we verify that AA-CBR D2 , ({a, b, c}, +) = +, which means that ({a, b, c}, +) is unsurprising. Therefore it is not added in the argumentation framework, that is, D3 = D2 and thus AF3 = AF (D3 ) = AF (D2 ) = AF2 . Now unprocessed = ∅, and the selected subset if D3 , with corresponding aaF oneD3 = AF3 , and we are done. We can check that using cAA-CBR the counterexample in the proof of Theorem 12 would fail, since ({a, b, c}, +) would not have been added to the AF. inference relation would then be cautiously monotonic if concise(D) = concise(D ∪ {(x, y)}). This identity is indeed guaranteed given that a concise subset of D is still a concise subset of D ∪ {(x, y)}, and given our assumption that there is a unique concise subset of D}. In the remainder of this section we will prove uniqueness and (constructively) existence of concise(D) in the case of AA-CBR . Uniqueness of concise subsets in AA-CBR . Theorem 16. Given a coherent dataset D, if there exists a concise D′ ⊆ D w.r.t. D then D′ is unique. Proof. By contradiction, let D′′ be a concise subsets of D distinct from D′ . Let then (x, y) ∈ (D′ \ D′′ ) ∪ (D′′ \ D′ ) such that (x, y) is -minimal in this set. Then the sets {(x′ , y ′ ) ∈ D′ | (x′ , y ′ ) ≺ (x, y)} and {(x′ , y ′ ) ∈ D′′ | (x′ , y ′ ) ≺ (x, y)} are equal, otherwise (x, y) would not be minimal. But then, since D is coherent, by Lemma 9 we can conclude that D′ \ {(x, y)} ⊢AA-CBR (x, y) iff D′′ \ {(x, y)} ⊢AA-CBR (x, y). Thus, (x, y) is surprising w.r.t. both D′ and D′′ or w.r.t. neither. But since it is an element of one but not the other, one of them is either missing a surprising element or containing a non-surprising element. Such a set is not concise, contradicting our initial assumption. Notice that we could have defined the algorithm equivalently by looking at cases one-by-one rather than grouping them in strata. However, using strata has the advantage of allowing for parallel testing of new cases. Theorem 18 (Convergence). Algorithm 2 converges. Existence of concise subsets in AA-CBR . We have proven that concise(D) is unique, if it exists. Here we prove that existence is guaranteed too. We do so constructively, and by doing do we also prove that our approach is practical, giving as we so a (reasonable) algorithm that finds the concise subset of D. The main idea behind the algorithm is simple: we start with the default argument, and progressively build the argumentation framework by adding cases from D by following the partial order . Before adding a past case, we test whether it is surprising or not w.r.t. the dataset underpinning the current AF: if it is, then it is added; otherwise, it is not added. More specifically, the algorithm works with strata over D, alongside . In the simplest setting where each stratum is a singleton, the algorithm works as follows:starting with D0 = {(δC , δo )} and the entire dataset D = {di }i∈{1,...,|D|} unprocessed, at each step i + 1, we obtain either Di+1 = Di ∪ {di+1 }, if di+1 is surprising w.r.t. Di , and Di+1 = Di , otherwise. Then D̂ = D|D| ⊆ D is the result of the algorithm. In the general case, each example of the current stratum is tested for “surprise”, and only the surprising examples are added to Di . The procedure is formally stated in Algorithm 2, using in turn Algorithm 1. We illustrate the application of the algorithms next. Proof. Obvious, since at each iteration of the while loop, the variable stratum is assigned to a non-empty set, due to the fact that unprocessed is always a finite set, and thus there is always at least one minimal element. Thus, the cardinality of unprocessed is reduced by at least 1 at each loop iteration, which guarantees that it will eventually become empty. Theorem 19 (Correctness of Algorithm 1). Every execution of simple add((Args, ), next case) (Algorithm 1) in Algorithm 2 correctly returns AF (Args ∪ {next case}). Proof (sketch). This is essentially a consequence of Lemma 8. We know that there will never be an argument in Args with the same characterisation as next case, since they will occur in the same stratum, thus the lemma applies. The lemma guarantees that Algorithm 1 adds all attacks that need to be added and only those. Finally, we need to check that it will never be necessary to remove an attack. This is true due to the requirement 3 in the second bullet of Definition 3, and since arguments are added following the partial order. Therefore the only modifications on the set of attacks are the ones in simple add. Example 17. Once more consider the dataset D= {({a}, +), ({c}, +), ({a, b}, +), ({c, z}, +), ({a, b, c}, +)} in Figure 6, as well as the definitions used in that example for X, Y , (δC , δo ) and . Let us examine the application of Algorithm 2 to it. We start with an AF consisting only of (δC , δo ), that is, D0 = ∅, AF0 = AF (D0 ) = AF (∅) = ({(∅, −)}, ∅). The first stratum would consist of stratum1 = {({a}, +), ({c}, +)}. Of course, then, we have AA-CBR ({(∅, −)}, ({a}, ?)) = −, and similarly 117 Theorem 20 (Correctness of Algorithm 2). If the input dataset is coherent, then the dataset underpinning the AF resulting from Algorithm 2 is concise. Proof (sketch). In order to prove that, for the returned Args current , Args current \{(δC , δo )} is concise, we just need to prove that at the end of each loop Args current \{(δC , δo )} is concise w.r.t. the set of all seen examples. Algorithm 1: simple add algorithm for AA-CBR . Input: An AA-CBR framework (Args, ) and a case n = (nc , no ) Output: A new AA-CBR (Args ′ , ′ ) framework DEF ←− {(x, y) ∈ AF (Args, nC ) | (x, y) 6= (nC , ?) and (nC , ?) defends (x, y) in AF (Args, nc )} ; Args ′ ←− Args ∪ {n} ; ′ ←− ( ∪{(n, a) | a = (ac , ao ), a ∈ DEF, and ao 6= no }) ; return (Args ′ , ′ ) Algorithm 2: Setup/learning algorithm for cAA-CBR . Input: A dataset D Output: An AF cAA-CBR (D) unprocessed ←− D ; Argscurrent ←− {(δC , δo )} ; current ←− ∅ ; while unprocessed 6= ∅ do stratum ←− {(x, y) ∈ unprocessed | (x, y) is -minimal in unprocessed} ; unprocessed ←− unprocessed \ stratum ; to add ←− ∅ ; for next case ∈ stratum do (case characterisation, case outcome) ←− next case ; if the outcome for case characterisation w.r.t. (Args current , current ) is not case outcome then to add ←− to add ∪ {next case} ; end end for next case ∈ to add do (Argscurrent , current ) ←− simple add((Argscurrent , current ), next case) ; end end return (Argscurrent , current ) checking whether the next case is surprising or not, thus we could optimise its implementation with the use of caching. Besides, the subset of minimal cases (that is, the stratum) can be extracted efficiently by representing the partial order as a directed acyclic graph and traversing this graph. Finally, as mentioned before, the order in which the cases in the same stratum are added does not affect the outcome. Thus, each case in the same stratum can be safely tested for surprise in parallel. As the base case, before the loop is entered, this is clearly the case, as the only seen argument is the default. As the induction step, we know that every case previously added is still surprising, since the new cases added are not smaller than them according to the partial order, and thus by Lemma 9 their prediction is not changed, that is, they keep being surprising. The same is true for every case previously not added: adding more cases afterwards does not change their prediction. For the cases added at this new iteration, by definition the surprising ones are added and the unsurprising ones are not. Regarding the order in which cases of the same stratum are added, each of the surprising cases will be included and the unsurprising ones will not be. It can be seen that the order is irrelevant as, since they are all -minimal and the dataset is coherent, they are incomparable, so each case in the list is irrelevant with respect to the other. Thus, for every case seen until this point, it is in the AF iff it is surprising. As this is true for every iteration, it is true for the final, returned AF. cAA-CBR . All theorems in this section so far lead to the following corollary: Corollary 21. Given a coherent dataset D, the dataset underpinning the AF resulting from Algorithm 2 is the unique concise D′ ⊆ D, w.r.t. D. To conclude, we can then define inference in cAA-CBR , the classifier yielded by the strategy described until now: Definition 22. Let D be a coherent dataset and let concise(D) be the unique concise subset of D, w.r.t. D. Let cAF (D, NC ) be the AF mined from concise(D) and (NC , ?), with default argument (δC , δo ). Then, cAA-CBR (D, NC ) stands for the outcome for NC , given cAF (D, NC ). A full complexity analysis of the algorithm is outside the scope of this paper. However, notice here that the algorithm refrains from building the AF from scratch each time a new case is considered, as seen in Theorem 19. Still regarding Algorithm 1, notice that it is easy to compute the set DEF while 118 Thus, we directly obtain the inference relation ⊢AA-CBR . Then, cAA-CBR amounts to the form of AA-CBR using this inference relation. It is easy to see, in line with the discussion before Theorem 16, and using the results in Section 11, that cAA-CBR satisfies several non-monotonicity properties, as follows: Theorem 23. ⊢cAA-CBR is cautiously monotonic and also satisfies cut, cumulativity, and rational monotonicity. 7 European Conference on Artificial Intelligence, 18-22 August 2014, Prague, Czech Republic - Including Prestigious Applications of Intelligent Systems (PAIS 2014), volume 263 of Frontiers in Artificial Intelligence and Applications, 267– 272. IOS Press. Dung, P. M. 2016. An axiomatic analysis of structured argumentation with priorities. Artificial Intelligence 231:107–150. Hunter, A. 2010. Base logics in argumentation. In Baroni, P.; Cerutti, F.; Giacomin, M.; and Simari, G. R., eds., Computational Models of Argument: Proceedings of COMMA 2010, Desenzano del Garda, Italy, September 8-10, 2010, volume 216 of Frontiers in Artificial Intelligence and Applications, 275–286. IOS Press. Kenny, E. M., and Keane, M. T. 2019. Twin-systems to explain artificial neural networks using case-based reasoning: Comparative tests of feature-weighting methods in ANN-CBR twins for XAI. In Kraus, S., ed., Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019, 2708–2715. ijcai.org. Makinson, D. 1994. General patterns in nonmonotonic reasoning. 35–110. Oxford University Press. Nugent, C., and Cunningham, P. 2005. A case-based explanation system for black-box systems. Artif. Intell. Rev. 24(2):163–178. Prakken, H.; Wyner, A. Z.; Bench-Capon, T. J. M.; and Atkinson, K. 2015. A formalization of argumentation schemes for legal case-based reasoning in ASPIC+. J. Log. Comput. 25(5):1141–1166. Shih, A.; Choi, A.; and Darwiche, A. 2019. Compiling bayesian network classifiers into decision graphs. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 February 1, 2019, 7966–7974. Čyras, K., and Toni, F. 2015. Non-monotonic inference properties for assumption-based argumentation. In Black, E.; Modgil, S.; and Oren, N., eds., Theory and Applications of Formal Argumentation - Third International Workshop, TAFA 2015, Buenos Aires, Argentina, July 25-26, 2015, Revised Selected Papers, volume 9524 of Lecture Notes in Computer Science, 92–111. Springer. Čyras, K., and Toni, F. 2016. Properties of ABA+ for nonmonotonic reasoning. CoRR abs/1603.08714. Čyras, K.; Birch, D.; Guo, Y.; Toni, F.; Dulay, R.; Turvey, S.; Greenberg, D.; and Hapuarachchi, T. 2019. Explanations by arbitrated argumentative dispute. Expert Syst. Appl. 127:141–156. Čyras, K.; Satoh, K.; and Toni, F. 2016a. Abstract argumentation for case-based reasoning. In KR 2016, 549–552. Čyras, K.; Satoh, K.; and Toni, F. 2016b. Explanation for case-based reasoning via abstract argumentation. In Proceedings of COMMA 2016, 243–254. Conclusion In this paper we study regular AA-CBR frameworks, and propose a new form of AA-CBR, denoted cAA-CBR , which is cautiously monotonic, as well as, as a byproduct, cumulative and rationally monotonic. Given that AA-CBR admits the original AA-CBR⊇ (Čyras, Satoh, and Toni 2016a) as an instance, we have (implicitly) also defined a cautiously monotonic version thereof. (Some incarnations of) AA-CBR have been shown successful empirically in a number of settings (see (Cocarascu et al. 2020). The formal properties we have considered in this paper do not necessarily imply better empirical results at the tasks in which AA-CBR has been applied. We thus leave for future work an empirical comparison between AA-CBR and cAA-CBR . Other issues open for future work are comparisons w.r.t. learnability (such as model performance in the presence of noise), as well as a full complexity analysis of the new model. Also, we conjecture that the reduced size of the AF our method generates could possibly have advantages in terms of time and space complexity: we leave investigation of this issue to future work. 8 Acknowledgements We are very grateful to Kristijonas Čyras for very valuable discussions, as well as to Alexandre Augusto Abreu Almeida, Victor Luis Barroso Nascimento and Matheus de Elias Muller for reviewing initial drafts of this paper. The first author was supported by Capes (Brazil, Ph.D. Scholarship 88881.174481/2018-01). References Cocarascu, O.; Stylianou, A.; Čyras, K.; and Toni, F. 2020. Data-empowered argumentation for dialectically explainable predictions. In ECAI 2020 - 24th European Conference on Artificial Intelligence, Santiago de Compostela, Spain, 10-12 June 2020. Cocarascu, O.; Čyras, K.; and Toni, F. 2018. Explanatory predictions with artificial neural networks and argumentation. In 2nd Workshop on XAI at the 27th IJCAI and the 23rd ECAI. Dung, P. M. 1995. On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games. Artificial Intelligence 77(2):321 – 357. Dung, P. M. 2014. An axiomatic analysis of structured argumentation for prioritized default reasoning. In Schaub, T.; Friedrich, G.; and O’Sullivan, B., eds., ECAI 2014 - 21st 119 A Preference-Based Approach for Representing Defaults in First-Order Logic James Delgrande , Christos Rantsoudis Simon Fraser University, Canada first last@sfu.ca Abstract over possible worlds. While properties such as specificity followed directly from the semantics, other properties, such as handling irrelevant properties, were not obtained. Arguably, at present there is no generally-accepted approach that adequately handles inference of default properties, reasoning in the presence of irrelevant information, and reasoning about default properties of an individual known to be exceptional with respect to another default property. In this paper, we present a new account of defaults. Consider the default assertions “birds fly” and “birds build nests”. The usual interpretation is that a normal bird flies and it builds nests. Our interpretation is that, with regards to flying, a normal bird flies, and with regards to nest building, a normal bird builds nests. That is, normality is given with respect to some property. Consequently, “birds fly” would be interpreted as saying that, with respect to the property of flight, an individual that is a bird in fact flies. Similarly, a penguin, as concerns flight, does not fly. Semantically, for each n-ary relation in the domain, we associate a total preorder over n-tuples of individuals, where the preorder gives the relative normality of a tuple with respect to that relation. Syntactically, we introduce a “predicate-forming construct” into the language of FOL that lets us identify those individuals that satisfy a certain condition (like Bird) and that are minimal in a given ordering (like that corresponding to F ly); one can then state assertions regarding such (minimal-in-the-ordering) individuals, for example that they indeed satisfy F ly. Notably, an individual abnormal in one respect (like flight) may be normal in another respect (like nest building). These orderings allow us to naturally specify a wide class of default assertions, including on predicates of arity > 1. Default inference, in which an individual is concluded to have a given property “by default”, is specified via a preference ordering over models. Then inferences that follow by default are just those that obtain in the minimal models. In the approach we avoid a modal semantics on the one hand and fixed-point constructions on the other. We also show how a “predicate-forming construct” can be translated into a standard first-order theory and argue that the approach presents various advantages: it satisfies a set of broadly-desirable properties; it is perspicuous, and presents a more nuanced and expressive account of defaults than previous approaches; and it is couched within classical FOL. A major area of knowledge representation concerns representing and reasoning with defaults such as “birds fly”. In this paper we introduce a new, preference-based approach for representing defaults in first-order logic (FOL). Our central intuition is that an individual (or tuple of individuals) is not simply normal or not, but rather is normal with respect to a particular predicate. Thus an individual that satisfies Bird may be normal with respect to F ly but not BuildN est. Semantically we associate a total preorder over n-tuples with each n-ary relation in the domain. Syntactically, a predicateforming construct is introduced into FOL that lets us assert properties of minimal elements in an ordering that satisfy a given condition. Default inference is obtained by (informally) asserting that a tuple in an ordering is ranked as “low” as consistently possible. The approach has appealing properties: specificity of defaults is obtained; irrelevant properties are handled appropriately; and one can reason about defaults within FOL. We also suggest that the approach is more expressive than extant approaches and present some preliminary ideas for its use in Description Logics. 1 Introduction One of the major challenges of Artificial Intelligence (AI) has been in representing and reasoning with defaults such as “birds fly”. Since the early days of AI, researchers in the field have recognized the importance of intelligent systems being able to draw default assertions, where one would conclude by default that a bird flies, while allowing for exceptional conditions and non-flying birds. Of the early approaches to nonmonotonic reasoning, default logic (Reiter 1980) and autoepistemic logic (Moore 1985) were based on the notion of a fixed-point construction in order to expand the set of obtained consequences, while circumscription (McCarthy 1980) was based on the idea of minimizing the extension of a predicate. In these approaches, desirable properties (such as specificity) had to be hand-coded in a theory (Reiter and Criscuolo 1981; McCarthy 1986). About a decade later, approaches based on conditional logics (Delgrande 1987; Lamarre 1991; Boutilier 1994; Fariñas del Cerro, Herzig, and Lang 1994) and nonmonotonic consequence relations (Kraus, Lehmann, and Magidor 1990; Lehmann and Magidor 1992) represented defaults as objects (binary modal operators in conditional logics) in a theory. In such approaches, the semantics was based on an ordering 120 Syntactically, we introduce a new construct into the language of FOL that, for an ordering associated with a relation, enables us to specify minimal domain elements in the ordering that satisfy a given condition. This construct has two parts, a predicate P and a formula φ; it is written {P (~y ), φ(~y )}. The construct stands for a (new) predicate that denotes a domain relation which holds for just those (tuples of) individuals that satisfy φ and that are minimal in the ordering corresponding to P (i.e., the ordering associated with the denotation of P ). Given this construct, one can make assertions regarding individuals that satisfy this relation. For example, to express “birds normally fly” we use: ∀x {F ly(y), Bird(y)}(x) → F ly(x) (1) In the next section we informally describe our framework. In Section 3 we present the syntax and semantics of our logic. After presenting some examples in Section 4, we look at various properties of our logic as well as provide a characterization result in Section 5. We briefly present our treatment of nonmonotonic inferences in Section 6. In Section 7 we compare our approach with related work as well as discuss about future directions. Section 8 concludes. 2 The Approach: Intuitions A common means for specifying the meaning of a default is via a preference order over models or possible worlds in which worlds or models are ordered with respect to their normality. Then, something holds normally (typically, defeasibly, etc.) just when it holds in the most preferred models or possible worlds. For example, in a conditional logic, “birds fly” can be represented propositionally as Bird ⇒ F ly. This assertion is true just when, in the minimal Birdworlds, F ly is also true. In circumscription “birds fly” can be represented as ∀x.Bird(x) ∧ ¬Abf (x) → F ly(x), so a bird that is not abnormal with respect to flight flies. Then models are ordered based on the extensions of the Ab predicates (with smaller extensions preferred), and a bird a flies by default just if F ly(a) is satisfied in the minimal models that satisfy Bird(a). Our approach belongs to the preference-based paradigm, but with significant differences from earlier work. Our preferences are expressed within FOL models, and not between models (or possible worlds, as in a modal framework). Preferences are given by a total preorder over n-tuples of individuals for each n-ary relation in the domain; these orderings give the relative normality of a tuple with respect to the underlying relation. Defaults are then expressed by making assertions concerning sets of minimal (tuples of) individuals in an ordering. Consider again the assertion that birds normally fly. We interpret this as, for a bird that is normal with respect to the unary relation f ly,1 that bird flies. In a model, the relative normality of individuals with respect to flight is given by a total preorder associated with the relation f ly. Then we can say that “birds fly” is true in a model just when, in the order associated with f ly, the minimal bird individuals satisfy f ly. Similarly, “penguins do not fly” is true in a model just when, in the order associated with f ly, the minimal penguin individuals do not satisfy f ly. The ranking of an individual with respect to one relation (like f ly) is not related to the ranking associated with another relation (like build nest). These considerations extend to relations of arity > 1. Consider “elephants normally like their keepers”. Semantically we would express this by having, in the total preorder associated with the relation likes, that the most normal pairs of individuals (d1 , d2 ), in which d1 is an elephant and d2 is a keeper, satisfy likes. Analogously we could go on and express that “elephants normally do not like (keeper) Fred”. whereas for “penguins normally do not fly” we use: ∀x {F ly(y), P enguin(y)}(x) → ¬F ly(x) (2) So those individuals that satisfy bird and that are minimal in the ordering associated with f ly also satisfy f ly whereas the minimal elements in the f ly ordering that satisfy penguin do not satisfy f ly. That is, “birds fly” and “penguins do not fly” both concern the property of flight and so are respect to the same (f ly) ordering.2 The fact that we deal with orderings over individuals means that our approach is irreducibly first-order. This is in contrast to most work in default logic, in which default theories are very often expressed in propositional terms, and where a rule with variables is treated as the set of corresponding grounded instances. It is also in contrast to work in conditional logics and nonmonotonic inference relations, which are nearly always expressed in propositional terms. As we suggest later, for many domains, it may well be that a first-order framework is essential for an adequate expression of defaults assertions. For our earlier example “elephants (E) normally like (L) their keepers (K)”, we have the following: ∀x1 , x2 {L(y1 , y2 ), E(y1 ) ∧ K(y2 )}(x1 , x2 ) → (3) L(x1 , x2 ) whereas for “elephants normally do not like (keeper) Fred”: ∀x1 , x2 {L(y1 , y2 ), E(y1 ) ∧ K(y2 ) ∧ (4) y2 = F red}(x1 , x2 ) → ¬L(x1 , x2 ) In addition, we suggest that our approach leads to a reconsideration of how some defaults are best expressed. Consider the assertion “adults are normally employed at a company”. In a conditional approach, one might express this as: Adult(x) ⇒x ∃y EmployedAt(x, y) ∧ Company(y) where, without worrying about details, ⇒x is a variablebinding connective (Delgrande 1998). But the interpretation 2 One way of viewing this is that a relation such as f ly gives a partition of a domain, into those elements that belong to the relation and those that do not. We also note that the interpretation of “birds fly” as a default conditional (like Bird ⇒ F ly) is somewhat superficial. A more nuanced approach would assert that for birds, flight is a default means of locomotion, perhaps along with others such as bipedal walking. We return to this point later. 1 We use the notation that a lower case string like f ly is used for a relation in a model whereas upper case, like F ly, is used for a predicate symbol in the language (in this case denoting f ly). 121 N} and a set of variables V = {x, y, z . . . }.3 Predicate symbols and variables may be subscripted, as may other entities in the language. The constants and variables make up the set of terms, which are denoted by ti , i ∈ N. A tuple of variables x1 , . . . , xn is denoted by ~x, and similarly for terms. For any formula φ, the expression φ(~x) indicates that the free variables of φ are among those in ~x. Our language LN is given in the following definition, with L given in Items 1-3. Definition 1. The well-formed formulas (wffs) of LN are defined inductively as follows: 1. If P is an n-ary predicate symbol and t1 , . . . , tn are terms then P (t1 , . . . , tn ) is a wff. 2. If t1 and t2 are terms then t1 = t2 is a wff. 3. If φ and ψ are wffs and x is a variable then (¬φ), (φ → ψ) and (∀x φ) are wffs. 4. If P is an n-ary predicate symbol, ~y is a tuple of n variables, φ(~y ) is a wff and ~t is a tuple of n terms then {P (~y ), φ(~y )}(~t) is a wff. Parentheses may be omitted if no confusion results. The connectives ∧, ∨, ≡ and ∃ are introduced in the usual way. For a wff of the form {P (~y ), φ(~y )}(~t), the part {P (~y ), φ(~y )} can be thought of as a self-contained predicate-forming construct (pfc). The first part, P (~y ), specifies that the ordering is with respect to predicate P ; it also provides names for the n variables of P , in ~y . The second part φ(~y ) will in general be true of some substitutions for ~y and false for others. The denotation of {P (~y ), φ(~y )} is just those n-tuples of domain elements that satisfy φ and are minimal in the ordering corresponding to P . So {P (~y ), φ(~y )} behaves just like any predicate symbol. Thus {F ly(y), Bird(y)}(x) can be thought of as analogous to an atomic formula, which in a model will be true of some individuals (viz. those that belong to bird and that are minimal in the f ly ordering) and false of others. Similarly, {F ly(y), Bird(y)}(T weety) will assert that T weety is a minimal bird element in the f ly ordering. In the wff {P (~y ), φ(~y )}(~t), there is a one-to-one correspondence between the terms in ~t and the variables ~y inside the pfc; but otherwise they are unrelated. Hence for the expression {F ly(x), Bird(x)}(x) the occurrences of variable x within {. . . } are distinct from the third occurrence. For {P (~y ), φ(~y )}, the variables in ~y are local to {. . . }, and can be thought of as effectively bound within the expression. In the following, we use the term predicate expression to refer to both predicate symbols and pfcs. We next remind the reader of some terminology regarding orderings. A total preorder on a set is a transitive and connected relation on the elements of the set. A well-founded order is one that has no infinitely-descending chains of elements. Formally, for a set S and a total preorder on S, is well-founded iff: (∀ T ⊆ S) T 6= ∅ → (∃x ∈ T )(∀y ∈ T ) x y that “the most normal adults are employed at a company” is unsuitable, since an abnormal adult here would be abnormal with respect to other normality assertions regarding adults. As well, it does not seem to make much sense to say that normality now refers to the full consequent. Instead, it seems that the best way of interpreting this assertion is that we have a normality ordering associated with employed at, giving the relative normality of pairs of domain elements with respect to this relation. Then for the most normal pairs (d1 , d2 ) where d1 is an adult, there is some pair (d1 , d3 ) among them for which d1 is employed at d3 and d3 is a company. Consequently, this suggests that simple conditionals, at least in a first-order framework, may not be adequate to represent general default information. The preceding sketches our intuitions regarding how we intend to represent and interpret default information. With regards to inferring default information in a knowledge base (KB), we define preferences between models in a similar manner to those of other preferential logics (McCarthy 1980; Shoham 1987; Kraus, Lehmann, and Magidor 1990). Again, what is new in our approach is that we have multiple orderings inside our models and so we can define more nuanced preferences between models. As we will see, although we only briefly treat nonmonotonic inferring of assertions, our ordering between the models will result in desirable properties with respect to defeasibility. Specifically, we satisfy the following principles: 1. Specificity: Properties are ascribed on the basis of most specific applicable information. Hence a penguin will not fly by default whereas a bird will. 2. Inheritance: Individuals will inherit all typical properties of the classes to which they belong, except for those we know are exceptional. Hence, by default, a penguin may be concluded to not fly, but will be concluded to have feathers, etc. 3. Irrelevance: Default inference is not affected by irrelevant information. Hence, by default, a yellow bird will be concluded to fly. As we have noted, there is no generally-accepted approach that fully captures these properties. Default logic, autoepistemic logic and circumscription do not satisfy specificity, while the rational closure mechanism of the KLM framework does not satisfy inheritance. 3 Language and Semantics As discussed, a first-order setting is required for our investigation. Thus, the language we employ is based on standard FOL enhanced with the aforementioned minimality operators. We start with some formal preliminaries, including the syntax of our new logic, and finish the section by presenting the semantics. 3.1 Formal Preliminaries We assume that the reader has some familiarity with standard FOL (Enderton 1972; Mendelson 2015). Let L be a first-order language containing a set of predicate symbols P = {P, Q, . . . }, a set of constant symbols C = {ci | i ∈ 3 For simplicity, except for constants, we exclude function symbols. Note that this does not affect expressiveness, since any n-ary function can be encoded by a (n + 1)-place predicate. 122 I,v M,v 1. M, v |= P(t1 , . . . , tn ) iff (tI,v 1 , . . . , tn ) ∈ P We will work only with well-founded total preorders. Given well-foundedness, for a total preorder we can define the minimal S-elements of , as follows: I,v M, v |= t1 = t2 iff tI,v 1 = t2 M, v |= ¬φ iff M, v 6|= φ M, v |= φ → ψ iff M, v 6|= φ or M, v |= ψ M, v |= ∀xφ iff M, v |= φ(x/d) for all d ∈ D As usual, if M, v |= φ for all M and v, then φ is valid in LN . If φ is a sentence (i.e., without free variables) then M, v |= φ iff M, v′ |= φ for all variable maps v, v′ ; thus we just write M |= φ. If Φ is a set of sentences then M |= Φ iff M |= φ for all φ ∈ Φ, and we say that M is a model of Φ. Finally, we write Φ |= φ when all models of Φ are models of φ, and we say that Φ logically entails φ. 2. 3. 4. 5. min(, S) = {x ∈ S | ∀y ∈ S : x y} 3.2 Semantics We next present the formal semantics, which will interpret the terms and formulas in LN with respect to a model. Definition 2. A model is a triple M = hD, I, Oi where D = 6 ∅ is the domain, I is the interpretation function, and O is a set containing, for each n-ary relation r in D, a wellfounded total preorder r on Dn . Specifically: 1. I interprets the predicate and constant symbols into D as follows: • P I ⊆ Dn , for each n-ary predicate symbol P ∈ P • cI ∈ D, for each constant symbol c ∈ C 2. O = { r ⊆ Dn × Dn | r ⊆ Dn and r is a wellfounded total preorder on Dn } 4 We have already seen some wffs in Equations 1–4 of Section 2. We now give some more examples that illustrate the range and application of our approach. As we have described, the first part of a pfc denotes an ordering associated with a given predicate. The second part is a formula that specifies minimal (tuples of) individuals in the ordering. The order of the variables in the two parts is important, as the next two equations illustrate: ∀x1 , x2 {L(y1 , y2 ), P (y1 , y2 )}(x1 , x2 ) → L(x1 , x2 ) (6) ∀x1 , x2 {L(y1 , y2 ), P (y2 , y1 )}(x1 , x2 ) → L(x1 , x2 ) (7) A variable map v : V 7→ D assigns each variable x ∈ V an element of the domain v(x) ∈ D. Definition 3. Let M = hD, I, Oi be a model and v a variable map. The denotation of a term t, written as tI,v , is defined as follows: 1. tI,v = tI , if t is a constant 2. tI,v = v(t), if t is a variable When L abbreviates Likes and P abbreviates P arentOf , it is easy to see that 6 states that “parents normally like their children” while 7 states that “children normally like their parents”. Recall also that the tuples of domain elements belonging to the denotation of a pfc {P (~y ), φ(~y )} do not necessarily have to satisfy the predicate P . See, for instance, Equations 2 and 4. On another note, predicates in φ may have a higher arity than P in a pfc {P (~y ), φ(~y )}. For instance, consider the statement “people that trust (T ) themselves are normally daring (D)”. We would express that using the following wff: ∀x {D(y), T (y, y)}(x) → D(x) The satisfaction relation |= is defined below. We first give some preliminary terminology and notation. Assume we have a model M, a variable map v and a wff φ. When M satisfies φ under v we write M, v |= φ. When M satisfies φ under v where the free variable x of φ is assigned to d we write M, v |= φ(x/d). For ~x a tuple of variables x1 , . . . , xn and d~ a tuple of domain elements d1 , . . . , dn , we denote by ~x/d~ the one-to-one assignment x1 /d1 , . . . , xn /dn . Similarly for ~x/~y and ~x/~t . Last, given a tuple of n variables ~y and a formula φ with free variables among ~y , the values of ~y for which φ can be satisfied are given by the set: ~ φ(~y )M,v = {d~ ∈ Dn | M, v |= φ(~y /d)} Examples Furthermore, we can express statements about specific individuals by directly replacing variables with constants. For example, for constant John, we can express that “John’s pets are normally happy (H)” by: ∀x {H(y), HasP et(John, y)}(x) → H(x) (5) We can now define the denotation of a pfc {P (~y ), φ(~y )}, written {P (~y ), φ(~y )}M,v , as the set of domain tuples that: 1. belong to the denotation of φ, as given in Equation 5 and The reading of our new wffs can sometimes be a bit cumbersome. Consider the earlier example that “adults (A) are normally employed (Em) at a company (C)”. According to the discussion in Section 2, this statement can be expressed by the following wff: 2. are the minimal such tuples in the ordering associated with P I , viz. P I . Definition 4. Let M = hD, I, Oi be a model and v a variable map. The denotation of {P (~y ), φ(~y )} is defined as the set: {P (~y ), φ(~y )}M,v = min(P I , φ(~y )M,v ) ∀x1 , x2 {Em(y1 , y2 ), A(y1 )}(x1 , x2 ) → ∃x3 {Em(y1 , y2 ), A(y1 )}(x1 , x3 ) ∧ Em(x1 , x3 ) ∧ C(x3 ) This contains two instances of the pfc {Em(y1 , y2 ), A(y1 )}. As a possible solution, if we introduce the predicate NAE (for Normal Adults wrt the Em ordering) we can rewrite the previous into the more compact and perspicuous formula: ∀x, y NAE(x, y) → ∃z NAE(x, z) ∧ Em(x, z) ∧ C(z) Finally, for each predicate symbol P ∈ P we define P M,v = P I . The satisfaction relation is given as follows (recall that a predicate expression is either a predicate symbol or a pfc). Definition 5. Let M = hD, I, Oi be a model, v a variable map and P a predicate expression. 123 to the more semantic approach of Section 3; and our equivalence result (Theorem 1) provides a counterpart to a standard soundness and completeness result.5 More precisely: This method of abbreviating pfcs via smaller “predicate names” could be used at the outset in order to make KBs more readable. A key point is that our approach is highly versatile, and can express nuances that (arguably) other approaches cannot. Consider for example the ambiguous statement “undergraduate students attend undergraduate courses”.4 Let U GS stand for “undergrad student” and U GC for “undergrad course”. Among other possibilities, we have the following interpretations: 1. “Normally, the things U GSs attend are U GCs” That is, for the most normal pairs (d1 , d2 ) according to the attend relation, such that d1 is an U GS that attends d2 , d2 is an U GC. In LN : ∀x1 , x2 {Attend(y1 , y2 ), U GS(y1 ) ∧ Attend(y1 , y2 )}(x1 , x2 ) → U GC(x2 ) 1. For each n-ary predicate symbol P we introduce a new predicate symbol P of arity 2n. We use these new predicate symbols to express the preference orderings instead of embedding them directly into the models. That is, each P will be used in the place of P I . 2. After having interpreted the new predicate symbols P in the aforementioned way, we translate each wff {P (~y ), φ(~y )}(~t) to the first-order formula: φ(~y /~t) ∧ ∀~z φ(~y /~z) → P (~t, ~z) So the variables from ~y that appear free in φ are assigned to the respective terms and variables from ~t and ~z, with the latter being employed in order to ensure the minimality of the former through the new predicate symbols P . The following list shows some of the examples of Sections 2 and 4 expressed in FOL using this translation: 2. “Normal U GSs attend only U GCs” That is, for the most normal pairs (d1 , d2 ) according to the attend relation, such that d1 is an U GS, everything d1 attends is an U GC. In LN : ∀x1 , x2 {Attend(y1 , y2 ), U GS(y1 )}(x1 , x2 ) → ∀x3 Attend(x1 , x3 ) → U GC(x3 ) 1. Birds (B) normally fly (F ) ∀x B(x) ∧ ∀z B(z) → F (x, z) → F (x) 2. Penguins (P ) normally do not fly ∀x P (x) ∧ ∀z P (z) → F (x, z) → ¬F (x) 3. “Normal U GSs attend some U GC” That is, for the most normal pairs (d1 , d2 ) according to the attend relation, such that d1 is an U GS, there exists an U GC that d1 attends. In LN : ∀x1 , x2 {Attend(y1 , y2 ), U GS(y1 )}(x1 , x2 ) → ∃x3 Attend(x1 , x3 ) ∧ U GC(x3 ) 3. Elephants normally like their keepers ∀x1 , x2 E(x1 ) ∧ K(x2 ) ∧ ∀z1 , z2 E(z1 ) ∧ K(z2 ) → L (x1 , x2 , z1 , z2 ) → L(x1 , x2 ) 4. Children normally like their parents 4. “Normally U GSs attend U GCs” ∀x1 , x2 P (x2 , x1 ) ∧ ∀z1 , z2 P (z2 , z1 ) → L (x1 , x2 , z1 , z2 ) → L(x1 , x2 ) Analogous to Equation 3, we have: ∀x1 , x2 {Attend(y1 , y2 ), U GS(y1 ) ∧ U GC(y2 )}(x1 , x2 ) → Attend(x1 , x2 ) 5. People that trust themselves are normally daring ∀x T (x, x) ∧ ∀z T (z, z) → D (x, z) → D(x) These examples illustrate the wealth of expressiveness in our logic and present a contrast to the more limited expressiveness of current approaches in the literature. 5 6. John’s pets are normally happy ∀x HasP et(John, x) ∧ ∀z HasP et(John, z) → H (x, z) → H(x) Characterization and Properties In this section we provide a characterization of our new logic through a translation into standard FOL. As well, we present some notable properties and briefly compare our approach to other well-known systems from the literature. First, we show how to encode our approach in standard FOL, via the introduction of a new set of predicate symbols representing the preference orderings. Then, we express the pfcs inside the language using these new predicate symbols. This translation then serves as a syntactic counterpart As we see, everything presented so far can be expressed using standard FOL without the need to enhance the models with preference orderings or the syntax with pfcs. Instead, a new set of predicate symbols together with a translation of pfcs suffice. Next, we present the formal translation as well as a characterization theorem. 5 Alternatively, we could have provided an axiomatisation of our new construct and directly proven a soundness and completeness result. This is done in (Brafman 1997), where a conditional logic is developed based on orderings over individuals, but for each n ∈ N there is a single ordering on n-tuples; see Section 7. We feel that the given translation is at least as informative as an axiomatisation, while being more straightforward to obtain. 4 This example is a type of assertion that might occur unconditionally in a description logic TBox. The fact that there are (at least) four corresponding nonmonotonic (normality) assertions indicates that a fully general approach to defeasibility in description logics may require substantial expressive power. See Section 7 for a further discussion. 124 5.1 τ ~ τ} 3. ~t I ,v ∈ {d~ ∈ (Dτ )n | Mτ , v |= ψ(~y /d) τ τ 4. ∀~e ∈ (Dτ )n if Mτ , v |= ψ(~y /~e) then (~t I ,v , ~e) ∈ τ (P )I Translation into FOL We first extend P with a new set of predicate symbols P = {P | P ∈ P}. Let P + = P ∪ P and let L+ be the extension of L with P + . From 3. it immediately follows that: τ 5. Mτ , v |= ψ(~y /~t) Definition 6. Given the syntax of LN , the translation τ : LN → L+ is defined as follows: τ 1. P (t1 , . . . , tn ) = P (t1 , . . . , tn ) 2. (t1 = t2 )τ = (t1 = t2 ) 3. (¬φ)τ = ¬φτ 4. (φ → ψ)τ = φτ → ψ τ 5. (∀xφ)τ = ∀xφτ τ τ 6. {P (~y ), φ(~y )}(~t) = φ(~y /~t) ∧ τ ∀~z φ(~y /~z) → P (~t, ~z) Next, let ~z be a random tuple of variables and let: τ 6. Mτ , v |= ψ(~y /~z) Let us also assume that v(~z) = ~e ∈ (Dτ )n . It immediately follows: τ 7. Mτ , v |= ψ(~y /~e) τ From 4., 7. and the fact that ~e = v(~z) then (~t I ,v , v(~z)) ∈ τ (P )I which is equivalent to: As for the semantics, the language L+ is interpreted over the usual models of FOL. Regarding the relation between the models of LN and the models of L+ , we can define a similar translation τ from the former into the latter. 8. Mτ, v |= P (~t, ~z) From 6., 8. and the fact that ~z was a random tuple, we have that: τ 9. Mτ , v |= ∀~z ψ(~y /~z) → P (~t, ~z) Definition 7. For a given model M = hD, I, Oi of LN , the model Mτ = hDτ , I τ i of L+ is defined as follows: 1. Dτ = D τ 2. for every constant symbol c ∈ C: cI = cI ∈ D 3. for every n-ary predicate symbol P ∈ P: τ • P I = P I ⊆ Dn τ ~ ~e) ∈ Dn × Dn | d~ P I ~e } • (P )I = {(d, Finally, 5. and 9. give: τ τ Mτ , v |= ψ(~y /~t) ∧ ∀~z ψ(~y /~z) → P (~t, ~z) τ By definition of τ then we get Mτ , v |= {P (~y ), ψ(~y )}(~t) , i.e., Mτ , v |= φτ . The reverse procedure gives the other direction as well: if Mτ , v |= φτ we end up with 3. and 4. which, by (IH) and the definition of Mτ , are equivalent to M, v |= φ. It is easy to see that Mτ interprets all new predicate symbols P as (well-founded) total preorders, in the following sense. Proposition 1. Let Mτ be a model of L+ according to Definition 7 and P ∈ P + . The following hold: Through this characterization result we can move from LN into L+ and use the known machinery of standard FOL when evaluating formulas in LN . 1. Mτ |= ∀~x P (~x, ~x) 2. Mτ |= ∀~x, ~y , ~z P (~x, ~y ) ∧ P (~y , ~z) → P (~x, ~z) 3. Mτ |= ∀~x, ~y P (~x, ~y ) ∨ P (~y , ~x) 5.2 Given this translation τ on formulas and models, we obtain the following characterization of LN through L+ . Properties We now examine some properties of our logic, starting with the fact that we can reason about defaults directly within our framework. A representative example of this property is showcased in the next proposition. Theorem 1. Let φ and M be a wff and a model of LN , respectively, and let v be a variable map. Then: Proposition 2. Let Φ = {φ1 , φ2 , φ3 } be a KB where: 1. φ1 = ∀x P (x) → B(x) “All penguins are birds” 2. φ2 = ∀x {F (y), B(y)}(x) → F (x) “Birds normally fly” 3. φ3 = ∀x {F (y), P (y)}(x) → ¬F (x) “Penguins normally do not fly” M, v |= φ iff Mτ , v |= φτ Proof. The proof follows by induction on the construction of φ. We only present the step for pfcs: Consider φ = {P (~y ), ψ(~y )}(~t) and that the Induction Hypothesis (IH) holds for ψ. We have that M, v |= φ iff: ~t I,v ∈ min(P I , ψ(~y )M,v ) Let us also assume that ~y is a tuple of n variables. By definition then: ~ 1. ~t I,v ∈ {d~ ∈ Dn | M, v |= ψ(~y /d)} 2. ∀~e ∈ Dn if M, v |= ψ(~y /~e) then ~t I,v P I ~e Furthermore, consider the following sentence: 4. ψ = ∀x P (x) → ¬{F (y), B(y)}(x) “Penguins are not normal birds with respect to flying” Then Φ |= ψ is derivable in LN . By (IH) and the definition of Mτ then we also have: 125 6 This is quite an important characteristic of LN since reasoning about defaults within the logic is not possible with many other approaches, e.g. default logic or circumscription. Next, we move on to compare the logic LN to the wellknown KLM systems (Kraus, Lehmann, and Magidor 1990; Lehmann and Magidor 1992). We start by noting that, like most of the approaches that employ preference orderings (either between worlds or between elements of a domain), KLM rely on a single ordering. This is in contrast to our multiple orderings and the fact that we can use multiple pfcs, each one associated with a different ordering, inside the same expression. Consider, e.g., the following instance of the KLM postulate of Right Weakening: from |= F ly → M obile and Bird p∼ F ly infer Bird p∼ M obile. The way we would express this instance of RW in LN would be the following: ∀x F (x) → M (x) ∧ ∀x {F (y), B(y)}(x) → F (x) → ∀x {M (y), B(y)}(x) → M (x) where F abbreviates F ly, M abbreviates M obile and B abbreviates Bird. This formula is not valid in our logic since the two pfcs refer to two different orderings (corresponding to F ly and M obile). The same holds for any other KLM postulate apart from Reflexivity. This is because we have not imposed any relationship between the different orderings or attempted to combine them in any way. We could impose, e.g., the following condition between two orderings: whenever ∀~x (P (~x) → Q(~x)) we also have that P I ⊆QI which would make the previous formula valid in our logic. One could propose such restrictions in our models (and more specifically in the set O) but this is not our intention here. However, if we introduce a new predicate symbol G that corresponds to a global ordering we get the following. Proposition 3. The KLM postulates articulated using only the (global) ordering associated with predicate G are valid in LN . It immediately follows that our approach is at least as expressive as the KLM systems. Corollary 1. Any proof wrt the KLM systems can be transformed into a proof in LN . We end this section by presenting some properties of pfcs in the next proposition, with the names suggesting similar properties that have appeared in the literature. Proposition 4. The following formulas are valid in LN : 1. REF: ∀~x {P (~y ), φ(~y )}(~x) → φ(~y /~x) To this point, in presenting LN , we have dealt with a monotonic formalism. We now examine nonmonotonic reasoning in LN and explore how default inferences can be obtained. Our investigations are still preliminary and are on the semantic level, i.e., we work with models. Nevertheless, a syntactic approach is also in the works and employs an extension of the Closed World Assumption for nonmonotonic reasoning. The goal then will be to provide a correspondence between the two approaches (syntactic and semantic). We present the latter here which, similar to (McCarthy 1980; Shoham 1987; Kraus, Lehmann, and Magidor 1990), employs preferences between the models of LN . We start by restricting our models a bit further so that they only contain orderings without infinitely-ascending chains of elements, i.e., our orderings are “upwards” well-founded as well. We then proceed with the following definitions. Definition 8. Let M be a model of LN and r ∈ O. The set mink (r ) is defined inductively as follows: 1. min1 (r ) = {d~ ∈ Dn | ∀~e ∈ Dn : d~ r ~e } 2. mink+1 (r ) = {d~ ∈ Dn | ∀~e ∈ Dn \ d~ r ~e } {P (~y ), (φ ∧ ψ)(~y )}(~x) 5. OR: ∀~x {P (~y ), (φ ∨ ψ)(~y )}(~x) → {P (~y ), φ(~y )}(~x) ∨ {P (~y ), ψ(~y )}(~x) k S minn (r ) : n=1 Intuitively, the set mink (r ) denotes the k-th least set of r -equivalent elements in the ordering r . Using these sets, we can define a preference between orderings on the same relation r as follows. Definition 9. Let M = hD, I, Oi and M′ = hD, I ′ , O′ i be two models of LN with r ∈ O and ′r ∈ O′ . We say that r is lexicographically preferred to ′r iff ∃n ∈ N such that: 1. mink (′r ) = mink (r ) ∀ k ∈ {1, . . . , n − 1} 2. minn (′r ) ⊂ minn (r ) Given the lexicographic preference between two orderings, we can now generalize our definition to a preference between models. Definition 10. Let M = hD, I, Oi and M′ = hD, I ′ , O′ i be two models of LN . We say that M is preferred to M′ , viz. M < M′ , iff for every P ∈ P we have that P I is lexicographically preferred to ′P I′ . 2. RCE: ∀~x (φ(~x) → ψ(~x)) → ∀~x {P (~y ), φ(~y )}(~x) → ψ(~y /~x) 3. LLE/RCEC: ∀~x (φ(~x) ≡ ψ(~x)) → ∀~x {P (~y ), φ(~y )}(~x) ≡ {P (~y ), ψ(~y )}(~x) 4. AND: ∀~x {P (~y ), φ(~y )}(~x) ∧ {P (~y ), ψ(~y )}(~x) → Default Inference in LN Next, we define the minimal models of a KB, which will be our main tool for drawing default inferences. Definition 11. Let Φ and M be a KB and a model of LN , respectively. M is a minimal model of Φ iff M is a model of Φ and there is no model M′ of Φ such that M′ < M. Using the above, the next definition shows how to obtain default inferences in LN . 126 7 Definition 12. Let Φ and φ be a KB and a sentence of LN , respectively. We say that Φ entails φ by default iff M |= φ for all minimal models M of Φ. 7.1 Proposition 5. Let M be a model of LN and r ∈ O. Then: 1. either ∀k ∈ N mink (r ) 6= ∅ 2. or ∃n ∈ N: • ∀k ∈ {1, . . . , n} mink (r ) 6= ∅ • ∀k > n mink (r ) = ∅ This means that the sets mink (r ), the preference between two orderings, and the preference between two models are all well-defined. Furthermore, a KB has a model iff it has a minimal one and, similar to monotonic inferences, inconsistent KBs entail all sentences by default. We conclude this section with a showcase of how specificity, inheritance and irrelevance, the three principles that we highlighted in Section 2, are handled in LN . Corollary 2. Let Φ = {φi | 1 ≤ i ≤ 6} be a KB where: 1. φ1 = B(T weety) ∧ Y (T weety) “Tweety is a yellow (Y ) bird” 2. φ2 = P (Opus) 3. φ3 = ∀x P (x) → B(x) “All penguins are birds” 4. φ4 = ∀x {F (y), B(y)}(x) → F (x) “Birds normally fly” 5. φ5 = ∀x {W (y), B(y)}(x) → W (x) “Birds normally have wings (W )” “Penguins normally do not fly” Such a conditional does not seem to capture accurately the meaning of the original expression, as argued in Section 2. However, we are able to capture Brafman’s approach in ours, provided the formula φ of “φ →~x ψ” has free variables only among ~x and there are no iterated occurrences of “→~x ”. More precisely, we can consider the class of models in our approach in which there is a single ordering for each arity n, say Un . Then, we can use the formula: ∀~y {Un (~x), φ(~x)}(~y ) → ψ(~x/~y ) 6. φ6 = ∀x {F (y), P (y)}(x) → ¬F (x) Related Work Our view that normality is relative to a property such as f ly was anticipated by work in circumscription, in particular in its use of Ab predicates (McCarthy 1986). (Otherwise the approaches have little in common.) Conditional approaches to assertions of normality are generally propositional; first-order approaches include (Delgrande 1998; Kern-Isberner and Thimm 2012). A predecessor to our work, in a full first-order setting, is Brafman’s (1997) approach to conditional statements. There, conditional statements of the form “if φ then normally ψ” are written as “φ →~x ψ” with the intuition being that the minimal tuples of the domain that make φ true also make ψ true. There are two main differences between the “φ →~x ψ” no tation and our corresponding “∀~x {P (~y ), φ(~y )}(~x) → ψ ” notation that make the latter more expressive. The first difference comes from the fact that we employ multiple orderings, which gives a more nuanced approach. In (Brafman 1997) it is not possible to have an individual that is normal in some respect (say, nest building) while abnormal in another (like flying). Secondly, our approach allows more expressive formulas, as we have seen in the sequence of examples in Section 4. As well, consider the “adults are normally employed at a company” example, which Brafman would write as: Adult(x) →x ∃y EmployedAt(x, y) ∧ Company(y) Before moving to a final example we note that, since the orderings r are well-founded in both directions, the following proposition holds. “Opus is a penguin” Related and Future Work to represent Brafman’s assertion φ(~x) →~x ψ. Furthermore, combining this translation with the method described in Section 5.1 implies that the approach of (Brafman 1997) can also be expressed in standard FOL. More recently there has been work in Description Logic (Baader et al. 2007) that deals with the representation and reasoning of defeasible assertions. The literature on socalled defeasible DLs is large and most of the established approaches to nonmonotonic reasoning (like default logic, circumscription or the rational closure) have been adapted for the DL setting; see for instance (Baader and Hollunder 1992; Bonatti, Lutz, and Wolter 2009; Giordano et al. 2015). Nevertheless, there have recently been interesting new proposals that relate to our work here. First, driven by the need to overcome problems like the inheritance of properties in the presence of exceptions, multiple orderings have also been considered in (Gliozzi 2016; Giordano and Gliozzi 2019) to account for different rankings between individuals, each corresponding to a particular aspect (like F ly or BuildN est). However, although multiple Then Φ entails the following sentences by default: 1. ψ1 = ¬F (Opus) “Opus does not fly” 2. ψ2 = W (Opus) “Opus has wings” 3. ψ3 = F (T weety) ∧ W (T weety) “Tweety flies and has wings” So Opus, being both a bird and a penguin, is concluded to not fly by ψ1 (specificity) but to have wings by ψ2 (inheritance) since it is an exceptional bird wrt flying but inherits any other typical property of birds. Then ψ3 (irrelevance) shows that Tweety, being a yellow bird, is still concluded to fly and have wings since being yellow is irrelevant wrt those two properties. 127 orderings are considered in the semantics, only one “typicality” (in practice minimality) operator is employed in the syntax and there is no corresponding syntactic construct like our pfcs. Furthermore, their multiple orderings are employed only among individuals (and not tuples) and the use of typicality operators is limited, being only allowed on the left side of a subsumption axiom. This results in an interesting, but less expressive, representation of defaults, as opposed to that developed here. Similar to the previous approach, but closer to ours, is the work in (Gil 2014), where the author takes into account multiple typicality operators. This work however suffers from similar limitations regarding the scope of the orderings and the limited employment of these typicality operators. A further limitation is the lack of any association between its operators/orderings and any relations or aspects. A third line of work, originating from an approach in (Britz and Varzinczak 2016) to define orderings not only among individuals but also among tuples, culminated in interesting recent developments regarding defeasible reasoning in DLs (Varzinczak 2018; Britz and Varzinczak 2019). An important characteristic of this work is that the orders on individuals are derived from the ones specified by the roles, i.e., they do not correspond to any concepts like in the previously mentioned (and our) work. This results in (contextual) defeasible subsumption needing specific role names to be subscripted in order to specify the origin of the order that will be employed, something that we do within the language by means of the pfcs. 7.2 example that we saw at the end of Section 4. No approach in the literature can adequately handle the various interpretations we gave in Section 4, especially in a DL setting. Our goal then for future work is to try and express these interpretations in DL terms in a way that would semantically correspond to the intended formulas of LN . Whereas the question of how to syntactically express such assertions is certainly non-trivial, we believe that the current framework could provide a basis for (more elaborately) dealing with defeasibility in DLs. The biggest advantage perhaps will be that the properties we presented, both the KLM postulates as well as defeasible principles like specificity, inheritance and irrelevance, will continue to hold in any DL language (consider, e.g., Corollary 2 adapted for such a DL). The combination of employing multiple orderings in the domain of any interpretation together with using LN and its pfcs to interpret the new default concept inclusions seems to overcome the difficulties of the established approaches as well as allow a more “informed” representation of defaults in any DL language. Apart from DLs, we plan to expand on this work in the future in a number of directions. First, a natural extension would be to allow quantifying into a pfc. This would allow an assertion such as “each elephant normally likes its keeper”, which is somewhat different from our previous example. Moreover, by allowing quantifying into a pfc, we would be able to encode nested default assertions, such as “profs that (normally) give good lectures are (normally) liked by their students”. Second, we plan to allow for complex expressions in a pfc, and so allow a predicate expression in place of P in {P (~y ), φ(~y )}. Last, as we already mentioned in Section 6, a thorough treatment and examination of nonmonotonic reasoning in LN is also in the works. Future Work One goal in our work is to extend the approach to a DL setting. In the following we present some preliminary ideas behind such an extension. Consider again the assertion “birds fly” which in a (non-defeasible) DL language is expressed by the concept inclusion Bird ⊑ F ly, whereas in a defeasible DL it could be expressed as T(Bird) ⊑ F ly or Bird ❁ ∼ F ly among others. We propose to express the same concept inclusion, perhaps through some extended syntax, in such a way that its structure will invoke the use of the pfcs from our setting. In this specific assertion, e.g., the “new” DL expression would be semantically equivalent to the LN -formula: ∀x {F ly(y), Bird(y)}(x) → F ly(x) (8) 8 Conclusion We have presented a new and well-behaved approach to representing default assertions through an expressive language and novel formalism. This approach takes the position that normality is not an absolute characteristic of an individual, but instead is relative to a property (or, in general, relation). This is achieved via an extension to the language of FOL, along with an enhancement to models in FOL; a subsequent result however shows that the approach may be embedded in standard FOL. The approach allows for a substantially more expressive language for representing default information than previous approaches. Moreover, we show that the approach possesses quite natural and desirable features and satisfies the standard KLM properties. With a variety of future directions and promising possible applications, like the one we briefly discussed for the DL setting, we believe the current framework presents an interesting new approach to representing and reasoning about defaults as well as obtaining “well-behaved” nonmonotonic reasoning in general. That is, we are interested in the minimal elements that satisfy the left side of the inclusion, similar to some of the aforementioned DL approaches, while also specifying the preference ordering we want to employ. This means that the domain of any given interpretation would once again be enhanced with preference orderings and the new syntax would somehow indicate the preference ordering that is used inside a (default) concept inclusion. In other words, whereas the inclusion Bird ⊑ F ly would be interpreted as Bird I ⊆ F ly I , its default version would translate into and follow the same semantics of Equation 8, being instead interpreted (roughly) as min(F lyI , BirdI ) ⊆ F ly I . As for more complex and ambiguous statements, consider the “undergraduate students attend undergraduate courses” Acknowledgements We thank the reviewers for their helpful comments. Financial support was gratefully received from the Natural Sciences and Engineering Research Council of Canada. 128 References Kraus, S.; Lehmann, D.; and Magidor, M. 1990. Nonmonotonic reasoning, preferential models and cumulative logics. Artificial Intelligence 44(1-2):167–207. Lamarre, P. 1991. S4 as the conditional logic of nonmonotonicity. In Proceedings of the Second International Conference on the Principles of Knowledge Representation and Reasoning, 357–367. Lehmann, D., and Magidor, M. 1992. What does a conditional knowledge base entail? Artificial Intelligence 55(1):1–60. McCarthy, J. 1980. Circumscription – a form of nonmonotonic reasoning. Artificial Intelligence 13:27–39. McCarthy, J. 1986. Applications of circumscription to formalizing common-sense knowledge. Artificial Intelligence 28:89–116. Mendelson, E. 2015. Introduction to Mathematical Logic. CRC Press, 6th edition. Moore, R. 1985. Semantical considerations on nonmonotonic logic. Artificial Intelligence 25:75–94. Reiter, R., and Criscuolo, G. 1981. On interacting defaults. In Proceedings of the International Joint Conference on Artificial Intelligence, 270–276. Reiter, R. 1980. A logic for default reasoning. Artificial Intelligence 13(1-2):81–132. Shoham, Y. 1987. A semantical approach to nonmonotonic logics (extended abstract). In Symposium on Logic in Computer Science, 275–279. Varzinczak, I. 2018. A note on a description logic of concept and role typicality for defeasible reasoning over ontologies. Logica Universalis 12(3-4):297–325. Baader, F., and Hollunder, B. 1992. Embedding defaults into terminological knowledge representation formalisms. In Nebel, B.; Rich, C.; and Swartout, W., eds., Proceedings of the Third International Conference on the Principles of Knowledge Representation and Reasoning, 306–317. Baader, F.; Calvanese, D.; McGuinness, D.; Nardi, D.; and Patel-Schneider, P., eds. 2007. The Description Logic Handbook. Cambridge: Cambridge University Press, second edition. Bonatti, P. A.; Lutz, C.; and Wolter, F. 2009. The complexity of circumscription in description logic. Journal of Artificial Intelligence Research 35:717–773. Boutilier, C. 1994. Conditional logics of normality: A modal approach. Artificial Intelligence 68(1):87–154. Brafman, R. I. 1997. A first-order conditional logic with qualitative statistical semantics. Journal of Logic and Computation 7(6):777–803. Britz, K., and Varzinczak, I. J. 2016. Introducing role defeasibility in description logics. In Logics in Artificial Intelligence - 15th European Conference, JELIA, 174–189. Britz, K., and Varzinczak, I. 2019. Contextual rational closure for defeasible ALC. Annals of Mathematics and Artificial Intelligence 87(1-2):83–108. Delgrande, J. 1987. A first-order conditional logic for prototypical properties. Artificial Intelligence 33(1):105–130. Delgrande, J. 1998. On first-order conditional logics. Artificial Intelligence 105(1-2):105–137. Enderton, H. 1972. A Mathematical Introduction to Logic. Academic Press. Fariñas del Cerro, L.; Herzig, A.; and Lang, J. 1994. From ordering-based nonmonotonic reasoning to conditional logics. Artificial Intelligence 66(2):375–393. Gil, O. F. 2014. On the non-monotonic description logic alc+tmin . CoRR abs/1404.6566. Giordano, L., and Gliozzi, V. 2019. Reasoning about exceptions in ontologies: An approximation of the multipreference semantics. In Kern-Isberner, G., and Ognjanovic, Z., eds., Symbolic and Quantitative Approaches to Reasoning with Uncertainty, ECSQARU, volume 11726, 212–225. Springer. Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. L. 2015. Semantic characterization of rational closure: From propositional logic to description logics. Artificial Intelligence 226:1–33. Gliozzi, V. 2016. Reasoning about multiple aspects in rational closure for DLs. In Adorni, G.; Cagnoni, S.; Gori, M.; and Maratea, M., eds., AI*IA 2016: XVth International Conference of the Italian Association for Artificial Intelligence, volume 10037 of Lecture Notes in Computer Science, 392– 405. Genova, Italy: Springer. Kern-Isberner, G., and Thimm, M. 2012. A ranking semantics for first-order conditionals. In Proceedings of the European Conference on Artificial Intelligence, 456–461. IOS Press. 129 Probabilistic Belief Fusion at Maximum Entropy by First-Order Embedding Marco Wilhelm1 , Gabriele-Kern-Isberner2 1,2 Department of Computer Science, TU Dortmund University, Dortmund, Germany 1 marco.wilhelm@tu-dortmund.de, 2 gabriele.kern-isberner@cs.tu-dortmund.de Example 1. Consider a doctor who comes to the conclusion that a symptom s is an indicator for a disease d with probability 0.9 which she formalizes in a probabilistic conditional (d|s)[0.9] (“if s holds, then d holds with probability 0.9”) while her colleague is more skeptical and assigns the probability 0.8 to the same conditional, (d|s)[0.8]. If we confide in both doctors, we do not want to reject one’s opinion but exploit both in order to obtain a unified view on this issue. Obviously, both conditionals cannot be satisfied at the same time and Rdoc = {(d|s)[0.9], (d|s)[0.8]} is inconsistent. Hence, there is a more sophisticated approach needed to combine both doctors’ views than purely joining them by set union. Instead, it seems to make sense to derive some kind of mean value of the two probabilities 0.9 and 0.8. (b) The first-order setting allows one to ask and answer more complex queries than the initial propositional setting. And most importantly, (c) the semantical freedom of first-order conditionals can be easily used to define different belief fusion operators. While we represent the beliefs of single reasoners by propositional conditionals (B|A)[p] with the meaning “if A holds, then B follows with probability p,” we translate them into first-order conditionals when merging. This enables a (fictive) decision maker to differentiate between the viewpoints of the single reasoners. Ground instantiated first-order conditionals (B(i)|A(i))[p] express that “B follows from A with probability p in the view of reasoner ri ,” while open first-order conditionals (B(X)|A(X))[p] stand for “B follows from A with probability p in the consolidated view of the group of reasoners.” By doing so, the semantics of open conditionals affects (or reflects; depending on your point of view) the reasoning behavior of the decision maker. Eventually, the application of the principle of maximum entropy to the merged belief base completes missing probability values in order to obtain a whole belief state while adding as less information as possible. In our opinion, this methodology perfectly fits to the mission of the decision maker as she should not contribute own beliefs but her task is to process the beliefs of the reasoners as unbiasedly as possible. In summary, the main contribution of this paper is the presentation of a framework for generating belief fusion opera- Abstract Belief fusion is the task of combining beliefs of several reasoners such that the outcome reflects the consensual opinions of the group of reasoners properly. We consider the case in which the beliefs are formalized by probabilistic conditional statements of the form “if A holds, then B follows with probability p” where A and B are propositions and present a formal framework for generating belief fusion operators that deal with such probabilistic beliefs. For this, we translate the beliefs of the reasoners into first-order conditionals and apply the principle of maximum entropy to the merged beliefs in order to observe a consolidated belief state. By varying the semantics of first-order conditionals, it is possible to generate different belief fusion operators. We prove that well-known belief fusion operations like linear and logarithmic pooling of maximum entropy distributions can be reproduced with our approach, while it can also be used to generate novel operators. 1 Introduction Judgment aggregation (Grossi and Pigozzi 2014) is a rapidly growing research area which addresses the problem of combining judgments of several individuals when their common opinion on a certain issue is in demand. It has practical applications in domains such as economics, philosophy, political science, law, and medicine. In the subfield of probabilistic aggregation one is interested in a “probability assignment to a given set of propositions on basis of the group members’ individual probability assignments” (List 2012) which shifts this research topic into the field of belief fusion (Bloch et al. 2001; Dubois et al. 2016) as uncertain probabilistic beliefs have to be combined. In this paper, we present a novel approach for merging probabilistic belief bases based on a first-order translation of beliefs and apply the principle of maximum entropy (Paris 2006) to the merged belief base in order to infer an aggregated belief state. The first-order embedding mainly brings three advantages with it: (a) While a simple merging of the belief bases of several reasoners typically causes conflicts (inconsistencies in the merged belief base; cf. Example 1), our approach guarantees consistency by a syntactic separation of beliefs of different reasoners. 130 tors based on first-order embedding which comes along with an inherent connection between probabilistic belief fusion and first-order semantics. The paper is organized as follows: First, we give a brief insight into probabilistic belief fusion in general, followed by a short discussion of pooling maximum entropy distributions in a propositional setting. Afterwards, we switch to the first-order level, introduce different semantics for first-order conditionals, and present our approach for belief base merging based on first-order embedding and apply the principle of maximum entropy. We elaborate on different first-order semantics and their connection to belief fusion operators, compare our approach with related work, and conclude. 2 R1 , . . . , Rn i=1 inductive inference inductive inference (belief states of n reasoners) opinion pooling P(R) (fused belief state) Figure 1: The two ways of processing social inferences. Probabilistic Belief Fusion Pi (ω) = n X µi · Pi (ω), A more general setting is dealt with in social inference processes (Wilmers and Jensen 2010; Adamcik 2014; Wilmers 2015). In social inference processes, the reasoners do not contribute their whole belief state but a set of beliefs which is called belief base. In general, a belief base does not determine the belief state of the reasoner completely and missing information has to be inferred inductively. When starting with a family of belief bases R1 , . . . , Rn instead of belief states P1 , . . . , Pn , basically, there are two different ways of obtaining a fused belief state P. On the one hand, it is possible to inductively infer a belief state Pi (Ri ) from the belief base Ri for every reasoner ri independently and then apply an opinion pooling operator on P1 (R1 ), . . . , Pn (Rn ). This two stage process is called obdurate merging (Adamcik 2014; Wilmers 2015). On the other hand, it is possible to merge the belief bases to a unified belief base R first and to infer a belief state P(R) from R afterwards (cf. Figure 1). As for opinion pooling, for belief base merging (Konieczny and Pérez 2011) there is not one generally accepted strategy but there are many different competing approaches. For probabilistic inductive inferences, the principle of maximum entropy (Shannon and Weaver 1949; Paris 2006) provides a well-founded methodology. The maximum entropy distribution for a belief base R is the probability distribution which satisfies all beliefs in R while adding as less information as possible. In (Paris 1999) it is shown that the maximum entropy distribution is the only probability distribution which satisfies a couple of fundamental principles from commonsense reasoning. Accordingly, there has been expended some effort in obdurate merging at maximum entropy (Wilmers and Jensen 2010; Adamcik 2014; Wilmers 2015), i.e., processing social inferences following the way “down right” in Figure 1. As opposed to this, the interaction of merging belief bases first and applying the maximum entropy principle afterwards has not been investigated satisfactorily yet (the way “right down” in Figure 1). In this paper, we want to provide insights into the second way of processing social inferences. In detail, we present a novel approach for merging belief bases which uses a firstorder translation of beliefs. Then, we apply the principle of maximum entropy to the merged beliefs and draw infer- ω ∈ Ω, i=1 logarithmic pooling (Bacharach 1972) maps probabilities to a (normalized) weighted geometric mean, Qn n M P (ω)µi i=1 Qn i Pi (ω) = P , ω ∈ Ω. ′ µi ω ′ ∈Ω i=1 Pi (ω ) i=1 The weights µ1 , . . . , µn usually satisfy µi ≥ 0, R (aggregated belief base) P(R1 ), . . . , P(Rn ) Belief fusion (Bloch et al. 2001; Dubois et al. 2016) addresses the task of aggregating beliefs of several reasoners when their common opinion is in demand. The usual way of aggregating beliefs in form of probability assignments is opinion pooling (Dietrich and List 2017; Genest and Zidek 1986). In opinion pooling one assumes that the reasoners (say, ri for i = 1, . . . , n) contribute their whole belief state which is formalized by a probability distribution Pi over a set of possible worlds Ω. The probability of a possible world ω ∈ Ω expresses the reasoner’s degree of belief in whether ω formalizes the real world (in relation to the other possible worlds) accurately or not. For all reasoners, the set of possible worlds is assumed to be the same in order to guarantee comparability of the probabilities. As there is no obvious solution to the opinion pooling problem, which is finding a mapping from the single belief states to a consensual belief state that reflects the opinions of the group of reasoners best, various properties are declared in order to determine what is a ‘good’ opinion pooling operation (see, e.g., (Dietrich and List 2017; Genest and Zidek 1986)). The most prominent approaches map probabilities to some kind of mean values: While linear pooling (Stone 1961; McConway 1981) maps probabilities to a (normalized) weighted arithmetic mean, n M merging (belief bases of n reasoners) i = 1, . . . , n, and n X µi = 1 i=1 and regulate the impact of the single belief states on the aggregated one. They can be understood as a measure of how much an external decision maker trusts in the particular reasoners when fusing their beliefs. If nothing is known about the reliability (or expertise) of the reasoners, the weights should equal 1/n. 131 of c in the presence of a and b when reasoning over all models of Rcp . More precisely, for each probability p ∈ [0, 1], there is a model of Rcp in which (c|ab)[p] holds. At the same time, it is reasonable to assume that the probability p of (c|ab)[p] is at least 0.7 (unless a and b weaken each other their strength of evidence for c, which is possible but certainly not to be assumed by default). Hence, for reasoning tasks, it is useful to select a single model of a consistent belief base R. Of course, this model should reflect the belief state of the reasoner with belief base R appropriately. Here, we rely on the aforementioned maximum entropy distribution (Paris 2006) which is formally defined by X P(ω) · log P(ω), ME(R) = arg max − ences which depend on the semantics of first-order conditionals. We show that our approach is expressive enough to produce well-known belief fusion operators that are defined via obdurate merging. In addition, our approach allows one to easily define novel belief fusion operators by varying the semantics of first-order conditionals. Due to the expressiveness of first-order logics, we are also able to formulate and answer more complex queries than in a purely propositional setting. Before we present our first-order embedding approach and highlight its benefits, we briefly recall obdurate merging at maximum entropy and discuss different semantics for our first-order setting. 3 Obdurate Merging at Maximum Entropy P|=R We start this section with a discussion of maximum entropy reasoning for a single reasoner who expresses her beliefs in form of probabilistic conditional statements where the convention 0 · log 0 = 0 applies. Note that, if R is consistent, ME(R) exists and is unique. It yields the non-monotonic inference relation “if A holds, then B follows with probability p” over a propositional language L. Afterwards, we recall the obdurate merging operators OLEP and OSEP (see, e.g., (Dietrich and List 2017; Adamcik 2014)) with which the beliefs of several reasoners can be fused. Let Σ = {a, b, c, . . .} be a finite set of propositions which can either be true or false. A formula in L(Σ) is a proposition or inductively defined by ¬A (negation), A ∧ B (conjunction), or A ∨ B (disjunction), where A, B ∈ L(Σ). The interpretation of formulas is as usual in propositional logics. To shorten mathematical expressions, we write A instead of ¬A, AB instead of A ∧ B, and ⊤ for any tautological formula like A ∨ A. Probabilistic conditionals are denoted by (B|A)[p] where A, B ∈ L(Σ) and p ∈ [0, 1] and formalize a reasoner’s degree of belief in B in the presence of A measured by the probability p. Finite sets of probabilistic conditionals R serve as belief bases. The semantics of probabilistic conditionals is based on probability distributions over possible worlds. Here, a possible world ω is a complete conjunction of literals, i.e., every proposition from Σ occurs in ω exactly once, either positive or negated. A probability distribution P over the set of all possible worlds Ω(Σ) is a model of a belief base R iff ∀(B|A)[p] ∈ R : P(A) > 0 ∧ where P(A) = X ω∈Ω(Σ) R |=ME (B|A)[p] iff ME(R)(B|A) = p. For example (cf. Example 2), Rcp |=ME (c|ab)[p] with p ≈ 0.908. Now, assume that there are several reasoners r1 , . . . , rn , each equipped with a belief base Ri , the common opinion of which is in demand. The obdurate merging operator OLEP(R1 , . . . , Rn )(ω) = n 1 X ME(Ri )(ω), = · n i=1 ω ∈ Ω(Σ), linearly pools the maximum entropy distributions for R1 , . . . , Rn and is called obdurate linear entropy process (Adamcik 2014; Dietrich and List 2017). The operator OSEP(R1 , . . . , Rn )(ω) = Qn ME(Ri )(ω)1/n i=1Q P = , n ′ 1/n ω ′ ∈Ω(Σ) i=1 ME(Ri )(ω ) ω ∈ Ω(Σ), logarithmically pools the maximum entropy distributions and is called obdurate social entropy process (Adamcik 2014; Dietrich and List 2017). Both operators satisfy a couple of desirable properties from opinion pooling and, hence, are important representatives of pooling operators (see (Adamcik 2014) for a comprehensive collection and comparison of properties). P(AB) = p, P(A) P(ω) for A ∈ L(Σ), ω|=A and where |= is the classical entailment relation. A belief base is consistent iff it has at least one model. Note that P(A) = P(A|⊤) for A ∈ L(Σ). Hence, probabilistic formulas are subsumed within this framework by identifying A[p] = (A|⊤)[p]. Due to the vast number of probability distributions over Ω(Σ), reasoning over all models of a consistent belief base is often very uninformative. Example 2. If Rcp = {(c|a)[0.7], (c|b)[0.9]}, i.e., a and b are evidence for c, nothing can be said about the likelihood 4 Maximum Entropy and First-Order Conditionals In preparation of our belief base merging approach which uses a first-order translation of beliefs, we briefly discuss some particularities of maximum entropy reasoning with first-order conditionals. The main difference to reasoning with propositional conditionals is that one has to specify a semantics of first-order conditionals with free variables. 132 P |=grnd r, iff ∀(B′ |A′ )[p] ∈ grnd(r) : Actually, we profit from this additional requirement when varying the semantics of first-order conditionals in order to produce different belief fusion operators later on. In this section, we consider a function-free first-order language FOL over the signature (Pred, Const) consisting of finite sets of predicates Pred and constants Const. Formulas in FOL are built by using the common connectives (∧, ∨, ¬) and quantifiers (∃, ∀). While constants and predicates are denoted with sans serif lowercase letters, we denote variables with uppercase letters. We use the same abbreviations for conjunction and negation as in the propositional case. If p is a predicate of arity n and c1 , . . . , cn are constants, the formula p(c1 , . . . , cn ) is called ground atom. The set of all ground atoms is denoted with Σfo = Σfo (Pred, Const). A ground literal is a ground atom or its negation. Possible worlds in Ω(Σfo ) are complete conjunctions of ground literals. Formulas in FOL can be instantiated by substituting each free variable by a constant (e.g., ∀X r(X, a) is an instance of ∀X r(X, Y)). The set of all instances of a formula A is denoted by inst(A) and the set of the free variables in A by var(A). Formulas without free variables are called closed. In analogy to the propositional case, probabilistic conditionals are expressions of the form (B|A)[p] with a probability p. In this context, however, A and B may be arbitrary first-order formulas from FOL. In particular, they may contain free variables. While the interpretation of closed conditionals, i.e. conditionals (B|A)[p] with closed formulas A, B, is obvious (P |= (B|A)[p] iff P(B|A) = p), conditionals with free variables can be interpreted in different ways. We recall three popular semantics of first-order conditionals from literature, namely the grounding semantics, the averaging semantics, and the aggregating semantics (KernIsberner and Thimm 2010; Thimm and Kern-Isberner 2012), and present two novel semantics as well. Beforehand, we introduce some further notations that are used in the definitions of the semantics. Let r = (B|A)[p] be a first-order conditional. Then, grnd(r) denotes the set of all proper groundings of r. Proper groundings are obtained by substituting each free variable that is mentioned in A or B by any constant from Const. For example, (d(1)|s(1))[p] and (d(2)|s(2))[p] are the proper groundings of (d(X)|s(X))[p] when Const = {1, 2}. Note that (d(1)|s(2))[p] is not a proper grounding as free variables that are mentioned in both A and B have to be substituted with the same constant in A and B. With verr (ω) and appr (ω) we count the numbers of proper groundings of r = (B|A)[p] that are verified respectively applicable in the possible world ω: P |= (B′ |A′ )[p]. The grounding semantics requires that for every proper grounding of a conditional statement the conditional probability is the same, namely p. Note that this is a rather strong constraint. The following semantics are more ‘smoothing’ and built upon mean values over all proper groundings instead. For example, the averaging semantics looks at the arithmetic mean of the probabilities of all groundings of r. Definition 2 (Averaging Semantics). Let r = (B|A)[p] be a conditional defined over FOL, and let P be a probability distribution over Ω(Σfo ). Then, P is an averaging-model of r, written X 1 P(B′ |A′ ) = p. · P |=avrg r, iff |grnd(r)| ′ ′ (B |A )[p]∈grnd(r) The idea behind the aggregating semantics is to mimic statistical probabilities from a subjective point of view: Not the relative frequency of the sum of the verifications of the instances of the conditional (spread over all possible worlds) is measured against the applicability of the conditional, but the reasoner’s beliefs in the eventuation of these instances are taken into account. Definition 3 (Aggregating Semantics). Let r = (B|A)[p] be a conditional defined over FOL, and let P be a probability distribution over Ω(Σfo ). Then, P is an aggregating-model of r, written X P(A′ ) > 0 P |=aggr r, iff (B′ |A′ )[p]∈grnd(r) P ′ ′ (B′ |A′ )[p]∈grnd(r) P(A B ) = p. and P ′ (B′ |A′ )[p]∈grnd(r) P(A ) Our two novel semantics are specially constructed for our belief fusion approach. In the approving semantics, the probabilities of the possible worlds are weighted with the relative frequency of the verifications of the conditional within the respective possible world. That is, probabilities get a higher weight the more proper groundings of the conditional are verified relative to the number of falsifications. Definition 4 (Approving Semantics). Let r = (B|A)[p] be a conditional defined over FOL, and let P be a probability distribution over Ω(Σfo ). Then, P is an approving-model of r, written X P |=appr r, iff fr (ω) · P(ω) = p, ω∈Ω(Σfo ) verr (ω) = |{(B′ |A′ )[p] ∈ grnd(r) | ω |= A′ B′ }|, where appr (ω) = |{(B′ |A′ )[p] ∈ grnd(r) | ω |= A′ }|. It is 0 ≤ verr (ω) ≤ appr (ω) ≤ |grnd(r)| for all conditionals r and possible worlds ω. Definition 1 (Grounding Semantics). Let r = (B|A)[p] be a conditional defined over FOL, and let P be a probability distribution over Ω(Σfo ). Then, P is a grounding-model of r, written fr (ω) = ( verr (ω) appr (ω) 0 iff appr (ω) > 0 . otherwise Our last semantics, the uniformity semantics, works on certain subclasses of first-order conditionals only. For simplicity, we concentrate on a Boolean fragment BOOL of FOL here. BOOL is the quantifier-free fragment of FOL 133 5 where Pred consists of unary predicates only. Hence, formulas in BOOL are Boolean combinations of unary predicates. We specify Const = {1, . . . , V n}. Then, possible worlds ω n can be decomposed into ω = i=1 ωi such that ωi contains those ground literals from ω that are instantiated with i. With A[i/j] we denote the formula A in which every occurance of the constant i is replaced by the constant j. We now present our approach to merging (propositional) belief bases R1 , . . . , Rn of n reasoners. The core idea of our approach is to lift the background language of the conditionals in Ri to a fragment of first-order logic. By doing so, the fictive decision maker which has access to the merged belief base is equipped with a more expressive language than the reasoners and is able to express statements about the opinions of the reasoners, i.e., statements of the form Definition 5 (Uniformity Semantics). Let r = (B|A)[p] be a conditional defined over BOOL, and let P be a probability distribution over Ω(Σfo ). Then, P is an uniformity-model of r, written X P |=unif r, iff “if A holds, then B follows with probability p in the view of reasoner ri ” P(ω)1/n > 0 and of the form ω∈Ω(Σbool ) ∀i=1,...,n: ωi [i/1]=ω1 ω1 |=A(1) “if A holds, then B follows with probability p in the view of the group of reasoners.” P and P(ω)1/n ω∈Ω(Σbool ) ∀i=1,...,n: ωi [i/1]=ω1 ω1 |=A(1)B(1) P P(ω)1/n This is a reasonable and natural extension of the language that meets the intention of the decision maker appropriately: The decision maker does not appear as an autonomous, standalone reasoning agent who revises her own beliefs but reflects and processes the opinions of the other reasoners. In nearly all approaches to belief base merging, the merged belief base R makes use of the same background language as the single belief bases that are merged instead. In our setting this would mean that R was a set of conditionals (B|A)[p] with A, B ∈ L(Σ). With this, the fictive decision maker with belief base R would be able to express the same statements as the reasoning agents but should assign aggregated probabilities to the statements in order to achieve a consensus. She would no longer be able to differentiate between the viewpoints of the reasoners. In particular, statements that involve opposing attitudes of several reasoners like “if the first doctor believes in disease d but the second does not, the decision maker/group of reasoners believes in the presence of symptom s with probability p” are not (directly) expressible. Further, it is a widely accepted but not an uncontroversial postulate of belief base merging Sn that merging should be performed by set union, R = i=1 Ri , if the union is consistent (Konieczny and Pérez 2011). This might be reasonable if, for example, the merged belief base belongs to a single reasoner who connects information from several sources that are formalized by the belief bases which are merged. However, in our setting, merging by set union is inappropriate, as it disregards the parts of the belief states of the single reasoners that are given only implicitly by the inference behaviors of the reasoners. We now discuss the technical aspects behind our merging approach. We lift the propositional language L(Σ) to a Boolean fragment of FOL, basically, by translating propositions from L(Σ) to unary predicates. While instantiations of these predicates correspond to propositions in view of a single reasoning agent, a predicate with a (free) variable represents a proposition in view of the group. This translation eventuates in a first-order signature (Const, Pred) consisting of the finite set of constants Const = {1, . . . , n} (the reasoners’ ids) and the finite set of predicates = p. ω∈Ω(Σbool ) ∀i=1,...,n: ωi [i/1]=ω1 ω1 |=A(1) We will further investigate and compare these semantics later on in the light of social inference processes. As minimal requirements for developing new semantics of first-order conditionals, one should guarantee that conditionals are evaluated to probability values and that closed conditionals are interpreted by conditional probabilities. We call semantics which satisfy these requirements well-behaved. All five aforementioned semantics are wellbehaved. In order to refer to an arbitrary semantics of first-order conditionals, we write |=sem . Hence, the subscript sem serves as a placeholder for grnd, avrg, and so on. In order to indicate that conditionals are interpreted under a certain semantics, we annotate the respective subscript also to probability distributions. For example, we write Paggr (B|A) when the conditional statement (B|A) is evaluated under the aggregating semantics. A probability distribution P is a sem-model of a firstorder belief base Rfo , i.e. a finite set of first-order conditionals, if it models all conditionals in Rfo with respect to the sem-semantics. Once a semantics sem is fixed, the maximum entropy distribution for Rfo is defined by ME(Rfo ) = arg max − P|=sem Rfo X Belief Base Merging by First-Order Embedding P(ω) · log P(ω). ω∈Ω(Σfo ) It remains to note that the maximum entropy distribution for arbitrary first-order belief bases and with respect to arbitrary semantics does neither need to exist nor need to be unique. However, we will show that in our concrete application the maximum entropy distribution will exist and will be unique with respect to all well-behaved semantics. Pred = {a/1 | a ∈ Σ}, 134 fo fo separation means that the union Rfo doc = Rdoc,1 ∪ Rdoc,2 of the reasoners’ belief bases is no longer necessarily inconsistent as the two conditionals in Rfo doc deal with the same issue but from different points of view, implemented by different syntactic elements. As we will see later on, this consistency preservation carries over to arbitrary (consistent) prior belief bases, and belief base merging can be performed by simply joining their first-order pendants. Definition 6 (First-Order Merging). The first-order merging Rfo of the belief bases R1 , . . . , Rn is defined by where all predicates are of arity 1. For simplicity, we name the predicates as their corresponding propositions but write them in sans serif letters. This leads to an easy-to-read translation while it still enables one to distinguish between propositional and first-order expressions. Hence, each proposition a ∈ Σ corresponds to an atom a(X). The set of ground atoms becomes Σfo = {a(i) | a ∈ Pred, i ∈ Const}. One can say that propositions are translated into several duplicates, one for each reasoner ri . Further, we define the first-order translation fo(·) of formulas from L(Σ) in a straightforward, recursive way by Rfo = Rfo (R1 , . . . , Rn ) := n G i=1 • fo(a) = a(X) for propositions a ∈ Σ, and Ri := n [ Rfo i . i=1 Although the merging operator ⊔ pretty much looks like ordinary set union, it differs from joining R1 , . . . , Rn as the fo fo first-order translations Sn R1 , . . . , Rn are joined instead, even in the case when i=1 Ri is consistent. This deviance is intended as already mentioned. A side product of our approach is that the belief base of a single reasoner ri can be recovered from the merged belief base Rfo by extracting those conditionals that mention ground atoms with constant i and by back-translating them into the propositional language L(Σ). • fo(¬A) = ¬fo(A), fo(A ∧ B) = fo(A) ∧ fo(B), and fo(A ∨ B) = fo(A) ∨ fo(B) for A, B ∈ L(Σ). In plain words, fo(A) is the formula A in which every proposition is replaced by its corresponding (non-grounded) atom from first-order logic. Again, with an easy readability in mind, we name the resulting formulas of a first-order translation with sans serif letters, i.e. fo(A) = A, similarly as we have done for propositions. The entirety of all possible first-order translations of formulas from L(Σ) forms a Boolean fragment of the first-order language FOL to which we refer as BOOL(Σ). Hence, fo(·) is a bijection between L(Σ) and BOOL(Σ). We denote the set of conditionals (B|A)[p] with A, B ∈ BOOL(Σ) with CondB (Σ). In order to merge belief bases R1 , . . . , Rn we compile them so that they fit into our first-order setting. For this, we substitute every conditional (B|A)[p] ∈ Ri with (B(i)|A(i))[p], where A(i) and B(i) are the first-order translations of A and B that are instantiated with the constant i. As such an instantiated formula A(i) means “A in the view of agent ri ,” the conditional (B(i)|A(i))[p] can be understood as the conditional (B|A)[p] in the view of reasoner ri . With this, we define the first-order translation of Ri by 6 Social Inference at Maximum Entropy Based on First-Order Embedding We now discuss the semantical aspects of our merging approach and define a schema for generating belief fusion operators at maximum entropy Fn based on our first-order embedding. For this, let Rfo = i=1 Ri be the first-order merging of the belief bases R1 , . . . , Rn . Without a proper semantics, Rfo is just a collection of the single reasoners’ beliefs that are marked with the reasoners’ id’s. The essential question when inferring fused beliefs from the belief base Rfo is how the conditionals in Rfo should be combined in order to observe a unified view that reflects the opinions of all reasoners. This aggregation is done by relating the ground instantiated conditionals (B(i)|A(i))[p] ∈ Rfo , i = 1, . . . , n, to the corresponding open conditional (B(X)|A(X))[p] ∈ CondB (Σ). While (B(i)|A(i))[p] expresses a belief of reasoner ri , the open conditional (B(X)|A(X))[p] expresses the unified view of all reasoners on the conditional event (B|A). In Example 1, for instance, we are interested in the unified view of both doctors on the influence of symptom s on disease d which can be formalized by the open conditional (d(X)|s(X))[p]. That is, we answer the query “With what probability do the doctors assume d in the presence of s?”, written (d|s)[?], with the probability p of the conditional (d(X)|s(X))[p]. Hence, the answer to the query depends on the semantics of open first-order conditionals. Once the belief bases R1 , . . . , Rn are merged to Rfo and a semantics of first-order conditionals is fixed, belief fusion is straightforward. Definition 7 (First-Order Belief Fusion). Let R1 , . . . , Rn be consistent belief bases,Flet P(Rfo ) be a model of the n merged belief base Rfo = i=1 Ri , and let sem be a wellfus behaved first-order semantics. Then, the fo-fusion Psem of Rfo i = {(B(i)|A(i))[p] | (B|A)[p] ∈ Ri } for i = 1, . . . , n. Example 3. The first-order translation of the belief base Rdoc,1 = {(d|s)[0.9]} of the first doctor from Example 1 is Rfo doc,1 = {(d(1)|s(1)[0.9]}, and the first-order translation of the belief base of the second doctor is Rfo doc,2 = {(d(2)|s(2))[0.8]}. When considering only a single reasoner ri , the first-order translation of the belief base Ri is nothing else than renaming the propositions, since Rfo i is grounded, and reasoning about Rfo works the same as reasoning about Ri . When i comparing the belief bases of several reasoners, the translafo tion process implies that every two belief bases Rfo i and Rj with i 6= j do not share any ground atoms so that they are syntactically separated. In particular, they are disjoint sets, fo i.e., Rfo i ∩ Rj = ∅, even if this is not the case for Ri and Rj , i.e., Ri ∩ Rj 6= ∅. In the doctors example this syntactic 135 MEfus sem (R1 , . . . , Rn ) depends on the semantics of first-order conditionals. We conclude this section by illustrating the ME-fusion by means of an example. R1 , . . . , Rn with respect to P and sem is defined by fus Psem (R1 , . . . , Rn ) |= (B|A)[p] iff P(Rfo ) |=sem (B(X)|A(X))[p] Example 4. We recall Example 1. One has for propositional conditionals (B|A)[p]. MEfus avrg (Rdoc,1 , Rdoc,2 )(d|s) = 1 = · ME(Rfo )(d(1)|s(1)) + ME(Rfo )(d(2)|s(2)) 2 1 = · (0.9 + 0.8) = 0.85 2 fus We usually omit the arguments of Psem and P when they are clear from the context. In particular, one has fus Psem (ω) = p iff P |=sem (w(X)|⊤)[p], where w(X) = fo(ω) is the first-order translation of the possible world ω ∈ Ω(Σ). Bear in mind that possible worlds from Ω(Σ) are not translated to possible worlds in Ω(Σfo ) but to open formulas w(X) ∈ BOOL(Σ). In fact, Definition 7 is not the definition of a single belief fusion operator but is a schema for generating a whole family of belief fusion operators which can be observed by varying the models of Rfo as well as the first-order semantics. We have already mentioned that there are several semantics of first-order conditionals but it remains to clarify under which constraints there is a model of Rfo . For this, F we show n that the maximum entropy distribution for Rfo = i=1 Ri exists if R1 , . . . , Rn are consistent. Hence, Rfo is consistent in this case, too. Recall that the maximum entropy optimization problem, i.e. finding a model of Rfo which has maximal entropy among all models of Rfo , is mathematically the same in our first-order setting as in the propositional case aside from the fact that one has to replace the belief base of a single reasoner with the merged belief base and the search space of all probability distributions over Ω(Σ) with those over Ω(Σfo ). The maximum entropy distribution ME(Rfo ) does not depend on the interpretation of firstorder conditionals with free variables since all conditionals in Rfo are ground, though. Therefore, the constraints in the optimization problem are the same for all well-behaved semantics and are linear combinations of the probabilities that have to be found. According to (Boyd and Vandenberghe 2004), the maximum entropy optimization problem has a unique solution in this case provided that the belief bases R1 , . . . , Rn are consistent which guarantees that the search space is non-empty. Consequently, the maximum entropy distribution ME(Rfo ) exists and Rfo is consistent. A further consequence is that belief fusion operators according to Definition 7 and with respect to ME(Rfo ) always exist. which equals MEfus appr (Rdoc,1 , Rdoc,2 ) = 0.85. In contrast to this, one has MEfus aggr (Rdoc,1 , Rdoc,2 )(d|s) ≈ 0, 8475 and MEfus unif (Rdoc,1 , Rdoc,2 ) ≈ 0, 8571. We leave the more sophisticated calculations in the latter cases to the reader. With the grounding semantics, it is not possible to draw an inference, as for the two proper groundings of (d(X)|s(X))[p] there are stated different probabilities in the merged belief base. 7 Comparison of First-Order Semantics in the Light of Belief Fusion In the last section, we have formally defined the notion of ME-fusion. Apart from the input belief bases, the ME-fusion operator also depends on the chosen semantics of first-order conditionals. We reformulate the semantics from Section 4 in the light of belief fusion and prove that the aggregating semantics coincide with the fusion operator OLEP while the uniformity semantics leads to OSEP. Example 4 has proven that the three remaining semantics differ from both OLEP and OSEP. As our first-order translation of beliefs leads to first-order conditionals in CondB (Σ), the definitions of first-order semantics in Section 4, which take conditionals defined over the entire language FOL into account, are too comprehensive to assess beliefs from the group of reasoners point of view (at least for many of them; see the end of this section for an extension of the notion of beliefs of the decision maker). Thus, we give characterizations of the semantics in the Boolean context before we discuss the role they are playing for belief fusion. Definition 8 (Social Inference at Maximum Entropy). Let R1 , . . . , Rn be consistent belief bases, and let sem be a well-behaved first-order semantics. Then, we define a social inference operator at maximum entropy for R1 , . . . , Rn and sem, the ME-fusion for short, by Characterization 1 (Grounding Semantics). A probability distribution P over Ω(Σfo ) is a grounding-model of a conditional (B(X)|A(X))[p] ∈ CondB (Σ) iff MEfus sem (R1 , . . . , Rn ) |= (B|A)[p] iff ∀i = 1, . . . , n : P(A(i)) > 0 and P(B(i)|A(i)) = p. fo ME(R ) |=sem (B(X)|A(X))[p] The grounding semantics causes that the decision maker only draws those inferences at maximum entropy that are supported by all reasoners in the same manner: for propositional conditionals (B|A)[p]. Note that in contrast to the definition of the maximum entropy distribution ME(Rfo ), the ME-fusion 136 MEfus grnd (R1 , . . . , Rn ) |= (B|A)[p] iff = ∀i = 1, . . . , n : ME(Ri ) |= (B|A)[p]. Proof (Sketch). The first equality holds as MEaggr satisfies Ṡ Syntax Splitting. By construction, Rfo = i=1,...,n Rfo i is a union of syntactically independent conditionals whereby fo conditionals in Rfo i are defined by using atoms from Σi only, i.e. the set of atoms that mention i. As a consequence, fo MEaggr (Rfo ) factorizes over Σfo 1 , . . . , Σn . See (Wilhelm, Kern-Isberner, and Ecke 2017) for the technical details. The second equation is an immediate consequence of the property System Independence. It states that “it should not matter whether one accounts for independent information about independent systems separately in terms of different densities or together in terms of a joint density.” (Shore and Johnson 1980) The averaging semantics leads to a linear pooling of conditional probabilities. Note that this is not the same as linear pooling of probabilities. In fact, in (Genest and Zidek 1986) it is mentioned that there is no non-trivial linear pooling operation which also linearly pools conditional probabilities. In Example 1, the decision maker assigns the probability 0.85, i.e. the arithmetic mean of 0.9 and 0.8, to the conditional (d|s), which is a very obvious assignment at first glance. Characterization 3 (Aggregating Semantics). A probability distribution P over Ω(Σfo ) is an aggregating-model of a conditional (B(X)|A(X))[p] ∈ CondB (Σ) iff Pn i=1 P(A(i)B(i)) P = p, n i=1 P(A(i)) Proposition 1 shows that MEaggr (Rfo ) is the joint distribution of the independent distributed opinions of the single reasoners. Corollary 1. Let R1 , . . . , Rn be consistent belief bases and let (B|A)[p] with A, B ∈ L(Σ) be a conditional. Then, MEaggr (Rfo )(B(i)|A(i)) = p iff ME(Ri )(B|A) = p. Proof. According to Proposition 1, MEaggr (Rfo )(B(i)|A(i)) = P fo ω∈Σfo : ω|=A(i)B(i) MEaggr (R )(A(i)B(i)) P = fo ω∈Σfo : ω|=A(i) MEaggr (R )(A(i)) P Qn fo ω|=A(i)B(i) j=1 MEaggr (R )(ωj ) Qn = P fo ω|=A(i) j=1 MEaggr (R )(ωj ) P fo ω |=A(i)B(i) MEaggr (R )(ωi ) = Pi fo ω |=A(i) ME(R )aggr (ωi ) P i prop(ωi )|=AB ME(Ri )(prop(ωi )) = P prop(ωi )|=A ME(Ri )(prop(ωi )) which can be reordered to a weighted arithmetic mean of the probabilities P(B(i)|A(i)): n X P(A(i)) µi · P(B(i)|A(i)) = p with µi = Pn . j=1 P(A(j)) i=1 The aggregating semantics leads to linear pooling of maximum entropy distributions with equal weights of 1/n as in OLEP. This, however, is not so obvious as the weights µi seem to differ from reasoner to reasoner (because of the probability P(A(i)) in the numerator of µi ). This, however, is the case because we deal with conditional statements here. In order to prove that the aggregating semantics in combination with the principle of maximum entropy coincides with OLEP,V we recall that each ω ∈ Ω(Σfo ) can be written as n ω = i=1 ωi where ωi is the conjunction of those ground literals in ω that mention the constant i. Further, consider the back-translation prop(ωi ) that translates the marginalized world ωi to a possible world from Ω(Σ) by substituting each ground atom a(i) in ωi by the proposition a ∈ Σ. For example, the possible world ω = d(1)s(1)d(2)s(2) ∈ Ω(Σfo ) decomposes into the marginalized worlds ω1 = d(1)s(1) and ω2 = d(2)s(2) which leads to prop(ω1 ) = ds and prop(ω2 ) = ds. Proposition 1.FLet R1 , . . . , Rn be consistent belief bases, n and let Rfo = i=1 Ri . Then, for all ω ∈ Ω(Σfo ), MEaggr (Rfo )(ω) = MEaggr (Ri )(prop(ωi )). i=1 As we have seen, an external decision maker would not come to a conclusion in Example 1 as the two doctors differ in their appraisal. Belief fusion based on the grounding semantics can thus be seen as the most cautious way of decision making. Characterization 2 (Averaging Semantics). A probability distribution P over Ω(Σfo ) is an averaging-model of a conditional (B(X)|A(X))[p] ∈ CondB (Σ) iff n 1 X P(B(i)|A(i)) = p. · n i=1 n Y n Y = ME(Ri )(B|A). Corollary 1 states that the maximum entropy probabilities of the i-th instance of the conditional statement (B(X)|A(X)) indeed corresponds to the probability which the i-th reasoner would assign to the conditional statement (B|A) if she were a maximum entropy reasoner. This is a strong justification for using the aggregating semantics for ME-fusion, and it directly leads to the central connection between OLEP and ME-fusion with respect to the aggregating semantics. Theorem 1. Let R1 , . . . , Rn be consistent belief bases. Then, MEfus aggr (R1 , . . . , Rn ) = OLEP(R1 , . . . , Rn ). Proof. For ω ∈ Ω(Σ), one has MEaggr (Rfo )(ωi ) MEfus aggr (R1 , . . . , Rn )(ω) i=1 137 Pn fo i=1 MEaggr (R )(w(i)) = P n fo i=1 MEaggr (R )(⊤) = 1/n ME⊛ (Rfo )(w(i)) =P 1/n Qn fo ′ i=1 MEunif (R )(w (i)) ω ′ ∈Ω(Σ) 1/n Qn i=1 ME(Ri )(ω) =P Qn ′ 1/n i=1 ME(Ri )(ω ) ω ′ ∈Ω(Σ) Qn n 1 X · ME(Ri )(ω) = OLEP(R1 , . . . , Rn )(ω). n i=1 i=1 From a semantical point of view, the contribution of Theorem 1 is as follows: When using OLEP for belief fusion, one assumes that every single reasoner infers her beliefs according to the principle of maximum entropy, i.e. reasons in a most cautious way, which is at least questionable. Theorem 1 states, however, that it is sufficient to assume that the decision maker is a maximum entropy reasoner to observe the same results as with OLEP. The assumption that the decision maker is a cautious reasoner is much more reasonable and is in accordance with the idea of finding a consensus between the reasoners. For the approving semantics, there is no simplifying characterization aside from the fact that the numbers of verified and applicable groundings of r = (B|A)[p] in ω reduce to = OSEP(R1 , . . . , Rn )(ω). We close this section with a brief discussion of the benefits owing to the gain in expressivity when translating beliefs into first-order statements. In principle, the translation allows the decision maker to easily express beliefs that mix the viewpoints of the single reasoners. For example, with respect to the scenario in Example 1, one could be interested in the common belief in disease d when the first doctor believes in the presence of symptom s while the second doctor does not, i.e., one queries the conditional statement (d(X)|(s(1)s(2)). This is, at least under consideration of the aggregating semantics, equivalent to the course of action where the doctors update their beliefs with s resp. s first, before they communicate them to the decision maker. Further queries of interest could be: With which probability should the decision maker assume that the group of doctors believe in d, that both doctors believe in d, that at least one doctor believes in d, and so on. Clearly, there is still a lot of work to be done in this line of thinking in order to translate these queries properly into first-order conditionals such that the conditionals and the chosen semantics of the conditionals reflect the idea behind the query properly. verr (ω) = |{i ∈ {1, . . . , n} | ω |= A(i)B(i)}|, appr (ω) = |{i ∈ {1, . . . , n} | ω |= A(i)}|. Maximum entropy reasoning based on the approving semantics is neither a linear nor a logarithmic pooling operation. Example 5. We consider Example 1 but with a third doctor who beliefs in (d|s) with probability 0.6. A calculation of fus some length shows MEfus appr (d|s) ≈ 0, 908, i.e. MEappr (d|s) is higher than the highest rating of the doctors. In particular, MEfus appr is not a (weighted) arithmetic or geometric mean. Example 5 shows that the approving semantics leads to rather credulous decision making. 8 Characterization 4 (Uniformity Semantics). A probability distribution P over Ω(Σfo ) is a uniformity-model of a conditional (B(X)|A(X))[p] ∈ CondB (Σ) iff Vn P 1/n ω∈Ω(Σ), ω|=AB P( i=1 w(i)) Vn P = p. P( i=1 w(i))1/n ω∈Ω(Σ), ω|=A Related Work There have been published further approaches to fuse probabilities based on the principle of maximum entropy (Myung, Ramamoorti, and Bailey 1996; Levy and Delic 1994; Mohammad-Djafari 1998; Fassinut-Mombot and Choquel 2000). What comes closest to our view on belief fusion is the fusion operator defined in (Kern-Isberner and Rödder 2004) to which we want to refer with KIR(R1 , . . . , Rn ). This operator also merges the belief bases R1 , . . . , Rn first before applying the maximum entropy principle to the merged belief base. In order to represent beliefs from different reasoners’ points of views, fresh propositions Wi are introduced and conditionals (B|A)[p] are translated to (B|AWi )[p]. Hence, the conditional (B|AWi )[p] states that “B holds in the presence of A with probability p in the view of reasoner ri .” To separate the different viewpoints, the conditionals (W1 ∨ . . . ∨ Wn |⊤)[1] and (Wi Wj |⊤)[0] for i 6= j are added. However, this translation shows certain side effects when applying the principle of maximum entropy. The uniformity semantics assigns positive probabilities only to those possible worlds in Ω(Σfo ) that are duplicates of worlds ω ∈ Ω(Σ) for each reasoner ri . With duplicated worlds we mean, for example, s(1)d(1)s(2)d(2) and s(1)d(1)s(2)d(2) in Example 1 but not s(1)d(1)s(2)d(2). This restriction causes a unified view on the world for all reasoners ri . The uniformity semantics in combination with the principle of maximum entropy results in OSEP. Theorem 2. Let R1 , . . . , Rn be consistent belief bases. Then, Example 6. With respect to Example 1, it holds that KIR(Rdoc,1 , Rdoc,2 )(d|s) ≈ 0.8456 which differs from the results that can be obtained with OLEP and OSEP. In particular, KIR does not compute the arithmetic mean of the probabilities 0.9 and 0.8 but tends to the lower probability 0.8. This is a direct consequence of the cautiousness of maximum entropy reasoning which also effects the calculation of the mean values in the KIR-approach. MEfus unif (R1 , . . . , Rn ) = OSEP(R1 , . . . , Rn ). Proof. One has MEfus unif (R1 , . . . , Rn )(ω) = Vn MEunif (Rfo )( i=1 w(i))1/n Vn =P fo ′ 1/n ω ′ ∈Ω(Σ) MEunif (R )( i=1 w (i)) 138 It is an open question whether KIR can be reproduced with our first-order embedding approach. For a further comparison of KIR with OLEP and OSEP, see (Adamcik 2014). 9 Genest, C., and Zidek, J. V. 1986. Combining probability distributions: A critique and an annotated bibliography. Statistical Science 1(1):114–135. Grossi, D., and Pigozzi, G. 2014. Judgment Aggregation: A Primer. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers. Kern-Isberner, G., and Rödder, W. 2004. Belief revision and information fusion on optimum entropy. Int. J. Intell. Syst. 19(9):837–857. Kern-Isberner, G., and Thimm, M. 2010. Novel semantical approaches to relational probabilistic conditionals. In Proceedings of 12th KR Conference, 382–392. AAAI Press. Konieczny, S., and Pérez, R. P. 2011. Logic based merging. J. Philosophical Logic 40(2):239–270. Levy, W. B., and Delic, H. 1994. Maximum entropy aggregation of individual opinions. IEEE Transactions on Systems, Man, and Cybernetics 24(4):606–613. List, C. 2012. The theory of judgment aggregation: an introductory review. Synthese 187(1):179–207. McConway, K. J. 1981. Marginalization and linear opinion pools. Journal of the American Statistical Association 76:410–414. Mohammad-Djafari, A. 1998. Probabilistic methods for data fusion. Maximum Entropy and Bayesian Methods 57– –69. Myung, I. J.; Ramamoorti, S.; and Bailey, A. 1996. Maximum entropy aggregation of expert predictions. Management Science 42(10):1420–1436. Paris, J. B. 1999. Common sense and maximum entropy. Synthese 117(1):75–93. Paris, J. B. 2006. The Uncertain Reasoner’s Companion: A Mathematical Perspective. Cambridge University Press. Shannon, C. E., and Weaver, W. 1949. The Mathematical Theory of Communication. University of Illinois Press. Shore, J. E., and Johnson, R. W. 1980. Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy. IEEE Trans. Information Theory 26(1):26–37. Stone, M. 1961. The opinion pool. Annals of Mathematical Statistics 32(4):1339–1342. Thimm, M., and Kern-Isberner, G. 2012. On probabilistic inference in relational conditional logics. Logic Journal of the IGPL 20(5):872–908. Wilhelm, M.; Kern-Isberner, G.; and Ecke, A. 2017. Basic independence results for maximum entropy reasoning based on relational conditionals. In Proceedings of the 3rd Global Conference on Artificial Intelligence (GCAI), 36–50. Wilmers, G., and Jensen, O. E. 2010. The social entropy process: Axiomatising the aggregation of probabilistic beliefs. Wilmers, G. 2015. A foundational approach to generalising the maximum entropy inference process to the multi-agent context. Entropy 17:594–645. Conclusion and Future Work In this paper, we presented a novel approach for merging belief bases that consist of probabilistic conditionals. While the belief bases that were merged used a propositional background language, the merged beliefs were formalized by first-order conditionals. The first-order lifting allowed us to distinguish between statements of the form “if A holds, then B follows with probability p in the view of reasoner ri ,” and ”if A holds, then B follows with probability p in the view of the group of reasoners.” The first type of statements was expressed by ground instantiated conditionals (B(i)|A(i))[p] and the second type of statements by open conditionals (B(X)|A(X))[p] with a free variable X. We proceeded with inferring a fused belief state from the merged belief base by applying the principle of maximum entropy. We showed that with our approach it is possible to both reproduce the well-known pooling operators OLEP and OSEP and define novel operators by varying the semantics of firstorder conditionals. In addition, the first-order translation of beliefs allows one to ask and answer more complex queries than in the initial propositional setting. In future work, we want to investigate if it is possible to benefit from the connection between belief fusion operators and semantics of first-order conditionals the other way around: For belief fusion operators, a wide range of postulates are declared. We plan to reformulate these postulates in terms of first-order conditionals in order to evaluate the quality of first-order semantics. References Adamcik, M. 2014. Collective Reasoning under Uncertainty and Inconsistency. Ph.D. Dissertation, University of Manchester. Bacharach, M. 1972. Scientific disagreement. Unpublished manuscript, Christ Church, Oxford. Bloch, I.; Hunter, A.; Appriou, A.; Ayoun, A.; Benferhat, S.; Besnard, P.; Cholvy, L.; Cooke, R.; Cuppens, F.; Dubois, D.; Fargier, H.; Grabisch, M.; Kruse, R.; Lang, J.; Moral, S.; Prade, H.; Saffiotti, A.; Smets, P.; and Sossai, C. 2001. Fusion: General concepts and characteristics. International Journal of Intelligent Systems 16(10):1107–1134. Boyd, S., and Vandenberghe, L. 2004. Convex Optimization. Cambridge University Press. Dietrich, F., and List, C. 2017. Probabilistic Opinion Pooling. Oxford University Press. Dubois, D.; Liu, W.; Ma, J.; and Prade, H. 2016. The basic principles of uncertain information fusion. an organised review of merging rules in different representation frameworks. Inf. Fusion 32(PA):12–39. Fassinut-Mombot, B., and Choquel, J. B. 2000. An entropy method for multisource data fusion. In Proceedings of the Third International Conference on Information Fusion, volume 2. 139 Stratified disjunctive logic programs and the infinite-valued semantics Panos Rondogiannis1 , Ioanna Symeonidou1 , 1 National and Kapodistrian University of Athens {prondo, isymeonidou}@di.uoa.gr, Abstract to the semantics of negation, namely the stable model semantics (Gelfond and Lifschitz 1988) and the well-founded semantics (van Gelder, Ross, and Schlipf 1988), both agree with the perfect model semantics when restricted to the class of stratified programs. Yet, most attempts at defining a disjunctive well-founded semantics do not have the analogous relationship with the disjunctive perfect model semantics. The only exception, to our knowledge, is the Stationary Semantics (Przymusinski 1990). Interestingly, Przymusinski has later replaced his proposal with the weaker Static Semantics (Przymusinski 1995) and criticized the disjunctive perfect model semantics, reopening the question of the “correct” interpretation of stratified disjunctive programs. It is in this light that we examine one of the most recent attempts at defining a well-founded semantics for disjunctive programs, namely the infinite-valued semantics (Cabalar et al. 2007). Under this purely model-theoretic approach, the meaning of a program is captured by the set of its minimal infinite-valued models. These models are defined over an expanded truth domain with infinite values, which express different degrees of certainty ranging from certainly true or certainly false to undefined. In this paper, we demonstrate how the structure of these models is closely connected to the stratifications of the program, by proving a number of properties. First, we show that these minimal models retain their minimality for every subset of the program defined by a given stratification. Then we argue that the stratum in which each atom is placed limits the degree of uncertainty of its truth value; the atoms in the lower strata are evaluated with greater certainty than the ones in the higher strata. This sets the number of strata as an upper bound of the distinct truth values appearing in the minimal infinite-valued models. At the same time, it ensures that the minimal infinitevalued models of a stratified program never assign the undefined truth value, as it also happens with the perfect models. Moreover, we show that the two-valued interpretation produced from a minimal infinite-valued model of a stratified program, by collapsing all true values to T and all false values to F , is always a classical minimal model of the program – another property shared by the perfect models. In addition, we discuss the behavior of the infinite-valued semantics with respect to a selection of example stratified disjunctive programs, borrowed from the available literature. We find that, for all of these programs, there is a one-to-one Numerous adaptations of the well-founded semantics have been proposed for the class of disjunctive logic programs with negation. The proposals are mostly different in philosophy as well as in the results they produce; surprisingly perhaps, the differences occur even in the case of stratified programs. In this paper we focus on one of the most recent of these approaches, namely the disjunctive infinite-valued semantics and explore its relationships to stratification. We demonstrate some close connections between the tiered structure of a stratified program and the structure of its minimal infinitevalued models. Most importantly, we show that the number of distinct truth values assigned by any minimal infinitevalued model of a stratified program is bound by the number of strata. In addition, we present some evidence of the approach’s affinity to the disjunctive perfect model semantics, such as the similar properties of the models defined by the two semantics and their identical behavior with respect to selected benchmark programs. 1 Introduction Disjunctive logic programming has been shown to hold more expressive power than traditional logic programming (Eiter and Gottlob 1993; Eiter, Gottlob, and Mannila 1994). As a result, it has drawn a fair amount of attention and a substantial body of work has been produced over the years pertaining to this paradigm. Among other things, great effort has gone into the search for a disjunctive well-founded semantics. After a plethora of variations has been proposed (Ross 1989; Przymusinski 1990; Baral 1992; Brass and Dix 1995; Przymusinski 1995; Wang 2000; Alcântara, Damásio, and Pereira 2005; Cabalar et al. 2007), it seems a consensus has not yet been reached on a widely accepted approach; or even, on the criteria by which such an approach should be selected. It is also noteworthy that one rarely comes across a version of the disjunctive well-founded semantics which is proven to extend the disjunctive perfect model semantics (Przymusinski 1988) of stratified disjunctive programs. In traditional logic programming, stratified programs have always had a special significance, as they are regarded to hold a clear and indisputable meaning, captured by the perfect model semantics (Apt, Blair, and Walker 1988; van Gelder 1988). The two main and established approaches 140 correspondence between their minimal infinite-valued models and their perfect models, even when some of the other approaches produce different results. Combined with the aforementioned original results, we believe that these observations serve as evidence to the potential equivalence of the infinite-valued semantics and the perfect model semantics and contribute to the better understanding of the semantics of stratified disjunctive programs. The rest of the paper is organized as follows. Section 2 gives an overview of the major disjunctive well-founded semantics that have been proposed in the past few years and their relationship to the disjunctive perfect model semantics. Section 3 defines the syntax of disjunctive logic programs, as we consider it in this paper, and gives a detailed presentation of their infinite-valued semantics. In Section 4 we recall the definition of stratification for disjunctive programs and in Section 5 we formally present our main results, i.e. we demonstrate the above-mentioned properties of the minimal infinite-valued models of stratified programs. Section 6 empirically compares the infinite-valued semantics to the perfect model semantics based on a curated set of examples. Finally, Section 7 concludes the paper with a discussion on possible future research directions. 2 Founded Semantics (Ross 1989), which features a procedural, top-down characterization. The meaning of the program is given by a unique set comprising disjunctions of ground atoms and negations of such disjunctions, all considered to be true with respect to the program. It is shown in (Brass and Dix 1995) that the the Strong Well-Founded Semantics is not a generalization of the perfect model semantics. At around the same time, the Stationary Semantics (Przymusinski 1990; Przymusinski 1991) was introduced. The semantics is equivalently characterized in terms of program completion, iterated minimal models and the least fixedpoint of a minimal model operator. This is the only approach, that we are aware of, which is shown to generalize the disjunctive perfect model semantics. Consequently, it is different from the Strong Well-Founded Semantics. Shortly after, Baral (1992) proposed a disjunctive wellfounded semantics (DWFS) with a fixed-point and a procedural characterization. The fixed-point characterization is a generalization of both Przymusinski’s characterization of the classical well-founded semantics (1989) and the fixedpoint semantics of negationless disjunctive programs (Lobo, Minker, and Rajasekar 1992). The semantics is different from the Strong Well-Founded and the Stationary Semantics, as shown in (Baral 1992). We are not aware of any results regarding its relationship to the disjunctive perfect model semantics, other than that the two semantics agree for certain benchmark programs discussed by the author. Another proposal, labeled D-WFS, was given by Brass and Dix (1995). It is defined abstractly as the weakest semantics which remains unchanged under a set of elementary program transformations. It is also defined procedurally, by means of a bottom-up query evaluation method, for programs with a finite ground instantiation. The D-WFS is proven to be strictly weaker than (and therefore not an extension of) the disjunctive perfect model semantics and different from the Strong Well-Founded Semantics in (Brass and Dix 1995). Przymusinski revisited the subject of disjunctive wellfounded semantics in (1995), where he introduced the Static Semantics. The definition of the semantics is based on translating the program into a belief theory and a fixed-point characterization is given. This approach is strictly weaker than the perfect model semantics and different from the author’s earlier approach, the Stationary Semantics (Przymusinski 1995). On the other hand, it is strictly stronger than the D-WFS unless we restrict attention to a common query language, in which case the two semantics coincide (Brass et al. 2001). Przymusinski argued that the weaker nature of his newer semantics makes it a preferable alternative (see Example 5 in Section 6 for more details). Interest in the issue of disjunctive well-founded semantics continued in the next decade. In (2000), Wang presented a semantic framework based on argumentation-based abduction, which he used to define a number of different semantics, including a generalization of the well-founded semantics. Wang named his proposal the Well-Founded Disjunctive Semantics (WFDS) and in (2001) he showed that it can be equivalently characterized using several different approaches, such as program transformations (an augmen- Generalizations of the well-founded semantics The generalization of the well-founded semantics to the class of disjunctive logic programs remains to this day an open problem. Despite the numerous approaches that have been proposed over the last three decades, none has succeeded in finding wide acceptance. Moreover, the completely different characterizations of the approaches and intuitions behind them make direct comparisons a particularly challenging task. As a result, an additional volume of literature has been built around these proposals, investigating their connections and differences and debating the criteria by which they should be judged. In this section we discuss some of the most popular semantics for disjunctive programs with negation, which are genuine generalizations of the well-founded semantics. We will make mention of results concerning the relationships of these approaches with the perfect model semantics for stratified disjunctive programs (Przymusinski 1988), as well as the relationships among the approaches themselves, whenever such results are available. Most of the approaches discussed here have been found to be different from each other and in most cases, the comparisons are limited to observing their different behavior with respect to one or more example programs. However, there are some instances where two semantics are compared in terms of the amount of information they can extract from a program. In (Brass and Dix 1995) a semantics is defined to be weaker than a second semantics (or the second is stronger than the first), if every disjunction of all positive or all negative ground literals which can be derived under the first semantics, can also be derived under the second. One of the first attempts at defining a well-founded semantics for the disjunctive case, was the Strong Well- 141 The key idea of the infinite-valued approach is that, in order to give a logical semantics to negation-as-failure and to distinguish it from ordinary negation, one needs to extend the domain of truth values. For example, consider the program: p← r ← ∼p s ← ∼q According to negation-as-failure, both p and s receive the value T . However, p seems “truer” than s because there is a rule which says so, whereas s is true only because we are never obliged to make q true. In a sense, s is true only by default. For this reason, it was proposed in (Rondogiannis and Wadge 2005) to introduce a “default” truth value T1 just below the “real” true T0 , and (by symmetry) a weaker false value F1 just above (“not as false as”) the real false F0 . Then, negation-as-failure is a combination of ordinary negation with a weakening. Thus ∼F0 = T1 and ∼T0 = F1 . Since negations can be iterated, the new truth domain requires a sequence . . . , T3 , T2 , T1 of weaker and weaker truth values below T0 but above the neutral value 0; and a mirror image sequence F1 , F2 , F3 , . . . above F0 and below 0. In fact, in (Rondogiannis and Wadge 2005) a Tα and a Fα are introduced for all countable ordinals α; since in this paper we deal with finite propositional programs, we will not need this generality here. The new truth domain V is ordered as follows: tation of the approach of (Brass and Dix 1995)), argumentation, unfounded sets and a bottom-up procedure. He also showed that the semantics is strictly stronger than the DWFS (2001), and different from the Static (2001) and Stationary Semantics (2000). Its relationship to the perfect model semantics is not examined, but a comparison of example programs from (Brass and Dix 1995) and (Wang 2000) reveals they are incompatible. The WFSd is another fixed-point approach presented in (Alcântara, Damásio, and Pereira 2005). The semantics is compared to most of the previous approaches and found to be different from all of them; in particular, it is different from the Strong Well-Founded Semantics, the Stationary and Static Semantics and Wang’s WFDS, while it is strictly stronger than the D-WFS. It is also shown that it doesn’t generalize the perfect model semantics. In (Cabalar et al. 2006), Partial Equilibrium Logic (PEL) was employed as a framework for providing a purely declarative semantics for disjunctive programs. It was shown that the semantics does not agree with the D-WFS, the Static Semantics and Wang’s WFDS; in particular, it is neither stronger nor weaker than the former two approaches. A peculiarity that distinguishes it from many approaches is that it does not guaranty the existence of a model for every program. The authors do not attempt a comparison to the perfect model semantics. In this paper we will focus on the most recent approach, (Cabalar et al. named the Infinite-Valued Semantics Lmin ∞ 2007). To our knowledge, this is the only version of a disjunctive well-founded semantics other than PEL, with a purely model-theoretic characterization. Proportionately to the semantics of positive programs, the meaning of a program is captured by the set of its minimal models. However, in this case the models are defined over a new logic of infinite truth values, used to signify the decreasing reliability of information obtained through negation-as-failure. In (Cabalar et al. 2007), the approach is compared to and shown to be different from the D-WFS, the Static Semantics, the WFDS, the WFSd and PEL, but it is not compared to the perfect model semantics. More details on the intuitive ideas behind the infinite-valued semantics, as well as a formal definition of Lmin ∞ , are given in the next section. 3 F0 < F1 < · · · < 0 < · · · < T1 < T0 Every truth value in V is associated with a natural number, called the order of the value: Definition 2. The order of a truth value is defined as follows: ord(Tn ) = n, ord(Fn ) = n and ord(0) = +∞. It is straightforward to generalize the notion of interpretation under the prism of our infinite-valued logic. We use HB P to denote the set of propositional symbols appearing in a given program P, also called the Herbrand base of P: Definition 3. An (infinite-valued) interpretation of a disjunctive program P is a function from HB P to the set V of truth values. If v ∈ V is a truth value, we will use I k v to denote the set of atoms which are assigned the value v by I. Definition 4. The meaning of a formula with respect to an interpretation I can be defined as follows: ( Tn+1 , if I(A) = Fn Fn+1 , if I(A) = Tn I(∼A) = 0, if I(A) = 0 I(A ∧ B) = min{I(A), I(B)} I(A ∨ B) = max{I(A), I(B)} T0 , if I(A) ≥ I(B) I(A ← B) = I(A), if I(A) < I(B) Disjunctive Programs and the Infinite-Valued Semantics Lmin ∞ In this section we present background on the infinite-valued semantics for disjunctive logic programs with negation. We follow closely the presentation of (Cabalar et al. 2007). The authors focus on the class of disjunctive logic programs: Definition 1. A disjunctive logic program is a finite set of clauses of the form p1 ∨ · · · ∨ pn ← q1 , . . . , qk , ∼r1 , . . . , ∼rm where n ≥ 1 and k, m ≥ 0. For the sake of simplicity of notation, the authors consider only finite programs, but they note that the results can be lifted to the more general first-order case. We adhere to this simplification in this paper. The notion of satisfiability of a clause can now be defined: Definition 5. Let P be a program and I an interpretation. Then, I satisfies a clause p1 ∨ · · · ∨ pn ← q1 , . . . , qk , ∼r1 , . . . , ∼rm 142 if I(p1 ∨ · · · ∨ pn ) ≥ I(q1 , . . . , qk , ∼r1 , . . . , ∼rm ). Moreover, I is a model of P if I satisfies all clauses of P. The semantics Lmin ∞ is a minimal model semantics. This implies we need a partial order on the set of interpretations: Definition 6. Let I, J be interpretations and n < ω. We write I =n J, if for all k ≤ n, I k Tk = J k Tk , and I k Fk = J k Fk . We write I ⊑n J, if for all k < n, I =k J and, moreover, I k Tn ⊆ J k Tn and I k Fn ⊇ J k Fn . We write I ❁n J, if I ⊑n J but I =n J does not hold. Definition 7. Let I, J be interpretations. We write I ❁∞ J, if there exists n < ω (that depends on I and J) such that I ❁n J. We write I ⊑∞ J if I = J or I ❁∞ J. It is easy to see that the relation ⊑∞ on the set of interpretations is a partial order (i.e., it is reflexive, transitive and antisymmetric). On the other hand, for every n < ω, the relation ⊑n is a preorder (i.e., reflexive and transitive). In comparing two interpretations I and J we consider first only those propositional symbols assigned “standard” truth values (T0 or F0 ) by at least one of the two interpretations. If I assigns T0 to a particular symbol and J does not, or J assigns F0 to a particular symbol and I does not, then we can rule out I ⊑∞ J. Conversely, if J assigns T0 to a particular variable and I does not, or I assigns F0 to a particular variable and J does not, then we can rule out J ⊑∞ I. If both these conditions apply, we can immediately conclude that I and J are incomparable. If exactly one of these conditions holds, we can conclude that I ⊑∞ J or J ⊑∞ I, as appropriate. However, if neither apply, then I and J are equal in terms of standard truth values; they both assign T0 to each of one group of propositional symbols and F0 to each of another. In this case we must now examine the symbols assigned F1 or T1 . If this examination proves inconclusive, we move on to T2 and F2 , and so on. Thus ⊑∞ gives the standard truth values the highest priority, T1 and F1 the next priority, T2 and F2 the next, and so on. These ideas are illustrated by the following example: Example 1. Consider the program: work ∨ play ← ∼rest play ∨ rest ← Theorem 2. Let P be a program and let M be a minimal infinite-valued model of P. For every propositional symbol p ∈ HB P , M (p) ∈ {0, F0 , T0 , . . . , F|HB P |−1 , T|HB P |−1 }. In section 5, we will show that in the case of stratified programs, this bound can be improved. In the special case of traditional non-disjunctive or normal programs, the number of minimal models is reduced to exactly one. This unique minimum model is equivalent to the well-founded model of the program. This is one of the main results of (Rondogiannis and Wadge 2005) and is summarized in Theorem 3 below. The notion of collapsing an infinite-valued interpretation into a three-valued one will be useful in formally stating the theorem: Definition 8. Let P be a program and let I be an infinitevalued interpretation of P. We denote by Col(I) the threevalued interpretation obtained from I by collapsing each Ti to T and each Fi to F . We may also say that I collapses to a three-valued interpretation I ′ , if I ′ = Col(I). Theorem 3. Every normal logic program with negation P has a unique minimum infinite-valued model, which collapses to the well-founded model of P. Returning to disjunctive programs, another important result of (Cabalar et al. 2007) is that the semantics Lmin ∞ extends the minimal model semantics for (negation-less) disjunctive logic programs: Theorem 4. Let P be a disjunctive program which does not contain negation. If we identify the value T0 with T and the value F0 with F , then 1. If M is a minimal classical model of P, then it is also a minimal infinite-valued model of P. 2. If M is a minimal infinite-valued model of P, then it assigns to every propositional symbol in P a value of order 0 and it is a minimal classical model of P. 4 Stratification In classical logic programming, stratification was first defined by Apt, Blair and Walker (1988) and, independently, by van Gelder (1988). The definition was generalized to apply to disjunctive programs by Przymusinski (1988). A stratified program allows us to determine a priority ordering of its propositional symbols, through which we evaluate a symbol only after having evaluated all of its dependencies through negation. In other words, stratified programs do not display circular dependencies through negation. The minimal models of the above program are: M1 = {(play, T0 ), (rest, F0 ), (work, F0 )} M2 = {(play, F0 ), (rest, T0 ), (work, F1 )} M3 = {(play, F1 ), (rest, T0 ), (work, F0 )}. From the above discussion it should be now clear that the infinite-valued semantics of a disjunctive logic program with negation is captured by the set of ⊑∞ −minimal infinitevalued models of the program. One of the major results of (Cabalar et al. 2007) is that this set is always non-empty: Theorem 1. Every disjunctive logic program with negation has a non-empty set of minimal infinite-valued models. Moreover, the set of minimal models of a program is finite. This is an immediate consequence of the following theorem from (Cabalar et al. 2007), which shows that the maximum order of truth values assigned by a minimal model M is bound by the number of propositional symbols appearing in the program. Definition 9. A program P is stratified, if it is possible to decompose HB P into disjoint sets S0 , S1 , . . . , Sr , called strata, so that for every clause p1 ∨ · · · ∨ pn ← q1 , . . . , qk , ∼r1 , . . . , ∼rm in P we have that: 1. the propositional symbols p1 , . . . , pn belong to the same stratum, say Sl S 2. the propositional symbols q1 , . . . , qk belong to {Sj : j ≤ l} S 3. the propositional symbols r1 , . . . , rm belong to {Sj : j < l} 143 Example 2. The program From the above definition and the definition of stratification, it becomes obvious that the clauses in a set Pn cannot contain propositional symbols belonging to the strata above Sn . In this sense, we might say that Pn defines a “selfcontained” subset of the program. Moreover, HB n is the Herbrand base and I n is a valid interpretation for this subset. The next lemma, which will be useful in proving our main result of Theorem 5, states that every minimal model of a stratified program is also a minimal model of every such subset of the program. Lemma 1. Let P be a stratified program, {S0 , . . . , Sr } be a stratification of P and M be a minimal model of P. For any n, 0 ≤ n ≤ r, M n is a minimal model of Pn . work ∨ play ← ∼rest work ← workday is stratified and a possible stratification is S0 = {rest, workday}, S1 = {work, play}. The program of Example 1 is not stratified, because the first requirement of Definition 9 demands that play and rest are in the same stratum (due to the second clause), while at the same time the third requirement demands that rest be placed at a lower stratum than play (due to the first clause). Finally, the program work ∨ play ← ∼rest rest ∨ play ← ∼work Proof. By definition, for each n = 1 . . . r, M n and M assign the same truth values to all propositional symbols in HB n , so M n satisfies a clause in Pn iff M satisfies the same clause. It follows that M n is a model of Pn for all n = 1 . . . r, so it suffices to show that M n a minimal. For the sake of contradiction, assume that M n is not a minimal model of Pn . Then there is a model N n of Pn such that N n ❁∞ M n . Assume more specifically that N n ❁k M n for some natural number k.We define the interpretation N of P as so: for each propositional symbol p ∈ HB  n n N (p) if S(p) ≤ n and ord(N (p)) ≤ k N (p) = M (p) if S(p) > n and ord(M (p)) ≤ k  Tk+1 otherwise is not stratified, as there exist circular dependencies through negation. The class of stratified programs is particularly interesting, as they have some nice semantic properties. Most importantly, the two major semantic approaches for general programs, namely the well-founded semantics (van Gelder, Ross, and Schlipf 1988) and the stable model semantics (Gelfond and Lifschitz 1988), coincide for this class of programs, giving a unique two-valued minimum model (also called the perfect model (Apt, Blair, and Walker 1988; van Gelder 1988)) for every stratified program. When examining first-order programs, the notion of stratification can be extended to define the significantly more general but equally “unproblematic” (from the semantics point of view) class of locally stratified programs (Przymusinski 1988). However, the class of locally stratified programs coincides with that of stratified programs when we restrict attention to finite propositional programs as we have, so we need not consider it in this paper. 5 We will show that: 1. N ❁k M and 2. N is a model of the program P. Obviously, this will contradict the assumption that M is a minimal model of P and prove that N n cannot exist. First we show statement 1: we have that N n ❁k M n ⇒ n N =k−1 M n . So, for all p such that S(p) ≤ n and ord(M (p)) < k (or, equivalently, ord(N n (p)) < k) we have that M (p) = M n (p) = N n (p) = N (p) (from the construction of N ). Moreover, for all p such that S(p) > n and ord(M (p)) ≤ k, N (p) = M (p). One can easily see that M =k−1 N and that, for order k, the two interpretations differ exactly for the same atoms as M n and N n . Thus we conclude that N ❁k M . Now we show statement 2: We need to prove that N satisfies every clause p1 ∨ · · · ∨ pnh ← Q1 , . . . , Qnb in P. n If the clause Sn is in P , then every atom appearing in the clause is in i=0 Si . Assume that, among the atoms in the head of the clause, N n assigns the maximum value to some pi . Also assume that, among the literals in the body of the clause, N n assigns the minimum value to some Qj of the form q or the form ∼q. Because N n is a model of P: Properties of the Infinite-Valued Models of Stratified Programs In this section we present the main results of this paper. In particular, we study the relationship between the structure of a stratified program and the structure of its minimal models and show that these models collapse to minimal two-valued models. As in the previous section, we restrict attention to finite programs. The next definition introduces notation that will be useful in stating the results of this section. Definition 10. Let P be a stratified program, {S0 , . . . , Sr } be a stratification of P and I be an interpretation of P. • We use Pi , 0 ≤ i ≤ r, to denote the set of clauses whose head consists of propositional symbols in Si . We use Pi Si to denote the set k=0 Pk . N n (pi ) max{N (p1 ), · · · , N (pnh )} N n (p1 ∨ · · · ∨ pnh ) N n (Q1 , . . . , Qnb ) min{N n (Q1 ), . . . , N n (Qnb )} N n (Qj ) n i • We use HB , 0 ≤ i ≤ r, to denote the set of propositional symbols appearing in the clauses of Pi . • We use I i , 0 ≤ i ≤ r, to denote the subset of I that assigns values only to the propositional symbols in HB i . • We define a function S, with S(p) = n if p ∈ Sn . 144 n = = ≥ = = A. If ord(M (pi )) ≤ k: In this case, M (pi ) ≤ Fk or Tk ≤ M (pi ). Also, by the construction of N , N (pi ) = M (pi ). Similarly, we have: N (p1 ∨ · · · ∨ pnh ) = max{N (p1 ), · · · , N (pnh )} ≥ N (pi ) and A.1. If ord(M (q)) < k: We have constructed N so that N =k M , therefore N (q) = M (q). Because M is a model of P, N (Qj ) = M (Qj ) ≤ M (pi ) = N (pi ), whether Qj = q or Qj = ∼q. A.2. If ord(M (q)) ≥ k: Because N ❁k M , if M (q) = Fk it must also be N (q) = Fk . Then we have N (Qj ) = M (Qj ) ≤ M (pi ) = N (pi ), whether Qj = q or Qj = ∼q. If Fk < M (q) ≤ Tk , then Fk+1 ≤ M (∼q) < Tk+1 . Consequently, whether Qj = q or Qj = ∼q, Fk < M (q) ≤ Tk , M (Qj ) ≤ M (pi ) and M (pi ) ≤ Fk cannot all hold at the same time, so it must be Tk ≤ M (pi ) = N (pi ) in this case. By N ❁k M and the construction of N , N (q) ∈ {Fk , Tk+1 , Tk } and so N (∼q) ∈ {Fk+1 , Fk+2 , Tk+1 }. Observe that N (Qj ) ≤ Tk ≤ M (pi ) = N (pi ) holds for all three possible values of N (q). N (Qj ) ≥ min{N (Q1 ), . . . , N (Qnb ))} = N (Q1 , . . . , Qnb ) This suggests that, if we can show N (pi ) ≥ N (Qj ), then we will have shown that N satisfies the clause and thus is a model of Pn . We distinguish the following cases: A. If ord(N n (pi )) ≤ k: In this case, N n (pi ) ≤ Fk or Tk ≤ N n (pi ). Also, by the construction of N , N (pi ) = N n (pi ). A.1. If ord(N n (q)) ≤ k: Again, by the construction of N , N n (q) = N (q). Because N n is a model of Pn , N (Qj ) = N n (Qj ) ≤ N n (pi ) = N (pi ), whether Qj = q or Qj = ∼q. A.2. If ord(N n (q)) > k: In this case, Fk < N n (q) < Tk , N (q) = Tk+1 and N (∼q) = Fk+2 . Because N n is a model of Pn and therefore N n (Qj ) ≤ N n (pi ), it is not possible that N n (pi ) ≤ Fk . It follows that N (Qj ) < Tk ≤ N n (pi ) = N (pi ), whether Qj = q or Qj = ∼q. B. If ord(N n (pi )) > k: In this case, Fk < N n (pi ) < Tk and N (pi ) = Tk+1 . B.1. If ord(N n (q)) ≤ k: This assumption makes N n (q) ≤ Fk or Tk ≤ N n (q) and by the construction of N , N (q) = N n (q). Also, N (∼q) = N n (∼q) ≤ Fk+1 or Tk+1 ≤ N n (∼q) = N (∼q). However, if Qj = q, then N n (Qj ) ≤ N n (pi ) and N n (pi ) < Tk imply that only N n (q) ≤ Fk can hold. Then, N (q) = N n (q) ≤ Fk < Tk+1 = N (pi ). Similarly, if Qj = ∼q, then N n (Qj ) ≤ N n (pi ) and N n (pi ) < Tk imply that either N (∼q) = N n (∼q) ≤ Fk+1 or N (∼q) = N n (∼q) = Tk+1 . Obviously, N (∼q) ≤ N (pi ) for all these values. B.2. If ord(N n (q)) > k: By the construction of N , N (q) = Tk+1 ; then N (∼q) = Fk+2 . Again, N (Qj ) ≤ N (pi ) holds for either possible form of Qj . B. If ord(M (pi )) > k: In this case, Fk < M (pi ) < Tk and N (pi ) = Tk+1 . B.1. If ord(M (q)) < k: This assumption translates to M (q) < Fk or Tk < M (q) and by N =k M , N (q) = M (q). Also, N (∼q) = M (∼q) < Fk+1 or Tk+1 < M (∼q) = N (∼q). However, if Qj = q, then M (Qj ) ≤ M (pi ) and M (pi ) < Tk imply that only M (q) < Fk can hold. Then, N (q) = M (q) < Fk < Tk+1 = N (pi ). Similarly, if Qj = ∼q, then M (Qj ) ≤ M (pi ) and M (pi ) < Tk imply that either N (∼q) = M (∼q) ≤ Fk+1 or N (∼q) = M (∼q) = Tk+1 . Obviously, N (∼q) ≤ N (pi ) for all these values. B.2. If ord(M (q)) = k: In this case M (q) = Fk or M (q) = Tk , while M (∼q) = Tk+1 or M (∼q) = Fk+1 respectively. If Qj = q, then M (Qj ) ≤ M (pi ) and M (pi ) < Tk imply that only M (q) = Fk can hold. Then, N ❁k M suggests that N (Qj ) = N (q) = Fk < Tk+1 = N (pi ). On the other hand, if Qj = ∼q, then M (q) = Fk and M (q) = Tk both allow M to satisfy the clause. By N ❁k M and the construction of N , N (q) ∈ {Fk , Tk+1 , Tk } and so N (∼q) ∈ {Fk+1 , Fk+2 , Tk+1 }. Observe that N (∼q) ≤ Tk+1 = N (pi ) holds for all three possible values of N (q). B.3. If ord(M (q)) > k: By the construction of N , N (q) = Tk+1 ; then N (∼q) = Fk+2 . Again, N (Qj ) ≤ N (pi ) holds for either possible form of Qj . We have shown that N (Qj ) ≤ N (pi ) in every case and thus, that N satisfies every clause in Pn . Let us now examine a clause not in Pn . This time we consider the values assigned by M (as opposed to N n ) and follow a similar reasoning as before. So again we assume that, among the atoms in the head of the clause, M assigns the maximum value to some pi . Also we assume that, among the literals in the body of the clause, M assigns the minimum value to some Qj of the form q or the form ∼q. We now have M (Qj ) ≤ M (pi ) and, as in the previous case, to show that N satisfies the clause, it suffices to show that N (Qj ) ≤ N (pi ). Observe that, if S(q) > n then N (Qj ) ≤ N (pi ) can be shown in exactly the same way as in the previous case, by simply substituting M for N n . If on the other hand S(q) ≤ n, we need to follow a slightly different line of arguments. We distinguish the following cases: We have shown that N (Qj ) ≤ N (pi ) in every case and thus, that N satisfies the clauses of P that are not in Pn , as well as those that are in Pn . In conclusion, we have shown that N is a model of P which satisfies N ❁k M , i.e. we have reached a contradiction. This proves that M n must be a minimal model of Pn . 145 ∼q. Observe that, because q appears in a negative literal in a clause of Pn , it must be that S(q) = n′ for some ′ n′ < n. Then, by the induction hypothesis M n (q) ∈ ′ {F0 , T0 , . . . , Fn′ , Tn′ }. Also, M n (q) = M n (q) by definition, which eventually makes ord(M n (q)) < n. In this case, N n (q) = M n (q). Since M n is a model of Pn and so satisfies the clause, it must be M n (pi ) ≥ M n (∼q) (1) n n A.1. If M (q) = Fm ⇒ M (∼q) = Tm+1 , m < n: M n (pi ) ≥ M n (∼q) ⇒ M n (pi ) ≥ Tm+1 ⇒ ord(M n (pi )) ≤ m + 1 ≤ n ⇒ N n (pi ) = M n (pi ) The latter relation, in conjunction with N n (q) = M n (q) and M n (pi ) ≥ M n (∼q), yields N n (pi ) ≥ N n (Qj ). A.2. If M n (q) = Tm ⇒ M n (∼q) = Fm+1 , m < n: M n (pi ) ≥ M n (∼q) ⇒ M n (pi ) ≥ Fm+1 ⇒  if Fn < M n (pi ) < Tn  Fn , n N (pi ) = M n (pi ), if Fm+1 ≤ M n (pi ) ≤ Fn  or Tn ≤ M n (pi ) The above lemma shows there is a relationship between any arbitrary stratification of a given program and the structure of the program’s minimal models. In the following we will explore another, more obvious aspect of this relationship, which highlights the proximity of the core ideas behind stratification and the infinite-valued semantics. In every clause of a stratified program, the propositional symbols appearing in a negative form have to be placed in a lower stratum than the stratum in which we place the propositional symbols of the clause head. Because of this, P0 , as defined by any stratification of a program P, is a positive program. As such, Theorem 4 of section 3 states that its minimal models assign only truth values of order 0 to all atoms. The next theorem demonstrates that this correlation between the stratum and the order of truth values assigned by the minimal model extends to all strata of the program. This way, it sets the number of strata as an improved bound (compared to the one set by Theorem 2 for the general case) of the order of truth values in the minimal model of a stratified program. Theorem 5. Let P be a stratified program, {S0 , . . . , Sr } be a stratification of P and M be a minimal infinite-valued model of P. For every propositional symbol p appearing in P, M (p) ∈ {F0 , T0 , . . . , Fr , Tr }. Proof. The proof is by induction on the strata of the program. Base case: For S0 , P0 is a positive program and, by Lemma 1, M 0 is a minimal model of P0 . From the second part of Theorem 4, M 0 assigns to the propositional symbols in HB 0 values in the set {F0 , T0 }. Induction step: We will show that M n assigns to the propositional symbols in HB n values in the set {F0 , T0 , . . . , Fn , Tn }, assuming that for all m < n, M m assigns to the propositional symbols in HB m values in the set {F0 , T0 , . . . , Fm , Tm }. Assume that there exists some propositional symbol p ∈ HB n , such that ord(M n (p)) > n. We construct an interpretation N n as so: Fn if ord(M n (p)) > n n N (p) = M n (p) otherwise The latter relation, in conjunction with N n (q) = M n (q) and Fm+1 ≤ Fn , gives us N n (pi ) ≥ N n (Qj ). B. Among the literals in the body of the clause, M n assigns the minimum value to some Qj of the form q. Since M n is a model of the program, it must satisfy the clause, i.e. M n (pi ) ≥ M n (Qj ) = M n (q). We have: B.1. If Fn < M n (q) < Tn , then N n (q) = Fn . Moreover: M n+1 (pi ) ≥ M n (q) ⇒ M n+1 (pi ) > N n+1 (pi ) = Fn ⇒ Fn , if Fn < M n (pi ) < Tn M n (pi ), if M n (pi ) ≥ Tn This makes N n (pi ) ≥ N n (q) in every case. B.2. If M n (q) = Tm , m ≤ n, then N n (q) = M n (q). Moreover: M n (pi ) ≥ M n (q) ⇒ M n (pi ) ≥ Tm ≥ Tn ⇒ N n (pi ) = M n (pi ) n Therefore M (pi ) ≥ M n (q) ⇒ N n (pi ) ≥ N n (q) = N n (Qj ). B.3. If M n (q) = Fm , m ≤ n, then N n (q) = M n (q). Moreover: M n (pi ) ≥ M n (q) ⇒ M n (pi ) ≥ Fm ⇒  if Fn < M n (pi ) < Tn  Fn , n n N (pi ) = M (pi ), if M n (pi ) ≤ Fn  or M n (pi ) ≥ Tn It is simple to see that N n ❁n M n and we will show that N n is a model of P n . This constitutes a contradiction, since by Lemma 1 M n is a minimal model of P n . The contradiction renders impossible the existence of atoms q ∈ HB n such that ord(M n (q)) > n. Every clause in Pn is of the form: p 1 ∨ · · · ∨ p n h ← Q 1 , . . . , Q nb We examine the possible truth values that M n and N n assign to the propositional symbols in the above arbitrary clause and show that N n satisfies the clause in every case. Assume that, among the propositional symbols in the head of the clause, M n assigns the maximum truth value to some pi . We distinguish the following cases: A. Assume that among the literals in the body of the clause, M n assigns the minimum value to some Qj of the form Therefore N n (pi ) ≥ N n (q). 146 we have N ′ (ri ) = T . The latter case is not possible because it would imply that N (ri ) > 0, ie., N (∼ri ) < 0, and therefore N (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) < 0. Therefore, for some qi it is N ′ (qi ) = F while N (qi ) > F . This implies that Col(M )(qi ) = T and therefore Col(M )(q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = T . However, since N ′ (p1 ∨ · · · ∨ pn ) = F , we also have Col(M )(p1 ∨ · · · ∨ pn ) = F , and thus Col(M )(p1 ∨ · · · ∨ pn ) < Col(M )(q1 , . . . , qk , ∼r1 , . . . , ∼rm ). This is a contradiction because Col(M ) is a model of P. C. N (p1 ∨ · · · ∨ pn ) = Fi and N (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = Fj with i < j. Then, M (p1 ∨ · · · ∨ pn ) = Fi and M (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = Fj , which is a contradiction because M is a model of P. D. N (p1 ∨· · ·∨pn ) = 0 and N (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = Ti . Since N (p1 ∨ · · · ∨ pn ) = 0, there exists some pi such that N (pi ) = 0, and therefore N ′ (pi ) < Col(M )(pi ) = T . This means that N ′ (pi ) ≤ ′ 0 and thus N (p1 ∨ · · · ∨ pn ) ≤ 0. However, since N (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = Ti , we get that N ′ (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = T . This is a contradiction because N ′ is a model of P. E. N (p1 ∨ · · · ∨ pn ) = Ti and N (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = Tj with i > j. Then, M (p1 ∨ · · · ∨ pn ) = Ti and M (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = Tj , which is a contradiction because M is a model of P. Naturally, the above theorem also applies to the class of normal (non-disjunctive) logic programs, since this is included in the class of disjunctive programs: Corollary 1. Let P be a stratified normal program, {S0 , . . . , Sr } be a stratification of P and M be the minimum infinite-valued model of P. For every propositional symbol p appearing in P, M (p) ∈ {F0 , T0 , . . . , Fr , Tr }. Note that a similar bound for the order or multitude of truth values was never given in (Rondogiannis and Wadge 2005), so Corollary 1 is in itself a novel result. On the other hand, the fact that the minimum infinite-valued model of a stratified normal program does not contain the undefined truth value was already implied in Theorem 3, as it is known that the well-founded model coincides with the perfect model for (locally) stratified programs. The exclusion of the undefined truth value from the intended models of a stratified program illustrated by Theorem 5 is a trademark characteristic of the semantics of stratified programs in traditional logic programing, retained by the disjunctive perfect model semantics (but not some other disjunctive semantics). The following lemma reveals that the perfect models and minimal infinite-valued models of stratified programs have another quality in common, in stating that the minimal infinite-valued models of stratified programs in fact correspond to traditional minimal models. Lemma 2. Let P be a stratified disjunctive logic program with negation and let M be a minimal infinite-valued model of P. Then, Col(M ) is a minimal classical model of P. In conclusion, N is a model of P and N ❁∞ M . This is a contradiction because we have assumed that M is a ⊑∞ minimal model of P. Proof. Assume that Col(M ) is not minimal. Then, there exists a model N ′ of P with N ′ < Col(M ). Notice that Col(M ) is two-valued by Theorem 5, but N ′ need not necessarily be two-valued. We construct the following infinitevalued interpretation: M (p), if Col(M )(p) = N ′ (p) N (p) = 0, if N ′ (p) < Col(M )(p) 6 Comparing the Semantics of Stratified Programs In this section we revisit some examples of stratified programs presented in the literature for the purpose of comparing various semantics, including the perfect model semantics. We find that the Infinite-Valued Semantics Lmin be∞ haves equivalently to the perfect model semantics in every case. We also give a summary of the more general comparison results mentioned in Section 2 supplemented by a few additional observations drawn from this section. We begin with this example from (Brass and Dix 1994): Notice that N < M and N ⊑∞ M . We claim that N is a model of P. Assume it is not. Then, there exists a clause: p1 ∨ · · · ∨ pn ← q1 , . . . , qk , ∼r1 , . . . , ∼rm such that N (p1 ∨· · ·∨pn ) < N (q1 , . . . , qk , ∼r1 , . . . , ∼rm ). We distinguish cases based on the values of N (p1 ∨· · ·∨pn ) and N (q1 , . . . , qk , ∼r1 , . . . , ∼rm ). Example 3. Consider the following stratified program: p←q p ← ∼q q∨r← A. N (p1 ∨ · · · ∨ pn ) = Fi and N (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = Tj for some i, j. But then, N (pi ) < 0 for all i; therefore N ′ (pi ) = F for all i and thus N ′ (p1 ∨ · · · ∨ pn ) = F . Moreover N (qi ) > 0 for all i and N (ri ) < 0 for all i. Therefore, N ′ (qi ) = T for all i and N ′ (ri ) = F for all i, and consequently N ′ (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = T . This means that N ′ is not a model of P (contradiction). B. N (p1 ∨ · · · ∨ pn ) = Fi and N (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = 0. But then, N ′ (p1 ∨ · · · ∨ pn ) = ′ F , and since N is a model of P, it must also be N ′ (q1 , . . . , qk , ∼r1 , . . . , ∼rm ) = F . This means that either for some qi we have N ′ (qi ) = F or for some ri The only possible stratification for this program is S0 = {q, r}, S1 = {p}. This program has two minimal infinitevalued models: M1 = {(p, T0 ), (q, T0 ), (r, F0 )} and M2 = {(p, T1 ), (q, F0 ), (r, T0 )}. It also has two perfect models, N1 = {p, q} and N2 = {p, r}. The two semantics give equivalent results: M1 collapses to N1 and M2 to N2 . The next example (again from (Brass and Dix 1994)) was used to demonstrate that neither the Strong Well-Founded Semantics (Ross 1989) nor the D-WFS (Brass and Dix 1995) coincide with the perfect model semantics. The same 147 Static 6= 6= 6= WFDS WFSd PEL Lmin ∞ Perfect ≶ 6= 6= 6= ≈ 6= 6= 6= 6= ≶ > ≈ > > ≶ 6= > 6= 6= ≶ 6= > 6= 6= 6= 6= 6= 6= 6= 6= Lmin ∞ PEL 6= WFSd 6= 6 = WFDS DWFS ≶ 6= ≶ Static The Strong Well-Founded Semantics, the D-WFS and Wang’s WFDS all leave r undefined, while the DWFS of (Baral 1992), the Stationary and the WFSd all allow to conclude r. The program has two perfect models, N1 = {p, r} and N2 = {q, r}, and r is true in both of them. Again, there exists a one-to-one equivalence between the perfect models and the minimal infinite-valued models M1 = {(p, T0 ), (q, F0 ), (r, T1 )} and M2 = {(p, F0 ), (q, T0 ), (r, T1 )}. As mentioned in Section 2, the Static Semantics is based on translating the program into a belief theory, i.e. a set of rules featuring the belief operator B. The following example is used in (1995) to justify the weak nature of the semantics: Example 5. Consider the following stratified program: Strong Stationary DWFS D-WFS D-WFS p∨q← r ← ∼p r ← ∼q Stationary Strong example is also discussed in (Baral 1992), (Wang 2000) and (Alcântara, Damásio, and Pereira 2005). Example 4. Consider the following stratified program: Table 1: Relationships among disjunctive well-founded semantics Table 1 summarizes the relationships among the discussed generalizations of the well-founded semantics, which are either stated in or inferred (by juxtaposing examples) from the cited references and this section. We use the symbol > to denote that the semantics that defines the respective row of the table is stronger than the semantics that defines the respective column. The symbol 6= simply denotes that the two semantics are different in general, while ≶ is used when we know that neither semantics is stronger or weaker than the other. Finally, ≈ denotes that they coincide under certain conditions, e.g. if restricted to a common language or to stratified programs. goto australia ∨ goto europe ← goto both ← goto australia, goto europe save ← ∼goto both cancel reservation ← ∼goto australia cancel reservation ← ∼goto europe The propositional symbol cancel reservation is assigned the value T in the two perfect models of the program. Once more, the perfect models coincide with the collapsed infinite-valued models, M1 = {(goto australia, F0 ), (goto europe, T0 ), (goto both, F0 ), (save, T1 ), (cancel reservation, T1 )}, M2 = {(goto australia, T0 ), (goto europe, F0 ), (goto both, F0 ), (save, T1 ), (cancel reservation, T1 )}. Under the Static Semantics, either ∼goto australia or ∼goto europe must be believed for cancel reservation to be derived; even though the Static Semantics derives B(∼goto australia∨ ∼goto europe) from the above program, this does not imply B∼goto australia ∨ B∼goto europe and cancel reservation is not concluded. Based on this example, Przymusinski argues that the perfect model semantics is too strong. He makes a point that cancel reservation is not derivable under his semantics by a conscious choice which could be remedied, if so desired, by assuming the Disjunctive Belief Axiom, B(F ∨ G) ≡ BF ∨ BG. The Static Semantics, Przymusinski continues, is a minimal approach that can be adapted to fit a variety of requirements, by explicitly adding axioms. This versatility is indeed an attractive quality of the Static Semantics. However, there is no formal statement in (Przymusinski 1995) that the inclusion of the Disjunctive Belief Axiom (or any finite set of axioms) would render the semantics equivalent to the perfect model semantics. It is our view that the perfect model semantics and Lmin ∞ do give the intuitive interpretation of this specific program, though we acknowledge that for some applications, the outcome of the Static Semantics could be more desirable. 7 Conclusions and future work In this paper we have singled out the Infinite-Valued Semantics Lmin ∞ as an attractive approach to generalizing the wellfounded semantics to the class of disjunctive programs and focused on its behavior in the case of stratified programs. We showed that the minimal infinite-valued models of such programs have a tiered structure that corresponds to that of the stratified program and close connections to the perfect models. Our most notable contribution is that the rank of the stratum at which a propositional symbol is placed correlates with the maximum order of the truth value this atom is assigned in a minimal model. This way we show the number of strata to be an upper bound of the order - and, consequently, the multitude - of truth values assigned by the minimal models. We consider this a significant improvement of the upper bound set by Theorem 2 (originally stated in (Cabalar et al. 2007)) for the general case, namely the cardinality of the Herbrand base of the program. Moreover, the undefined truth value was excepted from the bound of Theorem 2; this is not the case for our improved bound, i.e. the minimal models of a stratified program do not assign this value. We have presented an overview of the existing alternative generalizations of the well-founded semantics for disjunctive programs, especially noting their very different characterizations, as well as any comparative results we have found involving the discussed approaches and the perfect model 148 Eiter, T.; Gottlob, G.; and Mannila, H. 1994. Adding disjunction to datalog (extended abstract). In Proceedings of the Thirteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS ’94, 267–278. New York, NY, USA: Association for Computing Machinery. Fernández, J. A., and Minker, J. 1995. Bottom-up compuation of perfect models for disjunctive theories. J. Log. Program. 25(1):33–51. Gelfond, M., and Lifschitz, V. 1988. The stable model semantics for logic programming. In ICLP/SLP, 1070–1080. MIT Press. Lobo, J.; Minker, J.; and Rajasekar, A. 1992. Foundations of Disjunctive Logic Programming. Cambridge, MA, USA: MIT Press. Przymusinski, T. C. 1988. On the declarative semantics of deductive databases and logic programs. In Foundations of Deductive Databases and Logic Programming. Morgan Kaufmann. 193–216. Przymusinski, T. C. 1989. Every logic program has a natural stratification and an iterated least fixed point model. In Silberschatz, A., ed., PODS, 11–21. ACM Press. Przymusinski, T. C. 1990. Stationary semantics for disjunctive logic programs and deductive databases. In Proceedings of the 1990 North American Conference on Logic Programming, 40–59. Cambridge, MA, USA: MIT Press. Przymusinski, T. C. 1991. Semantics of disjunctive logic programs and deductive databases. In Delobel, C.; Kifer, M.; and Masunaga, Y., eds., Deductive and Object-Oriented Databases, Second International Conference, DOOD’91, Munich, Germany, December 16-18, 1991, Proceedings, volume 566 of Lecture Notes in Computer Science, 85–107. Springer. Przymusinski, T. C. 1995. Static semantics for normal and disjunctive logic programs. Ann. Math. Artif. Intell. 14(24):323–357. Rondogiannis, P., and Wadge, W. W. 2005. Minimum model semantics for logic programs with negation-as-failure. ACM Trans. Comput. Log. 6(2):441–467. Ross, K. A. 1989. The well founded semantics for disjunctive logic programs. In DOOD, 385–402. van Gelder, A.; Ross, K. A.; and Schlipf, J. S. 1988. Unfounded sets and well-founded semantics for general logic programs. In Edmondson-Yurkanan, C., and Yannakakis, M., eds., PODS, 221–230. ACM. van Gelder, A. 1988. Negation as failure using tight derivations for general logic programs. In Minker, J., ed., Foundations of Deductive Databases and Logic Programming. Morgan Kaufmann. 149–176. Wang, K. 2000. Argumentation-based abduction in disjunctive logic programming. J. Log. Program. 45(1-3):105–141. Wang, K. 2001. A comparative study of well-founded semantics for disjunctive logic programs. In Eiter, T.; Faber, W.; and Truszczynski, M., eds., LPNMR, volume 2173 of Lecture Notes in Computer Science, 133–146. Springer. semantics. As noted by other authors, this great difference in characterizations makes it particularly hard to perform thorough comparisons of the approaches. Therefore, most comparisons are limited to the examining example programs, some of which we have discussed in this paper. Many questions remain open in regard to the infinitevalued semantics of disjunctive programs in general and stratified disjunctive programs in particular. For example, the development of an immediate consequence operator for the disjunctive infinite-valued semantics would hold great value in itself. Additionally, it could contribute greatly to the comparative study of the infinite-valued semantics and other disjunctive semantics that have fixed-point characterizations, including the perfect model semantics (Fernández and Minker 1995). This could help to better understand the place of the disjunctive infinite-valued semantics as a generalization of the well-founded semantics and perhaps the perfect model semantics. References Alcântara, J.; Damásio, C. V.; and Pereira, L. M. 2005. A well-founded semantics with disjunction. In Gabbrielli, M., and Gupta, G., eds., ICLP, volume 3668 of Lecture Notes in Computer Science, 341–355. Springer. Apt, K. R.; Blair, H. A.; and Walker, A. 1988. Towards a theory of declarative knowledge. In Foundations of Deductive Databases and Logic Programming. Morgan Kaufmann. 89–148. Baral, C. 1992. Generalized negation as failure and semantics of normal disjunctive logic programs. In Voronkov, A., ed., LPAR, volume 624 of Lecture Notes in Computer Science, 309–319. Springer. Brass, S., and Dix, J. 1994. A disjunctive semantics based on unfolding and bottom-up evaluation. In GI Jahrestagung, 83–91. Brass, S., and Dix, J. 1995. Disjunctive semantics based upon partial and bottom-up evaluation. In ICLP, 199–213. Brass, S.; Dix, J.; Niemelä, I.; and Przymusinski, T. C. 2001. On the equivalence of the static and disjunctive wellfounded semantics and its computation. Theor. Comput. Sci. 258(1-2):523–553. Cabalar, P.; Odintsov, S. P.; Pearce, D.; and Valverde, A. 2006. Analysing and extending well-founded and partial stable semantics using partial equilibrium logic. In Etalle, S., and Truszczynski, M., eds., ICLP, volume 4079 of Lecture Notes in Computer Science, 346–360. Springer. Cabalar, P.; Pearce, D.; Rondogiannis, P.; and Wadge, W. W. 2007. A purely model-theoretic semantics for disjunctive logic programs with negation. In Baral, C.; Brewka, G.; and Schlipf, J. S., eds., LPNMR, volume 4483 of Lecture Notes in Computer Science, 44–57. Springer. Eiter, T., and Gottlob, G. 1993. Complexity aspects of various semantics for disjunctive databases. In Proceedings of the Twelfth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS ’93, 158–167. New York, NY, USA: Association for Computing Machinery. 149 Information Revision: The Joint Revision of Belief and Trust Ammar Yasser1 , Haythem O. Ismail2,1 1 German University in Cairo, Egypt 2 Cairo University, Egypt ammar.abbas@guc.edu.eg, haythem.ismail@guc.edu.eg Abstract and Bonnet 2018). Nevertheless, we believe that there are several issues that are left unaddressed by the logical approaches. Intuitively, trust is intimately related to misleading, on one hand, and belief revision, on the other. While several logical treatments of misleading are to be found in the literature (Sakama, Caminada, and Herzig 2010; van Ditmarsch 2014; Sakama 2015; Ismail and Attia 2017, for instance), the relation of misleading to trust erosion is often not attended to or delegated to future work. On the other hand, the extensive literature on belief revision (Alchourrón, Gärdenfors, and Makinson 1985; Hansson 1994; Darwiche and Pearl 1997; Van Benthem 2007, for example), while occasionally addressing trust-based revision of beliefs (Lorini, Jiang, and Perrussel 2014; Rodenhäuser 2014; Booth and Hunter 2018) does not have much to say about the revision of trust (but see (Liau 2003; Lorini, Jiang, and Perrussel 2014) for minimal discussions) and, as far as we know, any systematic study of jointly revising belief and trust. The goal of this paper is, hence, twofold: (i) to motivate why belief and trust revision are intertwined and should be carried out together, and (ii) to propose AGM-style postulates for the joint revision of trust and belief. The paper is structured as follows. Section 2 describes what we mean by trust, information and information sources. It also highlights the intuitions behind joint trust and belief revision. In Section 3, we present information states, a generic structure representing information and investigating its properties. Section 4 presents a powerful notion of relevance which information structures give rise to. In Section 5, the formal inter-dependency of belief and trust is explored, culminating in AGM-style postulates for joint belief-trust revision. Finally, Section 6 presents an extended example highlighting some of the key concepts proposed in the paper.1 Most of our decisions are guided by trust, specifically decisions about what to believe and what not to believe. We accept information from sources we trust and doubt information from sources we do not trust and, in general, rely on trust in revising our beliefs. While we may have difficulty defining exactly what trust is, we can, on one hand, rather easily explain why we trust or mistrust someone and, on the other hand, occasionally revise how much we trust them. In this paper, we propose that trust revision and belief revision are inseparable processes. We address issues concerning the formalization of trust in information sources and provide AGMstyle postulates for rational joint revision of the two attitudes. In doing so, we attempt to fill a number of gaps in the literature on trust, trust revision, and their relation to belief revision. 1 Introduction Trust acts, even if we are not aware, as an information filter. We are willing to believe in information communicated by sources we trust, cautious about information from sources we do not trust, and suspicious about information from sources we mistrust. Trust and mistrust are constantly revised; we gain more trust in information sources the more they prove themselves to be reliable, and our trust in them erodes as they mislead us one time after the other. Such attitudes allow us to be resilient, selective and astute. If exhibited by logic-based agents, these same attitudes would make them less susceptible to holding false beliefs and, hence, less prone to excessive belief revision. Moreover, by revising trust, these agents will not forever be naively trusting nor cynically mistrusting. Trust has been thoroughly investigated within multiagent systems (Castelfranchi and Falcone 1998; Falcone and Castelfranchi 2001; Jones and Firozabadi 2001; Jones 2002; Sabater and Sierra 2005; Katz and Golbeck 2006, for instance), psychology (Simpson 2007; Elangovan, Auer-Rizzi, and Szabo 2007; Haselhuhn, Schweitzer, and Wood 2010, for instance), and philosophy (Holton 1994; Hardwig 1991; McLeod 2015, for instance). Crucially, it was also investigated in the logic-based artificial intelligence (AI) literature by several authors (Demolombe 2001; Demolombe and Liau 2001; Liau 2003; Katz and Golbeck 2006; Herzig et al. 2010; Drawel, Bentahar, and Shakshuki 2017; Leturc 2 2.1 Trust and Belief Trust in Information Sources It is often noted that trust is not a dyadic relation, between the trusted and the trustee, but is a triadic relation involving an object of trust (McLeod 2015). You trust your doctor 1 Because of space constraints, we were not able to provide all results in this paper. Hence, some selected proofs are available through this online appendix: proofs. 150 in σ2 who earlier conveyed ψ to us. Moreover, suppose that φ, together with other beliefs, implies our old belief ξ. We say that φ is a confirmation of ξ. This confirmation may trigger us to revise, and increase, our trust in σ3 who is the source of ξ. Thus, trust revision depends on belief revision. In fact, belief revision may be the sole factor that triggers rational trust revision in information sources. We need not stop there though. For, by reducing our trust in σ2 ’s reliability, we are perhaps obliged to stop believing (or reduce our degree of belief in) ψ ′ which was conveyed by σ2 . It is crucial to note that ψ ′ may be totally consistent with φ and we, nevertheless, give it away. While we find such scenario quite plausible, classical belief revision, with its upholding of the principle of minimal change, would deem it irrational. Likewise, by increasing our trust in σ3 we may start believing (or raise our degree of belief) in ξ ′ which was earlier conveyed by σ3 . This second round of belief revision can start a second round of trust revision. It is clear that we may keep on doing this for several rounds (perhaps indefinitely) if we are really fanatic about information and its sources. Hence, we contend that belief revision and trust revision are so entangled that they need to be combined into one process of joint belief-trust revision or, as we shall henceforth refer to it, information revision. with your health, your mechanic with your car, your parents to unconditionally believe you, and your mathematics professor to tell you only true statements of mathematics. Our investigation of the coupling of belief and trust lets us focus only on trust in sources of information. Trust in information sources comes in different forms. Among Demolombe’s (Demolombe 2004; Lorini and Demolombe 2008) different types of trust in information sources, we focus on trust in sincerity and competence since they are the two types relevant to belief revision and realistic information sources.2 A sincere information source is one which (if capable of forming beliefs) only conveys what it believes; a competent source is one which only conveys what is true. In this paper, we consider trust in the reliability of information sources, where a source is reliable if it is both sincere and competent.3 Note that we do not take information sources to only be cognitive agents. For example, a sensor (or perception, in general) is a possible source of information. For information sources which are not cognitive agents, reliability reduces to competence. 2.2 Joint Revision of Trust and Belief Rational agents constantly receive information, and are faced with the question of whether to believe or not to believe. The question is rather simple when the new information is consistent with the agent’s beliefs, since no obvious risk lies in deciding either way. Things become more interesting if the new information is inconsistent with what the agent believes; if the agent decides to accept the new information, it is faced with the problem of deciding on which of its old beliefs to give up in order to maintain consistency. Principles for rationally doing this are the focus of the vast literature on belief revision (Alchourrón, Gärdenfors, and Makinson 1985; Hansson 1999a, for example). It is natural to postulate that deciding whether to believe and how to revise our beliefs–the process of belief revision– are influenced by how much we trust the source of the new piece of information. (Also see (Lorini, Jiang, and Perrussel 2014; Rodenhäuser 2014; Booth and Hunter 2018).) In particular, in case of a conflict with old beliefs, how much we trust in the source’s reliability and how much evidence we have accumulated for competing beliefs seem to be the obvious candidates for guiding us in deciding what to do. Thus, rational belief revision depends on trust. But things are more complex. For example, suppose that information source σ1 , whom we trust very much, conveys φ to us. φ is inconsistent with our beliefs but, because we trust σ1 , we decide to believe in φ and give away ψ which, together with other beliefs, implies ¬φ. In this case, we say that φ is a refutation of ψ. So far, this is just belief revision, albeit one which is based on trust. But, by stopping believing in ψ, we may find it rational to revise, and decrease, our trust 3 Information States In order for an agent A to perform information revision, revising both its beliefs and its trust in sources, it needs to be able to recall more than just what it believes, or how much it trusts certain sources, as is most commonly the case in the literature. Hence, we introduce formal structures for representing information in a way that would facilitate information revision. Definition 3.1. An information grading structure G is a qunituple (Db , Dt , ≺b , ≺t , δ), where Db and Dt are nonempty, countable sets; ≺b and ≺t are, respectively, total orders over Db and Dt ; and δ ∈ Dt . Db and Dt contain the degrees of belief and trust, respectively. They are not necessarily finite, disjoint, different or identical.4 Moreover, to be able to distinguish the strength by which an agent believes a proposition or trusts a source, the two sets are ordered; here, we assume them to be totally ordered. δ is interpreted as the default trust degree assigned to an information source with which the agent has no earlier experience. Definition 3.2. An information structure I is a quadruple (L, C, S, G), where 1. L is a logical language with a Tarskian consequence operator Cn, 2. C is a finite cover of L whose members are referred to as topics, 3. S is a non-empty finite set of information sources, and 4. G is an information grading structure. 2 Trust in completeness, for example, is unrealistic since it requires that the source informs about P whenever P is true. 3 As suggested by (Ismail and Attia 2017), it is perhaps possible that breach of sincerity and competence should have different effects on belief revision; for simplicity, we do not consider this here though. 4 Db and Dt are usually the same; however, a qualitative account of trust and belief might have different sets for grading the two attitudes. 151 the conveyance instances that make it into H(K). Hence, a generic revision operator is denoted by ⋉F , where F is the associated filter. Revising K with a conveyance of φ by σ is denoted by K ⋉F (φ, σ). We require all revision operators ⋉F to have the same effect on the history: H(K) ∪ {(φ, σ)} (φ, σ) ∈ F H(K ⋉F (φ, σ)) = H(K) otherwise Information structures comprise our general assumptions about information. S is the set of possible information sources. Possible pieces of information are statements of the language L, with each piece being about one or more, but finitely many, topics as indicated by the L-cover C. L is only required to have a Tarskian consequence operator (Hansson 1999b). A topic represents the scope of trust. It is a set of statements which may be closed under all connectives, some connectives or none at all. Topics could also be disjoint or overlapping. Choosing topics to be not necessarily closed under logical connectives allows us to accommodate interesting cases. For example, A may have, for the same source, a different trust value when conveying φ to when it conveys ¬φ. Definition 3.3. Let I = (L, C, S, (Db , Dt , ≺b , ≺t , δ)) be an information structure. An information state K over I is a triple (B, T , H), where There are three major filter types. A filter F is nonforgetful if F = L × S; it is forgetful if ∅ 6= F ⊂ S × L; and it is memory-less if F = ∅. Having filters beside the non-forgetful one is to simulate realistic scenarios where an agent does not always remember every piece of information that was conveyed to it. Henceforth, the subscript F will be dropped from ⋉F whenever this does not lead to ambiguity. We now turn to what happens to the belief and trust bases of a revised information state. We start with two general definitions. Definition 3.4. Formula φ is more entrenched in state K2 over state K1 , denoted K1 ≺φ K2 if 1. B : L ֒→ Db is a partial function referred to as the belief base, 2. T : S × C ֒→ Dt is a partial function referred to as the trust base, and 3. H ⊆ L × S, the history, is a finite set where, for every T ∈ C, if φ ∈ T then (σ, T, dt ) ∈ T , for some dt ∈ Dt . 1. φ ∈ / Cn(F or(B(K1 ))) and φ ∈ Cn(F or(B(K2 ))) or 2. (φ, b1 ) ∈ B(K1 ), (φ, b2 ) ∈ B(K2 ), and b1 ≺b b2 . If K1 ⊀φ K2 and K2 ⊀φ K1 , we write K1 ≡φ K2 . Definition 3.5. Source σ is more trusted on topic T in state K2 over state K1 , denoted K1 ≺σ,T K2 if (σ, T, t1 ) ∈ T (K1 ), (σ, T, t2 ) ∈ T (K2 ), and t1 ≺t t2 . If K1 ⊀σ,T K2 and K2 ⊀σ,T K1 , we write K1 ≡σ,T K2 . Intuitively, a belief changes after revision if is added to or removed from the belief base, or if its associated grade changes. Similarly, trust in a source regarding a topic changes after revision if the associated trust grade changes. Trust in information sources is recorded in T (K). This is a generalization to accommodate logics with an explicit account of trust in the object language (Demolombe and Liau 2001; Leturc and Bonnet 2018, for instance) as well as those without (Katz and Golbeck 2006; Jøsang, Ivanovska, and Muller 2015, for example). H(K) acts as a formal device for recording conveyance instances.5 As with T (K), we do not require L to have an explicit account for conveying.6 With this setup, having trust on single propositions, as is most commonly the case in the literature (Demolombe 2004; Leturc and Bonnet 2018, for instance), reduces to restricting all topics to be singletons. On the other hand, we may account for absolute trust in sources by having a single topic to which all propositions belong. So far, we defined what information states are. We now define the following abbreviations of which we will later make use. 4 Relevant Change As proposed earlier, the degrees of trust in sources depend on the degrees of belief in formulas conveyed by these sources and vice versa. Hence, on changing the degree of belief in some formula φ, the degree of trust in a source σ, that previously conveyed φ, is likely to change. However, when the degree of trust in σ changes, the degrees of belief in formulas conveyed by σ might change as well. To model such behavior, we need to keep track of which formulas and which sources are “relevant” to each other. First, we recall a piece of terminology due to (Hansson 1994): Γ ⊂ L is a φ-kernel (φ ∈ L), Γ |= φ and, for every ∆ ⊂ Γ, ∆ 6|= φ. Definition 4.1. Let K be an information state. The support graph G(K) = (SK ∪ ΦK , E) is such that (u, v) ∈ E if and only if • σ(H(K)) = {φ | (φ, σ) ∈ H(K)} • SK = {σ | (φ, σ) ∈ H(K)} • F or(B(K)) = {φ | (φ, db ) ∈ B(K)} • ΦK = F or(B(K)) ∪ {φ | φ ∈ σ(H(K)) f or all σ ∈ SK } Information revision is the process of revising an information state K with the conveyance of a formula φ by a source σ. Every information revision operator is associated with a conveyance inclusion filter F ⊆ L × S which determines 1. u ∈ SK , v ∈ ΦK , and v ∈ u(H(K)); 2. u ∈ ΦK , v ∈ ΦK , u 6= v, and u ∈ Γ ⊆ ΦK where Γ is a v-kernel; or 3. u ∈ ΦK , v ∈ SK , and (v, u) ∈ E. 5 We chose H(K) to be a set for simplicity. However, it could be more beneficial if it was a sequence instead. An agent might need to record multiple conveyances by the same source for the same formula or distinguish more recents one without an explicit representation of time. 6 The transfer of information is sometimes referred to using either “inform” or “communicate”. In this paper, we use “convey” as a cover term for any modality of information transfer. A node u supports a node v if there is a simple path from u to v. Figure 1 shows an example of the support graph for the following information state: Source σ1 conveys φ that logically implies ψ which, in turn, is conveyed by σ2 . Hence, 152 # B1 B2 B3 B4 B5 B6 B7 B8 Figure 1: The support graph where σ1 conveys φ which logically implies ψ which is conveyed by σ2 . K neither neither (φ, b1 ) ∈ B(K) (φ, b1 ) ∈ B(K) (¬φ, b1 ) ∈ B(K) (¬φ, b1 ) ∈ B(K) (¬φ, b1 ) ∈ B(K) (¬φ, b1 ) ∈ B(K) K⋉ (φ, b) ∈ B(K⋉ ) neither (φ, b2 ) ∈ B(K⋉ ) (φ, b1 ) ∈ B(K⋉ ) (φ, b2 ) ∈ B(K⋉ ) neither (¬φ, b2 ) ∈ B(K⋉ ) (¬φ, b1 ) ∈ B(K⋉ ) Notes b1 ≺ b b 2 b2 ≺ b b 1 - Table 1: The admissible scenarios of belief revision. there is an edge from σ1 to φ and from σ2 to ψ given the first clause in the definition of the graph. Also, there is an edge from φ to ψ given the second clause. Finally, according to the last clause, there is an edge from both φ and ψ to σ1 and σ2 , respectively. Note, for example, that φ supports ψ and that σ1 supports σ2 . Intuitively, φ supports ψ directly by logically implying it; σ1 supports σ2 by virtue of conveying a formula (φ) which confirms a formula (ψ) conveyed by σ2 . The support graph allows us to trace back and propagate changes in trust and belief to relevant beliefs and information sources along support paths. Instances of support may be classified according to the type of relata. Observation 4.1. Let K be an information state. 1. φ ∈ ΦK supports ψ ∈ ΦK if and only if φ 6= ψ and (i) φ ∈ Γ ⊆ ΦK where Γ is a ψ-kernel or (ii) φ supports some σ ∈ SK which supports ψ. 2. φ ∈ ΦK supports σ ∈ SK if and only if ψ ∈ σ(H(K)) and φ ∈ Γ ⊆ ΦK where Γ is a ψ-kernel or φ supports some σ ′ ∈ SK which supports σ. 3. σ ∈ SK supports φ ∈ ΦK if and only if ψ ∈ σ(H(K)) and ψ ∈ Γ ⊆ ΦK where Γ is a φ-kernel or σ supports some σ ′ ∈ SK which supports φ. 4. σ ∈ SK supports σ ′ ∈ SK if and only if σ 6= σ ′ σ supports some φ ∈ ΦK which supports σ ′ . Thus, given the first three clauses, the support relation from a formula to a formula, a formula to a source, or a source to formula may be established in two ways: (i) either purely logically via a path of only formulas or (ii) with the aid of a trust link via an intermediate source. A source can only support a source, however, by supporting a formula which supports that other source. Note that self-support is avoided by requiring support paths to be simple. The support graph provides the basis for constructing an operator of rational information revision. Traditionally, belief revision is concerned with minimal change (Gärdenfors and Makinson 1988; Hansson 1999a). In this paper, we model minimality using relevance. However, our notion of relevance is not restricted to logical relevance as with classical belief revision; it also accounts for source relevance. When an information state K is revised with formula φ conveyed by source σ, we want to confine changes in belief and trust to formulas and sources relevant to φ, ¬φ, and σ. Definition 4.2. Let K be an information state and u and v be nodes in G(K). u is v-relevant if u supports v or v supports u. Further, if φ, ψ ∈ L with Γφ ⊆ ΦK a φ-kernel and Γψ ⊆ ΦK a ψ-kernel, where u is v-relevant for some u ∈ Γφ and v ∈ Γψ , then φ is ψ-relevant. Observation 4.2. Let K be an information state where u is v-relevant. The following are true. 1. v is u-relevant. 2. If v ∈ σ(H(K)) and u 6= σ, then u is σ-relevant. 3. If v ∈ SK , φ ∈ v(H(K)), and u 6= φ, then, u is φrelevant. Hence, relevance is a symmetric relation. Crucially, if σ conveys φ, then the formulas and sources relevant to φ (other than σ) are exactly the formulas and sources relevant to σ (other than φ). For this reason, when revising with a conveyance of φ by σ it suffices to consider only φ-relevant (and ¬φ-relevant) formulas and sources. 5 Information Revision Before formalizing the postulates of information revision, we start by presenting the intuitions of changing beliefs and trust that constitute the foundation of said formalization. 5.1 Intuitions Table 1 shows the possible reasonable effects on B(K) as agent A revises its information state K with (φ, σ); K⋉ is shorthand for K ⋉ (φ, σ). The cases depend on whether φ ∈ Cn(F or(B(K))), ¬φ ∈ Cn(F or(B(K))), or neither φ or ¬φ is in Cn(F or(B(K))). It is important to note that the “neither” cases are that strong only to simplify introducing the intuitions. For further simplicity, we only consider cases where B(K) (and, of course, B(K⋉ )) is consistent so it is never the case that both φ and ¬φ are believed. In Case B1 , A believes neither φ nor ¬φ. Since A has no evidence to the contrary, on revising with φ, it is believed with some degree b, as now it is confirmed by a trusted source σ. Moreover, since ¬φ was neither refuted nor supported, it stays the same (K ≺φ K⋉ and K ≡¬φ K⋉ ). As with Case B1 , in Case B2 , A is neutral about φ. However, on revision, A finds that the weight of evidence for and against φ are comparable so that it cannot accept φ (K ≡φ K⋉ and K ≡¬φ K⋉ ). Unlike the previous two scenarios where A believed neither φ nor its negation, in Case B3 , A already believes φ with some degree b1 . Consequently, revision with φ confirms what is already believed. Since a new source σ now supports φ, it becomes more entrenched (K ≺φ K⋉ and K ≡¬φ K⋉ ). On the other hand, in Case B4 , on revising with φ, despite A’s already believing φ which is now being confirmed, φ does not become more entrenched (K ≡φ K⋉ 153 # B9 B10 B11 B12 B13 K neither (φ, b1 ) ∈ B(K) (φ, b) ∈ B(K) (φ, b1 ) ∈ B(K) (¬φ, b1 ) ∈ B(K) K⋉ (¬φ, b) ∈ B(K⋉ ) (φ, b2 ) ∈ B(K⋉ ) neither (¬φ, b2 ) ∈ B(K⋉ ) (¬φ, b2 ) ∈ B(K⋉ ) Notes b2 ≺ b b1 b1 ≺ b b2 its negation. To further illustrate this concept, consider a possible example. Example 5.1. Bob believes that there will be no classes tomorrow (φ) but he is not very certain about that. He meets T im, who tells him “there will be no classes tomorrow”. This is a direct confirmation of φ and, in normal circumstance, we should expect that Bob’s belief will become more entrenched. However, Bob recalls that, time and again, Tim has viciously lied to him about cancelled classes, thereby harming his academic status. One may consider it rational in this case for Bob to lower his degree of belief in φ, stop believing φ or, in extreme cases, to opt for believing ¬φ. Table 2: The forbidden scenarios of belief revision. and K ≡¬φ K⋉ ). An example where this might occur is when φ is believed with the maximum degree of belief, if such degree exists, or when φ has only been ever conveyed by σ, who is now only confirming itself. In this latter case, A might choose not to increase the degree of belief in φ. We now consider cases where A already believes ¬φ. In Case B5 , revising with the conflicting piece of information φ coming from a highly-trusted source σ, A stops believing ¬φ and starts believing φ instead (K ≺φ K⋉ and K⋉ ≺¬φ K). Similarly, in Case B6 , A is presented with evidence against ¬φ. After revision, A decides that there is not enough evidence to keep believing ¬φ. However, there is also not enough evidence to believe φ (K ≡φ K⋉ and K⋉ ≺¬φ K). Moreover, in Case B7 , A decides, on revision, that there is not enough evidence to completely give up ¬φ. However, there is enough evidence to doubt ¬φ (decrease ¬φ’s degree of belief) making it less entrenched (K ≡φ K⋉ and K⋉ ≺¬φ K). On the contrary, in Case B8 , A decides that there is not enough evidence to change its beliefs, even when provided with φ, and hence ¬φ remains unchanged (K ≡φ K⋉ and K ≡¬φ K⋉ ). A possible scenario for this is when the source is not trusted and so A decides not to consider this instance of conveyance. Other cases, we believe, should be forbidden for a rational operation of information revision. These cases are presented in Table 2. A is neutral about φ in Case B9 . However, when provided with evidence for φ, A neither believes φ nor does it remain neutral. Surprisingly, A starts believing ¬φ (K⋉ ≡φ K and K ≺¬φ K⋉ ). φ is already believed in Case B10 . However, on getting a confirmation for φ, it becomes less entrenched (K⋉ ≺φ K and K ≡¬φ K⋉ ). Similarly, in Case B11 , on receiving a confirmation for the already believed φ, A instead gives up believing φ (K⋉ ≺φ K and K ≡¬φ K⋉ ). An extreme case is that of Case B12 where A, already believing φ, receives a confirmation thereof and upon revision, ¬φ ends up being believed (K⋉ ≺φ K and K ≺¬φ K⋉ ). Finally, in Case B13 , A believes ¬φ; but, when provided with evidence against it, it becomes more entrenched nevertheless (K ≡φ K⋉ and K ≺¬φ K⋉ ). The cases in Table 2 may seem far fetched or even implausible. However, there is a line of reasoning that could accommodate such cases. Although, in this paper, we do not pursue this line or reasoning, it is at least worth a brief discussion. If agent A does not trust information source σ, A may be reluctant to believe what σ conveys, given no further supporting evidence; this much is perhaps uncontroversial. But if A, not only does not trust σ, but strongly mistrusts them (given a long history of being misled by the malicious source), then A may reject what σ conveys and also believe Intuitions about when and how trust in some information source should change is very context-sensitive and we believe it be unwise to postulate sufficient conditions for trust change in a generic information revision operation. For example, one might be tempted to say that, if after revision with φ, ¬φ is no longer believed, then trust in any source supporting ¬φ should decrease. Things are not that straightforward, though. Example 5.2. Let the belief base of agent A be {(S → P, b1 ), (Q → ¬S, b2 )}. Information source Jordan, conveys P then conveys Q. Since A has no evidence against either, it believes both. Now, information source N our, who is more trusted than Jordan, conveys S. Consequently, A starts believing S despite having evidence against it. To maintain consistency, A also stops believing Q (because it supports ¬S). What should happen to A’s trust in Jordan? We might, at first glance, think that trust in Jordan should decrease as he conveyed Q which is no longer believed. However, one could also argue that trust in Jordan should increase because he conveyed P , which is now being confirmed by N our. This example shows that setting general rules for how trust must change is almost impossible, as it depends on several factors. Whether A ends up trusting Jordan less, more, or without change appears to depend on how the particular revision operators manipulates grades. The situation becomes more complex if the new conveyance by N our supports several formulas supporting Jordan and refutes several formulas supported by him. In this case, how trust in Jordan changes (or not) would also depend on how the effects of all these support relations are aggregated. We contend that such issues should not, and cannot, be settled by general constraints on information revision. This non-determinism about how trust changes extends to similar non-determinism about how belief changes. According to Observation 4.1, a formula φ may support another formula ψ by transitivity through an intermediate source σ. Given that, in general, the effect of revising with φ on σ is non-deterministic, then so is its effect on ψ. Hence, the postulates to follow only provide necessary conditions for different ways belief and trust may change; the general principle being that the scope of change on revising with φ is limited to formulas and sources which are φ- and ¬φ-relevant. Postulating sufficient conditions is, we believe, ill-advised. 154 5.2 evidence for φ (⋉7 ). If σ ′ is less trusted after revision, then it must be either that φ succeeds and σ ′ (possibly identical to σ) is relevant to ¬φ, or that σ ′ is σ and there is believed evidence for ¬φ that leads to rejecting φ (⋉7 ). ψ is more entrenched after revision only if it is supported by φ (⋉9 ). Finally, ψ is less entrenched after revision only if it is relevant to φ or ¬φ (or both) and the one it is relevant to is not favored by the revision (⋉10 ). Postulates In the sequel, where φ is a formula and σ is a source, a σindependent φ-kernel is, intuitively, a φ-kernel that would still exist if σ did not exit. More precisely, for every ψ ∈ Γ, ψ is supported by some σ ′′ 6= σ, or ψ has no source. Of course, all formulas are conveyed by sources. However, given a forgetful filter, record of sources for some formulas may be missing from the history. We believe a rational information revision operator should observe the following postulates on revising an information state K with (φ, σ) and φ ∈ T where T is a topic. The postulates are a formalization of the intuitions outlined earlier. 5.3 The following observations follow from the definition of information states, the support graph, and the postulates. Observation 5.1. Let K be an information state. (⋉1 : Closure) K⋉(φ, σ) is an information state. (⋉2 : Default Attitude) If (σ, T, t) (σ, T, δ) ∈ T (K⋉(φ, σ)). ∈ / Discussion 1. Positive Entrenchment. If Cn(F or(B(K))) 6= L, then, K⋉(φ, σ) ⊀φ K. 2. Positive Persistence. If Cn(F or(B(K))) 6= L and φ ∈ Cn(F or(B(K))), then φ ∈ Cn(F or(B(K ⋉ (φ, σ)))). 3. Negative Persistence. If ¬φ ∈ / Cn(F or(B(K))), then ¬φ ∈ / Cn(F or(B(K ⋉ (φ, σ)))). 4. Formula Relevance. If K 6≡ψ K ⋉ (φ, σ), then ψ is φ- or ¬φ-relevant. 5. Trust Relevance. If K 6≡σ′ ,T K ⋉ (φ, σ), then σ ′ is φ- or ¬φ-relevant. 6. No Trust Increase I. If φ ∈ / F or(B(K ⋉ (φ, σ))), then there is no σ ′ ∈ SK such that K ≺σ′ ,T K ⋉ (φ, σ). 7. Rational Revision. If Cn(F or(B(K))) 6= L, then an operator that observes ⋉5 and ⋉6 allows for only cases in Table 1 to occur. T (K), then (⋉3 : Consistency) Cn(F or(B(K⋉(φ, σ)))) 6= L. (⋉4 : Resilience) If Cn({φ}) = L, then K ⊀σ,T K⋉(φ, σ). (⋉5 : Supported Entrenchment) K⋉(φ, σ) ≺φ K only if Cn(F or(B(K))) = L. (⋉6 : Opposed Entrenchment) K ⊀¬φ K⋉(φ, σ). (⋉7 : Positive Relevance) If K ≺σ′ ,T K⋉(φ, σ) and φ ∈ F or(B(K⋉(φ, σ))), then 1. σ ′ 6= σ is supported by φ; or 2. σ ′ = σ and there is Γ ⊆ F or(B(K)) where Γ is a σ-independent φ-kernel. (⋉8 : Negative Relevance) If K⋉(φ, σ) ≺σ′ ,T K, then The first two clauses of Observation 5.1 follow straight away from the definition of the postulates. On the other hand, the third and fourth clauses demonstrate how the postulates managed to reflect the intuitions behind information revision that lead us to propose the support graph. As previously discussed, information revision is considered with relevant change. Thus, we achieved our goal by ensuring that if belief in a formula (or trust in a source) is revised, this formula (or source) is relevant to the formula that triggered the revision (or possibly its negation). The fifth clause highlights the fact that if the formula of revision is rejected, no extra support is provided for anyone and hence no source will be more trusted. Finally, the last clause shows how the postulates managed to capture our intuitions about belief revision highlighted in Tables 1 and 2. Observation 5.2. Let K0 = {{}, {}, {}} be an information state. For i > 0, let Ki refer to any state resulting from the revision of Ki−1 using an operator ⋉, with a non-forgetful conveyance inclusion filter, which observes the postulates in Section 5.2. The following hold. 1. φ ∈ F or(B(K⋉(φ, σ))) and σ ′ is ¬φ-relevant; or 2. σ ′ = σ, but, there is Γ ⊆ F or(B(K ⋉ (φ, σ))) where Γ is a ¬φ-kernel. (⋉9 : Belief Confirmation) If K ≺ψ K⋉(φ, σ), then, ψ 6= φ is supported by φ. (⋉10 : Belief Refutation) If K⋉(φ, σ) ≺ψ K, then 1. ψ is ¬φ-relevant and φ ∈ Cn(F or(B(K⋉(φ, σ)))) or K⋉(φ, σ) ≺¬φ K; or 2. ψ is φ-relevant and φ ∈ / Cn(F or(B(K⋉(φ, σ)))) or K⋉(φ, σ) ≺φ K. Information revision should yield an information state (⋉1 ). An information source that has no prior degree of trust is associated with the default degree of trust (⋉2 ). A revised information state is consistent even if the revising formula is itself contradictory (⋉3 ). If φ is inconsistent, σ should not become more trusted7 (⋉4 ). Abiding by the admissible and forbidden cases of information revision outlined in Tables 1 and 2, φ cannot become less entrenched unless the belief base is inconsistent (⋉5 ) while, even if the belief base is inconsistent, ¬φ should not become more entrenched (⋉6 ). If an information source σ ′ is more trusted after revision, then (i) φ succeeds and (ii) either σ ′ is different from σ and supported by φ or σ ′ is σ and there is independent believed 1. Single Source Revision. If S = {σ}, then, for any information state Ki where i > 0, the maximum degree in { t | (σ, T, t) ∈ T (K) } is δ. 2. No Trust Increase II. If for every σ ∈ SKi , there is no source σ ′ that is σ-relevant, then there is no σ ∈ SKi such that Ki−1 ≺σ,T Ki . 3. No Trust Increase III. If for every information state Kj , 0 < j ≤ i, and for every source σj ∈ SKj there is no 7 A specific operator might choose to actually decrease trust in a source that conveys contradictions as this is a proof of its unreliability. 155 source σj′ that is σj -relevant, then the maximum degree in { t | (σ, T, t) ∈ T (Ki ) } is δ. to be a part of revision. Contraction is the process of removing a formula from the consequences of a belief base. AGM-Recovery states that expansion with φ after contraction with φ should yield the original belief set before contraction (φ already belongs to the belief set). Thus modeling recovery in information states is enforcing that removing a formula φ from B(K) and then expanding with (φ, σ) will result in the original belief base. As with the previous cases, ⋉ fails to observe recovery because if the contraction of φ occurred, the reintroduction of φ will affect the resulting degrees of belief and trust differently depending on the source of φ. The first clause in Observation 5.2 represents the case where, in fact, trust does not matter. When there is a single source, the relevance relata is reduced to logical implication between formulas. As trust can only decrease, because there can be no confirmations, information revision becomes traditional non-prioritized belief revision. The second clause draws upon the same line of reasoning. If sources are completely independent of each other, only self support is present, no source will be more trusted because, intuitively, there is no reliable independent evidence present for any of them. Last but not least, the third clause further supports our claim that if sources are not relevant to each other, relevance reduces to logical implication and, in the absence of source based support, no source will be more trusted (will not exceed the default). An ⋉ operator that observes the postulates in Section 5.2, by design, fails to observe the following AGM postulates (Alchourrón, Gärdenfors, and Makinson 1985): 6 Extended Example Let information structure I = (LV , C, S, G), where • Language LV is a propositional language with the set V = {Arr, Inc, Doomed, Kwin, Jwin, Af ather, Lymarrid, Lymother} of propositional variables. The intuitive meaning of the variables is as follows. Arr means “The army of Dany will arrive”. Inc means “Jon’s army increased in size”. Doomed means “We are all doomed”. Kwin denotes “The Knight King wins”, while Jwin denotes “Jon wins”. Af ather means that “Agon is the father of Jon”. Lymarried denotes that “Agon married Lyanna”, and finally, Lymother represents “Lyanna is the mother of Jon”. • Success. The success postulate states that, on revising with φ, agent A should believe φ. The ⋉ operator fails to observe AGM-success because information revision depends not only on the formula of revision but also on the source of that formula. Thus, A does not just accept a new piece of information. • C = {{LV }}. • Vacuity. AGM-vacuity states that expansion with φ, if the belief base does not derive ¬φ, is a subset of revision with φ. In order to draw a comparison, we have to first define expansion of information states. Let K + (φ, σ) denote the expansion of information state K with φ conveyed by σ. Expansion just adds a formula to the belief base without checking for consistency. Obviously, φ ∈ Cn(F or(B(K +F (φ, σ)))). However, since ⋉ does not observe success, it is not always the case that φ ∈ Cn(F or(B(K ⋉ (φ, σ)))), and hence expansion is not a subset of revision even if ¬φ ∈ / Cn(F or(B(K)))). Expansion on its own does not exist for information states as, we believe, adding a piece of information from an information source should be part of revision because: i) accepting or rejecting φ depends, among other things, on trust in σ, ii) an expanded belief base is not necessarily a consistent one. • S = {T yrion, Sam, P eter, V arys, Jon}. • G = (N, N, ≺N , ≺N , 1) where ≺N is the natural order on natural numbers. Then, we define information state K0 = (B0 , T0 , H0 ) as follows: • B0 = {(Arr → Inc, 20), (Inc → Jwin, 20), (¬Jwin → Kwin, 20), (Kwin → Doomed, 20), (Af ather, 10), (Lymarried, 6), (Af ather ∧ Lymarried → Lymother, 10)}. • T0 = {(T yrion, 5), (Sam, 5), (V arys, 4), (P eter, 3), (Jon, 10)}. The topic attribute was dropped from the tuples because there is only a single topic. • H0 = {} That is, we start revising with a consistent, non-empty, belief base, an empty history, and with an attribution of trust for all information sources. Let ⋉G be an information revision operator that will be used in this example.8 To illustrate how it works, we need to define what we call the support degree. The support degree of a formula φ with respect to a source σ, is the number of believed σ-independent φ-kernels (other than {φ}) and the number of sources (other than σ) that conveyed φ directly. Moreover, the support degree of a source σ is the sum of support degrees of all formulas it conveyed with respect to σ. The intuition is as follows. A source is supported to • Extensionality. Extensionality says that if φ ⇔ ψ, then revision with φ is equivalent to revision with ψ. There is no notion of an information source in the traditional AGM approach. Again, to draw a comparison, we will consider the case where revision is taking place with φ and ψ both conveyed by the same information source σ. Even then, ⋉ fails to observe extensionality. Since trust in a source is associated with a topic, and since topics need not be closed, it is not always the case that φ conveyed by σ is believed, if believed at all, to the same degree of ψ that is also conveyed by σ. Hence, revision with (φ, σ), in general, is not the same as revision (ψ, σ) even if φ ⇔ ψ. 8 ⋉G is just an operator created for the purpose of demonstrating interesting cases and is not a generic operator of information revision. • Recovery. In our framework of information revision, there is no operation of “contraction” on its own, it has 156 B(K2 ) = B(K1 ) ∪ {(Doomed, 5)}, T (K2 ) = T (K1 ), and H(K2 ) = H(K1 ) ∪ {(Doomed, T yrion)}. the extent formulas conveyed by this source are supported. However, we took into account source-independent kernels to eliminate exclusive self-support. Given an information state K, with a support graph G(K), on revising with (φ, σ), ⋉G operates as follows. Third Instance: Sam conveys Kwin. Kwin confirms Doomed. Since Doomed had no kernels and was conveyed only by T yrion, the support degree of T yrion in K2 (d1 ) was 0. This is why T yrion was not more trusted in K2 . However, after Sam conveyed Kwin, there is a T yrion-independent Doomed-kernel {Kwin, Kwin → Doomed}. Thus, the support degree of Doomed in K3 with respect to T yrion is 1, which makes T yrion’s new support degree (d2 ) 1 as well. T yrion will be more trusted by a value equal to d2 − d1 = 1. Because T yrion is more trusted, T yrion-relevant formulas could be more entrenched. In this particular case, Doomed will be more entrenched. Hence, K3 is as follows: B(K3 ) = B(K1 ) ∪ {(Doomed, 6), (Kwin, 5)}, T (K3 ) = (T (K1 ) \ {(T yrion, 5)}) ∪ {(T yrion, 6)}, and H(K3 ) = H(K2 ) ∪ {(Kwin, Sam)}. 1. If φ is inconsistent, it will be rejected. 2. Otherwise, a degree of belief for φ is derived. For any formula, in this case φ, the degree of belief bφ = M ax(F, S). F represents the degree by which an agent believes in φ given all φ-kernels, while S represents how much an agent believes in φ given trust in sources that conveyed φ. Since a kernel is as strong as its weakest formula, let the set Γφ be the set containing, for every φkernel, the formula with the lowest degree. Then, F will be the degree of the formula with the maximum degree in Γψ . Intuitively, the derived degree of belief in φ, given formulas, is that of its strongest support. Similarly, S will be the degree of the most trusted source among those that previously conveyed φ including σ. Fourth Instance: P eter conveys Inc. The newly conveyed Inc is supported by a kernel {Arr, Arr → Inc}. However, since the only source supporting both Arr and Inc is P eter himself, P eter’s support degree will not increase, but Inc will be believed as A has no evidence against it. K4 is as follows: B(K4 ) = B(K3 ) ∪ {(Inc, 3)}, T (K4 ) = T (K3 ), and H(K4 ) = H(K3 ) ∪ {(Inc, P eter)}. 3. Add (φ, bφ ) to the belief base. If any contradiction arises, for example ¬φ and φ (or any two formulas ψ and ¬ψ that both belong to Cn(F or(B(K)))), the derived degree of belief in ¬φ is compared to that of φ and the one with the lower degree is contracted. To contract any formula (for example ξ), ⋉G removes, recursively, from every single ξ-kernel the formula with the lowest degree. Fifth Instance: V arys conveys Jwin. A has no evidence against Jwin thus A believes it. Moreover, both Arr and Inc are believed propositions supporting Jwin. Now, there are 2 V arys-independent Jwin-kernels. Namely: {Inc, Inc → Jwin} and {Arr, Arr → Inc, Inc → Jwin}. Hence, the support degree of V arys becomes 2. Thus, V arys will be more trusted with a value of 2. As with the third instance, since V arys is more trusted, the formulas that V arys supports could be more entrenched and hence K5 is as follows: B(K5 ) = B(K4 ) ∪ {(Jwin, 6)}, T (K5 ) = (T (K4 ) \ {(V arys, 4)}) ∪ {(V arys, 6)}, and H(K5 ) = H(K4 )∪{(Jwin, V arys)}. 4. Once the beliefs are consistent, the support graph will be reconstructed. Given the new graph, trust in every φ- or ¬φ-relevant source σ ′ is, possibly, revised. If (σ ′ , t1 ) ∈ T (K) and (σ ′ , t2 ) ∈ T (K⋉G ), while the support degree of σ ′ in K is d1 and the support degree of σ ′ in K⋉G is d2 , then the new degree of trust t2 = t1 + (d2 − d1 ). The proposed trust update formula ensures that for any source, if the support degree increases, trust increases, and if the support degree decreases trust decreases. 5. Finally, given the new trust degrees derived in the previous step, for every formula ψ that is φ- or ¬φ-relevant, a possible new degree of belief is derived in the same way bφ was derived in step 2. Sixth Instance: V arys conveys Lymother. A already has evidence for Lymother. Lymother has a single kernel {Af ather, Lymarried, Af ather ∧ Lymarried → Lymother}. Since this kernel is not dependent on V arys, V arys’s support degree increases by 1 (due to Lymother) resulting in V arys being more trusted. The weakest formula in the Lymother-kernel has a degree of 6. However, trust in V arys, a source who directly conveyed Lymother, is 7 and hence Lymother will have a degree of belief equal to 7. As sources become more trusted, belief in formulas conveyed by these sources could increase. Hence, belief in Jwin will increase and K6 is as follows: B(K6 ) = B(K4 ) ∪ {(Jwin, 7), (Lymother, 7)}, (T (K5 ) \ {(V arys, 6)}) ∪ {(V arys, 7)}, and H(K6 ) = H(K5 ) ∪ {(Lymother, V arys)}. Observation 6.1. ⋉G observes ⋉3−10 of Section 5.2 We now follow the changes to the information state of agent A as it observes the following conveyance instances. Every information state Ki is the result of revision of Ki−1 using ⋉G , starting from K0 . The conveyance inclusion filter is non-forgetful, hence every instance of conveyance will make it to the history. First Instance: P eter conveys Arr. Since A has no evidence against Arr, A believes Arr. As there is no evidence for Arr, its degree of belief will be equal to that of the trust in its source P eter. There was no confirmations nor refutations to any formulas, so the trust base remains unchanged. K1 is as follows: B(K1 ) = B(K0 ) ∪ {(Arr, 3)}, T (K1 ) = T (K0 ), and H(K1 ) = {(Arr, P eter)}. Seventh Instance: Jon himself after the battle conveys ¬Jwin. Here, ¬Jwin supports Kwin and Doomed. However, it is a direct refutation to Jwin and provides Second Instance: T yrion conveys Doomed. Similar to the previous case, A believes Doomed. K2 is as follows: 157 evidence against Inc and Arr. This is the first time A has evidence against the newly conveyed formula. However, Jon has the highest degree of trust and hence ¬Jwin will have a higher degree of belief than Jwin. Thus, A will choose to remove Jwin as follows. Jwin has three kernels: Γ1 = {Jwin}, Γ2 = {Inc → Jwin, Inc} and Γ3 = {Arr, Arr → Inc, Inc → Jwin}. The operator will remove the formula with the lowest degree from every kernel. Γ1 has a single formula so it is removed and hence A gives up Jwin. Moreover, in Γ2 , Inc has a lower degree than Inc → Jwin thus A will give up Inc. Finally, following the same line of reasoning, Arr will be removed from Γ3 . The support degree of V arys in K7 is 1 as opposed to 3 in K6 . Hence, V arys’s support degree decreased by 2 and, subsequently, V arys becomes less trusted. Although P eter is Jwin-relevant, according to the definition of ⋉G , in this particular case, trust in P eter will not decrease. Both T yrion and Sam received a new confirmation and their support degrees increased by 1 resulting in them being more trusted which lead formulas supported by them to become more entrenched. Jon is not supported by any sources or formulas so trust in Jon will remain unchanged. K7 is as follows: B(K7 ) = B(K0 ) ∪ {(Doomed, 7), (Kwin, 6), (Lymother, 6), (¬Jwin, 10)}, T (K7 ) = {(T yrion, 7), (Sam, 6), (V arys, 4), (P eter, 3), (Jon, 10)}, and H(K7 ) = H(K6 ) ∪ {(¬Jwin, Jon)}. The case of Lymother is a very interesting case. Lymother became less entrenched after revision with ¬Jwin. In traditional AGM-approaches Lymother would not have been considered relevant to ¬Jwin and hence it would not change according to the principle of minimality. However, as we previously argued, belief in a formula depends on trust in sources of said formula. Thus, when trust in V arys decreased, irrelevant from Lymother, formulas conveyed by V arys (including Lymother) were subject to revision. 7 Conclusion and Future Work It is our conviction that belief and trust revision are intertwined processes that should not be separated. Hence, in this paper, we argued why that is the case and provided a model for performing the joint belief-trust (information) revision with minimal assumptions on the modeling language. Then, we introduced the notion of information states that allows for the representation of information in a way that facilitates the revision process. Moreover, we introduced the support graph which is a novel formal structure that highlights the relevance relations between not only formulas, but also, information sources. Finally, we proposed the postulates that we believe any rational information revision operator should observe. Future work could go in one or more of the following directions: 1. We intend to define a representation theorem for the postulates we provided. 158 2. We intend to further investigate conveyance and information acquisition to further allow agents to trust/mistrust their own perception(s). 3. Lastly, we would like to add desires, intentions, and other mental attitudes to create a unified revision theory for all mental attitudes, giving rise to an explainable AI architecture. References Alchourrón, C. E.; Gärdenfors, P.; and Makinson, D. 1985. On the logic of theory change: Partial meet contraction and revision functions. The journal of symbolic logic 50(2):510– 530. Booth, R., and Hunter, A. 2018. Trust as a precursor to belief revision. Journal of Artificial Intelligence Research 61:699–722. Castelfranchi, C., and Falcone, R. 1998. Principles of trust for MAS: Cognitive anatomy, social importance, and quantification. In Proceedings International Conference on Multi Agent Systems, 72–79. IEEE. Darwiche, A., and Pearl, J. 1997. On the logic of iterated belief revision. Artificial intelligence 89(1-2):1–29. Demolombe, R., and Liau, C.-J. 2001. A logic of graded trust and belief fusion. In Proceedings of the 4th workshop on deception, fraud and trust in agent societies, 13–25. Demolombe, R. 2001. To trust information sources: a proposal for a modal logical framework. In Castelfranchi, C., and Tan, Y.-H., eds., Trust and Deception in Virtual Societies. Dordrecht: Springer Netherlands. 111–124. Demolombe, R. 2004. Reasoning about trust: A formal logical framework. In Jensen, C.; Poslad, S.; and Dimitrakos, T., eds., Trust Management, 291–303. Berlin, Heidelberg: Springer Berlin Heidelberg. Drawel, N.; Bentahar, J.; and Shakshuki, E. 2017. Reasoning about trust and time in a system of agents. Procedia Computer Science 109:632–639. Elangovan, A.; Auer-Rizzi, W.; and Szabo, E. 2007. Why don’t I trust you now? An attributional approach to erosion of trust. Journal of Managerial Psychology 22(1):4–24. Falcone, R., and Castelfranchi, C. 2001. Social trust: A cognitive approach. In Castelfranchi, C., and Tan, Y.-H., eds., Trust and Deception in Virtual Societies. Dordrecht: Springer Netherlands. 55–90. Gärdenfors, P., and Makinson, D. 1988. Revisions of knowledge systems using epistemic entrenchment. In Proceedings of the 2nd conference on Theoretical aspects of reasoning about knowledge, 83–95. Morgan Kaufmann Publishers Inc. Hansson, S. O. 1994. Kernel contraction. The Journal of Symbolic Logic 59(3):845–859. Hansson, S. O. 1999a. A survey of non-prioritized belief revision. Erkenntnis 50(2-3):413–427. Hansson, S. O. 1999b. A textbook of belief dynamics - theory change and database updating, volume 11 of Applied logic series. Kluwer. Hardwig, J. 1991. The role of trust in knowledge. The Journal of Philosophy 88(12):693–708. Haselhuhn, M. P.; Schweitzer, M. E.; and Wood, A. M. 2010. How implicit beliefs influence trust recovery. Psychological Science 21(5):645–648. Herzig, A.; Lorini, E.; Hübner, J. F.; and Vercouter, L. 2010. A logic of trust and reputation. Logic Journal of the IGPL 18(1):214–244. Holton, R. 1994. Deciding to trust, coming to believe. Australasian Journal of Philosophy 72(1):63–76. Ismail, H., and Attia, P. 2017. Towards a logical analysis of misleading and trust erosion. In Gordon, A. S.; Miller, R.; and Turán, G., eds., Proceedings of the Thirteenth International Symposium on Commonsense Reasoning, COMMONSENSE 2017, London, UK, November 6-8, 2017, volume 2052 of CEUR Workshop Proceedings. CEUR-WS.org. Jones, A. J., and Firozabadi, B. S. 2001. On the characterisation of a trusting agent—aspects of a formal approach. In Castelfranchi, C., and Tan, Y.-H., eds., Trust and Deception in Virtual Societies. Dordrecht: Springer Netherlands. 157–168. Jones, A. J. 2002. On the concept of trust. Decision Support Systems 33(3):225–232. Jøsang, A.; Ivanovska, M.; and Muller, T. 2015. Trust revision for conflicting sources. In The 18th International Conference on Information Fusion (Fusion 2015), 550–557. IEEE. Katz, Y., and Golbeck, J. 2006. Social network-based trust in prioritized default logic. In Proceedings of the TwentyFirst National Conference on Artificial Intelligence (AAAI 2006), 1345–1350. Leturc, C., and Bonnet, G. 2018. A normal modal logic for trust in the sincerity. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 175–183. International Foundation for Autonomous Agents and Multiagent Systems. Liau, C.-J. 2003. Belief, information acquisition, and trust in multi-agent systems—a modal logic formulation. Artificial Intelligence 149(1):31–60. Lorini, E., and Demolombe, R. 2008. From binary trust to graded trust in information sources: A logical perspective. In International Workshop on Trust in Agent Societies, 205– 225. Springer. Lorini, E.; Jiang, G.; and Perrussel, L. 2014. Trust-based belief change. In T. Schaub, G. Friedrich, B. O., ed., Proceedings of the 21st European Conference on Artificial Intelligence (ECAI 2014), volume 263 of Frontiers in Artificial Intelligence and Applications, 549–554. Amsterdam: IOS Press. McLeod, C. 2015. Trust. In Zalta, E. N., ed., The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, fall 2015 edition. Rodenhäuser, L. B. 2014. A matter of trust: Dynamic attitudes in epistemic logic. Universiteit van Amsterdam [Host]. Sabater, J., and Sierra, C. 2005. Review on computational trust and reputation models. Artificial Intelligence Review 24(1):33–60. Sakama, C.; Caminada, M.; and Herzig, A. 2010. A logical account of lying. In European Workshop on Logics in Artificial Intelligence, 286–299. Springer. Sakama, C. 2015. A formal account of deception. In 2015 AAAI Fall Symposia, Arlington, Virginia, USA, November 12-14, 2015, 34–41. AAAI Press. Simpson, J. A. 2007. Psychological foundations of trust. Current Directions in Psychological Science 16(5):264–268. Van Benthem, J. 2007. Dynamic logic for belief revision. Journal of applied non-classical logics 17(2):129–155. van Ditmarsch, H. 2014. Dynamics of lying. Synthese 191(5):745–777. 159 Algebraic Foundations for Non-Monotonic Practical Reasoning Nourhan Ehab1 , Haythem O. Ismail2,1 1 Department of Computer Science and Engineering, German University in Cairo 2 Department of Engineering Mathematics, Cairo University {nourhan.ehab, haythem.ismail}@guc.edu.eg Abstract whenever they conflict with his beliefs and prefers to give up his desires whenever they conflict with his obligations. What should Ted do? Practical reasoning is a hallmark of human intelligence. We are confronted everyday with situations that require us to meticulously choose among our possibly conflicting desires, and we usually do so with ease guided by beliefs which may be uncertain or even contradictory. The desires we end up choosing to pursue make up intentions which we seamlessly revise whenever our beliefs change. Modelling the intricate process of practical reasoning has attracted a lot of attention in the KR community giving rise to a wide array of logics coming in diverse flavours. However, a robust logic of practical reasoning with adequate semantics representing the preferences among the agent’s different mental attitudes, while capturing the intertwined revision of beliefs and intentions, remains missing. In this paper, we aspire to fill this gap by introducing general algebraic foundations for practical reasoning. We present an algebraic logic we refer to as LogA PR capable of modelling preferences among the agent’s different mental attitudes and capturing their joint revision. 1 Since the days of Aristotle in the twelfth century BC, modelling practical reasoning has posed a difficult challenge for philosophers, logicians, and computer scientists alike. Several attempts have been made over the years to come up with logical theories of practical reasoning; however, a comprehensive and adequate theory remains missing (Thomason 2018). Some endeavours at modelling practical reasoning have been successful when the problem was viewed as a mean to model rational agency. In this view, rational agents are thought of as practical reasoners that act based on their beliefs about the world and driven by their desires (Rao and Wooldridge 1999; Searle 2003). The action-attitudes of the agent representing its commitment to some motivations, as permissible by its beliefs, are classically referred to as intentions (Bratman 1987; Cohen and Levesque 1990). In this way, the agent’s intentions are evidently dependent on both its beliefs and desires. Taking the trinity of beliefs, desires, and intentions to be the key elements of the agent’s mental state is the approach taken by the much renowned BDI model of rational agents (Rao and Georgeff 1995) and its extensions to include other mental attitudes such as obligations (Broersen et al. 2001; Broersen et al. 2002). In practical settings, the agent’s beliefs and desires are often governed by a system of preferences and are continuously revised. Consequently, the revision of beliefs and desires must be reflected on the agent’s intentions. The existing logical approaches to modelling preferences within the BDI architecture are the graded-BDI (g-BDI) model (Casali, Godo, and Sierra 2008; Casali, Godo, and Sierra 2011) and TEAMLOG (Dunin-Keplicz, Nguyen, and Szalas 2010). While both approaches propose frameworks for joint reasoning with graded beliefs, desires, and intentions; neither has an account for the joint revision of the three mental attitudes. Moreover, the g-BDI model lacks precise semantics and TEAMLOG is based on a normal modal logic providing only a third-person account of reasoning about the mental attitudes. On the other hand, the joint revision of beliefs and intentions has been attempted in (Shoham 2009; Icard, Pacuit, and Shoham 2010). These theories, however, do not account for desire or preferences over beliefs and in- Introduction “What should I do?” is a question that repeatedly poses itself in our everyday lives. The process of reflection and deliberation we undergo to answer this question is what we refer to as practical reasoning (Broome 2002). To demonstrate the intricate process of practical reasoning, consider the following example. Example 1. The Weekend Dilemma Ted needs to decide what to do over the weekend. He has to work on a long overdue presentation as his boss will be really mad if Ted does not give the presentation by latest the beginning of next week. Ted also had previous plans with his best friend Marshall to go on a hunting trip during the weekend. Ted thinks that he can go to the trip and dedicate some time to work on the presentation there. Barney, Ted’s other best friend who is currently in a fall out with Marshall, told Ted that he heard from his fiance Robin that the trip location has no internet connectivity so Ted will not be able to work on the presentation there. Ted trusts that Robin usually tells the truth, but suspects that Barney might be lying to make him not go on the trip with Marshall. Ted desires to go to the trip, but he still wishes he desired to not make his boss mad. As Ted wants to start being more rational and responsible, he prefers to give up his desires or obligations 160 there does not exist a formalism for practical reasoning that possesses all the following capabilities like LogA PR does. 1. LogA PR is a graded logic. The use of graded propositions in LogA PR allows the representation of preferences among the agent’s beliefs and motivations. This is useful to represent Ted’s different trust degrees in his own supposition that he can go to the trip and work on the presentation and the contradicting assertion attributed to Robin by Barney. Further, preferences among Ted’s different motivations can likewise be represented. 2. The nesting of graded propositions in LogA PR admits the representation of nested graded beliefs and motivations. This naturally facilitates the representation of information acquired by Ted through a chain of sources (Barney and Robin) with different trust degrees. Permitting the nesting of graded motivations facilitates the representation of higher-order desires, first introduced in (Frankfurt 1988), which is useful for representing Ted’s wish to desire not to make his boss mad, for instance. 3. Different scales of graded motivations can be represented in LogA PR. Having separate scales is useful for modelling agents with contradicting motivations and allows us to circumvent several paradoxes of deontic logic as suggested by (Ismail 2020). Two scales, personal desire and obligation, are needed to account for Ted’s desire to go to the trip and his obligation towards working on the presentation. We also provide an account for modelling the characters of artificial agents as an ordering over their beliefs and motivation scales. For example, a hedonistic agent will always prefer to pursue its desires over its obligations while a selfless agent will always prefer to pursue its obligations over its desires. Whenever contradictions among the agent’s motivations arise, they are resolved by alluding to the grades of the conflicting motivations in addition to the agent’s character. In our example, Ted’s character is represented as his preference to pursue his obligations whenever they conflict with his desires. 4. The precise semantics of LogA PR account for joint reasoning with, and revision of, graded beliefs and motivations. We follow (Rao and Georgeff 1995) and refer to the subset of consistent motivations the agent chooses to pursue as its intentions. At this point, questions about the grades associated to beliefs and motivations may occur to the reader. The set of grades in LogA PR can be any totally ordered set of (reified) particulars. The grades may be numeric or may merely be locations on a qualitative scale. As such, only the order among the grades is significant and not their actual nature. The assignment of grades to particular beliefs and motivations is out of the scope of this paper; we briefly remark on this, however. The grades come from the same knowledge source that provides the beliefs and motivations themselves. If the latter are provided by a knowledge engineer, for example, then so must the former be. This should not complicate the task of the knowledge engineer as several studies showed that domain experts are often quite good at setting and subjectively assessing numbers to be used as grades for the beliefs and motivations (Charniak 1991). Alternatively, tentions. In this paper, we aspire to address this gap in the literature. Our contribution is twofold. First, we introduce general algebraic foundations for first-person practical reasoning with several mental attitudes where preference and joint revision can be captured. Second, we provide precise semantics for an algebraic logic we refer to as LogA PR for joint reasoning with graded beliefs and motivations. “Log” stands for logic, “A” for algebraic, and “PR” for practical reasoning. The grades associated with the beliefs and motivations in LogA PR are reified and are taken to represent measures of trust or preference. In LogA PR, we deviate from the BDI model and its extensions in (at least) two ways: (i) we replace the notion of desire with a more general notion of motivation to encompass all the different types of motivational attitudes a rational agent can have including (but not limited to) desires, obligations, and social norms; and (ii) we follow (Castelfranchi and Paglieri 2007; Cohen and Levesque 1990) and treat intention as a mental attitude derived from belief and motivation rather than treating it as a basic attitude. The rest of the paper is structured as follows. In Section 2, we present the motivations behind employing a non-classical logic like LogA PR by highlighting its different capabilities. Since we are taking the algebraic route, we review in Section 3 foundational concepts of Boolean algebra on which LogA PR will be based. We also generalize the classical notion of filters in Boolean algebra into what we will refer to as multifilters providing a generalized algebraic treatment of reasoning with multiple mental attitudes. Next, in Section 4, we present the syntax and semantics of LogA PR. In Section 5, we extend multifilters to accommodate reasoning with graded beliefs and motivations. Additionally, we present our extended graded consequence relation representing the joint reasoning from graded beliefs and motivations to intentions. Finally, in Section 6 we outline some concluding remarks. 2 Why LogA PR? LogA PR is the most recent addition to a growing family of algebraic logics (Ismail 2012; Ismail 2013; Ismail 2020; Ehab and Ismail 2020). As such, it is essential for a treatment of practical reasoning within the algebraic framework. Hence, independent motivations for the algebraic approach are also motivations for LogA PR. Such motivations do exist, and are detailed in (Ismail 2012; Ismail 2013; Ismail 2020; Ehab and Ismail 2020). Furthermore, LogA PR is a generalization of LogA G which is an algebraic logic we presented earlier for non-monotonic reasoning about graded beliefs. As proven in (Ehab and Ismail 2018; Ehab and Ismail 2019), LogA G can capture a wide array of non-monotonic reasoning formalisms such as possibilistic logic, circumscription, default logic, autoepistemic logic, and the principle of negation as failure. Thus, LogA G can be considered a unifying framework for non-monotonicity. LogA PR naturally inherits all the features of LogA G yielding a very powerful system of practical reasoning. In what follows, we briefly present the different features of LogA PR and motivate why they are needed by referring to the introductory example. To the best of our knowledge, 161 and (ii) if a ≤ b then ({⊤}i−1 × {a} × {⊤}k−i ) × (P i−1 × {b} × P k−i ) ⊆k . if the beliefs and motivations are learned by some machine learning procedure, then the, typically numeric, grades can be learned as well. Several attempts for accomplishing this are suggested in (Fern 2010; Paccanaro and Hinton 2001; Richardson and Domingos 2006; Yang et al. 2015; Vovk, Gammerman, and Shafer 2005). It is also worth pointing out that any difficulty resulting from the task of assigning grades is a price one is bound to pay to account for non-monotonic reasoning. It can be argued that similar equivalently challenging tasks arise in other non-graded non-monotonic formalisms. For instance, how do we set the priorities among the default rules when using prioritized default logic? It might even be that using quantitative grades simplifies the problem as there are several well-defined computational approaches to setting the grades as we previously pointed out. 3 We will henceforth drop the subscript k in k whenever there is no resulting ambiguity. Definition 3.3. Let be a k partial order on A and C ⊆ {1, ..., k}. A -multifilter of A with respect to C is a tuple F (C) = hF1 , F2 , ...., Fk i of subsets of P such that 1. ⊤ ∈ Fi , for 1 ≤ i ≤ k; 2. if i ∈ C, a ∈ Fi , and b ∈ Fi , then a.b ∈ Fi ; and 3. if (a1 , ..., ak ) (b1 , ..., bk ) and (a1 , ..., ak ) ∈ ×ki=1 Fi then (b1 , ..., bk ) ∈ ×ki=1 Fi . We can observe at this point that the three conditions on multifilters are just generalizations of the three conditions on filters. The second condition though need not apply to all the sets F1 , ..., Fk . The set C specifies the sets which behave classically in observing the second condition. We next define how multifilters can be generated by a tuple of sets of propositions. The intuition is that each set of propositions represents a mental attitude and the tuple of sets represents the collective mental state. Boolean Algebras and Multifilters In this section we lay the algebraic foundations on which LogA PR is based. We start by reviewing the algebraic concepts of Boolean algebras and filters underlying classical logic, then we extend the notion of filters to accommodate a practical logic of multiple mental attitudes. A Boolean algebra is a sextuple A =< P, +, ·, −, ⊥, ⊤ > where P is a non-empty set with {⊥, ⊤} ⊆ P. A is closed under the two binary operators + and · and the unary operator − with commutativity, associativity, absorption, and complementation properties as detailed in (Sankappanavar and Burris 1981). For the purposes of this paper, we will take the elements of P to be propositions and the operators +, ·, and − to to be disjunction, conjunction, and negation, respectively. The following definition of filters is an essential notion of Boolean algebras to represent an algebraic counterpart to logical consequence. Filters are defined in pure algebraic terms, without alluding to the notion of truth, by utilizing the natural lattice order ≤ on the the algebra: for p1 , p2 ∈ P, p1 ≤ p2 =def p1 · p2 = p1. Henceforth, A is a Boolean algebra hP, +, ·, −, ⊥, ⊤i. Definition 3.4. Let Q1 , ..., Qk ⊆ P, be a k partial order on A, and C ⊆ {1, ..., k}. The -multifilter generated by hQ1 , ..., Qk i with respect to C, denoted F (hQ1 , ..., Qk i, C), is a -multifilter hQ′1 , ..., Q′k i with respect to C where Q′i is the smallest set containing Qi , for 1 ≤ i ≤ k, Q′i . The following theorem states that, under certain conditions, multifilters can be reduced to classical filters applied to the different sets of propositions representing the different mental attitudes. Theorem 1. Let Q1 , ..., Qk ⊆ P, C ⊆ {1, ..., k}, and be a k partial order on A which is classical in i for some i ∈ C. If F (hQ1 , ..., Qk i, C) = hQ′1 , ..., Q′k i, then Q′i = F (Qi ). In the remainder of the paper, we will be assuming that practical reasoning is based on a tuple of sets of propositions, the first set representing the agent’s beliefs and the rest representing different types of motivations that the agent acts upon. When using multifilters, we will assume that only the set of beliefs behaves classically (C = {1}). We will henceforth use F (hQ1 , ..., Qk i) as a shorthand for F (hQ1 , ..., Qk i, {1}). Definition 3.1. A filter of A is a subset F of P where 1. ⊤ ∈ F ; 2. If a, b ∈ F , then a · b ∈ F ; and 3. If a ∈ F and a ≤ b, then b ∈ F . The filter generated by Q ⊆ P is the smallest filter F (Q) of which Q is a subset. 4 LogA PR Languages Since practical reasoning typically involves joint reasoning with multiple mental attitudes (beliefs, motivations, intentions, wishes, etc.), we extend the notion of filters giving rise to what we will refer to as multifilters. In contrast to classical filters that rely on the natural order ≤ on the Boolean algebra, multifilters will rely on an order on tuples. ( Recall that ≤ is the classical lattice order.) In this section, we present the syntax and semantics of LogA PR in addition to defining two logical consequence relations one for beliefs and the other for motivations. Utilizing the multifilters presented in Section 3, we show that our logical consequence relations have the distinctive properties of classical Tarskian logical consequence. Definition 3.2. Let k be a positive integer. A k partial-order on A is a partial order k on P k such that, (a1 , . . . , ak ) k (b1 , . . . , bk ) and bi = ⊥, for some 1 ≤ i ≤ k, only if aj = ⊥, for some 1 ≤ j ≤ k. Further, we say that k is classical in i just in case, (i) if (a1 , ..., ak ) k (b1 , ..., bk ) then ai ≤ bi 4.1 LogA PR Syntax LogA PR consists of terms constructed algebraically from function symbols. There are no sentences; instead, we use terms of a distinguished syntactic type to denote propositions. Propositions are included as first-class individuals 162 in the LogA PR ontology and are structured in a Boolean algebra. Though non-standard, the inclusion of propositions in the ontology has been suggested by several authors (Church 1950; Bealer 1979; Parsons 1993; Shapiro 1993). Grades are also taken to be first-class individuals. As a result, propositions about graded beliefs and motivations can be constructed, which are themselves recursively gradable. A LogA PR language is a many-sorted language composed of a set of terms partitioned into three base sorts: σP is a set of terms denoting propositions, σG is a set of terms denoting grades, and σI is a set of terms denoting anything else. A LogA PR alphabet Ω includes a non-empty, countable set of constant and function symbols each having a syntactic sort from the set σ = {σP , σG , σI } ∪ {τ1 −→ τ2 | τ1 ∈ {σP , σG , σI } and τ2 ∈ σ} of syntactic sorts. Intuitively, τ1 −→ τ2 is the syntactic sort of function symbols that take a single argument of sort σP , σG , or σI and produce a functional term of sort τ2 . Given the restriction of the first argument of function symbols to base sorts, LogA PR is, in a sense, a first-order language. In addition, an alphabet Ω includes a countably infinite set of variables of the three base sorts; a set of syncategorematic symbols including the comma, various matching pairs of brackets and parentheses, and the symbol ∀; and a set of logical symbols defined as the union of the following sets: (i) {¬} ⊆ σP −→ σP , (ii) {∧, ∨} ⊆ σP −→ σP −→ σP , (iii) . {⋖, =} ⊆ σG −→ σG −→ σP , and (iv) {G} ∪ {Mi }ki=1 ⊆ σP −→ σG −→ σP . G(p, g) denotes a belief that the grade of p is g and Mi (p, g) denotes that p is a motivation of type i with a grade of g. Terms involving ⇒ (material implication) and ∃ are abbreviations defined in the standard way. A LogA PR language L is the smallest set of terms formed according to the following rules, where t and ti (i ∈ N) are terms in L. The bridge rules serve to “bridge” propositions across the different mental attitudes. A bridge rule B, M1 , ..., Mk 7−→ B ′ , M1′ , ..., Mk′ means that if B is a subset of the current beliefs and Mi is a subset of the current i motivations, then B ′ should be added to the current beliefs and each Mi′ should be added to the current i motivations. We now go back to Example 1 showing a corresponding encoding of it as a LogA PR theory. Example 2. Let “p” denote working on the presentation, “t” denote going to the trip, and “m” denote the boss’s getting mad. A possible LogA PR theory representing Example 1 is T = hB, (M1 , M2 ), Ri where: • B is made up of the following terms. b1. G(p ∧ t, 5) b2. G(G(p ⇔ ¬t, 10), 2) • M1 = {M1 (t, 1)}. • M2 is the set made up of the following terms. o1. M2 (p, 1) o2. M2 (M1 (¬m, 2), 3) • R is the set of instances of the following rule schema where φ and g are variables. r1. {}, {}, {M1 (φ, g)} 7−→ {}, {M1 (φ, g)}, {}. r2. {}, {M1 (¬m, g)}, {} 7−→ {}, {M1 (p, g)}, {}. b1 represents Ted’s belief that he can work on the presentation. He trusts his belief b1 with a degree of 5. b2 represents the information Ted acquired through a chain of sources (Barney and Robin) that he cannot work on the presentation while being on the trip. Since Ted trusts Robin more than Barney, p ⇔ ¬t which is acquired through Robin is given the grade 10 and the whole graded belief G(¬p ⇔ ¬t, 10) acquired through Barney is given the grade of 2 as Ted trusts Barney the least. There are two types of motivations in this example: Ted’s personal desires and his obligations. M1 (φ, g) represents that Ted desires to φ with a degree of g. Likewise, M2 (φ, g) represents that Ted is obliged to φ with a degree of g. M1 is made up of Ted’s desire to go to the trip with a degree of 1. o1 represents Ted’s obligation to work on the presentation with a degree of 1 as well. o2 represents his obligation to desire to not make his boss mad. r1 is a bridge rule motivated by Ted’s character that prefers to pursue his obligations. So whenever Ted is obliged to have a desire, then he has it as a desire. r2 represents that if Ted has a desire to make his boss not mad with a degree g, then he should desire to work on the presentation with the same degree g. • All variables and constants in the alphabet Ω are in L. • f (t1 , . . . , tm ) ∈ L, where f ∈ Ω is of type τ1 −→ . . . −→ τm −→ τ (m > 0) and ti is of type τi . • ¬t ∈ L, where t ∈ σP . • (t1 ⊗ t2 ) ∈ L, where ⊗ ∈ {∧, ∨} and t1 , t2 ∈ σP . • ∀x(t) ∈ L, where x is a variable in Ω and t ∈ σP . • t1 ⋖ t2 ∈ L, where t1 , t2 ∈ σG . . • t1 = t2 ∈ L, where t1 , t2 ∈ σG . • G(t1 , t2 ) ∈ L, where t1 ∈ σP and t2 ∈ σG . • Mi (t1 , t2 ) ∈ L, where t1 ∈ σP and t2 ∈ σG . In what follows, we consider two distinguished subsets ΦG and ΦM of σP . ΦG is the set of terms of the form G(φ, g) and ΦM is a set of terms of the form Mi (ψ, g) with ψ not containing any occurrence of G. 4.2 Definition 4.1. A LogA PR theory T is a triple hB, M, Ri where: From Syntax to Semantics A key element in the semantics of LogA PR is the notion of a LogA PR structure. • B ⊆ σP represents the agent’s beliefs; • M = (M1 , ..., Mk ) is a k-tuple of subsets of ΦM representing the agent’s k motivation types; and • R is a set of bridge rules each of the form B, M1 , ..., Mk 7−→ B ′ , M1′ , ..., Mk′ where B ⊆ σP , B ′ ⊆ ΦG , and M1 , ..., Mk , M1′ , ..., Mk′ ⊆ ΦM . Definition 4.2. A LogA PR structure is a quintuple Sk = hD, A, g, Mk , ≪, ei, where • D, the domain of discourse, is a set with two disjoint, nonempty, countable subsets: a set of propositions P, and a set of grades G. 163 B ′ , M1′ , ..., Mk′ ) ∈ R, B ′ 6= {} only if Mi = Mi′ = {} and [[B]]V ≤ [[B ′ ]]V , then TV is classical in 1. • A = hP, +, ·, −, ⊥, ⊤i is a complete, non-degenerate Boolean algebra (Sankappanavar and Burris 1981). • g : P × G −→ P is a belief-grading function. • Mk = {mi | 1 ≤ i ≤ k} is a set of k motivation-grading functions such that each mi : P × G −→ P. • ≪: G × G −→ P is a ordering function imposing a total order. • e : G × G −→ {⊥, ⊤} is an equality function, where for every g1 , g2 ∈ G: e(g1 , g2 ) = ⊤ if g1 = g2 , and e(g1 , g2 ) = ⊥ otherwise. We next utilise a multifilter based on a TV -induced order to define an extended logical consequence relation for beliefs and motivations. Definition 4.5. Let T = hB, (M1 , ..., Mk ), Ri a LogA PR theory. For every φ ∈ σP , φ is a belief (or motivation) consequence of T, denoted T |=B φ (or T |=M φ), if, for every valuation V, [[φ]]V ∈ B (or [[φ]]V ∈ Mi for some i, 1 ≤ i ≤ k), where hB, M1 , ..., Mk i = V FTV (h[[B]]V , [[M]]V 1 , ..., [[M]]k i). A valuation V of a LogA PR language is a triple hS, Vf , Vx i, where S is a LogA PR structure, Vf is a function that assigns to each function symbol an appropriate function on D, and Vx is a function mapping each variable to a corresponding element of the appropriate block of D. An interpretation of LogA PR terms is given by a function [[·]]V . Both |=B and |=M are monotonic and have the distinctive properties of classical Tarskian logical consequence, with |=B observing a variant of the deduction theorem. Theorem 2. Let T = hB, M, Ri and T′ = hB′ , M′ , R′ i be LogA PR theories. Definition 4.3. Let L be a LogA PR language and let V be a valuation of L. An interpretation of the terms of L is given by a function [[·]]V : 1. If φ ∈ B, then T |=B φ. 2. If φ ∈ Mi for some Mi ∈ M, then T |=M φ. 3. If T |=B φ, B ⊆ B′ , Mi ⊆ M′i for all 1 ≤ i ≤ k, and R′ ⊆ R, then T′ |=B φ. 4. If T |=M φ, B ⊆ B′ , Mi ⊆ M′i for all 1 ≤ i ≤ k, and R′ ⊆ R, then T′ |=M φ. 5. If T |=B ψ and hB ∪ {ψ}, M, Ri |=B φ, then T |=B φ. 6. Let M′i = Mi ∪ {ψ}, for some 1 ≤ i ≤ k, and M′j = Mj , for j 6= i. If T |=M ψ and hB, M′ , Ri |=M φ, then T |=M φ. 7. If hB ∪ {φ}, M, Ri |=M ψ, then T |=B φ ⇒ ψ. • [[x]]V = Vx (x), for a variable x • [[c]]V = Vf (c), for a constant c • [[f (t1 , . . . , tn )]]V = Vf (f )([[t1 ]]V , . . . , [[tm ]]V ), for an madic (m ≥ 1) function symbol f • [[(t1 ∧ t2 )]]V = [[t1 ]]V · [[t2 ]]V • [[(t1 ∨ t2 )]]V = [[t1 ]]V + [[t2 ]]V V • [[¬t]]V = −[[t]] Y • [[∀x(t)]]V = [[t]]V[a/x] a∈D • • • • 5 [[t1 ⋖ t2 ]]V = [[t1 ]]V ≪ [[t2 ]]V . [[t1 = t2 ]]V = e([[t1 ]]V , [[t2 ]]V ) [[G(t1 , t2 )]]V = g([[t1 ]]V , [[t2 ]]V ) [[Mi (t1 , t2 )]]V = mi ([[t1 ]]V , [[t2 ]]V ) Graded Multifilters 1. If b ≤ b′ , then (b, ⊤, . . . , ⊤) TV (b′ , ⊤, . . . , ⊤). 2. If (B, . . . , Mk 7−→ B ′ , . . . , Mk′ ) ∈ R, then ([[B]]V , . . . , [[Mk ]]V ) TV ([[B ′ ]]V , . . . , [[Mk′ ]]V ) Consider the LogA PR theory of the weekend dilemma from Example 2. Given that Ted believes G(p ∧ t, 5), and does not believe ¬(p ∧ t), it would make sense for him to accept p∧t despite his uncertainty about it. (Who is ever absolutely certain of their beliefs?) Similarly, it would make sense for Ted to add t to his desires and p to his obligations if they do not conflict with other motivations or beliefs. However, if we only use multifilters, we will never be able to reason with those nested graded beliefs and motivations as they are not themselves in the agent’s theory but only grading propositions thereof. For this reason, we extend our notion of multifilters into a more liberal notion of graded multifilters to enable the agent to conclude, in addition to the consequences of the initial theory, beliefs and motivations graded by the initial beliefs and motivations (like p ∧ t). Should this lead to contradictions, the agent’s character and the grades of the contradictory propositions are used to resolve them. Due to nested grading, graded multifilters come in degrees depending on the depth of nesting of the admitted graded propositions. The rest of this section is dedicated to formalizing graded filters and presenting our definition of graded consequence for beliefs and motivations. We start by introducing some convenient abbreviations and notational conventions. Observation 4.1. If T = hB, M, Ri is a LogA PR theory and V a valuation, then TV is a k + 1 partialorder on A. Further, if, for every (B, M1 , ..., Mk 7−→ • Since we are modelling joint reasoning with beliefs and different types of motivations, in the sequel we assume a tuple Q = hQ0 , Q1 , ..., Qk i where Q0 , ..., Qk ⊆ P. Q0 V In the rest Q of the Vpaper, for any Γ ⊆ σp , we will use [[Γ]] to denote p∈Γ [[p]] for notational convenience. 4.3 Logical Consequence In this section, we employ our notion of multifilters from Section 3 to define logical consequence for LogA PR in algebraic terms. In Section 3, we defined multifilters based on an arbitrary partial order . We start by defining how to construct such an order for the tuples of propositions in P. The intuition is that the order is induced by the bridge rules in a LogA PR theory in addition to the natural order ≤ among the belief propositions. Definition 4.4. Let T = hB, (M1 , . . . , Mk ), Ri be a LogA PR theory and V a valuation. A TV -induced order, denoted TV , is a partial order over P k+1 with the following properties. 164 S∞ n where ⊕ : i=1 G i −→ G is commutative and hCk ik=1 is a permutation of the set of longest grading chains of p in R. represents a set of believed propositions, and Q1 , ..., Qk represent sets of motivation propositions where each Qi represents a different type of motivation. We will refer to Q as the mental state of the agent. • For every p ∈ P and g ∈ G, g(p, g) is referred to as a belief-grading proposition that grades p and p is a graded belief. Similarly, mi (p, g) is a motivation-grading proposition and p is a graded motivation. • If g(p, g) ∈ Q0 , then p is graded in Q0 . Similarly, if mi (p, g) ∈ Qi , then p is graded in Qi . • If R ⊆ P and p ∈ P, then GB (p, R) = {g(p, g) | g ∈ G and g(p, g) ∈ R} and GM i (p, R) = {mi (p, g) | g ∈ G and mi (p, g) ∈ R}, for 1 ≤ i ≤ k. 5.1 5.2 Telescoping and Graded Multifilters The key to defining graded multifilters is the intuition that the set of consequences of Q = hQ0 , Q1 , ..., Qk i may be further enriched by telescoping Q and accepting some of the beliefs and motivations embedded therein. We refer to this process as “telescoping” as the set of graded multifilters at increasing depths can be thought of as an inverted telescope. To this end, we need to define (i) the process of telescoping, which is a step-wise process that considers both beliefs and motivations at increasing degrees of embedding, and (ii) a criterion for accepting embedded beliefs and motivations without introducting inconsistencies. In this section, we will be formalizing the process of telescoping and the construction of graded multifilters. A first step towards defining graded multifilters is the notion of telescoping structures. Definition 5.4. Let Sk be a LogA PR structure with a depth- and fan-out-bounded P. A telescoping structure for Sk is a sextuple T = hT , O, ⊗B , ⊕B , ⊗M , ⊕M , Ci, where • T = hT0 , T1 , ..., Tk i, where T0 , T1 , ..., Tk ⊆ P. T0 is referred to as the set of top beliefs, and each Ti , for 1 ≤ i ≤ k, is referred to as a set of top motivations. • O is an ultrafilter of the subalgebra induced by Range(≪) (an ultrafilter is a maximal filter with respect to not including ⊥ (Sankappanavar and Burris 1981)); • ⊗B , ⊕B , ⊗M , and ⊕M are fusion functions from tuples of grades to grades; ⊕B and ⊕M are commutative. • C is a partial preorder over the set {0,...,k} representing the agent’s character. The telescoping structure provides the top beliefs and motivations that will never be given up together with their consequences. The ultrafilter O provides a total ordering over grades to enable comparing them. The operators ⊗B , ⊕B and ⊗M , ⊕M are used to get fused grades for beliefs and motivations respectively as per Definition 5.3. It is worth noting that, for simplicity, we opted for fusing the grades of all types of motivations using the same pair of operators ⊗M and ⊕M . The agent’s character C is defined as an ordering over the set {0, ..., k}. 0 will be taken to represent the agent’s beliefs and 1, .., k represent the different types of motivation. The character of the agent in addition to the grades of the motivations will be utilised when picking a consistent set of motivations making up the agent’s intentions. For simplicity, in the sequel, we will be assuming that the agent’s character is a total order, giving rise to what we will refer to as a linear character. We are now ready to define the T-induced telescoping of Q. The process of telescoping Q is made up of first getting the multifilter of Q then extracting the graded propositions embedded at depth 1. This might introduce inconsistencies. We resolve the inconsistencies by getting the tuple of kernel survivors κ(E 1 (F (Q)), T) given the telescoping structure T. The telescoping structure T is useful in getting the survivors as it contains the top beliefs and motivations, fusion operators, and the agent character that will all be used to decide which propositions to keep and which to Embedding and Grading Chains As a building step towards formalizing graded multifilters, the structure of graded propositions should be carefully scrutinized. Definition 5.1. Let R ⊆ P and X be any of B or Mi , for n 1 ≤ i ≤ k. The set EX (R) of X-embedded propositions at depth n ∈ N in R is inductively defined as follows. 0 • EX (R) = R and i+1 i i • EX (R) = EX (R)) 6= {}}. (R) ∪ {p | GX (p, EX In the sequel, recalling that Q = hQ0 , Q1 , . . . , Qk i is the agent’s mental state, we let n n n (Qk )i (Q1 ), ..., EM E n (Q) = hEB (Q0 ), EM 1 k Having carefully defined the notions of embedding and the degree of embedding of a graded proposition, we say that a grading chain of a belief (or motivation) p is a nonempty, finite sequence hq0 , q1 , . . . , qn i where q0 , q1 , . . . , qn are belief (or motivation) grading propositions such that qi grades qi+1 for 1 ≤ i < n and qn grades p. We next define some properties of sets of propositions based on the grading chains they include. Definition 5.2. Let R ⊆ P. 1. R is depth-bounded if there is some d ∈ N such that every belief (or motivation) grading chain in R has at most d distinct grading propositions. 2. R is fan-out-bounded if there is some fout ∈ N such that every grading belief (or motivation) chain in R grades at most fout propositions. 3. R is fan-in bounded if there is some fin ∈ N where |GB (p, R)| ≤ fin (|GMi (p, R)| ≤ fin ), for every p ∈ R. Since nested grading is allowed, it is necessary to define the fused grade of a graded proposition p in a chain C. Moreover, a proposition p might be graded by more than one grading chain. Accordingly, we also need to fuse the grades of p across all the chains grading it in some R ⊆ P. The intuition is to compute the fused grade of p for each chain that grades it by some operator ⊗, then combine these fused grades together using another operator ⊕. Definition 5.3. Let R ⊆ P be fan-in-bounded , then the fused grade of p in Q is defined as M n f⊕ (p, R) = hf⊗ (p, Ck )ik=1 165 of Q where ∀i, j such that i < j ∈ C, Qi appears before Qj in QC . In the sequel, let QC = hQ′0 , ..., Q′k i be a C-ordered C Q and QC ⊥ be the set of ⊥-kernels in Q throughout. give up. Since this process can cause some propositions to be given up, other propositions may lose their support. For this reason, we only retain the tuple of supported propositions ς(κ(E 1 (F (Q)), T), T ) amongst the kernel survivors. Definition 5.7. Let X = hX0 , ..., Xk i be a ⊥-kernel of QC . p does not survive X given T iff ∃i 1 ≤ i ≤ k such that p is a graded proposition in Xi where Xi is the left-most non-empty set in X and ∀q ∈ Xi such that q 6∈ Ti′ with F (T ) = hT0′ , ..., Tk′ i, (fT (p, Q′i ) ≪ fT (q, Q′i )) ∈ O. We next define what we refer to as a best-next ⊥-kernel in QC ⊥ . The best-next ⊥-kernel is the ⊥-kernel that must be first examined next to pick a proposition to give up from one of its sets to resolve the inconsistency. Our intuition in picking such a ⊥-kernel is that, as a first condition, it has the longest sequence of empty sets from the left. That is, we are forced to give up a proposition from a more preferred set. If there are multiple ⊥-kernels satisfying this first condition, a best-next ⊥-kernel will be a kernel that contains a proposition with the highest grade in the left-most nonempty set. We do this in order to tend to the ⊥-kernels (satisfying the first condition) that contain the more preferred propositions first. It is worth noting here that there might be multiple next-best ⊥-kernels if there are more than one ⊥-kernel with the same number of empty sets from the left where the left-most non-empty set contains propositions with the same highest grade. Definition 5.8. A next-best ⊥-kernel X ∗ = hX0 , ..., Xk i ∈ QC ⊥ satisfies the following properties. Definition 5.5. Let T be a telescoping structure for Sk . If Q = hQ0 , Q1 , ..., Qk i where every item in E 1 (F (Q)) is fan-in-bounded, then the T-induced telescoping of Q is given by τT (Q) = ς(κ(E 1 (F (Q)), T), T ) The first two steps of telescoping were already presented in Definitions 3.4 and 5.1 respectively, we only need to define the tuples of kernel survivors and supported propositions. For kernel survival, we generalize the notion of a ⊥kernel of a belief base (Hansson 1994) to suit reasoning with multiple sets of propositions. The intuition is that a ⊥-kernel is a tuple of sets, one for each mental attitude, where the union of the sets is a subset-minimal inconsistent set. This means that if we remove all occurrences of a single proposition from the sets in the ⊥-kernel, the union becomes consistent. In what follows, we say that a set R ⊆ P is inconsistent whenever the classical filter of R is improper (F (R) = P). Definition 5.6. Let Q = hQ0 , ..., Qk i, X = hX0 , ..., Xk i, and hX0′ , ..., Xk′ i = F (X ). X is a ⊥-kernel of Q iff Xi ⊆ Qi , 1 ≤ i ≤ k, and X0′ ∪ X1′ ∪ ... ∪ Xk′ is a subset-minimal inconsistent set of propositions. Example 3. We refer back to Example 2. The following are examples of ⊥-kernels.1 The first set in each ⊥-kernel represents beliefs of Ted’s, the second represents desires thereof, and the third obligations. 1. There does not exist another ⊥-kernel in QC ⊥ with a longer sequence of empty sets from the left. 2. Let Xi be the left-most non-empty set in X ∗ with a proposition p ∈ Xi . If there does not exist another proposition r ∈ Xi such that (fT (p, Q′i ) ≪ fT (r, Q′i )) ∈ O, then there does not exist another ⊥-kernel hX0′ , ..., Xk′ i ∈ QC ⊥ satisfying condition (1) with Xi′ containing a proposition q where (fT (p, Q′i ) ≪ fT (q, Q′i )) ∈ O. We are now ready to present the construction of the tuple of kernel survivors. What we do is, we pick a next-best ⊥kernel from QC ⊥ and get the propositions that do not survive from it according to Definition 5.7. There might be more that one proposition that does not survive if they all have the same lowest grade. Such propositions are removed from all the beliefs and motivations in Q to resolve the inconsistency. The intuition behind doing this is that if some propositions in some set in the next-best ⊥-kernel do not survive, then they can not survive in any other set to guarantee the consistency of the union of the sets in the mental state. We next proceed to getting the kernel survivors from the updated Q until the union of the sets in the mental state becomes consistent. Definition 5.9. The tuple of kernel survivors of Q given T is κ(Q, T) where κ(Q, T) is defined as follows: 1. if QC ⊥ = ∅, then κ(Q, T) = Q; and 2. if X ∗ is a next-best ⊥-kernel in QC ⊥ and S is the set of propositions that do not survive X ∗ given T, then 1. h{p ∧ t, p ⇔ ¬t}, {}, {}i. 2. h{p ⇔ ¬t}, {t}, {p}i. 3. h{p ⇔ ¬t}, {p, t}, {}i The first ⊥-kernel shows a contradiction within Ted’s beliefs; the second shows a contradiction between a belief, a desire, and an obligation; and the third shows a contradiction between a belief and two desires. Note that the unions resulting from the three ⊥-kernels are subset minimal. How do we choose propositions to give up and resolve inconsistency? The intuition is this: The proposition to be given up must be from the least preferred set in the mental state according to the agent’s character. If the least preferred set contains more than one proposition, then the proposition to be given up must be the proposition with the least grade in the set. To make finding the least preferred set easier, we reorder Q such that its items are ordered from the least preferred to the most preferred according to the character, and construct the ⊥-kernels out of the reordered Q. In this way, the least preferred set in a ⊥-kernel will be the leftmost non-empty set. Henceforth, we assume Q = hQ0 , ..., Qk i where each set in Q is fan-in-bounded and a telescoping structure T = hT , O, ⊗B , ⊕B , ⊗M , ⊕M , Ci with T = hT0 , T1 , ..., Tk i. We say that QC is a C-ordered Q if QC is a permutation κ(Q, T) = κ(Q′ , T) 1 We use the syntactic ∧ and ⇔ operators rather than their semantic counterparts for readability. where Q′ = hQ0 − S, Q1 − S, ...., Qk − Si. 166 Example 4. Suppose we have the same ⊥-kernels in Example 3. The agent character according to Example 1 C = {0 < 1, 0 < 2, 2 < 1} with 0 representing the agent’s beliefs, 1 representing the desires, and 2 representing the obligations. After getting QC , QC ⊥ contains the following three kernels. The sets in the kernels are now ordered according to the agent character with the first set containing desires, the second set containing obligations, and the third set containing beliefs. we give up p ∧ t from the first ⊥-kernel. This means that any proposition that was supported by p∧t must go away as well as it loses its support. Therefore, the following definition states that the supported propositions are the propositions in the sets of the multifilter of T , or the propositions that are graded by supported propositions in Q. Definition 5.10. The set of supported propositions in Q = hQ0 , Q1 , ..., Qk i given T = hT0 , T1 , ..., Tk i denoted ς(Q, T ) is the tuple hS0 , S1 , ..., Sk i where S0 , S1 , ..., Sk are the smallest subsets of, respectively, Q0 , Q1 , ..., Qk such that, for 0 ≤ i ≤ k, 1. ∀p ∈ Si if F (T ) = hT0′ , T1′ , ..., Tk′ i and p ∈ Ti′ ; and 2. ∀p ∈ Si if there is a grading chain hq0 , ..., qn i of p in Si and there is a tuple (R0 , . . . , Rk ) where Rj ⊆ Sj , for 0 ≤ j ≤ k, such that q0 ∈ Ri′ where F (R) = hR0′ , . . . , Rk′ i. We can now present an important result. The following theorem states that if the union of the sets in the multifilter of T is consistent, then the union of the sets in the multifilter after getting the tuple of supported propositions in the kernel survivors of any Q ⊆ P given T is consistent. This basically means that the process of telescoping is consistency preserving. Accordingly, we can do the revision of the agent’s beliefs and motivations while maintaining consistency amongst all the beliefs and motivations. Theorem 3. Let F (T ) = hB, M1 , ..., Mk i and F (ς(κ(Q, T), T )) = hB ′ , M′1 , ..., M′k i. If F (B ∪ M1 ∪ ... ∪ Mk ) is proper, then F (B ′ ∪ M′1 ∪ ... ∪ M′k ) is proper. Since graded multifilters come in degrees depending on the nesting level of the telescoped propositions, we need to extend the T-induced telescoping of Q to a generalized notion of T-induced telescoping of the Q at degree n as follows. Definition 5.11. If each set in Q has finitely-many grading propositions, then τT (Q) is defined, for every telescoping structure T. In what follows, provided that the right-hand side is defined, let Q if n = 0 n τT (Q) = τT (τTn−1 (Q)) otherwise 1. h{}, {}, {p ∧ t, p ⇔ ¬t}i 2. h{t}, {p}, {p ⇔ ¬t}i 3. h{p, t}, {}, {p ⇔ ¬t}i The next-best ⊥-kernel is the first kernel as it has the longest sequence of empty sets from the left. While the agent’s character prefers to give up desires and obligations rather than beliefs, in the first kernel we are forced to give up beliefs to resolve the contradiction as the desires and obligations sets are empty. The beliefs that will be given up will then be removed from the less preferred obligations and desires. In this way, we treat ⊥-kernels where we have to give up a proposition from a more preferred attitude first so the removal of propositions from them affects less preferred attitudes. To decide which propositions to give up, we look at the grades of p ∧ t and p ⇔ ¬t. We consider three cases. 1. If the grade of p ⇔ ¬t is less, it will be removed from the first kernel and Ted’s beliefs and motivations resulting in a consistent union of the sets in the mental state. In this case, giving up p ⇔ ¬t from the first ⊥-kernel resolves the inconsistency in the second and third ⊥-kernels as well. 2. If the grade of p ∧ t is less, it will be removed from the first kernel and Ted’s beliefs and motivations. However, the inconsistency in the second and third kernels is not resolved by giving up p ∧ t. Ted now believes that he can not work on the presentation and go to the trip, desires to go to the trip and work on the presentation, and is obliged to work on the presentation. From the updated QC , the second and third kernels are reconstructed. The second and third kernels have the same number of empty sets from the left, so to identify a next-best kernel we look at the kernel where the desires set contains a proposition with the highest grade. Suppose that p has a grade of 2 and t has a grade of 1. The next-best kernel will then be the third kernel. t is removed from the third kernel and the beliefs and the motivations resulting in a consistent mental state. Giving up t from the third kernel resolves the inconsistency in the second kernel. 3. If the grades of p ∧ t and p ⇔ ¬t are equal, then both beliefs are given up. Both propositions are accordingly removed from the motivations resulting in a consistent union of the sets in the mental state, and resolving the inconsistency in the second and third kernels. We now are finally ready define graded multifilters as the multifilter of the T-induced telescoping of the tuple of sets of top propositions T at degree n. Definition 5.12. Let T be a telescoping structure. We refer to F (τTn (T )) as a degree n (∈ N) graded filter of T = n (T ). hT0 , T1 , ..., Tk i, denoted F The following observation states that there might be several graded multifilter of degree n. This is due to the possible existence of several best-next ⊥-kernels at each step of getting the kernel survivors. The order of considering the possible best-next ⊥-kernels will affect the graded multifilter we end up with. It is worth noting though that according to Theorem 3 the union of the sets in all the possible graded multifilters is consistent. Observation 5.1. Let T be a telescoping structure with T = n (T ) hT0 , T1 , ..., Tk i. The degree n graded multifilter F might not be unique. After defining the kernel survivors, what remains for us to fully define the process of telescoping is to present the notion of the tuple of supported propositions in Q given a tuple of top sets of propositions T = hT0 , T1 , ..., Tk i. The motivation for defining this is the following. Suppose in Example 4 167 5.3 Graded Consequence presented in Example 2. Figure 1 shows the graded belief and motivation consequences of T with respect to a series of canons with ⊗B =mean, ⊕B =max, and ⊕M , ⊗M =max, and 0 ≤ n ≤ 2 with the agent character C = {0 < 1, 0 < 2, 2 < 1}. Level 1: Upon telescoping to level 1, the embedded beliefs, desires, and obligations at level 1 are extracted. The classical consequences of the beliefs are added as well including p and t. Once M1 (¬m, 3) is extracted in the obligations, the bridge rule r1 fires to bridge M1 (¬m, 3) to Ted’s desires. This fires r2 to add M1 (¬p, 3) to Ted’s desires as well. There are no contradictions between the extracted beliefs, desires and obligations so all the extracted beliefs and motivations survive telescoping and are supported. Hence, at level 1, Ted believes he can work on the presentation and go to the trip, desires to go to the trip, and is obliged to work on the presentation. Level 2: At level 2, the embedded graded propositions at level 1 in the previous level are extracted adding p ⇔ ¬t to Ted’s beliefs, ¬m and p to Ted’s desires. Note that ¬m is not extracted in the obligations as it we only telescope obligations in the set of obligations according to Definition 5.1 and ¬m was in a desire term. No new bridge rules are fired in Level 2. However, once we do this, we get several contradictions between Ted’s beliefs, desires, and obligations. We get the three ⊥-kernels in Example 4. Since p∧t has a lower grade (5) than p ⇔ ¬t (with fused grade ⊗(h10, 2i) = 6 as it is graded in a grading chain), the second scenario explained in Example 4 ensues resulting in removing p ∧ t from the agent’s beliefs and t from the desires. This causes both p and t in the agent’s beliefs to go away as they lose their support. Hence, at level 2, Ted gives up his belief that he can work on the presentation while being on the trip. He accordingly gives up his desire to go to the trip and ends up desiring to not make his boss mad and consequently desiring working on the presentation. Ted’s obligations to desire to make his boss not mad and to work on the presentation are retained at level 2 as Ted’s character prefers to give up desires rather than obligations. Note that a proposition was removed from the beliefs even though it is the highest preferred attitude, but this was necessary in order to resolve the contradiction within the beliefs. This revision only happened at level 2 as we look deeper into the nested graded propositions that contradicted Ted’s beliefs and motivations at level 1. In what follows, given a LogA PR theory T = hB, (M1 , ..., Mk ), Ri and a valuation V = hSk , Vf , Vx i, let the valuation of T be denoted as V(T) = hV(B), V(M1 ), ..., V(Mk )i. Just like we used multifilters to define logical consequence in Section 4, we use graded multifilters to define graded consequence as follows. Definition 5.13. Let T = hB, (M1 , ..., Mk ), Ri be a LogA PR theory and T be a T-induced order. For every φ ∈ σP , valuation V = hSk , Vf , Vx i where Sk has a set P which is depth- and fan-out-bounded, and grading canon C = h⊗B , ⊕B , ⊗M , ⊕M , C, ni, φ is a graded belief (or motivation) consequence of T with respect to C, denoted C C n (T) = hB, M1 , ..., Mk i is T |≃B φ (or T |≃M φ), if F T V defined and [[φ]] ∈ B ( or Mi ) for every telescoping structure T = hV(T), O, ⊗B , ⊕B , ⊗M , ⊕M , Ci where O extends F (V(T) ∩ Range(≪)) 2 . C C It is worth pointing out that |≃B and |≃M are nonmonotonic and reduce to |=B and |=M respectively if n = 0. The set of belief consequences makes up a consistent set of beliefs, and the set of motivation consequences makes up a consistent set of motivations representing the agent’s intentions. 5.4 The Weekend Dilemma in LogA PR 6 Figure 1: The graded consequences of the LogA PR theory in Example 2. The top portion of each level contains Ted’s beliefs, the middle portion contains Ted’s desires, and the bottom portion contains Ted’s obligations. The newly added terms in each level is shown in red. In this section we revisit the weekend dilemma showing how it can be accounted for in LogA PR illustrating the joint belief and intention revision. Recall the LogA PR theory T = hB, (M1 , M2 ), Ri representing the weekend dilemma 2 Conclusion Despite the abundance of logical theories in the literature for modelling practical reasoning, a robust theory with adequate semantics remains missing. In this paper, we introduced general algebraic foundations for practical reasoning with several mental attitudes. We also provided semantics for an algebraic logic , LogA PR, for joint reasoning with graded beliefs and motivations to decide on sets of consistent beliefs and intentions. The LogA PR semantics also captures the joint revision of the agent’s beliefs and intentions all in one framework. We are currently working on a proof theory for LogA PR. Reasons for intentions are to computed in the same way reason-maintenance systems computes supports for beliefs. The end result would be a proof theory for An ultrafilter O extends a filter F , if F ⊆ O. 168 practical reasoning augmented with the ability to explain the reasons for choosing to adopt particular intentions to achieve an initial set of motivations giving rise to an explainable AI system. Frankfurt, H. G. 1988. Freedom of the will and the concept of a person. In What is a person? Springer. 127–144. Hansson, S. O. 1994. Kernel contraction. The Journal of Symbolic Logic 59(03):845–859. Icard, T.; Pacuit, E.; and Shoham, Y. 2010. Joint revision of belief and intention. In Proc. of the 12th International Conference on Knowledge Representation, 572–574. Ismail, H. O. 2012. LogA B: A first-order, non-paradoxical, algebraic logic of belief. Logic Journal of the IGPL 20(5):774–795. Ismail, H. O. 2013. Stability in a commonsense ontology of states. Proceedings of the Eleventh International Symposium on Logical Formalization of Commonsense sense Reasoning (COMMONSENSE 2013). Ismail, H. O. 2020. The good, the bad, and the rational: Aspects of character in logical agents. In ElBolock, A.; Abdelrahman, Y.; and Abdennadher, S., eds., Character Computing. Springer. Paccanaro, A., and Hinton, G. E. 2001. Learning distributed representations of concepts using linear relational embedding. IEEE Transactions on Knowledge and Data Engineering 13(2):232–244. Parsons, T. 1993. On denoting propositions and facts. Philosophical Perspectives 7:441–460. Rao, A. S., and Georgeff, M. P. 1995. BDI agents: From theory to practice. In ICMAS, volume 95, 312–319. Rao, A. S., and Wooldridge, M. 1999. Foundations of rational agency. In Rao, A. S., and Wooldridge, M., eds., Foundations of rational agency. Springer. 1–10. Richardson, M., and Domingos, P. 2006. Markov logic networks. Machine learning 62(1-2):107–136. Sankappanavar, H., and Burris, S. 1981. A course in universal algebra. Graduate Texts Math 78. Searle, J. R. 2003. Rationality in action. MIT press. Shapiro, S. C. 1993. Belief spaces as sets of propositions. Journal of Experimental & Theoretical Artificial Intelligence 5(2-3):225–235. Shoham, Y. 2009. Logical theories of intention and the database perspective. Journal of Philosophical Logic 38(6):633. Thomason, R. H. 2018. The formalization of pratical reasoning: Problems and prospects. In Gabbay, D. M., and Guenthner, F., eds., Handbook of Philosophical Logic: Volume 18. Cham: Springer International Publishing. 105–132. Vovk, V.; Gammerman, A.; and Shafer, G. 2005. Algorithmic learning in a random world. Springer Science & Business Media. Yang, B.; Yih, W.; He, X.; Gao, J.; and Deng, L. 2015. Embedding entities and relations for learning and inference in knowledge bases. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings. References Bealer, G. 1979. Theories of properties, relations, and propositions. The Journal of Philosophy 76(11):634–648. Bratman, M. 1987. Intention, plans, and practical reason. Harvard University Press. Broersen, J.; Dastani, M.; Hulstijn, J.; Huang, Z.; and van der Torre, L. 2001. The BOID architecture: conflicts between beliefs, obligations, intentions and desires. In Proceedings of the fifth international conference on Autonomous agents, 9–16. Broersen, J.; Dastani, M.; Hulstijn, J.; and van der Torre, L. 2002. Goal generation in the BOID architecture. Cognitive Science Quarterly 2(3-4):428–447. Broome, J. 2002. Practical reasoning. In Bermúdez, J. L., and Millar, A., eds., Reason and nature: Essays in the theory of rationality. Oxford: Clarendon Press. 85–111. Casali, A.; Godo, L.; and Sierra, C. 2008. A logical framework to represent and reason about graded preferences and intentions. In Brewka, G., and Lang, J., eds., Principles of Knowledge Representation and Reasoning: Proceedings of the Eleventh International Conference, KR 2008, Sydney, Australia, September 16-19, 2008, 27–37. AAAI Press. Casali, A.; Godo, L.; and Sierra, C. 2011. A graded BDI agent model to represent and reason about preferences. Artificial Intelligence 175(7-8):1468–1478. Castelfranchi, C., and Paglieri, F. 2007. The role of beliefs in goal dynamics: prolegomena to a constructive theory of intentions. Synthese 155(2):237–263. Charniak, E. 1991. Bayesian networks without tears. AI magazine 12(4):50–50. Church, A. 1950. On carnap’s analysis of statements of assertion and belief. Analysis 10(5):97–99. Cohen, P. R., and Levesque, H. J. 1990. Intention is choice with commitment. Artificial intelligence 42(2-3):213–261. Dunin-Keplicz, B.; Nguyen, L. A.; and Szalas, A. 2010. A framework for graded beliefs, goals and intentions. Fundam. Inform. 100(1-4):53–76. Ehab, N., and Ismail, H. O. 2018. Towards a unified algebraic framework for non-monotonicity. Proceedings of the KI 2018 Workshop on Formal and Cognitive Reasoning 26– 40. Ehab, N., and Ismail, H. O. 2019. A unified algebraic framework for non-monotonicity. In Moss, L. S., ed., Proceedings Seventeenth Conference on Theoretical Aspects of Rationality and Knowledge, TARK 2019, Toulouse, France, 17-19 July 2019, volume 297 of EPTCS, 155–174. Ehab, N., and Ismail, H. O. 2020. LogA G: An algebraic non-monotonic logic for reasoning with graded propositions. Annals of Mathematics and Artificial Intelligence. Fern, A. 2010. Weighted logic. Technical report. 169 BKLM - An expressive logic for defeasible reasoning Guy Paterson-Jones2 , Giovanni Casini1,2 , Thomas Meyer2 1 ISTI-CNR, Italy 2 CAIR and Univ. of Cape Town, South Africa guy.paterson.jones@gmail.com , giovanni.casini@isti.cnr.it , tmeyer@cair.org.za Abstract class of entailment relations for KLM-style logics (Casini, Meyer, and Varzinczak 2019), and it is widely agreed upon that there is no unique best answer. The options can be narrowed down, however, and Lehmann et al. propose Rational Closure (RC) as the minimally acceptable form of rational entailment. Rational closure is based on the principle of presumption of typicality (Lehmann 1995), which states that propositions should be considered typical unless there is reason to believe otherwise. For instance, if we know that birds typically fly, and all we know about a robin is that it is a bird, we should tentatively conclude that it flies, as there is no reason to believe it is atypical. While RC is not always appropriate, there is fairly general consensus that interesting forms of conditional reasoning should extend RC from an inferential perspective (Lehmann 1995; Casini, Meyer, and Varzinczak 2019). Since KLM-style logics have limited conditional expressivity (see Section 2.1), there has been some work in extending the KLM constructions to more expressive logics. Perhaps the main question is whether entailment relations resembling RC can be defined also for more expressive logics. The first investigation in such a direction was proposed by Booth and Paris (1998) who consider an extension in which both positive (α |∼ β) and negative (α 6|∼ β) conditionals are allowed. Booth et al. (2013) introduce a significantly more expressive logic called Propositional Typicality Logic (PTL), in which propositional logic is extended with a modal-like typicality operator •. This typicality operator can be used anywhere in a formula, in contrast to KLMstyle logics, where typicality refers only to the antecedent of conditionals of the form α |∼ β. The price one pays for this expressiveness is that rational entailment becomes more difficult to pin down. This is shown by Booth et al. (2015), who prove that several desirable properties of rational closure are mutually inconsistent for PTL entailment. They interpret this as saying that the correct form of entailment for PTL is contextual, and depends on which properties are considered more important for the task at hand. In this paper we consider a different extension of KLMstyle logics, which we refer to as Boolean KLM (BKLM), and in which we allow negative conditionals, as well as arbitrary conjunctions and disjunctions of conditionals. We do not allow the nesting of conditionals, though. We show, Propositional KLM-style defeasible reasoning involves a core propositional logic capable of expressing defeasible (or conditional) implications. The semantics for this logic is based on Kripke-like structures known as ranked interpretations. KLM-style defeasible entailment is referred to as rational whenever the defeasible entailment relation under consideration generates a set of defeasible implications all satisfying a set of rationality postulates known as the KLM postulates. In a recent paper Booth et al. proposed PTL, a logic that is more expressive than the core KLM logic. They proved an impossibility result, showing that defeasible entailment for PTL fails to satisfy a set of rationality postulates similar in spirit to the KLM postulates. Their interpretation of the impossibility result is that defeasible entailment for PTL need not be unique. In this paper we continue the line of research in which the expressivity of the core KLM logic is extended. We present the logic Boolean KLM (BKLM) in which we allow for disjunctions, conjunctions, and negations, but not nesting, of defeasible implications. Our contribution is twofold. Firstly, we show (perhaps surprisingly) that BKLM is more expressive than PTL. Our proof is based on the fact that BKLM can characterise all single ranked interpretations, whereas PTL cannot. Secondly, given that the PTL impossibility result also applies to BKLM, we adapt the different forms of PTL entailment proposed by Booth et al. to apply to BKLM. 1 Introduction Non-monotonic reasoning has been extensively studied in the AI literature, as it provides a mechanism for making bold inferences that go beyond what classical methods can provide, while retaining the possibility of revising these inferences in light of new information. In their seminal paper, Kraus et al. (1990) consider a general framework for non-monotonic reasoning, phrased in terms of defeasible, or conditional implications of the form α |∼ β, to be read as ‘If α holds, then typically β holds’. Importantly, they provide a set of rationality conditions, in the form of structural properties, that a reasonable form of entailment for these conditionals should satisfy, and characterise these semantically. Lehmann and Magidor (1992) also considered the question of which entailment relations definable in the KLM framework can be considered to be the correct ones for non-monotonic reasoning. In general, there is a large 170 perhaps surprisingly, that BKLM is strictly more expressive than PTL by exhibiting an explicit translation of PTL knowledge bases into BKLM. We also prove that BKLM entailment is more restrictive than PTL entailment, in the sense that a stronger class of entailment properties are inconsistent for BKLM. In particular, attempts to extend rational closure to BKLM in the manner of LM-entailment as defined by Booth et al. (2015), are shown to be untenable. The rest of the paper is structured as follows. In section 2 we provide the relevant background on the KLM approach to defeasible reasoning, and discuss various forms of rational entailment. We then define Propositional Typicality Logic, and give a brief overview of the entailment problem for PTL. In section 3 we define the logic BKLM, an extension of KLM-style logics that allows for arbitrary boolean combinations of conditionals. We investigate the expressiveness of BKLM, and show that it is strictly more expressive PTL by exhibiting an explicit translation of PTL formulas into BKLM. In section 4 we turn to the entailment problem for BKLM, and show that BKLM suffers from stronger versions of the known impossibility results for PTL. Section 5 discusses some related work, while section 6 concludes and points out some future research directions. 2 pbf 1 pbf, pbf 0 pbf, pbf, pbf Figure 1: A ranked interpretation over P = {p, b, f}. {u ∈ U : R(u) < ∞}, and for any α ∈ L we define JαKR = {u ∈ U R : u α}. Every ranked interpretation R determines a total preorder on U in the obvious way, namely u ≤R v iff R(u) ≤ R(v). Writing the strict version of this preorder as ≺R , it is straightforward to show that it is modular: Proposition 1. ≺R is modular, i.e. for all u, v, w ∈ U , u ≺R v implies that either w ≺R v or u ≺R w. Lehmann et al. (1992) define ranked interpretations in terms of modular orderings on U . The following straightforward observation proves the equivalence of the two definitions: Proposition 2. Let R1 and R2 be ranked interpretations. Then R1 = R2 iff ≺R1 =≺R2 . We define satisfaction with respect to ranked interpretations as follows. Given any α ∈ L, we say R satisfies α (written R α) iff JαKR = U R . Similarly, R satisfies a conditional assertion α |∼ β iff min≤R JαKR ⊆ JβKR , or in other words iff all of the ≤R -minimal valuations satisfying α also satisfy β. Example 1. Let R be the ranked interpretation in figure 1. Then R satisfies p → b, b |∼ f and p |∼ ¬f. Note that in our figures we omit rank ∞ for brevity, and we represent a valuation as a string of literals, with p indicating the negation of the atom p. A useful simplification is the fact that classical statements (such as p → b) can be viewed as special cases of conditional assertions: Proposition 3. (Kraus, Lehmann, and Magidor 1990, p.174) For all α ∈ L, R α iff R ¬α |∼ ⊥. In what follows we define a knowledge base as a finite set of conditional assertions. We sometimes abuse notation by including classical statements (of the form α ∈ L) in knowledge bases, but in the context of Proposition 3 this should be understood to be shorthand for the conditional assertion ¬α |∼ ⊥. For example, the knowledge base {p → b, b |∼ f} is shorthand for {¬(p → b) |∼ ⊥, b |∼ f}. We denote the set of all ranked interpretations over P by RI, and we write M OD(K) for the set of ranked models of a knowledge base K. For any U ⊆ RI, we write U α to mean R α for all R ∈ U . Finally, we write sat(R) for the set of formulas satisfied by the ranked interpretation R. Even though KLM extends propositional logic, it is still quite restrictive, as it only permits positive conditional assertions. Booth et al. (1998) consider an extension allowing for negative conditionals, i.e. assertions of the form α 6|∼ β. Such an assertion is satisfied by a ranked interpretation R if and only if R 6 α |∼ β. Background Let P be a set of propositional atoms, and let p, q, . . . be meta-variables for elements of P. We write LP for the set of propositional formulas over P, defined by α ::= p | ¬α | α ∧ α | ⊤ | ⊥. Other boolean connectives are defined as usual in terms of ∧, ¬, →, and ↔. We write U P for the set of valuations of P, which are functions v : P → {0, 1}. Valuations are extended to LP in the usual way, and satisfaction of a formula α will be denoted v α. For the remainder of this paper we will assume that P is finite and drop superscripts whenever there isn’t any danger of ambiguity. 2.1 2 The Logic KLM Kraus et al. (1990) study a conditional logic, which we refer to as KLM. It is defined by assertions of the form α |∼ β, which are read “if α, then typically β”. For example, if P = {b, f} refers to the properties of being a bird and flying respectively, then b |∼ f states that birds typically fly. There are various possible semantic structures for this logic, but in this paper we are interested in the case of rational conditional assertions. The semantics for rational conditionals is given by ranked interpretations (Lehmann and Magidor 1992). The following is an alternative, but equivalent definition of such a class of interpretations. Definition 1. A ranked interpretation R is a function from U to N ∪ {∞} satisfying the following convexity condition: if R(u) < ∞, then for every 0 ≤ j < R(u), there is some v ∈ U for which R(v) = j. Given a ranked interpretation R, we call R(u) the rank of u with respect to R. Valuations with a lower rank are viewed as being more typical than those with a higher rank, whereas valuations with infinite rank are viewed as being impossibly atypical. We refer to the set of possible valuations as U R = 171 (R EFL) K |≈ α |∼ α (A ND) K |≈ α |∼ β, K |≈ α |∼ γ K |≈ α |∼ β ∧ γ (L LE) |= α ↔ β, K |≈ α |∼ γ K |≈ β |∼ γ (O R) K |≈ α |∼ γ, K |≈ β |∼ γ K |≈ α ∨ β |∼ γ (RW) |= β → γ, K |≈ α |∼ β K |≈ α |∼ γ (C M) K |≈ α |∼ β, K |≈ α |∼ γ K |≈ α ∧ β |∼ γ (R M) Definition 3. (Giordano et al. 2015, Definition 7) Given two ranked interpretations R1 and R2 , we write R1 <G R2 , that is, R1 is preferred to R2 , iff R1 (u) ≤G R2 (u) for every u ∈ U, and there is some v ∈ U s.t. R1 (v) <G R2 (v). Consider the set of models of a K LM knowledge base K. Intuitively, the lower a model is with respect to the ordering ≤G , the fewer exceptional valuations it has modulo the constraints of K. Thus the ≤G -minimal models can be thought of as the semantic counterpart to the principle of typicality seen above. This idea of making valuations as typical as possible has first been presented by Booth et al. (1998) for the case of K LM knowledge bases with both positive and negative conditionals. For these knowledge bases, it turns out that there is always a unique minimal model: K 6|≈ α ∧ β |∼ γ, K 6|≈ α |∼ ¬β K 6|≈ α |∼ γ Figure 2: Rationality properties for defeasible entailment. 2.2 Rank Entailment A central question in non-monotonic reasoning is determining what forms of entailment are appropriate in a defeasible setting. Given a knowledge base K, we write K |≈ α |∼ β to mean that K defeasibly entails α |∼ β. In the literature, there are a plethora of options available for the entailment relation |≈, each with their own strengths and weaknesses (Casini, Meyer, and Varzinczak 2019). As such, it is useful to understand defeasible entailment relations in terms of their global properties. An obviously desirable property is Inclusion: Proposition 5. Let K ⊆ L|∼ be a knowledge base. Then if K is consistent, M OD(K) has a unique ≤G -minimal element, RC denoted RK . The rational closure of a knowledge base can be characterised as the set of formulas satisfied by this minimal model: Proposition 6. (Giordano et al. 2015, Theorem 2) A conditional α |∼ β is in the rational closure of a knowledge base RC K ⊆ L|∼ (written K |≈RC α |∼ β) iff RK α |∼ β. (Inclusion) K |≈ α |∼ β for all α |∼ β ∈ K Kraus et al. (1990) argue that a defeasible entailment relation should satisfy each of the properties given in figure 2, known as the rationality properties. We will call such relations rational. Rational properties are essentially intertwined with the class of ranked interpretations. Proposition 4 (Lehmann et al. (1992)). A defeasible entailment relation |≈ is rational iff for each knowledge base K, there is a ranked interpretation RK such that K |≈ α |∼ β iff RK α |∼ β. The following natural form of entailment, called rank entailment, is not rational in general, as it fails to satisfy the property of rational monotonicity (R M): Definition 2. A conditional α |∼ β is rank entailed by a knowledge base K (written K |≈R α |∼ β) iff R α |∼ β for every ranked model R of K. Despite failing to be rational, rank entailment is important as it can be viewed as the monotonic core of an appropriate defeasible entailment relation. In other words, the following property is desirable: A well-known behaviour of rational closure is the socalled drowning effect. To make this concrete, consider the knowledge base K = {b |∼ f, b |∼ w, r → b, p → b, p |∼ ¬f, }. This states that birds have wings and typically fly, that robins are birds, and that penguins are birds that typically don’t fly. Intuitively one would expect to be able to conclude from this that robins typically have wings (r |∼ w), since robins are not exceptional birds. More generally, every subclass that does not show any exceptional behaviour should inherit all the typical properties of a class by default. This is the principle of the Presumption of Typicality mentioned earlier, to which rational closure obeys. But what happens with subclasses that are exceptional with respect to some property? In the above example, since penguins are exceptional only with respect to their ability to fly, the question is whether penguins should inherit the other typical properties of birds, such as having wings (p |∼ w). Rational closure does not sanction this type of conclusion. That is, subclasses that are exceptional with respect to a typical property of a class do not inherit the other typical properties of the class. This is the drowning effect which, while being a desirable form of reasoning in some contexts, is considered a limitation if we are interested in modelling some form of Presumption of Independence (Lehmann 1995), in which a subclass inherits all the typical properties of a class, unless there is explicit information to the contrary. So, even though penguins are exceptional birds in the sense of typically not being able to fly, the Presumption of Independence requires us to conclude that penguins typically have wings. There are several refinements of rational closure, such as lexicographic closure (Lehmann 1995), relevant closure (Casini et al. 2014) and inheritance-based closure (Casini and Straccia 2013), that satisfy both the Presumption of Typ- (KLM-Ampliativity) K |≈ α |∼ β whenever K |≈R α |∼ β Note that a rational entailment relation satisfying Inclusion also satisfies KLM-Ampliativity by proposition 4. 2.3 Rational Closure A well-known form of rational entailment for K LM is rational closure. Lehmann et al. (1992) propose rational closure as the minimum acceptable form of rational defeasible entailment, and give a syntactic characterisation of rational closure in terms of an ordering on K LM knowledge bases. Here we refer to the semantic approach (Giordano et al. 2015) and define rational closure in terms of an ordering on ranked interpretations: 172 icality and the Presumption of Independence. Unlike rational closure, lexicographic closure formalises the presumptive reading of α |∼ β, which states that “α implies β unless there is reason to believe otherwise” (Lehmann 1995; Casini, Meyer, and Varzinczak 2019). 2.4 the following properties, modelled after properties of rational closure, where |≈? is a P TL entailment relation and Cn? (K) = {α ∈ L• : K |≈? α} is its associated consequence operator: (Cumulativity) For all K1 , K2 ⊆ L• , if K1 ⊆ K2 ⊆ Cn? (K1 ), then Cn? (K1 ) = Cn? (K2 ). Propositional Typicality Logic (Ampliativity) For all K ⊆ L• , CnR (K) ⊆ Cn? (K). The present paper investigates whether the notion of rational closure can be extended to more expressive logics. The first investigation in such a direction was proposed by Booth and Paris (1998), who consider an extension of KLM in which both positive (α |∼ β) and negative (α 6|∼ β) conditionals are allowed. This additional expressiveness introduces some technical issues, as not every such knowledge base has a model (consider K = {α |∼ β, α 6|∼ β}, for instance). Nevertheless, Booth and Paris show that this is the only limit in the validity of Proposition 5: every consistent knowledge base in this extension has a rational closure. Another investigated logic that extends KLM is Propositional Typicality Logic (P TL), a logic for defeasible reasoning proposed by Booth et al. (2015), in which propositional logic is enriched with a modal typicality operator (denoted •). Formulas for P TL are defined by α ::= ⊤ | ⊥ | p | •α | ¬α | α ∧ α, where p is any propositional atom. As before, other boolean connectives are defined in terms of ¬, ∧, →, ↔. The intuition behind a formula •α is that it is true for typical instances of α. Note that the typicality operator can be nested, so α may itself contain some •β as a subformula. The set of all P TL formulas is denoted L• . Satisfaction for P TL is defined with respect to a ranked interpretation R. Given a valuation u ∈ U and formula α ∈ L• , we define u R α inductively in the same way as propositional logic, with an additional rule for the typicality operator: u R •α if and only if u R α and there is no v ≺R u such that v R α. We then say that R satisfies the formula α, written R α, iff u R α for all u ∈ U R . Given that the typicality operator can be nested and used anywhere within a P TL formula, one would intuitively expect P TL to be at least as expressive as K LM. The following lemma shows that this is indeed the case: Proposition 7 (Booth et al. (2013)). A ranked interpretation R satisfies the K LM formula α |∼ β if and only if it satisfies the P TL formula •α → β. Given two knowledge bases K1 and K2 , we say they are equivalent if they have exactly the same set of ranked models, i.e. if M OD(K1 ) = M OD(K2 ). Proposition 7 can be rephrased as saying that every K LM knowledge base has an equivalent P TL knowledge base. Note that the converse doesn’t hold; there are P TL knowledge bases with no equivalent in K LM: Proposition 8 (Booth et al. (2013)). For any p ∈ P, the knowledge base K = {•p} has no equivalent K LM knowledge base. The obvious form of entailment for a P TL knowledge base K is rank entailment (denoted |≈R ), presented earlier in definition 2. As noted before, rank entailment is monotonic and therefore inappropriate in many contexts. To pin down better forms of P TL entailment, Booth et al. (2015) consider (Strict Entailment) For all K ⊆ L• and α ∈ L, α ∈ Cn? (K) iff α ∈ CnR (K). (Typical Entailment) For all K ⊆ L• and α ∈ L, •⊤ → α ∈ Cn? (K) iff •⊤ → α ∈ CnR (K). (Single Model) For all K ⊆ L• , there’s some R ∈ M OD(K) such that for all α ∈ L• , α ∈ Cn? (K) iff R α. Surprisingly, it turns out that an entailment relation cannot satisfy all of these properties simultaneously: Proposition 9 (Booth et al. (2015)). There is no P TL entailment relation |≈? satisfying Cumulativity, Ampliativity, Strict entailment, Typical entailment and the Single Model property. Booth et al. suggest that this is best interpreted as an argument for developing more than one form of P TL entailment, which can be compared to the divide between presumptive and prototypical readings for K LM entailment. An example of P TL entailment is LM-entailment, which is based on the following adaption of proposition 5: Proposition 10 (Booth et al. (2019)). Let K ⊆ L• be a consistent knowledge base. Then M OD(K) has a unique ≤G LM minimal element, denoted RK . Given a knowledge base K ⊆ L• , we define LMentailment by writing K |≈LM α iff either K is inconsisLM α. Booth et al. prove that LM-entailment tent or RK satisfies all of the above properties except for Strict Entailment, and hence in general there may be classical statements that are LM-entailed by K but not rank-entailed by it. Other forms of entailment, such as PT-entailment, can be shown to satisfy Strict Entailment but fail both Typical Entailment and the Single Model property. 3 Boolean KLM In section 2.1, we noted that the logic K LM is quite restrictive, as it allows only for positive conditional assertions. As mentioned there, Booth and Paris (1998) consider an extension allowing for negative conditionals, i.e. assertions of the form α 6|∼ β. Here we take that extension further, and propose Boolean KLM (B KLM), which allows for arbitrary boolean combinations of conditionals, but not for nested conditionals. B KLM formulas are defined by A ::= α |∼ β | ¬A | A ∧ A, with other boolean connectives defined as usual. Following Booth and Paris, we will write α 6|∼ β as a synonym for ¬(α |∼ β) where convenient, and we denote the set of all B KLM formulas by Lb . So, for example, (α |∼ β) ∧ (γ 6|∼ δ) and ¬((α 6|∼ β) ∨ (γ |∼ δ)) are B KLM formulas, but α |∼ (β |∼ γ) is not. 173 1 pq 0 pq In this section we show that B KLM is maximally expressive, in the sense that it can characterise any set of ranked interpretations. For a valuation u ∈ U , we write û to mean any characteristic formula of u, namely any propositional formula such that v û iff v = u. It is easy to see that these always exist, as P is finite, and that all characteristic formulas of u are logically equivalent. Figure 3: A ranked interpretation illustrating the difference between classical disjunction and B KLM disjunction. Lemma 1. For any ranked interpretation R and valuations u, v ∈ U, it is straightforward to check that: Satisfaction for B KLM is defined in terms of ranked interpretations, by extending K LM satisfaction to boolean combinations of conditionals in the obvious fashion, namely R ¬A iff R 6 A and R A ∧ B iff R A and R B. This leads to some subtle differences between B KLM satisfaction and the other logics. For instance, care must be taken to apply proposition 3 correctly when translating between propositional formulas and B KLM formulas. The propositional formula p ∨ q translates to the B KLM formula ¬(p ∨ q) |∼ ⊥, and not to the B KLM formula (¬p |∼ ⊥) ∨ (¬q |∼ ⊥), as the following example illustrates: 1. R 2. R 3. R Note that this lemma holds even in the vacuous case where R(u) = ∞ for all u ∈ U . Following Lehmann et al. (1992), we write α < β as shorthand for the defeasible implication α ∨ β |∼ ¬β. We now show that the concept of characteristic formulas can be applied to ranked interpretations as well: Lemma 2. Let R be any ranked interpretation. Then there exists a formula ch(R) ∈ Lb with R as its unique model. Example 2. Consider the propositional formula A = p ∨ q and the B KLM formula B = (¬p |∼ ⊥) ∨ (¬q |∼ ⊥). If R is the ranked interpretation in figure 3, then R satisfies A but not B, as neither clause of the disjunction is satisfied. Proof. Consider the following knowledge bases. 1. K≺ = {û < v̂ : u ≺R v} ∪ {û 6< v̂ : u 6≺R v} 2. K∞ = {û |∼ ⊥ : R(u) = ∞} ∪ {û 6|∼ ⊥ : R(u) < ∞} To prevent possible confusion, we will avoid mixing classical and defeasible assertions in a B KLM knowledge base. For similar reasons, it’s also worth noting the difference between boolean connectives in P TL and the corresponding connectives in B KLM. By proposition 7, one might expect a B KLM formula such as ¬(p |∼ q) to translate into the P TL formula ¬(•p → q). The following example shows that this naı̈ve approach fails: By lemma 1, R satisfies K = K≺ ∪K∞ . To show that it is the unique model of K, consider any R ∗ ∈ M OD(K). Since R ∗ satisfies K∞ , R ∗ (u) = ∞ iff R(u) = ∞ for any u ∈ U. Now consider any u, v ∈ U, and suppose that R(u) < ∞. Then u ≺R v iff K≺ contains û < v̂. But R ∗ satisfies K≺ , so this is true iff u ≺R∗ v as R ∗ (u) < ∞. On the other hand, if R(u) = ∞, then u 6≺R v and u 6≺R∗ v. Hence ≺R =≺R∗ , which implies that R = R ∗ by V proposition 2. We conclude the proof by letting ch(R) = α∈K α. Example 3. Consider the formulas A = ¬(•p → q) and B = ¬(p |∼ q), and let R be the ranked interpretation in figure 3. Then A is equivalent to •p∧¬q, which isn’t satisfied by R. On the other hand, R satisfies B. We refer to ch(R) as the characteristic formula of R. A simple application of disjunction allows us to prove the following more general corollary: One might ask whether there is a more nuanced way of translating B KLM knowledge bases into P TL. In the next section we answer this question in the negative, by showing that B KLM is in fact strictly more expressive than P TL. 3.1 ⊤ 6|∼ ¬û iff R(u) = 0. û |∼ ⊥ iff R(u) = ∞. û ∨ v̂ |∼ ¬v̂ iff u ≺R v or R(u) = R(v) = ∞. Corollary 1. Let U ⊆ RI be a set of ranked interpretations. Then there exists a formula ch(U ) ∈ Lb with U as its set of models. Expressiveness of BKLM for Ranked Interpretations This proves that B KLM is at least as expressive as P TL since, in principle, for every P TL knowledge base there is some B KLM knowledge base with the same set of models. It is not clear, however, whether there is a more natural description of this knowledge base than that provided by characteristic formulas. In the next section we will address this shortcoming by describing an explicit translation from P TL to B KLM knowledge bases. In fact, B KLM is strictly more expressive than P TL. This is illustrated by the knowledge base K = {(⊤ |∼ p) ∨ (⊤ |∼ ¬p)}, which expresses the “excluded-middle” statement that typically one of p or ¬p is true. There are two distinct ≤G minimal ranked models of K, given by R1 and R2 in figure 4, and hence K cannot have an equivalent P TL knowledge base by proposition 10. So far we have been rather vague about what we mean by the expressiveness of a logic. All of the logics we consider in this paper share the same semantic structures, which provides us with a handy definition. We say that a logic can characterise a set of ranked interpretations U ⊆ RI if there is some knowledge base K with U as its set of ranked models. Given this, we say that a logic is more expressive than another logic if it can characterise at least as many sets of interpretations. Example 4. Let K ⊆ L|∼ be a K LM knowledge base. Then its P TL translation K′ = {•α → β : α |∼ β ∈ K} has exactly the same ranked models by proposition 7, and hence P TL is at least as expressive as K LM. Proposition 8 shows that this comparison is strict. 174 3.2 A formula α ∈ L• is satisfied by a ranked interpretation R iff it is satisfied by every possible valuation of R. We can combine the translation operators of definition 4 to formalise this statement as follows: V def Definition 5. tr(α) = (û | 6 ∼ ⊥) → tr (α) u u∈U Translating PTL Into BKLM In section 2.4, satisfaction for P TL formulas with respect to a ranked interpretation R was defined in terms of the possible valuations of R. In order to define a translation operator between P TL and B KLM, our main idea is to encode satisfaction with respect to a valuation u ∈ U in terms of an appropriate B KLM formula. In other words, we will define an operator tru : L• → Lb such that for each u ∈ U R , R tru (α) iff u R α. All that remains is to check that this formula correctly encodes P TL satisfaction: Lemma 4. For all α ∈ L• , a ranked model R satisfies α if and only if it satisfies tr(α). Definition 4. Given α, β ∈ L• , p ∈ P and u ∈ U, we define tru by structural induction as follows: Proof. Suppose R α. Then for all u ∈ U, either R(u) = ∞ or u R α. The former implies R û |∼ ⊥ by lemma 1, and the latter implies R tru (α) by lemma 3. Thus R (û 6|∼ ⊥) → tru (α) for all u ∈ U , which proves R tr(α) as required. Conversely, suppose R tr(α). Then for any u ∈ U , either R û |∼ ⊥ and hence R(u) = ∞ by lemma 1, or R û 6|∼ ⊥ and hence R tru (α) by hypothesis. But then R α by lemma 3. def tru (p) = û |∼ p def û |∼ ⊤ tru (⊤) = def tru (⊥) = û |∼ ⊥ def ¬tru (α) tru (¬α) = def tru (α) ∧ tru (β) tru (α ∧ β) = h i V def 6. tru (•α) = tru (α) ∧ v∈U (v̂ < û) → ¬trv (α) 1. 2. 3. 4. 5. Note that this is well-defined, as each case is defined in terms of the translation of strict subformulas. The translations can be viewed as formal version of the definition of P TL satisfaction - case 6 states that •α is satisfied by a possible valuation u iff u is a minimal valuation satisfying α, for instance. 4 Entailment Results for BKLM We now turn to the question of defeasible entailment for B KLM knowledge bases. As in previous cases, an obvious approach to this is rank entailment, which we define in the usual fashion: Definition 6. Given any K ⊆ Lb and A ∈ Lb , we say K rank entails A (written K |≈R A) iff R A for all R ∈ M OD(K). Lemma 3. Let R be a ranked interpretation, and u ∈ U R a valuation with R(u) < ∞. Then for all α ∈ L• we have R tru (α) if and only if u R α. Being monotonic, rank entailment serves as a useful lower bound for defeasible B KLM entailment, but cannot be considered a good solution in its own right. Letting |≈? be an entailment relation and Cn? its associated consequence operator, consider the entailment properties in section 2.4 in the context of B KLM. Our first observation is that the premises of proposition 9 can be weakened as a consequence of global disjunction: Proof. We will prove the result by structural induction on the cases in definition 4: 1. Suppose that R tru (p), i.e. R û |∼ p. This is true iff u |= p, which is equivalent by definition to u R p. Cases 2 and 3 are similar. 4. Suppose that R tru (¬α), i.e. R ¬tru (α). This is true iff R 6 tru (α), which by the induction hypothesis is equivalent to u 6 R α. But this is equivalent to u R ¬α by definition. Case 5 is similar. 6. Suppose there exists an α ∈ L• such that R tru (•α) but u 6 R •α. Then either u 6 R α, which by the induction hypothesis is a contradiction since R tru (α), or there is some v ∈ U with v ≺R u such that v R α. But by lemma 1, v ≺R u is true only if R v̂ < û. We also have, by the induction hypothesis, that R trv (α) since v R α. Hence R (v̂ < û) ∧ trv (α), which implies that one of the clauses in tru (•α) is false. This is a contradiction, so we conclude that R tru (•α) implies u R •α. Conversely, suppose that u R •α. Then u R α, and hence R tru (α) by the induction hypothesis. We also have that if v ≺R u then v 6 R α, which is equivalent to R ¬trv (α) by the induction hypothesis. But by lemma 1, v ≺R u iff R v̂ < û. We conclude that R (v̂ < û) → ¬trv (α) for all v ∈ U, and hence R tru (•α). Lemma 5. There is no B KLM entailment relation |≈? satisfying Ampliativity, Typical Entailment and the Single Model property. Proof. Suppose that |≈? is such an entailment relation, and consider the knowledge base K = {(⊤ |∼ p) ∨ (⊤ |∼ ¬p)}. Both interpretations in figure 4, R1 and R2 , are models of K. R1 satisfies ⊤ |∼ p and not ⊤ |∼ ¬p, whereas R2 satisfies ⊤ |∼ ¬p and not ⊤ |∼ p. Thus, by the Typical Entailment property, K 6|≈? ⊤ |∼ p and K 6|≈? ⊤ |∼ ¬p. On the other hand, by Ampliativity we get K |≈? (⊤ |∼ p)∨(⊤ |∼ ¬p). A single ranked interpretation cannot satisfy all three of these assertions, however, and hence no such entailment relation can exist. In the P TL context, LM-entailment satisfies Ampliativity, Typical Entailment and the Single Model property. Thus lemma 5 is a concrete sense in which B KLM entailment is more constrained than P TL entailment. This raises an interesting question - can we nevertheless define a notion of entailment for BKLM, in the same spirit as rational closure 175 1 p 1 p 0 p 0 p (a) R1 the context of an agent’s knowledge. For B KLM entailment, it turns out that there is a partial converse to this discussion, which we will prove in the next section. (b) R2 4.2 Figure 4: Ranked models of K = {(⊤ |∼ p) ∨ (⊤ |∼ ¬p)}. and LM-entailment, by giving up one of the above properties? In order to guarantee a rational entailment relation, it is desirable to keep the Single Model property in view of proposition 4. For the rest of this section we will investigate the consequences of this choice, and show that while it is possible to satisfy the Single Model property for B KLM entailment, the resulting entailment relations are heavily restricted. 4.1 The Single Model Property In this section we prove that, under some mild assumptions, a B KLM entailment relation satisfying the Single Model property is always equivalent to a total order entailment relation. Theorem 1. Suppose |≈? is a B KLM entailment relation satisfying Cumulativity, Ampliativity and the Single Model property. Then |≈? =|≈< , where |≈< is a total order entailment relation. For the remainder of the section, consider a fixed B KLM entailment relation |≈? (with associated consequence operator Cn? ), and suppose that |≈? satisfies Cumulativity, Ampliativity and the Single Model property. In what follows, we will move between the entailment relation and consequence operator notations freely as convenient. To begin with, we note the following straightforward lemma: Lemma 6. For any knowledge base K ⊆ Lb , Cn? (K) = CnR (Cn? (K)) = Cn? (CnR (K)). Our approach to proving theorem 1 is to assign a unique index ind(R) ∈ N to each ranked interpretation R ∈ RI, and then show that Cn? (K) corresponds to minimisation of index in M OD(K). To construct this indexing scheme, consider the following algorithm: 1. Set M0 := RI, i := 0. 2. If Mi = ∅, terminate. 3. By corollary 1, there is some knowledge base Ki ⊆ Lb such that M OD(Ki ) = Mi . 4. By the single model property, there is some Ri ∈ Mi such that Cn? (Ki ) = sat(Ri ). 5. Set Mi+1 := Mi \ {Ri }, i := i + 1. 6. Go to step 2, and iterate until termination. Order Entailments As we have seen in Section 2.3, rational closure can be modeled as a form of minimal model entailment. In other words, given a knowledge base K, we can construct the rational closure of K by placing an appropriate ordering on its set of ranked models (in this case ≤G ), and picking out the consequences of the minimal ones. In this section we formalise this notion of entailment, with a view towards understanding the Single Model property for B KLM. Definition 7. Let < be a strict partial order on RI. Then for all knowledge bases K and formulas α, we define K |≈< α iff R α for all <-minimal elements of M OD(K). The relation |≈< will be referred to as the order entailment relation of <. Note that we have been deliberately vague about which logic we are dealing with, as the construction works identically for K LM, P TL and B KLM. It is also worth mentioning that the set of models of a consistent knowledge base always has <-minimal elements, as we have assumed finiteness of P, that implies a finite set of ranked interpretations. Example 5. By definition 6, the rational closure of any K LM knowledge base K is the set of formulas satisfied by the (unique) <G -minimal element of M OD(K). Thus rational closure is the order entailment relation of <G over K LM. In general, order entailment relations satisfy all of the rationality properties in figure 2 except for rational monotonicity (R M). Rational monotonicity holds, for instance, if M OD(K) has a unique <-minimal model for every knowledge base K. This is the case for rational closure and LMentailment, which both satisfy the Single Model property. The following proposition follows easily from the definitions, and shows that this is typical: Proposition 11. An order entailment relation |≈< satisfies the Single Model property iff M OD(K) has a unique <minimal model for every knowledge base K. A class of order entailment relations for which the Single Model property always holds are the total order entailment relations, i.e. those |≈< corresponding to a total order <. Intuitively, this is a strong restriction, as an a priori total ordering over all possible ranked interpretations is unnatural in This algorithm is guaranteed to terminate as M0 is finite and 0 ≤ |Mi+1 | < |Mi |. Note that once the algorithm terminates, for each R ∈ RI there will have been a unique i ∈ N such that R = Ri . We will call this i the index of R, and denote it by ind(R). Given a knowledge base K, we define ind(K) = min{ind(R) : R ∈ M OD(K)} to be the minimum index of the knowledge base. For clarity, when we write Rn , Kn and Mn in the following lemmas, we mean the ranked interpretations, knowledge bases and sets of models constructed in steps 3 to 5 of the algorithm when i = n: Lemma 7. Given any knowledge base K ⊆ Lb , M OD(K) ⊆ Mn , where n = ind(K). Proof. An easy induction on step 5 of the algorithm proves that Mn = {R ∈ RI : ind(R) ≥ n}. By hypothesis, ind(R) ≥ n for all R ∈ M OD(K), and hence M OD(K) ⊆ Mn . The following lemma proves that entailment under |≈? corresponds to minimisation of index: 176 Lemma 8. Given any knowledge base K ⊆ Lb , Cn? (K) = sat(Rn ), where n = ind(K). operator T (·) that does not go beyond the expressivity of a K LM-style conditional language, but revised, of course, for the expressivity of description logics. In the context of adaptive logics, Straßer (2014) defines the logic R+ as an extension of K LM in which arbitrary boolean combinations of defeasible implications are allowed, and the set of propositional atoms has been extended to include the symbols {li : i ∈ N}. Semantically, these symbols encode rank in the object language, in the sense that u li in a ranked interpretation R iff R(u) ≥ i. Straßer’s interest in R+ is to define an adaptive logic ALC S that provides a dynamic proof theory for rational closure, whereas our interest in B KLM is to generalise rational closure to more expressive extensions of K LM. Nevertheless, the Minimal Abnormality Strategy (see the work of Batens (2007), for instance) for ALC S is closely related to LM -entailment as defined in this paper. Proof. For all A, Kn |≈R A iff R A for all R ∈ M OD(Kn ) = Mn . But by lemma 7, M OD(K) ⊆ Mn and hence CnR (Kn ) ⊆ CnR (K). On the other hand, Rn ∈ M OD(K) by hypothesis and hence Rn A for all A ∈ K. By the definition of step 4 of the algorithm we have sat(Rn ) = Cn? (Kn ), and thus K ⊆ Cn? (Kn ). Applying CnR to each side of this inclusion (using the monotonicity of rank entailment), we get CnR (K) ⊆ CnR (Cn? (Kn )) = Cn? (Kn ), with the last equality following from lemma 6. Putting it all together, we have CnR (Kn ) ⊆ CnR (K) ⊆ Cn? (Kn ), and hence by Cumulativity we conclude Cn? (K) = Cn? (Kn ) = sat(Rn ). Consider the strict partial order on RI defined by R1 < R2 iff ind(R1 ) < ind(R2 ). By construction, the index of a ranked interpretation is unique, and hence < is total. It follows from lemma 8 that |≈? =|≈< , and hence |≈? is equivalent to a total order entailment relation. This completes the proof of theorem 1. 5 6 Conclusion The main focus of this paper is exploring the connection between expressiveness and entailment for extensions of the core logic K LM. Accordingly, we introduce the logic B KLM, an extension of K LM that allows for arbitrary boolean combinations of defeasible implications. We take an abstract approach to the analysis of B KLM, and show that it is strictly more expressive than existing extensions of K LM such as P TL (Booth, Meyer, and Varzinczak 2013) and K LM with negation (Booth and Paris 1998). Our primary conclusion is that a logic as expressive as B KLM has to give up several desirable properties for defeasible entailment, most notably the Single Model property, and thus appealing forms of entailment for P TL such as LM-entailment (Booth et al. 2015) cannot be lifted to the B KLM case. For future work, an obvious question is what forms of defeasible entailment are appropriate for B KLM. For instance, is it possible to skirt the impossibility results proven in this paper while still retaining the K LM rationality properties? Other forms of entailment for P TL, such as PT-entailment, have also yet to be analysed in the context of B KLM and may be better suited to such an expressive logic. Another line of research to be explored is whether there is a more natural translation of P TL formulas into B KLM than that defined in this paper. Our translation is based on a direct encoding of P TL semantics, and consequently results in an exponential blow-up in the size of the formulas being translated. It is clear that there are much more efficient ways to translate specific P TL formulas, but we leave it as an open problem whether this can be done in general. In a similar vein, it is interesting to ask how P TL could be extended in order to make it equiexpressive with B KLM. Finally, it may be interesting to compare B KLM with an extension of K LM that allows for nested defeasible implications, i.e. formulas such as α |∼ (β |∼ γ). While such an extension cannot be more expressive than B KLM, at least for a semantics given by ranked interpretations, it may provide more natural encodings of various kinds of typicality, and thus be easier to work with from a pragmatic point of view. Related Work The most relevant work w.r.t. the present paper is that of Booth and Paris (1998) in which they define rational closure for the extended version of KLM for which negated conditionals are allowed, and the work on P TL (Booth et al. 2015; Booth et al. 2019). The relation this work has with B KLM was investigated in detail throughout the paper. Delgrande (1987) proposes a logic that is as expressive as B KLM. The entailment relation he proposes is different from the minimal entailment relations we consider here and, given the strong links between our constructions and the KLM approach, the remarks in the comparison made by Lehmann and Magidor (1992, Section 3.7) are also applicable here. Boutilier (1994) defines a family of conditional logics using preferential and ranked interpretations. His logic is closer to ours and even more expressive, since nesting of conditionals is allowed, but he too does not consider minimal constructions. That is, both Delgrande and Boutilier’s approaches adopt a Tarskian-style notion of consequence, in line with rank entailment. The move towards a nonmonotonic notion of defeasible entailment was precisely our motivation in the present work. Giordano et al. (2010) propose the system Pmin which is based on a language that is as expressive as P TL. However, they end up using a constrained form of such a language that goes only slightly beyond the expressivity of the language of KLM-style conditionals (their well-behaved knowledge bases). Also, the system Pmin relies on preferential models and a notion of minimality that is closer to circumscription (McCarthy 1980). In the context of description logics, Giordano et al. (2007; 2015) propose to extend the conditional language with an explicit typicality operator T (·), with a meaning that is closely related to the P TL operator •. It is worth pointing out, though, that most of the analysis in the work of Giordano et al. is dedicated to a constrained use of the typicality 177 References McCarthy, J. 1980. Circumscription, a form of nonmonotonic reasoning. Art. Int. 13(1-2):27–39. Straßer, C. 2014. An adaptive logic for rational closure. In Adaptive Logics For Defeasible Reasoning, volume 38 of Trends in Logic. Springer International Publishing. 181– 206. Batens, D. 2007. A universal logic approach to adaptive logics. Logica Universalis 1:221–242. Booth, R., and Paris, J. 1998. A note on the rational closure of knowledge bases with both positive and negative knowledge. Journal of Logic, Language and Information 7(2):165–190. Booth, R.; Casini, G.; Meyer, T.; and Varzinczak, I. 2015. On the entailment problem for a logic of typicality. In IJCAI 2015, 2805–2811. Booth, R.; Casini, G.; Meyer, T.; and Varzinczak, I. 2019. On rational entailment for propositional typicality logic. Artificial Intelligence 277. Booth, R.; Meyer, T.; and Varzinczak, I. 2013. A propositional typicality logic for extending rational consequence. In Fermé, E.; Gabbay, D.; and Simari, G., eds., Trends in Belief Revision and Argumentation Dynamics, volume 48 of Studies in Logic – Logic and Cognitive Systems. King’s College Publications. 123–154. Boutilier, C. 1994. Conditional logics of normality: A modal approach. Artificial Intelligence 68(1):87–154. Casini, G., and Straccia, U. 2013. Defeasible inheritancebased description logics. JAIR 48:415–473. Casini, G.; Meyer, T.; Moodley, K.; and Nortje, R. 2014. Relevant closure: A new form of defeasible reasoning for description logics. In JELIA 2014, 92–106. Casini, G.; Meyer, T.; and Varzinczak, I. 2019. Taking defeasible entailment beyond rational closure. In Calimeri, F.; Leone, N.; and Manna, M., eds., Logics in Artificial Intelligence - 16th European Conference, JELIA 2019, Rende, Italy, May 7-11, 2019, Proceedings, volume 11468 of Lecture Notes in Computer Science, 182–197. Springer. Delgrande, J. 1987. A first-order logic for prototypical properties. Artificial Intelligence 33:105–130. Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. 2007. Preferential description logics. In Dershowitz, N., and Voronkov, A., eds., Logic for Programming, Artificial Intelligence, and Reasoning (LPAR), number 4790 in LNAI, 257–272. Springer. Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. L. 2010. A nonmonotonic extension of KLM preferential logic P. In Logic for Programming, Artificial Intelligence, and Reasoning - 17th International Conference, LPAR-17, Yogyakarta, Indonesia, October 10-15, 2010. Proceedings, 317–332. Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. 2015. Semantic characterization of rational closure: From propositional logic to description logics. Art. Int. 226:1–33. Kraus, S.; Lehmann, D.; and Magidor, M. 1990. Nonmonotonic reasoning, preferential models and cumulative logics. Artificial Intelligence 44:167–207. Lehmann, D., and Magidor, M. 1992. What does a conditional knowledge base entail? Art. Int. 55:1–60. Lehmann, D. 1995. Another perspective on default reasoning. Annals of Math. and Art. Int. 15(1):61–82. 178 Towards Efficient Reasoning with Intensional Concepts Jesse Heyninck1 , Ricardo Gonçalves2 , Matthias Knorr2 , João Leite2 1 Technische Universität Dortmund 2 NOVA LINCS, Departamento de Informátia, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa jesse.heyninck@tu-dortmund.de, {rjrg,mkn,jleite}@fct.unl.pt Abstract as “next”, “at time”, “during interval T”) or space (e.g., “at place P”, “within a given radius”, “connected to”), but also legal reasoning (e.g., “is obliged to”, “is permitted”). Example 1. Consider an airport that constantly receives data from sensors, cameras, etc. for monitorization, which, in combination with, e.g., facial recognition algorithms, allows one to automatize and optimize relevant tasks, such as boarding, checking in, and giving or denying access to certain parts of the airport. For example, checked-in passengers that are missing for boarding can be traced and alerted, and, in the case of non-compliance to proceed to the gate given the constraints on time and location, be subject to penalties. Also, passengers that comply with relevant security procedures can be automatically boarded, and irregularities can be communicated to the security personnel for possible intervention, allowing for a more efficient allocation of (human) resources, and a better-functioning and safer airport. In this context, efficient reasoning with non-monotonic rules over intensional concepts is indeed mandatory, since a) rules allow us to encode monitoring and intervention guidelines and policies in a user-friendly and declarative manner; b) conclusions may have to be revised in the presence of newly arriving information; c) different intensional concepts need to be incorporated in the reasoning process; and d) timely decisions are required, even in the presence of large amounts of data, as in streams. However, relevant existing work usually deals with only one kind of intensional concepts (as detailed before), and, in general, the computational complexity of the proposed formalisms is too high, usually due to both the adopted underlying formalism and the unrestricted reasoning with expressive intensional concepts. In this paper, we introduce a formalism that allows us to reason with defeasible knowledge over intensional concepts. We build on so-called intensional logic programs (Orgun and Wadge 1992), extended with non-monotonic default negation, and equip them with a novel three-valued semantics with favorable properties. In particular, we define a well-founded model in the line of the well-founded semantics for logic programs (Gelder, Ross, and Schlipf 1991). Provided the adopted intensional operators satisfy certain properties, which turn out to be aligned with practical applications such as the one outlined in Ex. 1, the well-founded model is unique, minimal among the three-valued models, Recent developments triggered by initiatives such as the Semantic Web, Linked Open Data, the Web of Things, and geographic information systems resulted in the wide and increasing availability of machine-processable data and knowledge in the form of data streams and knowledge bases. Applications building on such knowledge require reasoning with modal and intensional concepts, such as time, space, and obligations, that are defeasible. For example, in the presence of data streams, conclusions may have to be revised due to newly arriving information. The current literature features a variety of domain-specific formalisms that allow for defeasible reasoning using specific intensional concepts. However, many of these formalisms are computationally intractable and limited to one of the mentioned application domains. In this paper, we define a general method for obtaining defeasible inferences over intensional concepts, and we study conditions under which these inferences are efficiently computable. 1 INTRODUCTION In this paper, we develop a solution that allows us to efficiently reason with intensional concepts, such as time, space and obligations, providing defeasible/non-monotonic inferences in the presence of large quantities of data. Initiatives such as the Semantic Web, Linked Open Data, and the Web of Things, as well as modern Geographic Information Systems, resulted in the wide and increasing availability of machine-processable data and knowledge in the form of data streams and knowledge bases. To truly take advantage of this kind of knowledge, it is paramount to be able to reason in the presence of intensional or modal concepts, which has resulted in an increased interest in formalisms, often based on rules with defeasible inferences, that allow for reasoning with time (Anicic et al. 2012; Gonçalves, Knorr, and Leite 2014; Brandt et al. 2018; Beck, Dao-Tran, and Eiter 2018; Brewka et al. 2018; Walega, Kaminski, and Grau 2019), space (Brenton, Faber, and Batsakis 2016; Walega, Schultz, and Bhatt 2017; Izmirlioglu and Erdem 2018; Suchan et al. 2018), and possibility or obligations (Panagiotidi, Nieves, and Vázquez-Salceda 2009; Gonçalves and Alferes 2012; Governatori, Rotolo, and Riveret 2018; Beirlaen, Heyninck, and Straßer 2019). Examples of such concepts may be found in applications with data referring for example to time (e.g., operators such 179 in the sense of only providing derivable consequences, and, crucially, its computation is tractable. Our approach allows us to add to relevant related work in the sense of providing a well-founded semantics to formalisms that did not have one, which we illustrate on a relevant fragment of LARS programs (Beck, Dao-Tran, and Eiter 2018). The remainder of the paper is structured as follows. We introduce intensional logic programs in Sec. 2, define our three-valued semantics in Sec. 3, show how to compute the well-founded model in Sec. 4, discuss the complexity and related work in Secs. 5 and 6, respectively, before we conclude. 2 a finite set of rules of the form: A ← A1 , . . . , A n , ∼ B 1 , . . . , ∼ B m (1) where A, A1 , . . . , An , B1 , . . . , Bm ∈ LA O . We call A the head of the rule, and A1 , . . . , An , ∼ B1 , . . . , ∼ Bm its body. We also call P simply a program when this does not cause confusion and positive if it does not contain default negation. Intensional logic programs are highly expressive as intensional operators can appear arbitrarily anywhere in the rules, in particular in rule heads and in scope of default negation. Example 2. Consider a fragment of the setting in Ex. 1 with two gateways: a and b. The area before gateway a is α, the area between gateway a and b is β, and the area behind gateway b is γ. α and β are transit zones where one is not allowed to wait. For simplicity we assume a finite timeline T = {1, 2}.1 Consider the set of operators O1 : INTENSIONAL LOGIC PROGRAMS In this section, building on previous work by Orgun and Wadge [1992], we introduce intensional logic programs, a very expressive framework that allows us to reason with intensional concepts, such as time, space, and obligations, in the presence of large quantities of data, including streams of data. Such intensional logic programs are based on rules, as used in normal logic programs, enriched with atoms that introduce the desired intensional concepts. The usage of default negation in the rules is a distinctive feature compared to the original work (Orgun and Wadge 1992) and is particularly well-suited to model non-monotonic and defeasible reasoning (Gelfond 2008) and allows us to capture many other forms of non-monotonic reasoning, see, e.g., (Caminada et al. 2015; Chen et al. 2010). To assign meaning to intensional programs, we rely on the framework of neighborhood semantics (Pacuit 2017), a generalization of the Kripke semantics, that easily allows us to capture a wide variety of intensional operators. In this section, we introduce neighborhood frames to assign semantics to intensional operators, and leave the definition of our novel three-valued semantics for such programs to the next section. We start by defining the basic elements of our language. We consider a function-free first-order signature Σ = hP, Ci, a set X of variables, and a set of operation symbols O, such that the sets P (of predicates), C (of constants), X and O are mutually disjoint. The set of atoms over Σ and X is defined in the usual way. We say that an atom is ground if it does not contain variables, and we denote by AΣ the set of all ground atoms over Σ. In what follows, and without loss of generality, we leave the signature Σ implicit and consider only the set of ground atoms over Σ, denoted by A. The set O contains the symbols representing the various intensional operators ∇. Based on these, we introduce intensional atoms. Definition 1. Given a set of atoms A and a set of operation A symbols O, the set IO of intensional atoms over A and O is A defined as IO = {∇p | p ∈ A and ∇ ∈ O}, and the set of A A program atoms LA O is defined as LO = A ∪ IO . We can define intensional logic programs as sets of rules with default negation, denoted by ∼, over program atoms. Definition 2. Given a set of atoms A and a set of operation symbols O, an intensional logic program P over A and O is O1 = {O, t , @t,ℓ , @ℓ , ⊳t | t ∈ T, ℓ ∈ {α, β, γ}} where O expresses that “something is obligatory”, t means “something is the case at time t and every place ℓ”, @ℓ means “something is the case at location ℓ”, @t,ℓ means “something is the case at time t and location ℓ”, and ⊳t means “something is the case at or before time t”. We use a signature hP, Ci where the set C of constants contains identifiers representing persons, including p representing Petra, and the set P of predicates is composed of the following unary predicates: passeda , passedb move, is and called. They express that x passed through gate a or b, x moves, x is at a spatio-temporal point, and x is called, respectively. Consider program P composed of the following rules: 2 called(x) ← Omove(x), ∼ move(x) Omove(x) ← ∼ @γ is(x) t move(x) ← @t+1,β is(x), @t,α is(x) t move(x) ← @t+1,γ is(x), @t,β is(x) @t,β is(x) ← ⊳t passeda (x), ∼ ⊳t passedb (x) @t,γ is(x) ← ⊳t passedb (x) 1 passeda (p) ← (2) (3) (4) (5) (6) (7) (8) Rule (2) encodes that if a person should move, but does not, she will be called. Rule (3) encodes that a person ought to move if she is not at γ. In the case of rules (4) and (5), a person moved if she was at two different locations at two subsequent time points. Rule (6) encodes that if a person passed through gate a, but not through gate b, she is at β, whereas rule (7) imposes that she is at γ if she passed through gate b. Finally, rule (8) asserts that Petra passed through gate a at time 1. 1 Note that for applications such as the one described here, in practice, considering an arbitrarily large, but finite timeline does indeed suffice. 2 In the course of this example, we use variables to ease the presentation. They represent the ground instantiation of such rules with all possible constants in the usual way. Time variable t also represents all possible values. 180 3 In order to give semantics to intensional operators, we follow the same ideas as employed by Orgun and Wadge [1992] and consider the neighborhood semantics, a strict generalization of Kripke-style semantics that allows capturing intensional operators (Pacuit 2017) such as temporal, spatial, or deontic operators, even those that do not satisfy the normality property imposed by Kripke frames (Chellas 1980). We start by recalling neighborhood frames. THREE-VALUED SEMANTICS In this section, we define a three-valued semantics for intensional logic programs as an extension of the well-founded semantics for logic programs (Gelder, Ross, and Schlipf 1991) that incorporates reasoning over intensional concepts. The benefit of this approach over the more commonly used two-valued models is that, although there are usually several such three-valued models, we can determine a unique minimal one – intuitively the one which contains all the minimally necessary consequences of a program – which can be efficiently computed. In fact, even for programs without intensional concepts, a unique two-valued minimal model does not usually exist (Gelfond and Lifschitz 1991). We consider three truth values, “true”, “false”, and “undefined”, where the latter corresponds to neither true nor false. Given a neighborhood frame, we start by defining interpretations that contain a valuation function which indicates in which worlds (of the frame) an atom from A is true (W ⊤ ), and in which ones it is true or undefined (W u ), i.e., not false 4 . Definition 3. Given a set of operation symbols O, a neighborhood frame (over O) is a pair F = hW, N i where W is a non-empty set (of worlds) and N = {θ∇ | ∇ ∈ O} is a set of neighborhood functions θ∇ : W → ℘(℘(W )).3 Thus, in comparison to Kripke frames, instead of a relation over W , neighborhood frames have functions for each operator that map worlds to a set of sets of worlds. These sets intuitively represent the atoms necessary (according to the correspondent intensional operator) at that world. Example 3. The operators from Ex. 2 are given semantics using a neighborhood frame where the set of worlds W1 is composed of triples (t, ℓ, ⋆) where t ∈ T is a time point, ℓ ∈ {α, β, γ} is a location and ⋆ ∈ {I, A} indicates if the world is the actual world A or the ideal world I (postulating ideal worlds is a standard technique for giving semantics to modal operators (McNamara 2019)). The neighborhoods of O1 are defined, for t, t′ ∈ T , ℓ, ℓ′ ∈ {α, β, γ}, w ∈ W1 and ⋆ ∈ {I, A}, as: Definition 4. Given a set of atoms A and a frame F = hW, N i, an interpretation I over A and F is a tuple hW, N, V i with a valuation function V : A → ℘(W ) × ℘(W ) s.t., for every p ∈ A, V (p) = (W ⊤ , W u ) with W ⊤ ⊆ W u . If, for every p ∈ A, W ⊤ = W u , then we call I total. The subset inclusion on the worlds ensures that no p ∈ A can be true and false in some world simultaneously. This intuition of the meaning is made precise with the denotation of program atoms for which we use the three truth values. We denote the truth values true, undefined and false with ⊤, u, and ⊥, respectively, and we assume that the language LA O contains a special atom u (associated to u). • θO ((t, ℓ, ⋆)) = {W ′ ⊆ W1 | (t, ℓ, I) ∈ W ′ }; • θt (w) = {W ′ ⊆ W1 | (t, ℓ, A) ∈ W ′ for every ℓ ∈ {α, β, γ}}. • θ@ℓ ((t, ℓ′ , ⋆)) = {W ′ ⊆ W1 | (t, ℓ, A) ∈ W ′ }; • θ@t,ℓ (w) = {W ′ ⊆ W1 | (t, ℓ, A) ∈ W ′ }; Definition 5. Given a set of atoms A, a frame F, and an interpretation I = hW, N, V i, we define the denotation of A ∈ LA O in I: • θ⊳t (w) = {W ′ ⊆ W1 | (t′ , ℓ, A) ∈ W ′ , for some t′ ≤ t and for some ℓ ∈ {α, β, γ}}. Intuitively, θO ((t, ℓ, ⋆)) consists of all the sets of worlds which include the ideal counterpart (t, ℓ, I) of (t, ℓ, ⋆); θt (w) consists of all the sets of worlds which includes all the actual worlds with time component t; θt (w) consists of all the sets of worlds that include all actual worlds with a time stamp t; θ@ℓ (w) contains all sets of worlds that contains at least one actual world with a space component ℓ; a set of worlds is in θ@t,ℓ (w) if it contains (t, ℓ, A); finally, a set of worlds is in θ⊳t (w) if it contains at least one actual world with a time stamp t′ which is earlier or equal as t. • kpk†I = W † if A = p ∈ A, with V (p) = (W ⊤ , W u ) and † ∈ {⊤, u}; • kuku = W and kuk⊤ = ∅, if A = u; A • k∇pk†I = {w ∈ W | kpk†I ∈ θ∇ (w)} if A = ∇p ∈ IO and † ∈ {⊤, u}; u A • kAk⊥ I = W \ kAkI for A ∈ LO . ⊤ For a formula A ∈ LA O and an interpretation I, kAkI is the set of worlds in which A is true, kAkuI is the set of worlds in which A is not false, i.e., undefined or true, and kAk⊥ I is the set of worlds in which A is false. For atoms p ∈ A, the denotation is straightforwardly derived from the interpretation I, i.e., from the valuation function V , and for the special atom u it is defined as expected (undefined in all worlds). For an intensional atom ∇p, w is in the denotation k∇pk†I of ∇p if the denotation of p (according to I) is a neighborhood of ∇ for w, i.e. kpk†I ∈ θ∇ (w). Thus, neighborhood functions θ can be both invariant under the input w, i.e., θ(w) = θ(w′ ) for any w, w′ ∈ W (e.g., θt and θ@t,ℓ ), or variate depending on w (e.g., θO and @ℓ ). This is why the above definitions of neighborhood functions that depend on w need to explicit the components of the world w, i.e., (t, ℓ, ⋆). 3 Note that we often leave O implicit as N allows to uniquely determine all elements from O. Also, to ease the presentation, we only consider unary intensional operators. Others can then often be represented using rules (cf. also (Orgun and Wadge 1992)). 4 We follow the usual notation in modal logic and interpretations explicitly include the corresponding frame. 181 We often leave the subscript I from kAk†I as well as the reference to A and F for interpretations and programs implicit. Example 4. Consider hW1 , {θO , θ@1 ,α , θ@β }i as in Ex. 3 and I1 = hW1 , {θO , θ@1 ,α , θ@β }, V i where: V (passeda (p)) = ({(1, α, A)}, {(1, α, A)}) V (move(p)) = ({(1, α, I)}, {(1, α, I), (2, β, A)}) Then the following are examples of denotations of intensional atoms: kOmove(p)k⊤ I1 = {(1, α, A), (1, α, I)} k@1,α passeda (p)k⊤ I1 = W 1 k@β move(p)kuI1 = {(2, ℓ, ⋆) | ℓ ∈ {α, β, γ}, ⋆ ∈ {A, I}} since We explain the first denotation kOmove(p)k⊤ I1 : kmove(p)k ∋ (1, α, I) and {(1, α, I)} ∈ θO ((1, α, I)) and {(1, α, I)} ∈ θO ((1, α, A)), we get the denotation kOmove(p)k⊤ I1 as stated above. Based on the denotation, we can now define our model notion, which is inspired by partial stable models (Przymusinski 1991), which come with two favorable properties, minimality and support. The former captures the idea of minimal assumption, the latter provides traceable inferences from rules. We adapt this notion here by defining a reduct that, given an interpretation, transforms programs into positive ones, for which a satisfaction relation and a minimal model notion are defined. We start by adapting two orders for interpretations, the truth ordering, ⊑, and the knowledge ordering, ⊑k . The former prefers higher truth values in the order ⊥ < u < ⊤, the latter more knowledge (i.e., less undefined knowledge). Formally, for interpretations I and I ′ , and every p ∈ A: • I ⊑k I iff kpk⊤ I ′ ⊆ kpk⊤ I′ ′ and kpk⊥ I ⊆ Stable models can now be defined by imposing minimality w.r.t. the truth ordering on the corresponding reduct. Definition 8. Let A be set of atoms, and F = hW, N i a frame. An interpretation I is a stable model of a program P if: • for every w ∈ W , I satisfies P/Iw at w, and • there is no interpretation I ′ such that I ′ ⊏ I and, for each w ∈ W , I ′ satisfies P/Iw at w. Example 5. Recall P from in Ex. 2. For simplicity of presentation suppose that the set of constants C only contains p, resulting in the following grounded program.6 called(p) ← Omove(p), ∼ move(p) Omove(p) ← ∼ @γ is(p) t move(p) ← @t+1,β is(p), @t,α is(p) t move(p) ← @t+1,γ is(p), @t,β is(p) @t,β is(p) ← ⊳t passeda (p), ∼ ⊳t passedb (p) @t,γ is(p) ← ⊳t passedb (p) 1 passeda (p) ← Consider F = hW1 , O1 i as in Ex. 3 and the total interpretation I1 defined by: • I ⊑ I ′ iff kpk†I ⊆ kpk†I ′ for every † ∈ {⊤, u}; ′ Definition 7. Let A be a set of atoms, and F = hW, N i a frame. An interpretation I satisfies a positive program P at w ∈ TW iff for each r ∈ P of the form (1), we have that w ∈ i≤n kAi k† implies w ∈ kAk† (for any † ∈ {⊤, u}) 5 . kpasseda (p)k⊤ I1 = {(1, ℓ, A) | ℓ ∈ {α, β, γ}} kpk⊥ I′ . kpassedb (p)k⊤ I1 = {} ′ We write I ≺ I if I I and I 6 I for ∈ {⊑, ⊑k }. We proceed with a generalization of the notion of reduct to programs with intensional atoms. kis(p)k⊤ I1 = {(1, β, A)} kmove(p)k⊤ I1 = {(t, ℓ, I) | t ∈ T ; ℓ ∈ {α, β, γ}} kcalled(p)k⊤ I1 = {(t, ℓ, A) | t ∈ T ; ℓ ∈ {α, β, γ}} Definition 6. Let A be set of atoms, and F = hW, N i a frame. The reduct of a program P at w ∈ W w.r.t. an interpretation I, P/Iw , contains for each r ∈ P of the form (1): S • A ← A1 , . . . , An if w 6∈ i≤m kBi ku S S • A ← A1 , . . . , An , u if w ∈ i≤m kBi ku \ i≤m kBi k⊤ We see that for any (t′ , ℓ, A) ∈ W1 , P/(I1 )w consists of the following rules occuring in P. called(p) ← Omove(p) Omove(p) ← t move(p) ← @t+1,β is(p), @t,α is(p) t move(p) ← @t+1,γ is(p), @t,β is(p) @t,β is(p) ← ⊳t passeda (p) @t,γ is(p) ← ⊳t passedb (p) 1 passeda (p) ← Intuitively, for each rule r of P, the reduct P/Iw contains either a rule of the first form, if all negated program atoms in the body of r are false at w (or the body does not have negated atoms), or a rule of the second form, if none of the negated program atoms in the body of r are true at w, but some of these are undefined at w, or none, otherwise. This also explains why the reduct is defined at w: truth and undefinedness vary for different worlds. The special atom u is applied to ensure that rules for the second case cannot impose the truth of the head in the notion of satisfaction for positive programs. Note that the reduct of a program is a positive program, for which we can define a notion of satisfaction as follows. 5 Since the intersection of an empty sequence of subsets of a set is the entire set, then, for n=0, i.e., when the body of the rule is empty, the satisfaction condition is just w ∈ kAk† for any † ∈ {⊤, u}. 6 To ease the presentation, we still use t to represent all possible values. 182 Whereas for any (t′ , ℓ, I) ∈ W1 , P/(I1 )w consists of: θ|1,2|γ (w) for any w ∈ W1 , k|1, 2|γ is(p)k⊤ I2 = ∅, which means that P/I2 = {@1,γ is(p) ←; @2,γ is(p) ← ; @3,γ is(p) ←}. Clearly, I2 is the ⊏-minimal interpretation that satisfies P/I2 . However, I1 ⊏ I2 and thus, I2 is not a truth-minimal stable model. To counter that, we consider monotonic operators. Formally, given a set of atoms A and a frame F, an intensional operator ∇ is said to be monotonic in F if, for any two interpretations I and I ′ such that I ⊑ I ′ , we have that k∇pk†I ⊆ k∇pk†I ′ for every p ∈ A and † ∈ {⊤, u}. If all intensional operators in a frame are monotonic, then truth-minimality of stable models is guaranteed. Proposition 2. Let A be set of atoms, and F a frame in which all intensional operators are monotonic. If I is a stable model of P, then there is no stable model I ′ of P such that I ′ ⊏ I. Regarding support, recall that the stable models semantics of normal logic programs satisfies the support property, in the sense that for every atom of a stable model there is a rule that justifies it. In other words, if we remove an atom p from a stable model some rule becomes false in the resulting model. Such rule can be seen as a justification for p being true at the stable model. In the case of intensional logic programs we say that an interpretation I = hW, N, V i is supported for a program P if, for every p ∈ A and w ∈ W , if w ∈ kpk⊤ , then there is a rule r ∈ P/Iw that is not satisfied by I ′ at w, where I ′ = hW, N, V ′ i is such that V ′ (q) = V (q) for q 6= p, and V ′ (p) = hW ⊤ \ {w}, W u i where V (p) = hW ⊤ , W u i. This notion of supportedness is desirable for intensional logic programs since we also want a justification why each atom is true at each world in a stable model. The following results show that this is indeed the case. Proposition 3. Let A be set of atoms, and F a frame. Then, every stable model of a program P is supported. In general, the existence and uniqueness of stable models of a program is not guaranteed, not even for positive programs and/or under the restriction of all operators being monotonic. Example 7. Let O = {⊕}, A = {p} and F = h{1, 2}, {θ⊕ }i where θ⊕ (1) = θ⊕ (2) = {{1}, {2}}. Let P = {⊕p ←}. This program has two stable models: • I1 with V1 (p) = ({1}, {1}); • I2 with V2 (p) = ({2}, {2}). The existence of two stable models of the above positive program is caused by the non-determinism introduced by the intensional operator in the head of the rule. Formally, an T operator θ of a frame F = hW, N i is deterministic if θ(w) ∈ θ(w) for every w ∈ W . A program P is deterministic in the head if, for every rule r ∈ P of the form (1), if A = ∇p, then θ∇ is deterministic. We can show that every positive program that is deterministic in the head and only considers monotonic operators has a single minimal model. Proposition 4. Given a set of atoms A and a frame F, if P is a positive program that is deterministic in the head and Omove(p) ← t move(p) ← @t+1,β is(p), @t,α is(p) t move(p) ← @t+1,γ is(p), @t,β is(p) @t,β is(p) ← ⊳t passeda (p) @t,γ is(p) ← ⊳t passedb (p) 1 passeda (p) ← It can be checked that I1 satisfies minimality and is therefore a stable model of P. Consider now the total interpetation I2 identical to I1 except for kpassedb (p)k⊤ I2 = {(2, ℓ, A) | ℓ ∈ {α, β, γ}}. Then for e.g. (2, α, A), the reduct P/(I2 )(2,α,A) is: called(p) ← Omove(p) Omove(p) ← t move(p) ← @t+1,β is(p), @t,α is(p) t move(p) ← @t+1,γ is(p), @t,β is(p) @t,γ is(p) ← ⊳t passedb (p) 1 passeda (p) ← Since (2, α, A) 6∈ k@2,γ is(p)k = ∅ even though @2,γ is(p) ← ⊳2 passedb (p) ∈ P/(I2 )(2,α,A) and (2, α, A) ∈ k ⊳2 passedb (p)k⊤ I2 = W1 , we see that I2 does not satisfy P/(I2 )(2,α,A) and therefore is not a stable model. We can show that our model notion is faithful to partial stable models of normal logic programs (Przymusinski 1991), i.e., if we consider a program without intensional atoms, then its semantics corresponds to that of partial stable models. Proposition 1. Let A be set of atoms, F a frame, and P a program with no intensional atoms. Then, there is a one-toone correspondence between the stable models of P and the partial stable models of the normal logic program P. While partial stable models are indeed truth-minimal, this turns out not to be the case for intensional programs, due to non-monotonic intensional operators. Example 6. Consider the operator |j, k|γ representing that an atom is true at γ at all time points in [j, k], and not in any interval properly containing [j, k]. This operator has the following neighborhood (given W1 from Ex. 3): θ|j,k| (w) = {W ′ ⊆ W1 | {(j, γ, A), (j + 1, γ, A), . . . , (k, γ, A)} ⊆ W ′ and (j − 1, γ, A), (k + 1, γ, A) 6∈ W ′ }. Consider the following program P consisting of: @1,γ is(p) ← @2,γ is(p) ← @3,γ is(p) ←∼ |1, 2|γ is(p) This program has two stable models, of which one is not minimal. In more detail, the following interpreta= kis(p)kuI1 = tions are stable: I1 with kis(p)k⊤ I1 ⊤ {(1, γ, A), (2, γ, A)} and I2 with kis(p)kI2 = kis(p)kuI2 = {(1, γ, A), (2, γ, A), (3, γ, A)}. To see that I2 is stable, observe first that since {(1, γ, A), (2, γ, A), (3, γ, A)} 6∈ 183 every ∇ ∈ O is monotonic in F, then it has a unique stable model. atoms. This operator is composed of three operators to incorporate reasoning over rules as well as over intensional atoms. We first define an immediate consequence operator applied to sets of labelled program atoms. To ease notation, here and in the following, we use LW to represent (LA O )W . Definition 10. Given a frame F, a set of LW -formulas ∆, and a positive program PW , we define TPW (∆) as follows: Due to this result, in what follows, we focus on monotonic operators and programs that are deterministic in the head, as this is important for several of the results we obtain subsequently. This does not mean that non-montonic intensional operators cannot be used in our framework. In fact, we can take advantage of the default negation operator ∼ to define non-monotonic formulas on the basis of monotonic operators and default negation. As an example, consider again the operator |j, k| from example 6. We can use the following rule to define |j, k|p for some atom p ∈ A: |j, k|p ← @j p, @j+1 p, . . . , @k−1 p, @k p, ∼ @j−1 p, ∼ @k+1 p. Among the stable models of a program, we can distinguish the well-founded models as those that are minimal in terms of the knowledge order. TPW (∆) = {w : A | w : A ← A1 , . . . An ∈ PW , w : A1 , . . . , w : An ∈ ∆} Example 9. Let PW = {(2, α, A) : @1 passeda (p) ←}. Then TPW (∅) = {(2, α, A) : @1 passeda (p)}. The result of TPW may contain labelled intensional atoms, such as in Ex. 9, which implies that passeda (p) holds at (1, α, A). The next operator, the intensional extraction operator IE∇ allows us to derive such labelled atoms from labelled intensional atoms. Definition 11. Given a frame F, a set of LW -formulas ∆, and ∇ ∈ O, we define IE∇ (∆) as follows: \ IE∇ (∆) = {w : A | w′ : ∇A ∈ ∆, w ∈ θ∇ (w′ )} Definition 9. Given a set of atoms A and a frame F, an interpretation I = hW, N, V i is a well-founded model of a program P if it is a stable model of P, and, for every stable model I ′ of P, it holds that I ⊑k I ′ . Example 8 (Example 5 continued). Since I2 is in fact the unique stable model, it is therefore the well-founded model. Given our assumptions about monotonicity and determinism in the head, we can also show that the well-founded model of an intensional program exists and is unique. As IE∇ is intended to be applied to results of TPW , and these only contain intensional atoms occurring in the head of some rule in a given program P, we restrict IE∇ to this set of intensional operators, which we denote by OP . Since P is deterministic in the head, this also ensures that IE∇ has a unique outcome. Also, since programs do not contain nested operators, we can consider the union of IE∇ for all ∇ ∈ OP . Finally, the intensional consequence operator IC∇ maps atoms to intensional atoms that are implied by the former, i.e., it maps w1 : A, . . . , wn : A to w : ∇A if {w1 , . . . , wn } ∈ θ∇ (w). Definition 12. Given a frame F = hW, N i, a set of LW formulas ∆, and θ∇ ∈ N , we define IC∇ (∆) as follows: Theorem 1. Given a set of atoms A, and a frame F, every program P has a unique well-founded model. 4 ALTERNATING FIXPOINT In this section, we show how the well-founded model can be efficiently computed. Essentially, we extend the idea of the alternating fixpoint developed for logic programs (Gelder 1989), that builds on computing, in an alternating manner, underestimates of what is necessarily true, and overestimates of what is not false, with the mechanisms to handle intensional inferences. Namely, we first define an operator for inferring consequences from positive programs, and make it applicable in general using the reduct. Based on an iterative application of these, the alternating fixpoint provides the well-founded model. First, since different pieces of knowledge are inferable in different worlds, we need a way to distinguish between these. Therefore, we introduce labels referring to worlds and apply them to formulas of a given language as well as programs. Given a language L, a frame F = hW, N i, and a program P, we define the language labelled by W , LW , as {w : A | w ∈ W and A ∈ L}, and the program labelled by W , PW , as {w : r | r ∈ P, w ∈ W }. PW is positive iff P is. This allows us to say that some (program) atom is true (or undefined) at world w, which can, e.g., be used for inferences using rules labelled with w. This way, we can also compute inferences for several worlds simultaneously, since the labels allow us to avoid any potential overlaps. We now proceed to define an operator for positive labelled programs for computing inferences given a set of labelled IC∇ (∆) = {w : ∇A ∈ LW | {w′ | w′ : A ∈ ∆} ∈ θ∇ (w), w ∈ W} Here, we can also simply consider the union for all θ∇ ∈ N . Example 10. Consider the frame F = hW1 , {θ@1 , θ⊳1 }i as defined in Ex. 3. Let ∆ = {(2, α, A) : @1 passeda (p)}. Since (2, α, A) : @1 passeda (p) ∈ ∆ and (1, l, A) ∈ T θ@1 ((2, α, A)), IE@1 (∆) = {(1, ℓ, A) : passeda (p) | ℓ ∈ {α, β, γ}}. Informally, to believe ∆, and thus to believe that Petra passed gate a at time 1, passeda (p) has to be true at every actual world with time stamp 1. Also, IC⊳1 (IE@1 (∆)) = {w : ⊳1 passeda (p) | w ∈ W1 }, since {(1, ℓ, A) | ℓ ∈ {α, β, γ}} ∈ θ⊳1 (w) for any w ∈ W1 and (1, ℓ, A) : passeda (p) ∈ IE@1 (∆) for any ℓ ∈ {α, β, γ}. Informally, since we know that passeda (p) is the case at time 1, we derive that passeda (p) is the case at or before time 1 (formally: ⊳1 passeda (p)). We are now ready to define the closure operator as the least fixpoint of the composition of TPW , IE∇ and IC∇ . 184 passeda (p) | w ∈ W1 , ℓ ∈ {α, β, γ}} and N 1 = P 1 ∪ {(1, ℓ, A) : @1,β is(p) | ℓ ∈ {α, β, γ}}. From this we derive PW /P 1 = PW /N 1 = {w : @1 passeda (p) ←; w : @1,β is(p) ← passeda (p) | w ∈ W1 }, which allows us to calculate P 2 = N 2 = N 1 . Notice that a fixpoint is reached and thus P ω = N ω = N 1 . Definition 13. Given a frame F = hW, N i and a positive program PW , Cn(PW ) is defined as the least fixpoint 7 of ! [ [ IC∇ IE∇ (TPW ) . ∇∈O ∇∈O P The consequence operator defined above is adequate in the sense that it calculates the minimal (total) model of a positive program. In more detail, we define an interpretation I(Cn): Wp = {w ∈ W | w : p ∈ Cn(PW )}, I(Cn) = hW, N, V i is a minimal model with V (p) = (Wp , Wp ) for every p ∈ A. Proposition 5. Given a frame F = hW, N i, and a positive program P, if every ∇ ∈ O is monotonic in F and P is deterministic in the head, then I(Cn) is the unique (and total) stable model of P. This result can be generalized to arbitary programs relying on the reduct and on the alternating fixpoint (Gelder 1989). For the former, we adapt the reduct from Def. 6 to labelled languages, with the benefit that we can define a single reduct for all worlds w ∈ W . Given a frame F = hW, N i, a set of LW -formulas ∆ and a program PW , the reduct, PW /∆ = {w : A ← A1 , . . . , An | r ∈ P of the form (1) and, ∀i 6 m, w : Bi 6∈ ∆}. The idea of the alternating fixpoint can be summarized as follows. We create two sequences, that are meant to represent an underestimation of what is true (P i ) and an overestimation of what is not false (N i ). Each iteration is meant to further increase the elements in P i and further decrease the elements in N i using Cn over reducts obtained from the labelled program and the results from the previous iteration. Given a frame F = hW, N i and a program P, we define: P0 = ∅ Given a frame F, for which any ∇ ∈ O is monotonic in F, the alternating fixpoint construction defined above offers a characterization of the well-founded model for programs that are deterministic in the head. In more detail, given a pair h∆, Θi of sets of LW -formulas, we define a partial interpretation I(h∆, Θi) = (W, N, V ) on the basis of ∆ as follows: for every A ∈ A, V (A) = ({w ∈ W | w : A ∈ ∆}, {w ∈ W | w : A ∈ Θ}). We can then show this correspondence. Theorem 2. Given a frame F = hW, N i, and a program P s.t. every ∇ ∈ O is monotonic in F and P is deterministic in the head, then I(hP ω , N ω i) is the well-founded model of P. Thus the result of the alternating fixpoint operator is a precise representation of the well-founded model of the considered intensional program. 5 In this section, we study the computational complexity of several of the problems considered. We recall that the problem of satisfiability under neighborhood semantics has been studied for a variety of epistemic structures (Vardi 1989). Here, we consider the problem of determining models for the two notions we established, stable models and the wellfounded model, and we focus on the propositional case.8 We assume familiarity with standard complexity concepts, including oracles and the polynomial hierarchy. We first provide a result in the spirit of model-checking for programs P. As we do not impose any semantic properties on the neighborhood frames we consider, determining a model for a frame that can be arbitrarily chosen is not meaningful. Thus, in the remainder, we assume a fixed frame F, fixing the worlds and the semantics of the intensional operators.9 N 0 = LW P i+1 = Cn(PW /N i ) [ Pω = Pi i i COMPUTATIONAL COMPLEXITY N i+1 = Cn(PW /P i ) \ Nω = Ni i We can show that P is increasing, and that N i is decreasing, and both sequences reach a fixpoint because the operator for determining Cn is monotonic and the reduct is antitonic. Proposition 6. Given a set of atoms A, a frame F = hW, N i, and a positive program P, if every ∇ ∈ O is monotonic in F and P is deterministic in the head, then there are i, j ∈ N s.t. P i = P i+1 and N j = N j+1 . Example 11. Consider the frame F = hW1 , {θ@1 , θ@1,β }i as defined in Ex. 3. Let PW = {w : @1 passeda (p) ←; w : @1,β is(p) ← passeda (p), ∼ passedb (p) | w ∈ W1 }. The alternating fixpoint construction is carried out as follows: We start with P 0 = ∅ and N 0 = LW . Then PW /N 0 = {w : @1 passeda (p) ←| w ∈ W1 } and PW /P 0 = {w : @1 passeda (p) ←; w : @1,β is(p) ← passeda (p) | w ∈ W1 }, which implies P 1 = {w : @1 passeda (p); (1, ℓ, A) : Proposition 7. Given a program P and an interpretation I, deciding whether I is a stable model of P is in coNP. This result is due to the minimization of stable models, i.e., we need to check for satisfaction and verify that there is no other interpretation which is smaller (cf. Def. 8). This also impacts on the complexity of finding a stable model given a fixed frame. Theorem 3. Given a program P, deciding whether there is a stable model of P is in ΣP 2. 8 Corresponding results for the data complexity of this problem for programs with variables can then be achieved in the usual way (Dantsin et al. 2001). 9 This also aligns well with related work, e.g., for reasoning with time, such as stream reasoning where often a finite timeline is assumed, and avoids the exponential explosion on the number of worlds for satisfiability for some epistemic structures (Vardi 1989). 7 Recall that, given an operator T over lattice (L, 6) and an ordinal α, T ↑ α is defined as: T ↑ 0 S = ∅, T ↑ α = T (T ↑ α − 1) for successor ordinals, and T ↑ α = α T ↑ α for limit ordinals. 185 LARS (Beck, Dao-Tran, and Eiter 2018) assumes a set of atoms A and a stream S = (T, v), where T is a closed interval of the natural numbers and v is an evaluation function that defines which atoms are true at each time point of T . Several temporal operators are defined, including expressive window operators, and answer streams, a generalization of FLP-semantics, are employed for reasoning. A number of related approaches are covered including CQL (Arasu, Babu, and Widom 2006), C-SPARQL (Barbieri et al. 2010), and CQELS (Phuoc et al. 2011). Among the implementations exists LASER (Bazoobandi, Beck, and Urbani 2017), which focuses on a considerable fragment, called plain LARS. We can represent a plain LARS program P for stream S as a program PS , encoding S using the @t operator. This allows us to show the following result relating the answer streams of plain LARS to the total stable models of such a program PS . Note that these results do not require that intensional operators be monotonic or deterministic in the head. In fact, if intensional operators are monotonic, we obtain the following improved results on Prop. 7 and Thm. 3 from Prop. 2. Corollary 1. Given a program P such that all operators ocurring in P are monotonic, and an interpretation I, deciding whether I is a stable model of P is in P. Corollary 2. Given a program P, deciding whether there is a stable model of P is in NP. Thus, if all operators are monotonic the complexity results do coincide with that of normal logic programs (without intensional atoms) (Dantsin et al. 2001), which indicates that monotonic operators do not add an additional burden in terms of computational complexity. Now, if we in addition consider programs that are deterministic in the head, then we know that there exists the unique well-founded model (cf. Thm. 1). As we have shown, this model can be computed efficiently (cf. Thm. 2), and we obtain the following result in terms of computational complexity. Proposition 9. Given a plain LARS program P for stream S and PS , there is a one-to-one correspondence between answer streams of P for S and total stable models of PS . Theorem 4. Given a program P that is deterministic in the head and all operators occurring in P are monotonic, computing the well-founded model of P is P-complete. In addition, for such an encoding of plain LARS programs into intensional programs, we can apply our well-founded semantics, since the operators applied in plain LARS are monotonic and deterministic. Hence, our work also provides a well-founded semantics for plain LARS, i.e., we allow the usage of unrestricted default negation while preserving polynomial reasoning. ETALIS (Anicic et al. 2012) aims at complex event processing. It assumes as input atomic events with a time stamp and uses complex events, based on Allen’s interval algebra (Allen 1990), that are associated with a time interval, and is therefore considerably different from LARS (which considers time points). It contains no negation in the traditional sense, but allows for a negated pattern among the events. Many of the complex event patterns from ETALIS can be captured as neighborhood functions in our framework. However, ETALIS also makes use of some event patterns that would result in a non-monotonic operator, such as the negated pattern not(p)[q, r] which expresses that p is not the case in the interval between the end time of q and the starting time of r. We conjecture that such a negation can be modelled with a combination of the default negation ∼ and an operator [q, r]p which expresses that p is the case in the interval between the end time of q and the starting time of r, which in turn can to be defined using rules such as: [q, r]p ← [t, t′ ]p, @t q, @t′ r, ∼ @t+1 q, ∼ @t′ −1 r. Defining a transformation that converts a set of ETALIS rules into an intensional logic program is left for future work. Deontic logic programs of (Gonçalves and Alferes 2012) are similar in spirit to our work as they extend logic programs with deontic logic formulas under stable model semantics. Although complex deontic formulas can appear in the rules, the deontic operators are restricted to those of Standard Deontic Logic (SDL), and computational aspects are not considered. Answer Set Programming Modulo Theories extended to the Qualitative Spatial Domain (in short, ASPMT(QS)) Note that this result is indeed crucial in contexts were reasoning with a variety of intensional concepts needs to be highly efficient. 6 RELATED WORK In this section, we discuss related work establishing relations to relevant formalisms in the literature. Intensional logic programs were first defined by Orgun and Wadge [1992] focussing on the existence of models in function of the properties of the intensional operators. Only positive programs are considered, but nesting of intensional operators is allowed. The latter however can be covered in our approach by introducing corresponding additional operators that represent each nesting occurring in such a program. This allows us to show that our approach covers the previous work. Proposition 8. Let P be program as in (Orgun and Wadge 1992). Then there is a positive intensional program P ′ such that there is a one-to-one correspondence between the models of P and the total stable models of P ′ . The contrary is not true already for programs without intensional operators. We could use a non-monotonic intensional operator for representing default negation, but these are not considered in (Orgun and Wadge 1992) confirming that our work is indeed an extension of the previous approach. Since (Orgun and Wadge 1992) covers classical approaches for intensional reasoning, such as TempLog (Abadi and Manna 1989) and MoLog (del Cerro 1986), our work applies to these as well. It also relates to more recent work with intensional operators, and we first discuss two prominent approaches in the area of stream reasoning. 186 2011), also based on an alternating fixpoint construction, together with its efficient implementation (Kasalica et al. 2020) may prove fruitful for such an endeavour. (Walega, Schultz, and Bhatt 2017) allows for the systematic modelling of dynamic spatial systems. It is based on logic programs over first-order formulas (with function symbols), which are not yet integrated in our approach. On the other hand, this work does only conside spatial reasoning. An interesting option for future work would be considering an extension incorporating such formulas. 7 Acknowledgments The authors are indebted to the anonymous reviewers of this paper for helpful feedback. The authors were partially supported by FCT project RIVER (PTDC/CCI-COM/30952/2017) and by FCT project NOVA LINCS (UIDB/04516/2020). J. Heyninck was also supported by the German National Science Foundation under the DFG-project CAR (Conditional Argumentative Reasoning) KE-1413/11-1. CONCLUSIONS Building on work by Orgun and Wadge (1992), we have introduced intensional logic programs that allow defeasible reasoning with intensional concepts, such as time, space, and obligations, and with streams of data. Relying on the neighborhood semantics (Pacuit 2017), we have introduced a novel three-valued semantics based on ideas adapted from partial stable models (Przymusinski 1991). Due to the expressivity of the intensional operators, stable models may not be minimal nor deterministic even for programs without default negation. Hence, we have studied the characteristics of our semantics for monotonic intensional operators and programs that only admit deterministic operators in the heads of the rules, and shown that a unique minimal model, the well-founded model, exists and can be computed with an alternating fixpoint construction. We have studied the computational complexity of checking for existence of models and computation of models and established that the well-founded model can be computed in polynomial time. Finally, we have discussed related work and shown that several relevant approaches in the literature can be covered. In terms of future work, we want to investigate in more detail the exact relations to existing approaches in the literature, that are not formally covered in this paper. Furthermore, this work can be generalized in several directions, for example, by allowing for first-order formulas instead of essentially propositional formulas (which is what programs with constants and variables over a finite instantiation domain amount to) and nested, non-deterministic, and non-monotonic intensional operators. Furthermore, we may want to consider intensional operators with multiple minimal neighborhoods, by defining IE∇ as a nondetermistic operator that extracts a minimal neighborhood W ′ ∈ θ∇ (w′ ). In that case, of course, the alternating fixpoint construction as it is defined now might not result in a unique well-founded model. However, the occurence of non-deterministic operators in the heads of rules is very similar to disjunctive logic programs, where the truth of a head of a rule can also be guaranteed by a choice of different atoms (the disjuncts) being made true. Therefore, we plan to look at techniques from disjunctive logic programming to generate unique well-founded extensions (cf. references in (Knorr and Hitzler 2007)). Finally, the integration with taxonomic knowledge in the form of description logic ontologies (Baader et al. 2007) may also be worth pursuing as applications sometimes require both (see e.g. (Alberti et al. 2011; Alberti et al. 2012; Kasalica et al. 2019)). Hybrid MKNF knowledge bases (Motik and Rosati 2010) are a more prominent approach among the existing approaches for combining non-monotonic rules and such ontologies, and the wellfounded semantics for these (Knorr, Alferes, and Hitzler References Abadi, M., and Manna, Z. 1989. Temporal logic programming. J. Symb. Comput. 8(3):277–295. Alberti, M.; Gomes, A. S.; Gonçalves, R.; Leite, J.; and Slota, M. 2011. Normative systems represented as hybrid knowledge bases. In CLIMA, volume 6814 of LNCS, 330– 346. Springer. Alberti, M.; Knorr, M.; Gomes, A. S.; Leite, J.; Gonçalves, R.; and Slota, M. 2012. Normative systems require hybrid knowledge bases. In AAMAS, 1425–1426. IFAAMAS. Allen, J. F. 1990. Maintaining knowledge about temporal intervals. In Readings in qualitative reasoning about physical systems. Elsevier. 361–372. Anicic, D.; Rudolph, S.; Fodor, P.; and Stojanovic, N. 2012. Stream reasoning and complex event processing in ETALIS. Semantic Web 3(4):397–407. Arasu, A.; Babu, S.; and Widom, J. 2006. The CQL continuous query language: semantic foundations and query execution. VLDB J. 15(2):121–142. Baader, F.; Calvanese, D.; McGuinness, D. L.; Nardi, D.; and Patel-Schneider, P. F., eds. 2007. The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, 2nd edition. Barbieri, D. F.; Braga, D.; Ceri, S.; Valle, E. D.; and Grossniklaus, M. 2010. C-SPARQL: a continuous query language for RDF data streams. Int. J. Semantic Computing 4(1):3– 25. Bazoobandi, H. R.; Beck, H.; and Urbani, J. 2017. Expressive stream reasoning with LASER. In Procs. of ISWC, volume 10587 of LNCS, 87–103. Springer. Beck, H.; Dao-Tran, M.; and Eiter, T. 2018. LARS: A logicbased framework for analytic reasoning over streams. Artif. Intell. 261:16–70. Beirlaen, M.; Heyninck, J.; and Straßer, C. 2019. Structured argumentation with prioritized conditional obligations and permissions. Journal of Logic and Computation 29(2):187– 214. Brandt, S.; Kalayci, E. G.; Ryzhikov, V.; Xiao, G.; and Zakharyaschev, M. 2018. Querying log data with metric temporal logic. J. Artif. Intell. Res. 62:829–877. Brenton, C.; Faber, W.; and Batsakis, S. 2016. Answer set programming for qualitative spatio-temporal reasoning: 187 Methods and experiments. In Technical Communications of ICLP, volume 52 of OASICS, 4:1–4:15. Schloss Dagstuhl Leibniz-Zentrum fuer Informatik. Brewka, G.; Ellmauthaler, S.; Gonçalves, R.; Knorr, M.; Leite, J.; and Pührer, J. 2018. Reactive multi-context systems: Heterogeneous reasoning in dynamic environments. Artif. Intell. 256:68–104. Caminada, M.; Sá, S.; Alcântara, J.; and Dvořák, W. 2015. On the equivalence between logic programming semantics and argumentation semantics. International Journal of Approximate Reasoning 58:87–111. Chellas, B. F. 1980. Modal Logic: An Introduction. Cambridge University Press. Chen, Y.; Wan, H.; Zhang, Y.; and Zhou, Y. 2010. dl2asp: implementing default logic via answer set programming. In European Workshop on Logics in Artificial Intelligence, 104–116. Springer. Dantsin, E.; Eiter, T.; Gottlob, G.; and Voronkov, A. 2001. Complexity and expressive power of logic programming. ACM Comput. Surv. 33(3):374–425. del Cerro, L. F. 1986. MOLOG: A system that extends PROLOG with modal logic. New Generation Comput. 4(1):35– 50. Gelder, A. V.; Ross, K. A.; and Schlipf, J. S. 1991. The well-founded semantics for general logic programs. J. ACM 38(3):620–650. Gelder, A. V. 1989. The alternating fixpoint of logic programs with negation. In Procs. of SIGACT-SIGMODSIGART, 1–10. ACM Press. Gelfond, M., and Lifschitz, V. 1991. Classical negation in logic programs and disjunctive databases. New Generation Comput. 9(3-4):365–385. Gelfond, M. 2008. Answer sets. In Handbook of Knowledge Representation, volume 3 of Foundations of Artificial Intelligence. Elsevier. 285–316. Gonçalves, R., and Alferes, J. J. 2012. Specifying and reasoning about normative systems in deontic logic programming. In Procs. of AAMAS, 1423–1424. IFAAMAS. Gonçalves, R.; Knorr, M.; and Leite, J. 2014. Evolving multi-context systems. In ECAI, volume 263 of Frontiers in Artificial Intelligence and Applications, 375–380. IOS Press. Governatori, G.; Rotolo, A.; and Riveret, R. 2018. A deontic argumentation framework based on deontic defeasible logic. In International Conference on Principles and Practice of Multi-Agent Systems, 484–492. Springer. Izmirlioglu, Y., and Erdem, E. 2018. Qualitative reasoning about cardinal directions using answer set programming. In Procs. of AAAI, 1880–1887. AAAI Press. Kasalica, V.; Gerochristos, I.; Alferes, J. J.; Gomes, A. S.; Knorr, M.; and Leite, J. 2019. Telco network inventory validation with nohr. In LPNMR, volume 11481 of LNCS, 18–31. Springer. Kasalica, V.; Knorr, M.; Leite, J.; and Lopes, C. 2020. NoHR: An overview. Künstl Intell. Knorr, M.; Alferes, J. J.; and Hitzler, P. 2011. Local closed world reasoning with description logics under the well-founded semantics. Artif. Intell. 175(9-10):1528–1554. Knorr, M., and Hitzler, P. 2007. A comparison of disjunctive well-founded semantics. In FAInt, volume 277 of CEUR Workshop Proceedings. CEUR-WS.org. McNamara, P. 2019. Deontic logic. In Zalta, E. N., ed., The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, summer 2019 edition. Motik, B., and Rosati, R. 2010. Reconciling description logics and rules. J. ACM 57(5):30:1–30:62. Orgun, M. A., and Wadge, W. W. 1992. Towards a unified theory of intensional logic programming. The Journal of Logic Programming 13(4):413–440. Pacuit, E. 2017. Neighborhood semantics for modal logic. Springer. Panagiotidi, S.; Nieves, J. C.; and Vázquez-Salceda, J. 2009. A framework to model norm dynamics in answer set programming. In MALLOW. Phuoc, D. L.; Dao-Tran, M.; Parreira, J. X.; and Hauswirth, M. 2011. A native and adaptive approach for unified processing of linked streams and linked data. In Procs. of ISWC, volume 7031 of LNCS, 370–388. Springer. Przymusinski, T. C. 1991. Stable semantics for disjunctive programs. New Generation Comput. 9(3/4):401–424. Suchan, J.; Bhatt, M.; Walega, P. A.; and Schultz, C. P. L. 2018. Visual explanation by high-level abduction: On answer-set programming driven reasoning about moving objects. In Procs. of AAAI, 1965–1972. AAAI Press. Vardi, M. Y. 1989. On the complexity of epistemic reasoning. In Procs. of LICS, 243–252. IEEE Computer Society. Walega, P. A.; Kaminski, M.; and Grau, B. C. 2019. Reasoning over streaming data in metric temporal datalog. In Procs. of AAAI, 3092–3099. AAAI Press. Walega, P. A.; Schultz, C. P. L.; and Bhatt, M. 2017. Nonmonotonic spatial reasoning with answer set programming modulo theories. TPLP 17(2):205–225. 188 Obfuscating Knowledge in Modular Answer Set Programming∗ Ricardo Gonçalves1 , Tomi Janhunen2 , Matthias Knorr1 , João Leite1 , Stefan Woltran3 1 Universidade Nova de Lisboa 2 Tampere University 3 Vienna University of Technology {rjrg,mkn,jleite}@fct.unl.pt, tomi.janhunen@tuni.fi, woltran@dbai.tuwien.ac.at Abstract declarative meaning, but also because it may be necessary, e.g., as a means to deal with privacy and legal issues such as to eliminate illegally obtained data, or to comply with the recently enacted right to be forgotten (European Union 2016). Whereas forgetting in the context of classical logic is essentially a solved problem (Bledsoe and Hines 1980; Weber 1986; Middeldorp, Okui, and Ida 1996; Lang, Liberatore, and Marquis 2003; Moinard 2007; Gabbay, Schmidt, and Szalas 2008), new challenging issues arise when it is considered in the context of a non-monotonic logic based language such as ASP (Zhang and Foo 2006; Eiter and Wang 2008; Wong 2009; Wang, Wang, and Zhang 2013; Knorr and Alferes 2014; Wang et al. 2014; Delgrande and Wang 2015; Gonçalves, Knorr, and Leite 2016b). According to (Goncalves, Knorr, and Leite 2016a), forgetting in ASP is best captured by strong persistence (Knorr and Alferes 2014), a property inspired by strong equivalence, which requires that there be a correspondence between the answer sets of a program before and after forgetting a set of atoms, and that such correspondence be preserved in the presence of additional rules not containing the atoms to be forgotten. However, it has also been shown that, in ASP, it is not always possible to forget and satisfy strong persistence (Gonçalves, Knorr, and Leite 2016b). What about forgetting in Modular ASP? Do the same negative results hold, and sometimes it is simply impossible to forget while satisfying strong persistence? Is strong persistence an adequate requirement in the case of Modular ASP? Can forgetting be reconciled with the module theorem? Investigating forgetting in the context of Modular ASP is the central topic of this paper. Our main contributions are: Modular programming facilitates the creation and reuse of large software, and has recently gathered considerable interest in the context of Answer Set Programming (ASP). In this setting, forgetting, or the elimination of middle variables no longer deemed relevant, is of importance as it allows one to, e.g., simplify a program, make it more declarative, or even hide some of its parts without affecting the consequences for those parts that are relevant. While forgetting in the context of ASP has been extensively studied, its known limitations make it unsuitable to be used in Modular ASP. In this paper, we present a novel class of forgetting operators and show that such operators can always be successfully applied in Modular ASP to forget all kinds of atoms – input, output and hidden – overcoming the impossibility results that exist for general ASP. Additionally, we investigate conditions under which this class of operators preserves the module theorem in Modular ASP, thus ensuring that answer sets of modules can still be composed, and how the module theorem can always be preserved if we further allow the reconfiguration of modules. 1 Introduction Modularity in Answer Set Programming (ASP) (Dao-Tran et al. 2009; Harrison and Lierler 2016; Baral, Dzifcak, and Takahashi 2006; Janhunen et al. 2009; Oikarinen and Janhunen 2008), just as in many other programming paradigms, is a fundamental concept to ease the creation and reuse of large programs. In one of the most significant general approaches to modularity – the so-called programming-in-thelarge – compositional operators are provided for combining separate and independent modules, i.e., essentially answer set programs extended with well-defined input/output interfaces, based on standard semantics. The compositionality of the semantics of individual modules is ensured by the socalled module theorem (Janhunen et al. 2009). The operation of forgetting, which aims at eliminating a set of variables from a knowledge base while preserving all relationships (direct and indirect) between the remaining variables, has recently gained a lot of attention, not only because it is useful, e.g., as a means to clean up a theory by eliminating all auxiliary variables that have no relevant • We argue that, given that the input of a module is just a set of facts, strong persistence is too strong when forgetting in Modular ASP, and that it is more suitable to rely on uniform equivalence (Sagiv 1988; Eiter and Fink 2003) for a weaker form of persistence, say uniform persistence, which has not been considered before. • We thoroughly investigate forgetting in ASP under uniform equivalence, including formalizing uniform persistence and showing that, unlike with strong persistence, it is always possible to forget under this new property. ∗ This paper has been published in the Proceedings of the Thirtythird AAAI Conference on Artificial Intelligence (AAAI), 2019 (Gonçalves et al. 2019). • We show that no previously known class of forgetting operators satisfies uniform persistence, which leads us to in- 189 P2 (Lifschitz, Pearce, and Valverde 2001). Given a set of atoms V , the V -exclusion of a set of HT-interpretations M, MkV , is {hX\V, Y \V i | hX, Y i ∈ M}. A forgetting operator over a class C of programs over A is a partial function f : C × 2A → C s.t. the result of forgetting about V from P , f(P, V ), is a program over A(P )\V , for each P ∈ C and V ⊆ A. We denote the domain of f by C(f) and usually we focus on C = Ce , and leave C implicit. The operator f is called closed for C ′ ⊆ C(f) if f(P, V ) ∈ C ′ , for every P ∈ C ′ and V ⊆ A. A class F of forgetting operators (over C) is a set of forgetting operators f s.t. C(f) ⊆ C. We recall notions of modules using ELP-functions, a generalization of DLP-functions (Janhunen et al. 2009).1 An ELP-function, Π, is a quadruple hP, I, O, Hi, where I, O, and H are pairwise distinct sets of input atoms, output atoms, and hidden atoms, respectively, and P is a logic program s.t. for each rule A ← B, not C, not not D of P , troduce a new class of forgetting operators that satisfies uniform persistence, and investigate its other properties. • We employ the newly introduced class of operators to forget in a prominent approach of modular ASP, DLPfunctions (Janhunen et al. 2009), and show how it can adequately be used to forget input, output, and hidden atoms from a module, while obeying uniform persistence. • We also show that, not unexpectedly, the module theorem no longer holds in general after forgetting. • To overcome the latter problem, we investigate ways to modify modules so that the module theorem can be preserved while forgetting under uniform persistence i.e., ways to reconfigure ASP modules by merging and splitting modules, so that we can properly forget while preserving the compositionality of stable models of modules. 2 Preliminaries 1. A ∪ B ∪ C ∪ D ⊆ I ∪ O ∪ H, and We start by recalling some notions about logic programs. An (extended) rule r is an expression of the form a1 ∨ . . . ∨ an ← b1 , ..., bm , not c1 , ..., not ck , not not d1 , ..., not not dl , 2. if A 6= ∅, then A ∩ (O ∪ H) 6= ∅. Input atoms and output atoms are also called visible atoms. An interpretation for an ELP-function Π = hP, I, O, Hi is an arbitrary set M ⊆ A(Π), where A(Π) = I ∪ O ∪ H. We denote by Ai (Π), Ao (Π), Ah (Π), and by Mi , Mo , Mh the subsets of A(Π) and M restricted to elements in I, O, and H, respectively. Given ELP-function Π = hP, I, O, Hi and interpretation M , the reduct of Π w.r.t. M is the ELPfunction ΠM = hP M , I, O, Hi, where P M is the reduct of P w.r.t. M . An interpretation N is a model of ΠM iff N is a model of P M . A model N of ΠM is I-minimal iff there is no model N ′ of ΠM such that Ni′ = Ni and N ′ ⊂ N . An interpretation M is a stable model2 of Π iff M is an Iminimal model of ΠM . The set of all stable models of Π is denoted by SM(Π). We have M ∈ SM(Π) iff M ∈ AS(P ∪ Mi ) (Lierler and Truszczynski 2011). Given a program P and a set of atoms S, the set of defining rules for S is Def P (S) = {A ← B, not C, not not D ∈ P | A ∩ S 6= ∅}. Two ELP-functions Π1 = hP1 , I1 , O1 , H1 i and Π2 = hP2 , I2 , O2 , H2 i respect the input/output interfaces of each other iff (1) (I1 ∪ O1 ∪ H1 ) ∩ H2 = ∅; (2) (I2 ∪O2 ∪H2 )∩H1 = ∅; (3) O1 ∩O2 = ∅; (4) Def P1 (O1 ) = Def P1 ∪P2 (O1 ), and (5) Def P2 (O2 ) = Def P1 ∪P2 (O2 ). Let Π1 = hP1 , I1 , O1 , H1 i and Π2 = hP2 , I2 , O2 , H2 i be ELP-functions that respect the input/output interfaces of each other. The composition Π1 ⊕ Π2 is defined as (1) where a1 , . . . , an , b1 , . . . , bm , c1 , . . . , ck , and d1 , . . . , dl are atoms of a given propositional alphabet A. Note that double negation is standard in the context of forgetting in ASP. We also write such rules as A ← B, not C, not not D where A = {a1 , . . . , an }, B = {b1 , . . . , bm }, C = {c1 , . . . , ck }, and D = {d1 , . . . , dl }. An (extended) logic program is a finite set of rules. By A(P ) we denote the set of atoms appearing in P and by Ce the class of extended programs. We call r disjunctive if D = ∅; normal if, additionally, A has at most one element; Horn if on top of that C = ∅; and fact if also B = ∅. The classes of disjunctive, normal and Horn programs, Cd , Cn , and CH , are then defined as usual. Given a program P and an interpretation I, i.e., a set I ⊆ A, the reduct P I is defined as P I = {A ← B | A ← B, not C, not not D ∈ P, C ∩ I = ∅, D ⊆ I}. An interpretation I is a model of a rule A ← B if A ∩ I 6= ∅ whenever B ⊆ I; I is a model of a reduct R if it satisfies every rule of R; I is a minimal model of the reduct R if I is a model of R and there is no model I ′ of R s.t. I ′ ⊂ I; and I is an answer set of an extended program P if it is a minimal model of the reduct P I . The set of all answer sets of a program P is denoted by AS(P ). Given a set of atoms V , the V -exclusion of a set of sets M, denoted MkV , is {X\V | X ∈ M}. Two programs P1 and P2 are said to be equivalent if AS(P1 ) = AS(P2 ), strongly equivalent, denoted by P1 ≡ P2 , if AS(P1 ∪ R) = AS(P2 ∪ R) for any R ∈ Ce , and uniformly equivalent, denoted by P1 ≡u P2 , if AS(P1 ∪ R) = AS(P2 ∪ R), for any set of facts R. An HT -interpretation is a pair hX, Y i s.t. X ⊆ Y ⊆ A. Given a program P , an HT -interpretation hX, Y i is an HT model of P if Y |= P and X |= P Y , where |= stands for the classical satisfaction relation for rules. The set of all HTmodels of P is denoted by HT (P ). Also, Y ∈ AS(P ) iff hY, Y i ∈ HT (P ) and there is no X ⊂ Y s.t. hX, Y i ∈ HT (P ). Also, HT (P1 ) = HT (P2 ) precisely when P1 ≡ hP1 ∪ P2 , (I1 \ O2 ) ∪ (I2 \ O1 ), O1 ∪ O2 , H1 ∪ H2 i. The join ⊔ of modules builds on this composition imposing further restrictions. The positive dependency graph of an ELP-function Π = hP, I, O, Hi is the pair DG+ (Π) = hO ∪ H, ≤1 i, where b ≤1 a holds for a, b ∈ (O ∪ H) iff there is a rule A ← B, not C, not not D ∈ P s.t. a ∈ A and b ∈ B. The reflexive and transitive closure of ≤1 provides 1 While we limit our generalization to extended logic programs to the necessary notions for individual modules, we do not foresee major difficulties for other aspects left out of scope of this paper. 2 We reserve the term “answer set” for programs and the term “stable model” for ELP-functions to ease the reading. 190 the dependency relation ≤ over output and hidden atoms. A strongly connected component (SCC) S of DG+ (Π) is a maximal set S ⊆ Ao (Π) ∪ Ah (Π) s.t. b ≤ a for all pairs a, b ∈ S. If Π1 ⊕ Π2 is defined, then Π1 and Π2 are mutually dependent iff DG+ (Π1 ⊕ Π2 ) has an SCC S s.t. S ∩ Ao (Π1 ) 6= ∅ and S ∩ Ao (Π2 ) 6= ∅, and mutually independent otherwise. Thus, given ELP-functions Π1 and Π2 , if the composition Π1 ⊕ Π2 is defined and Π1 and Π2 are mutually independent, then the join Π1 ⊔ Π2 of Π1 and Π2 is defined and coincides with Π1 ⊕ Π2 (Janhunen et al. 2009). 3 Due to this negative result and the fact that it is not always possible to forget while satisfying (SP), the question that arises is whether this is actually different for (UP), given that it is less demanding in its requirements. Example 1. Consider program P used in the impossibility result for (SP) (Gonçalves, Knorr, and Leite 2016b): a←p b←q p ← not q q ← not p Adding program R = {a ←; b ←}, it is shown there that any result of forgetting {p, q} from P , f(P, {p, q}), that satisfies (SP) is required to have an HT-model hab, abi5 . At the same time, since {a, b} (modulo {p, q}) is not an answer set of P , we must have hX, abi ∈ HT (f(P, {p, q})) for at least one X ⊂ {a, b}, to prevent {a, b} from being an answer set of f(P, {p, q}). It is then shown that due to different programs R, hX, abi 6∈ HT (f(P, {p, q})) for any such X, thus causing a contradiction. However, in the case of X = ∅, R = {a ← b; b ← a} is used, which is not a set of facts and thus not relevant w.r.t. (UP). In fact, given the only possible four sets of facts over {a, b} to be considered for R, we can verify that P ′ = {a ← not b; a ← not not a, b; b ← not a; b ← not not b, a} is a result of forgetting {p, q} from P for which the condition of (UP) is satisfied. Forgetting under Uniform Persistence Arguably, among the many properties for forgetting in ASP, strong persistence is the one that should intuitively hold, since it imposes the preservation of all original direct and indirect dependencies between atoms not to be forgotten. Here and in the sequel, F is a class of forgetting operators. (SP) F satisfies Strong Persistence if, for each f ∈ F, P ∈ C(f) and V ⊆ A, we have AS(f(P, V ) ∪ R) = AS(P ∪ R)kV , for all programs R ∈ C(f) with A(R) ⊆ A\V . Essentially, (SP) requires that the answer sets of f(P, V ) correspond to those of P , no matter what programs R over A\V we add to both, which is closely related to the concept of strong equivalence. However, this property is rather demanding, witnessed by the fact that it cannot always be satisfied (Gonçalves, Knorr, and Leite 2016b). On the other hand, in the case of a module, i.e., an ELP-function, its program P is fixed, and we only vary the input, which is closely related to considering a fixed ASP program, encoding the declarative specification of a problem, and only varying the instances corresponding to the specific problem to be solved. This is captured by the notion of uniform equivalence, which weakens strong equivalence by considering that only facts can be added. To investigate forgetting in such cases, we introduce Uniform Persistence, (UP), obtained from (SP) by restricting the varying programs R to sets of facts. A naive approach to define a class of forgetting operators that satisfies (UP) would be to use relativized uniform equivalence (Eiter, Fink, and Woltran 2007), which is close in spirit to (UP). However, this would not work, for the same reasons that a similar approach based on relativized strong equivalence fails to capture (SP) (Gonçalves et al. 2017; Gonçalves et al. 2020). Instead, we define a class of forgetting operators that satisfies (UP), dubbed FUP , whose more involved definition – that we will gently introduce in an incremental way – builds on the manipulation of HT-models given an input program P and a set of atoms V ⊆ A(P ) to forget. To this end, we aim at devising a mapping from HT (P ) to the set of HTmodels of the result of forgetting, f(P, V ), for any operator f ∈ FUP . This mapping can be illustrated as follows. Example 2. The program P from Ex. 1 has 15 HT-models: (UP) F satisfies Uniform Persistence if, for each f ∈ F, P ∈ C(f) and V ⊆ A, we have AS(f(P, V ) ∪ R) = AS(P ∪ R)kV , for all sets of facts R with A(R) ⊆ A\V . hbq, bqi hap, api hap, abpi habp, abpi Having introduced (UP) as the desired property for forgetting in ELP-functions, we now turn our attention to which forgetting operator to use. Unfortunately, none of the existing classes mentioned in the literature3 satisfy (UP).4 hbq, abqi habq, abqi h∅, abpqi ha, abpqi hb, abpqi hbq, abpqi hap, abpqi hab, abpqi habp, abpqi habq, abpqi habpq, abpqi The HT-models for the proposed result P ′ of forgetting are ha, ai, hb, bi, h∅, abi and hab, abi. But how could we determine the latter set of HT-models for any P and V ? Given the HT-models listed above, the set HT (P )kV contains extra tuples such as ha, abi and hb, abi. Thus, a more involved analysis of HT-models is in order. By the definition of (UP), an answer set Y of f(P, V ) ∪ R corresponds to an answer set Y ∪ A of P ∪ R, for some A ⊆ V . We will therefore collect all HT-models hX, Y ∪ Ai in HT (P ) with the same Y and join them in blocks separated Theorem 1. None of the classes F of forgetting operators studied in (Goncalves, Knorr, and Leite 2016a; Gonçalves et al. 2017) satisfy (UP). 3 Cf. the survey on forgetting in ASP (Goncalves, Knorr, and Leite 2016a), (Gonçalves et al. 2017; Gonçalves et al. 2020), and references therein. 4 Note that the result in (Goncalves, Knorr, and Leite 2016a) (Fig. 1) indicating that class FSas satisfies (SP), the generalization of (UP), is in fact not entirely accurate, since the only known operator in FSas is not defined for a class of programs, but rather for instances of forgetting. 5 We follow a common convention and abbreviate sets in HTinterpretations such as {a, b} with the sequence of its elements, ab. 191 which is why h∅, abi ∈ HT (P ′ ) in Ex. 2 holds. Generalizing this observation, whenever there is a set X s.t. each Y,A ′ ′ NhP,V i contains an element X with X ⊆ X , then adding X as facts to P cannot result in an answer set of P , and thus, hX, Y i should be part of the forgetting result. In Ex. 6, the only such set X is indeed X = ∅. Y,A We thus collect all sets NhP,V i for each Y , define tuples over this set of sets, and intersections over these tuples. The latter correspond to the maximal subsets X, which suffices for uniform equivalence (Eiter, Fink, and Woltran 2007). Definition 1. Let P be a program, V ⊆ A, and Y ⊆ A\V . Y,i Y Consider the indexed family of sets ShP,V i = {NhP,V i }i∈I Y where I = SelhP,V i . For each tuple (Xi )i∈I such that Xi ∈ T Y,i NhP,V i , we define the intersection of its sets as i∈I Xi . We denote by SIntYhP,V i the set of all such intersections. by the varying A. To this end, we first characterize all the different total HT-models of P , namely, for each Y ⊆ A\V : Y SelhP,V i = {A ⊆ V | hY ∪ A, Y ∪ Ai ∈ HT (P )}. Example 3. Given the HT-models (Ex. 2) for P of Ex. 1 and {a} ∅ V = {p, q}, we obtain SelhP,V i = ∅, SelhP,V i = {{p}}, {b} {a,b} SelhP,V i = {{q}}, and SelhP,V i = {{p}, {q}, {p, q}}. Clearly, the total models to be considered in the result Y of forgetting should be restricted to those Y s.t. SelhP,V i is non-empty. But not all these sets should be considered. Example 4. Let P be a program over A = {a, b, p, q} s.t. its HT-models of the form hX, {a, b}∪Ai with A ⊆ V = {p, q} are hab, abpi, habp, abpi, habp, abpqi, and habpq, abpqi. We {a,b} have that SelhP,V i = {{p}, {p, q}}. Nevertheless, the nontotal models hab, abpi and habp, abpqi do not allow {a, b, p} and {a, b, p, q} to be answer sets of P ∪ R, for any R over {a,b} A\V = {a, b}. So, although SelhP,V i 6= ∅, the set {a, b} should not be a possible answer set of the forgetting result.6 Taking this observation into account, we define the set of total models for the result of forgetting V from P : The resulting intersections indeed correspond to sets X pointed out in the preceding discussion. Therefore, we obtain the definition of FUP by combining the total models based on ThP,V i and the non-total ones based on SIntYhP,V i , but naturally restricted to those cases where the corresponding total model exists. Definition 2 (UP-Forgetting). Let FUP be the class of forgetting operators defined as: Y ThP,V i = {Y ⊆ A\V | there exists A ∈ SelhP,V i s.t. hY ∪ A′ , Y ∪ Ai ∈ / HT (P ) for every A′ ⊂ A}. Example 5. Based on the HT-models of P listed in Ex. 2, the Y sets SelhP,V i identified in Ex. 3, and V = {p, q}, we observe that ThP,V i = {{a}, {b}, {a, b}}. In each of the three cases, the condition in the definition of ThP,V i is satisfied by some Y element of SelhP,V i . For Y = {a, b} in particular, the set A can be either {p} or {q}, but not {p, q}. Given ThP,V i , we expect three total HT-models for the result of forgetting {p, q} from P , i.e., the ones indicated in Ex. 2 for P ′ . The crucial question now is how to extract the non-total HT-models for the result of forgetting in general. For this Y purpose, for each A ∈ SelhP,V i , we first consider the nontotal HT-models of P of the form hX, Y ∪ Ai: {f |HT (f(P, V )) = ({hY, Y i | Y ∈ ThP,V i } ∪ {hX, Y i | Y ∈ ThP,V i and X ∈ SIntYhP,V i }) for all P ∈ C(f) and V ⊆ A}. Example 7. Recall P from Ex. 1. Following the discussion after Ex. 6, we can verify that the result of forgetting about V = {p, q} from P according to FUP has the expected HTmodels (cf. Ex. 2): ha, ai, hb, bi, h∅, abi, and hab, abi. The definition of FUP characterizes the HT-models of a result of forgetting for any f ∈ FUP , but not an actual program. This may raise the question whether there actually is such an operator, and we can answer this question positively. To this end, we recall the necessary notions and results related to countermodels in here-and-there (Cabalar and Ferraris 2007), which have been used previously in a similar manner for computing concrete results of forgetting for classes of forgetting operators based on HTmodels (Wang, Wang, and Zhang 2013; Wang et al. 2014; Gonçalves et al. 2020). Essentially, the HT-interpretations that are not HT-models of P (hence the name countermodels) can be used to determine rules, that, if conjoined, result in a program P ′ that is strongly equivalent to P . Let P be a program and X ⊆ Y ⊆ A. An HT-interpretation hX, Y i is an HT-countermodel of P if hX, Y i 6|= P . We also define the following rules: Y,A NhP,V i = {X \V | hX, Y ∪Ai ∈ HT (P ) and X 6= Y ∪A}. Example 6. Continuing Ex. 5, these non-total models, in particular those relevant for the desired result h∅, abi, are: {a,b},{p} NhP,V i {a,b},{q} = {{a}}, NhP,V i {a,b},{p,q} NhP,V i = {{b}}, and = {∅, {a}, {b}, {a, b}}. Now, since HT-models of facts never include h∅, Y i for any Y , we know that any HT-model h∅, Y i of P will not occur in HT (P ∪ R) for any (non-empty) set of facts R. Y,A Hence, either one of the NhP,V i is empty, in which case P itself has an answer set Y modulo V and the result of forY,A getting should have an answer set Y , or ∅ ∈ NhP,V i for any A results in an HT-model h∅, Y i for the result of forgetting, 6 Similar considerations have been used in the context of relativized equivalence (Eiter, Fink, and Woltran 2007) and in forgetting (Gonçalves, Knorr, and Leite 2016b). rX,Y = (Y \X) ← X, not (A\Y ), not not (Y \X) (2) rY,Y = ∅ ← Y, not (A\Y ) (3) The relation between these rules and HT-countermodels has been established as follows. 192 Lemma 1 ((Cabalar and Ferraris 2007)). Let X ⊂ Y ⊆ A and U ⊆ V ⊆ A. (W) F satisfies Weakening if, for each f ∈ F, P ∈ C(f) and V ⊆ A, we have P |=HT f(P, V ). (PP) F satisfies Positive Persistence if, for each f ∈ F, P ∈ C(f) and V ⊆ A: if P |=HT P ′ , with P ′ ∈ C(f) and A(P ′ ) ⊆ A\V , then f(P, V ) |=HT P ′ . (SI) F satisfies Strong (addition) Invariance if, for each f ∈ F, P ∈ C(f) and V ⊆ A, we have f(P, V ) ∪ R ≡ f(P ∪ R, V ) for all programs R ∈ C(f) with A(R) ⊆ A\V . (EC ) F satisfies Existence for C, i.e., F is closed for a class of programs C if there exists f ∈ F s.t. f is closed for C. (CP) F satisfies Consequence Persistence if, for each f ∈ F, P ∈ C(f) and V ⊆ A, we have AS(f(P, V )) = AS(P )kV . (wC) F satisfies weakened Consequence if, for each f ∈ F, P ∈ C(f) and V ⊆ A, we have AS(P )kV ⊆ AS(f(P, V )). Note that P |=HT P ′ holds if HT (P ) ⊆ HT (P ′ ), in which case P ′ is said to be a HT-consequence of P . We obtain that FUP satisfies the following properties. Proposition 3. FUP satisfies (sC), (wE), (SE), (CP), (wC), (ECe ), (ECH ), but not (W), (PP), (SI), (SP), (ECd ), (ECn ). Given the close connection between the class FUP and uniform equivalence (cf. Thm. 3), it is not surprising that some properties of forgetting that are closely connected to strong equivalence are not satisfied by FUP , notably (PP) and (SI), which are satisfied by the class of forgetting operators defined for forgetting w.r.t. (SP) when forgetting is possible (Gonçalves, Knorr, and Leite 2016b). Finally, we obtain that deciding whether a program is the result of forgetting for f ∈ FUP is in ΠP 3. Theorem 4. Given programs P , Q, and V ⊆ A, deciding whether P ≡ f(Q, V ) for f ∈ FUP is in ΠP 3. Note that the same problem for the classes of forgetting operators that approximate forgetting under (SP) is ΠP 3complete (Gonçalves et al. 2017; Gonçalves et al. 2020). Also, by (Wang et al. 2014) and Prop. 2, if Q is Horn, then this problem is in ΠP 1. (i) hU, V i is an HT-countermodel of rX,Y iff U = X and V =Y. (ii) hU, V i is an HT-countermodel of rY,Y iff V = Y . This allows us to determine a program for a set of HTmodels provided such program exists. Recall that not all sets of HT-interpretations correspond to the set of HT-models of some program. A set of HT-interpretations S is HTexpressible iff hX, Y i ∈ S implies hY, Y i ∈ S. In this case, we are able to determine a corresponding program. Proposition 1 ((Cabalar and Ferraris 2007)). Let M be a set of HT-interpretations which is HT-expressible and define the program PM as PM = {rX,Y | hX, Y i ∈ / M and hY, Y i ∈ M } ∪ {rY,Y | hY, Y i ∈ / M }. Then, HT (PM ) = M . Note that according to Def. 2, the HT-models of the forgetting result for any operator in FUP are HT-expressible. Thus, based on these ideas, we can define a concrete operator that belongs to the class FUP . Theorem 2. There exists f such that f ∈ FUP . While the definition of UP-Forgetting itself is certainly non-trivial, it turns out that for the case of Horn programs, a considerably simpler definition can be used. Proposition 2. Let f be in FUP . Then, for every V ⊆ A: HT (f(P, V )) = HT (P )kV for each P ∈ CH . This result serves as further indication that UP-Forgetting is well-defined, given that essentially all classes of forgetting operators coincide with this definition for the class of Horn programs (Goncalves, Knorr, and Leite 2016a). We are able to show that FUP indeed satisfies (UP) which guarantees that, unlike for the property (SP), it is always possible to forget satisfying (UP). Theorem 3. FUP satisfies (UP). 4 Despite (SP) being the property that best captures the essence of forgetting in ASP in general, of which (UP) is the weaker version that is sufficient when dealing with modules, other properties have been investigated in the literature (cf. (Goncalves, Knorr, and Leite 2016a)), which we recall in the following. Let F be a class of forgetting operators. Forgetting in Modules We now turn our attention to the use of FUP to forget in modules i.e., ELP-functions. Towards characterizing results of forgetting in modules, the notion of equivalence between ELP-functions – modular equivalence (Janhunen et al. 2009) – first needs to be adapted, since it is too strong as it requires the existence of a bijection between stable models of different ELP-functions, which is not possible in general when reducing the language, as illustrated by the next example. Example 8. Take Π = h{a ←; b ← not not b}, ∅, {a}, {b}i with SM(Π) = {{a}, {a, b}}. Forgetting b should yield, e.g., Π′ = h{a ←}, ∅, {a}, ∅i with SM(Π′ ) = {{a}}, but then no bijection between SM(Π) and SM(Π′ ) is possible. Therefore, we introduce a novel notion of equivalence for program modules according to which two modules are V equivalent if they coincide on I and O ignoring V , and if their stable models coincide ignoring V . (sC) F satisfies strengthened Consequence if, for each f ∈ F, P ∈ C(f) and V ⊆ A, we have AS(f(P, V )) ⊆ AS(P )kV . (wE) F satisfies weak Equivalence if, for each f ∈ F, P, P ′ ∈ C(f) and V ⊆ A, we have AS(f(P, V )) = AS(f(P ′ , V )) whenever AS(P ) = AS(P ′ ). (SE) F satisfies Strong Equivalence if, for each f ∈ F, P, P ′ ∈ C(f) and V ⊆ A: if P ≡ P ′ , then f(P, V ) ≡ f(P ′ , V ). 193 Definition 3 (V-Equivalence). Let Π1 and Π2 be ELPfunctions, and V a set of atoms. Then, Π1 and Π2 are V equivalent, denoted by Π1 ≡V Π2 , iff Thm. 4)), easily encodable with extended rules, and we can forget about input atoms from ELP-functions as follows. Theorem 8 (Forgetting input atoms). Given a set V ⊆ I of input atoms to forget, an ELP-function Π = hP, I, O, Hi is V -equivalent to any Π′ = hf(P ∪ {a ← not not a | a ∈ V }, V ), I\V, O, Hi based on a uniformly persistent forgetting operator f ∈ FUP . This construction of Π′ can also be used to hide input atoms. Theorem 9 (Hiding input atoms). Given a set V ⊆ I of input atoms to hide, an ELP-function Π = hP, I, O, Hi is V -equivalent to Π′ = hP ∪ {a ← not not a | a ∈ V }, I\V, O, H ∪ V i. Combining these results, we can now define a general notion of a module resulting from forgetting elements of single parts of a module’s interface. From now on, we assume that some forgetting operator f ∈ FUP has been fixed. Definition 4. Given an ELP-function Π = hP, I, O, Hi and a set V of atoms to forget, the ELP-function resulting from forgetting V , also denoted Π\V , is defined as follows: 1. Ai (Π1 )\V = Ai (Π2 )\V and Ao (Π1 )\V = Ao (Π2 )\V ; 2. SM(Π1 )kV = SM(Π2 )kV . Forgetting from each of the pairwise disjoint sets of atoms considered in a module – input, output and hidden – needs to be characterised in turn. Additionally, in the case of input and output atoms, we also consider hiding them – useful when atoms are not declaratively meaningful outside the module, or should not be shown – and discuss its difference with respect to forgetting them. We start by showing that the hidden atoms of an ELPfunction can be forgotten without affecting its behavior perceived in terms of visible atoms, ensuring that we can deal with cases when we are not allowed to express a certain piece of information in terms of our hidden atoms, or do not want to show it to someone who wants to visualize the program of a module. Theorem 5 (Forgetting hidden atoms). Given a set V ⊆ H of hidden atoms to forget, an ELP-function Π = hP, I, O, Hi is V -equivalent to any ELP-function Π′ = hf(P, V ), I, O, H\V i based on a uniformly persistent forgetting operator f ∈ FUP . hf(P ∪ {a ← not not a | a ∈ I ∩ V }, V ), I\V, O\V, H\V i. We can show that this notion indeed fits the expectations. Corollary 1. For an ELP-function Π and a set of atoms V ⊆ A(Π), we have SM(Π\V ) = SM(Π)kV . And it follows that we can forget sets of atoms iteratively. Proposition 4. Let Π be an ELP-function and V ⊆ A(Π). Then, if V1 ∪ V2 = V and V1 ∩ V2 = ∅, we have But forgetting is also applicable to the visible elements of a module. For instance, whenever output atoms are no longer used by other modules, they can effectively be removed without affecting the behavior of the module. Theorem 6 (Forgetting output atoms). Given a set V ⊆ O of output atoms to forget, an ELP-function Π = hP, I, O, Hi is V -equivalent to any ELP-function Π′ = hf(P, V ), I, O\V, Hi based on a uniformly persistent forgetting operator f ∈ FUP . SM(Π\V ) = SM((Π\V1 )\V2 ) = SM((Π\V2 )\V1 ). In (Janhunen et al. 2009), it is shown, through the module theorem, that the stable model semantics of modules is fully compositional, which should be preserved under forgetting. In the case of two modules Π1 = hP1 , I1 , O1 , H1 i and Π2 = hP2 , I2 , O2 , H2 i that do not mention each other’s hidden atoms and their join Π1 ⊔ Π2 is defined (coincides with the composition Π1 ⊕ Π2 ), the module theorem states that SM(Π) = SM(Π1 ) ⊲⊳ SM(Π2 ) where the join of sets of stable models captured by the operator ⊲⊳ contains M1 ∪ M2 whenever M1 ∈ SM(Π1 ), M2 ∈ SM(Π2 ), and M1 and M2 are compatible, i.e., M1 ∩ (I2 ∪ O2 ) = M2 ∩ (I1 ∪ O1 ) so that M1 and M2 coincide on visible atoms. Limited to forgetting atoms that are not shared by two modules, if we consider two modules whose join is defined, then the module theorem can be preserved while forgetting. Theorem 10. If Π is an ELP-function obtained as a join of two ELP-functions Π1 and Π2 , and V ⊆ A(Π) is a set of atoms to forget s.t. V ∩ (I1 ∪ O1 ) ∩ (I2 ∪ O2 ) = ∅, then SM(Π\V ) = SM(Π1 \V ) ⊲⊳ SM(Π2 \V ). We can generalize this result to deal with cases where atoms to be forgotten appear in more than two modules. Theorem 11. If Π is an ELP-function obtained as a join of n ELP-functions Π1 , . . . , Πn , and V ⊆ A(Π) is a set of atoms to forget s.t., for all i, j ∈ {1, . . . , n}, i 6= j, V ∩ (Ii ∪ Oi ) ∩ (Ij ∪ Oj ) = ∅, then SM(Π\V ) = ⊲⊳ni=1 SM(Πi \V ). An alternative to forgetting output atoms is hiding them. Given an ELP-function Π = hP, I, O, Hi and a set V ⊆ O of output atoms, we could create an ELP-function hP, I, O\V, H ∪V i where the atoms of V are simply hidden. This would be computationally cheap since P would not change, but could be regarded insufficient under the strict interpretation of forgetting V , i.e., the elements of V should not appear in the result at all. Nevertheless, we derive the following counterpart to Thm. 6. Theorem 7 (Hiding output atoms). Given a set V ⊆ O of output atoms to hide, an ELP-function Π = hP, I, O, Hi is V -equivalent to the ELP-function Π′ = hP, I, O\V, H ∪V i. Thus, both hiding and forgetting output atoms yields V equivalent ELP-functions. Turning to forgetting (or hiding) of input atoms, no analogous result exists without making changes to the program. Example 9. Take Π = h{a ← b}, {b}, {a}, ∅i. Then, SM(Π) = {∅, {a, b}}, but moving b from I to H yields Π′ with SM(Π′ ) = {{}}, which is not {b}-equivalent. Nevertheless, if we allow programs to change, such V equivalent ELP-functions can be constructed using the idea of an input generator (cf. (Oikarinen and Janhunen 2006, 194 i ∼V j iff V ∩ (Ii ∪ Oi ) ∩ (Ij ∪ Oj ) 6= ∅. This relation identifies those ELP-functions that share atoms to forget, i.e., that can cause problems with the module theorem. We denote by ∼∗V the reflexive and transitive closure of ∼V on N . Since ∼V is clearly a symmetric relation, its reflexive and transitive closure, ∼∗V , is an equivalence relation on N . We can therefore consider the quotient set N \∼∗V , i.e., the set of equivalence classes defined by ∼∗V on N . We then F consider, for each e ∈ N \∼∗V , the ELP-function Πe = i∈e Πi , the join of those ELP-functions corresponding to the considered equivalence class. This allows us to prove a relaxed version of the module theorem. Yet, if we lift the restrictions on where the atoms to forget appear, we lose a full correspondent to the module theorem. Theorem 12. If Π is an ELP-function obtained as a join of two ELP-functions Π1 and Π2 , and V ⊆ A(Π) is a set of atoms to forget, then SM(Π\V ) ⊆ ⊲⊳ni=1 SM(Πi \V ). Only one of the two inclusions one would expect actually holds, and this is not by chance. In general, it is possible that modules Π1 \V and Π2 \V possess compatible stable models M1 and M2 such that M = M1 ∪ M2 ∈ SM(Π1 \V ) ⊲⊳ SM(Π2 \V ) but M 6∈ SM(Π\V ) as illustrated next. Example 10. Let us consider ELP-functions Π1 = h{a ← b}, {b}, {a}, ∅i and Π2 = h{b ← not c}, {c}, {b}, ∅i and their join Π = hP, {c}, {a, b}, ∅i with P = P1 ∪ P2 for the respective sets of rules P1 and P2 of Π1 and Π2 . As regards forgetting V = {b}, we have Π1 \V = h{a ← not not a}, ∅, {a}, ∅i, Π2 \V = h∅, {c}, ∅, ∅i, and Π\V = h{a ← not c}, {c}, {a}, ∅i. It remains to observe that M1 = ∅ ∈ SM(Π1 \V ), M2 = ∅ ∈ SM(Π2 \V ), and M1 ∪ M2 6∈ SM(Π\V ) = {{a}, {c}} although M1 and M2 are (trivially) compatible. Theorem 13. Let Π be an ELP-function obtained as a join of n ELP-functions Π1 , . . . , Πn , and V ⊆ A(Π) a set of atoms to forget. Let N = {1, . . . , n}, and consider ∼∗V the equivalence relation on N as defined previously, and N \∼∗V = {e1 , . . . , ek } the respective quotient set. Then, SM(Π\V ) = ⊲⊳ki=1 SM(Πei \V ). This shows that joining those modules that share atoms to be forgotten allows for the preservation of the module theorem. Joining entire modules is not ideal. However, it may happen that only part of a module is relevant to the shared atom to be forgotten, in which case we can use the operation of decomposing (or splitting) modules to do a more fine-grained recomposition of modules that still preserves the module theorem. Towards this end, we adapt the necessary notions to introduce module decomposition (Janhunen et al. 2009). Given an ELP-function Π = hP, I, O, Hi, let SCC + (Π) denote the set of strongly connected components of DG+ (Π). The dependency relation ≤ can be lifted to SCC + (Π) by setting S1 ≤ S2 iff there are atoms a1 ∈ S1 and a2 ∈ S2 s.t. a1 ≤ a2 . It is easy to check that ≤ is well-defined over SCC + (Π), i.e., it does not depend on the chosen a1 ∈ S1 and a2 ∈ S2 , and that hSCC + (Π), ≤i is a partially ordered set, i.e., ≤ is reflexive, transitive, and antisymmetric. For each S ∈ SCC + (Π) we consider the ELPfunction ΠS = hDef P (S), A(Def P (S))\S, S ∩ O, S ∩ Hi. Some of these modules ΠS , however, may share hidden atoms, and therefore cannot be joined. To overcome this, such components of SCC + (Π) need to be identified. The example suggests that it is not safe to use f ∈ FUP to forget shared atoms that inherently change the I/O interface between the modules. The same is also true for hiding. Example 11. Consider again Ex. 10. We obtain the three modules in each of which b has been hidden as follows: Π′1 = h{a ← b; b ← not not b}, ∅, {a}, {b}i, Π′2 = h{b ← not c}, {c}, ∅, {b}i and (Π1 ⊔ Π2 )′ = h{a ← b; b ← not c}, {c}, {a}, {b}i. But then Π′1 and Π′2 do not respect the input/output interfaces of each other. We could circumvent this by renaming one of the occurrences of b, but we would also lose the prior dependency of a on c. 5 Module Reconfiguration Preserving the compositionality of stable models of modules while forgetting is desirable by the very idea of modular ASP: we want users to define ASP modules that can be composed into larger programs/modules. However, as we have seen, the module theorem no longer works entirely whenever some atom to be forgotten is shared by two modules. In such cases, one alternative is to somehow modify the modules so that these atoms cease to occur in the visible components of different modules, i.e., reconfigure ASP modules by merging and splitting modules, so that we can forget while preserving the compositionality of stable models of modules. Of course, for this to be feasible, we must have access to the modules in question (by communication, or because we own the modules). This may require sharing some information about some module, which may not always be desirable, but, arguably, whenever possible, this is a reasonable trade-off for being able to forget atoms from modules while preserving (UP) and the module theorem. One way to address the problem, provided all involved modules are mutually independent and their composition is defined, is to join all the modules that contain such atoms. Let Π be an ELP-function obtained as a join of n ELPfunctions Π1 , . . . , Πn , and V ⊆ A(Π) a set of atoms to forget. Consider the following relation on N = {1, . . . , n}: Definition 5. Given an ELP-function Π = hP, I, O, Hi, components S1 , S2 ∈ SCC + (Π) do not respect the hidden atoms of each other, denoted by S1 !h S2 , if and only if S1 6= S2 and (at least) one of the following conditions holds: 1. there is h ∈ Ah (ΠS1 ) such that h ∈ Ai (ΠS2 ), 2. there is h ∈ Ah (ΠS2 ) such that h ∈ Ai (ΠS1 ), 3. there are h1 ∈ Ah (ΠS1 ) and h2 ∈ Ah (ΠS2 ) such that both occur in some integrity constraint of Π. It is clear that the relation !h is irreflexive and symmetric on SCC + (Π) for every ELP-function Π. If we consider the reflexive and transitive closure of !h , denoted by !∗h , we obtain an equivalence relation. A repartition of SCC + (Π) can then be obtained by considering the quotient set SCC + (Π)\!∗h , i.e., the set of equivalence classes of !∗h over SCC + (Π), which can be used to decompose Π. 195 Definition 6. Given an ELP-function Π = hP, I, O, Hi, the decomposition induced by SCC + (Π) and !∗h includes an ELP-function Π0 = hIC0 (P ), A(IC0 (P )) ∪ (I \ A(P )), ∅, ∅i where IC0 (P ) = {← B, not C, not not D ∈ P | (B ∪ C ∪ D) ∩ H = ∅} and, for each S ∈ SCC + (Π)\!∗h , an ELP-function ΠS = hDef P (S) ∪ ICS S (P ), A(Def P (S) ∪ ICS (P )) \ S, S ∩ O, S ∩ Hi, where S = S and ICS (P ) = {← B, not C, not not D ∈ P | (B∪C ∪D)∩(S ∩H) 6= ∅}. We showed that, unlike with (SP), it is always possible to to forget under (UP). Perhaps surprisingly, we also showed that, in general, none of the operators defined in the literature satisfies this weaker form of persistence, which led us to introduce the class of forgetting operators FUP that we proved to obey (UP), as well as a set of other properties commonly discussed in the literature. We then turned our attention to the application of this class of forgetting operators to forget input, output, and hidden atoms from modules, and related it with the operation of hiding. Despite showing that we can always forget atoms from modules under uniform persistence, we also showed that the important module theorem no longer holds in general, with negative consequences in the compositionality of stable models. Subsequently, after pinpointing the conditions under which the module theorem holds, we proceeded by investigating how the theorem could be “recovered” through a reconfiguration of the modules obtained by suitable decomposition and composition operations. Possible avenues for future work include investigating forgetting in other existing ways to view modular ASP, such as (Dao-Tran et al. 2009; Harrison and Lierler 2016), and the precise relationship of (UP) and UP-Forgetting to the notion of relativized uniform equivalence (Eiter, Fink, and Woltran 2007), and obtaining syntactic operators for UP-Forgetting in the line of (Berthold et al. 2019). The module Π0 keeps track of integrity constraints as well as input atoms that are not mentioned by the rules of P . We can adapt straightforwardly (from (Janhunen et al. 2009)) that this decomposition of an ELP-function is valid. Proposition 5. Given an ELP-function Π = hP, I, O, Hi, F then Π = Π0 ⊔ ( S∈SCC + (Π)\!∗ ΠS ). h We now show that this decomposition can be used to allow forgetting while still preserving the module theorem. Let Π1 = hP1 , I1 , O1 , H1 i and Π2 = hP2 , I2 , O2 , H2 i be two ELP-functions such that their join is defined. Since Prop. 4 shows that we can forget a set of atoms by forgetting iteratively every atom in the set, we focus on forgetting a single atom p. Suppose that p is shared by the two modules, i.e., p ∈ (I1 ∪ O1 ) ∩ (I2 ∪ O2 ), and recall that we cannot guarantee that forgetting p separately in Π1 and Π2 preserves the module theorem. We first consider the set of components of the decomposition of Π1 that are relevant for atom p, i.e., R(Π1 , p) = {S ∈ SCC + (Π1 )\!∗h | p ∈ Ao (ΠS ) ∪ Ai (ΠS )}. We denote by Πp1 the F union of the ELP-functions in R(Π1 , p), i.e., Πp1 = R(Π1 , p), by R(Π1 , p) the set of components of the decomposition of Π1 that are not relevant for p, i.e., R(Π1 , p) = {S ∈ / R(Π1 , p)}, and by Πp1 the union of SCC + (Π1 )\!∗h | S ∈ F the ELP-functions in R(Π1 , p), i.e., Πp1 = R(Π1 , p). The decomposition of Π1 can then be used to obtain a restricted version of the module theorem. Acknowledgments Authors R. Gonçalves, M. Knorr, and J. Leite were partially supported by FCT project FORGET (PTDC/CCI-INF/32219/2017) and by FCT project NOVA LINCS (UIDB/04516/2020). T. Janhunen was partially supported by the Academy of Finland project 251170. S. Woltran was supported by the Austrian Science Fund (FWF): Y698, P25521. References Theorem 14 (Reconfiguration). Let Π be an ELP-function obtained as a join of two ELP-functions Π1 and Π2 , and let p ∈ (Ai (Π1 ) ∪ Ao (Π1 )) ∩ (Ai (Π2 ) ∪ Ao (Π2 )). Then, Baral, C.; Dzifcak, J.; and Takahashi, H. 2006. Macros, macro calls and use of ensembles in modular answer set programming. In Etalle, S., and Truszczynski, M., eds., Procs. of ICLP, volume 4079 of LNCS, 376–390. Springer. Berthold, M.; Gonçalves, R.; Knorr, M.; and Leite, J. 2019. A syntactic operator for forgetting that satisfies strong persistence. Theory Pract. Log. Program. 19(5-6):1038–1055. Bledsoe, W. W., and Hines, L. M. 1980. Variable elimination and chaining in a resolution-based prover for inequalities. In Bibel, W., and Kowalski, R. A., eds., Procs. of CADE, volume 87 of LNCS, 70–87. Springer. Cabalar, P., and Ferraris, P. 2007. Propositional theories are strongly equivalent to logic programs. TPLP 7(6):745–759. Dao-Tran, M.; Eiter, T.; Fink, M.; and Krennwallner, T. 2009. Modular nonmonotonic logic programming revisited. In Hill, P. M., and Warren, D. S., eds., Procs. of ICLP, volume 5649 of LNCS, 145–159. Springer. Delgrande, J. P., and Wang, K. 2015. A syntax-independent approach to forgetting in disjunctive logic programs. In Bonet, B., and Koenig, S., eds., Procs. of AAAI, 1482–1488. AAAI Press. SM(Π\{p}) = SM(Πp1 \{p}) ⊲⊳ SM((Π2 ⊔ Πp1 )\{p}). Thus, to allow forgetting in modules and preserve the module theorem, we can essentially decompose certain modules and reconfigure them in such a way that all rules on the considered shared atom occur in a single module. 6 Conclusions In this paper, we thoroughly investigated the operation of forgetting in the context of modular ASP. We began by observing that strong persistence (SP) – the property usually taken to best characterize forgetting in ASP, which cannot always be guaranteed – is too strong when we consider modular ASP. Given the structure of modules in the context of modular ASP, namely their restricted interface, a weaker notion of persistence based on uniform equivalence is sufficient to properly characterise forgetting in this case, which led us to introduce uniform persistence (UP). 196 Eiter, T., and Fink, M. 2003. Uniform equivalence of logic programs under the stable model semantics. In Palamidessi, C., ed., Procs. of ICLP, volume 2916 of LNCS, 224–238. Springer. Eiter, T., and Wang, K. 2008. Semantic forgetting in answer set programming. Artif. Intell. 172(14):1644–1672. Eiter, T.; Fink, M.; and Woltran, S. 2007. Semantical characterizations and complexity of equivalences in answer set programming. ACM Trans. Comput. Log. 8(3). European Union. 2016. General Data Protection Regulation. Official Journal of the European Union L119:1–88. Gabbay, D. M.; Schmidt, R. A.; and Szalas, A. 2008. Second Order Quantifier Elimination: Foundations, Computational Aspects and Applications. College Publications. Gonçalves, R.; Knorr, M.; Leite, J.; and Woltran, S. 2017. When you must forget: Beyond strong persistence when forgetting in answer set programming. TPLP 17(5-6):837–854. Gonçalves, R.; Janhunen, T.; Knorr, M.; Leite, J.; and Woltran, S. 2019. Forgetting in modular answer set programming. In AAAI, 2843–2850. AAAI Press. Gonçalves, R.; Knorr, M.; Leite, J.; and Woltran, S. 2020. On the limits of forgetting in answer set programming. Artif. Intell. 286:103307. Goncalves, R.; Knorr, M.; and Leite, J. 2016a. The ultimate guide to forgetting in answer set programming. In Baral, C.; Delgrande, J.; and Wolter, F., eds., Procs. of KR, 135–144. AAAI Press. Gonçalves, R.; Knorr, M.; and Leite, J. 2016b. You can’t always forget what you want: on the limits of forgetting in answer set programming. In Fox, M. S., and Kaminka, G. A., eds., Procs. of ECAI, 957–965. IOS Press. Harrison, A., and Lierler, Y. 2016. First-order modular logic programs and their conservative extensions. TPLP 16(56):755–770. Janhunen, T.; Oikarinen, E.; Tompits, H.; and Woltran, S. 2009. Modularity aspects of disjunctive stable models. J. Artif. Intell. Res. (JAIR) 35:813–857. Knorr, M., and Alferes, J. J. 2014. Preserving strong equivalence while forgetting. In Fermé, E., and Leite, J., eds., Procs. of JELIA, volume 8761 of LNCS, 412–425. Springer. Lang, J.; Liberatore, P.; and Marquis, P. 2003. Propositional independence: Formula-variable independence and forgetting. J. Artif. Intell. Res. (JAIR) 18:391–443. Lierler, Y., and Truszczynski, M. 2011. Transition systems for model generators - A unifying approach. TPLP 11(45):629–646. Lifschitz, V.; Pearce, D.; and Valverde, A. 2001. Strongly equivalent logic programs. ACM Trans. Comput. Log. 2(4):526–541. Middeldorp, A.; Okui, S.; and Ida, T. 1996. Lazy narrowing: Strong completeness and eager variable elimination. Theor. Comput. Sci. 167(1&2):95–130. Moinard, Y. 2007. Forgetting literals with varying propositional symbols. J. Log. Comput. 17(5):955–982. Oikarinen, E., and Janhunen, T. 2006. Modular equivalence for normal logic programs. In Brewka, G.; Coradeschi, S.; Perini, A.; and Traverso, P., eds., Procs. of ECAI, 412–416. Oikarinen, E., and Janhunen, T. 2008. Achieving compositionality of the stable model semantics for smodels programs. TPLP 8(5-6):717–761. Sagiv, Y. 1988. Optimizing datalog programs. In Minker, J., ed., Foundations of Deductive Databases and Logic Programming. Morgan Kaufmann. 659–698. Wang, Y.; Zhang, Y.; Zhou, Y.; and Zhang, M. 2014. Knowledge forgetting in answer set programming. J. Artif. Intell. Res. (JAIR) 50:31–70. Wang, Y.; Wang, K.; and Zhang, M. 2013. Forgetting for answer set programs revisited. In Rossi, F., ed., Procs. of IJCAI, 1162–1168. IJCAI/AAAI. Weber, A. 1986. Updating propositional formulas. In Expert Database Conf., 487–500. Wong, K.-S. 2009. Forgetting in Logic Programs. Ph.D. Dissertation, The University of New South Wales. Zhang, Y., and Foo, N. Y. 2006. Solving logic program conflict through strong and weak forgettings. Artif. Intell. 170(8-9):739–778. 197 A framework for a modular multi-concept lexicographic closure semantics Laura Giordano , Daniele Theseider Dupré DISIT - Università del Piemonte Orientale, Italy {laura.giordano, dtd}@uniupo.it which are concerned with subject Ci are admitted in mi . We call a collection of such modules a modular multi-concept knowledge base. This modularization of the defeasible part of the knowledge base does not define a partition of the set D of defeasible inclusions, as an inclusion may belong to more than one module. For instance, the typical properties of employed students are relevant both for the module with subject Student and for the module with subject Employee. The granularity of modularization has to be chosen by the knowledge engineer who can fix how large or narrow is the scope of a module, and how many modules are to be included in the knowledge base (for instance, whether the properties of employees and students are to be defined in the same module with subject Person or in two different modules). At one extreme, all the defeasible inclusions in D can be put together in a module associated with subject ⊤ (Thing). At the other extreme, which has been studied in (Giordano and Theseider Dupré 2020), a module mi is a defeasible TBox containing only the defeasible inclusions of the form T(Cj ) ⊑ D for some concept Ci . In this paper we remove this restriction considering general modules, containing arbitrary sets of defeasible inclusions, intuitively pertaining some subject. In (Giordano and Theseider Dupré 2020), following Gerard Brewka’s framework of Basic Preference Descriptions for ranked knowledge bases (Brewka 2004), we have assumed that a specification of the relative importance of typicality inclusions for a concept Ci is given by assigning ranks to typicality inclusions. However, for a large module, a specification by hand of the ranking of the defeasible inclusions in the module would be awkward. In particular, a module may include all properties of a class as well as properties of its exceptional subclasses (for instance, the typical properties of penguins, ostriches, etc. might all be included in a module with subject Bird ). A natural choice is then to consider, for each module, a lexicographic semantics which builds on the rational closure ranking to define a preference ordering on domain elements. This preference relation corresponds, in the propositional case, to the lexicographic order on worlds in Lehmann’s model theoretic semantics of the lexicographic closure (Lehmann 1995). This semantics already accounts for the specificity relations among concepts inside the module, as the lexicographic closure deals with Abstract We define a modular multi-concept extension of the lexicographic closure semantics for defeasible description logics with typicality. The idea is that of distributing the defeasible properties of concepts into different modules, according to their subject, and of defining a notion of preference for each module based on the lexicographic closure semantics. The preferential semantics of the knowledge base can then be defined as a combination of the preferences of the single modules. The range of possibilities, from fine grained to coarse grained modules, provides a spectrum of alternative semantics. 1 Introduction Kraus, Lehmann and Magidor’s preferential logics for nonmonotonic reasoning (Kraus, Lehmann, and Magidor 1990; Lehmann and Magidor 1992), have been extended to description logics, to deal with inheritance with exceptions in ontologies, allowing for non-strict forms of inclusions, called typicality or defeasible inclusions, with different preferential and ranked semantics (Giordano et al. 2007; Britz, Heidema, and Meyer 2008) as well as different closure constructions such as the rational closure (Casini and Straccia 2010; Casini et al. 2013; Giordano et al. 2013b; Giordano et al. 2015), the lexicographic closure (Casini and Straccia 2012), the relevant closure (Casini et al. 2014), and MP-closure (Giordano and Gliozzi 2019). In this paper we define a modular multi-concept extension of the lexicographic closure for reasoning about exceptions in ontologies. The idea is very simple: different modules can be defined starting from a defeasible knowledge base, containing a set D of typicality inclusions (or defeasible inclusions) describing the prototypical properties of classes in the knowledge base. We will represent such defeasible inclusions as T(C) ⊑ D (Giordano et al. 2007), meaning that “typical C’s are D’s” or “normally C’s are D’s”, corresponding to conditionals C |∼ D in KLM framework. A set of modules m1 , . . . , mn is introduced, each one concerning a subject, and defeasible inclusions belong to a module if they are related with its subject. By subject, here, we mean any concept of the knowledge base. Module mi with subject Ci does not need to contain just typicality inclusions of the form T(Ci ) ⊑ D, but all defeasible inclusions in D 198 extended to complex concepts as follows: specificity, based on ranking of concepts computed by the rational closure of the knowledge base. Based on the ranked semantics of the single modules, a compositional (preferential) semantics of the knowledge base is defined by combining the multiple preference relations into a single global preference relation <. This gives rise to a modular multi-concept extension of Lehmann’s preference semantics for the lexicographic closure. When there is a single module, containing all the typicality inclusions in the knowledge base, the semantics collapses to a natural extension to DLs of Lehmann’s semantics, which corresponds to Lehmann’s semantics for the fragment of ALC without universal and existential restrictions. We introduce a notion of entailment for modular multiconcept knowledge bases, based on the proposed semantics, which satisfies the KLM properties of a preferential consequence relation. This notion of entailment has good properties inherited from lexicographic closure: it deals properly with irrelevance and specificity, and it is not subject to the “blockage of property inheritance” problem, i.e., the problem that property inheritance from classes to subclasses is not guaranteed, which affects the rational closure (Pearl 1990). In addition, separating defeasible inclusions in different modules provides a simple solution to another problem of the rational closure and its refinements (including the lexicographic closure), that was recognized by Geffner and Pearl (1992), namely, that “conflicts among defaults that should remain unresolved, are resolved anomalously”, giving rise to too strong conclusions. The preferential (not necessarily ranked) nature of the global preference relation < provides a simple way out to this problem, when defeasible inclusions are suitably separated in different modules. 2 ⊤I = ∆ ⊥I = ∅ (¬C)I = ∆\C I (C ⊓ D)I = C I ∩ DI (C ⊔ D)I = C I ∪ DI (∀R.C)I = {x ∈ ∆ | ∀y.(x, y) ∈ RI → y ∈ C I } (∃R.C)I = {x ∈ ∆ | ∃y.(x, y) ∈ RI & y ∈ C I }. The notion of satisfiability of a KB in an interpretation and the notion of entailment are defined as follows: Definition 1 (Satisfiability and entailment). Given an ALC interpretation I = h∆, ·I i: - I satisfies an inclusion C ⊑ D if C I ⊆ DI ; - I satisfies an assertion C(a) if aI ∈ C I ; - I satisfies an assertion R(a, b) if (aI , bI ) ∈ RI . Given a KB K = (T , A), an interpretation I satisfies T (resp., A) if I satisfies all inclusions in T (resp., all assertions in A). I is an ALC model of K = (T , A) if I satisfies T and A. Letting a query F to be either an inclusion C ⊑ D (where C and D are concepts) or an assertion (C(a) or R(a, b)), F is entailed by K, written K |=ALC F , if for all ALC models I =h∆, ·I i of K, I satisfies F . Given a knowledge base K, the subsumption problem is the problem of deciding whether an inclusion C ⊑ D is entailed by K. The instance checking problem is the problem of deciding whether an assertion C(a) is entailed by K. The concept satisfiability problem is the problem of deciding, for a concept C, whether C is consistent with K (i.e., whether there exists a model I of K, such that C I 6= ∅). In the following we will refer to an extension of ALC with typicality inclusions, that we will call ALC + T as in (Giordano et al. 2007), and to the rational closure of ALC + T knowledge bases (T , A) (Giordano et al. 2013b; Giordano et al. 2015). In addition to standard ALC inclusions C ⊑ D (called strict inclusions in the following), in ALC + T the TBox T also contains typicality inclusions of the form T(C) ⊑ D, where C and D are ALC concepts. Among all rational closure constructions for ALC mentioned in the introduction, we will refer to the one in (Giordano et al. 2013b), and to its minimal canonical model semantics. Let us recall the notions of preferential, ranked and canonical model of a defeasible knowledge base (T , A), that will be useful in the following. Definition 2 (Interpretations for ALC + T). A preferential interpretation N is any structure h∆, <, ·I i where: ∆ is a domain; < is an irreflexive, transitive and well-founded relation over ∆; ·I is a function that maps all concept names, role names and individual names as defined above for ALC interpretations, and provides an interpretation to all ALC concepts as above, and to typicality concepts as follows: (T(C))I = min< (C I ), where min< (S) = {u : u ∈ S and ∄z ∈ S s.t. z < u}. When relation < is required to be also modular (i.e., for all x, y, z ∈ ∆, if x < y then x < z or z < y), N is called a ranked interpretation. Preliminaries: The description logics ALC and its extension with typicality inclusions Let NC be a set of concept names, NR a set of role names and NI a set of individual names. The set of ALC concepts (or, simply, concepts) can be defined inductively as follows: • A ∈ NC , ⊤ and ⊥ are concepts; • if C and D are concepts and R ∈ NR , then C ⊓ D, C ⊔ D, ¬C, ∀R.C, ∃R.C are concepts. A knowledge base (KB) K is a pair (T , A), where T is a TBox and A is an ABox. The TBox T is a set of concept inclusions (or subsumptions) C ⊑ D, where C, D are concepts. The ABox A is a set of assertions of the form C(a) and R(a, b) where C is a concept, R ∈ NR , and a, b ∈ NI . An ALC interpretation (Baader et al. 2007) is a pair I = h∆, ·I i where: ∆ is a domain—a set whose elements are denoted by x, y, z, . . . —and ·I is an extension function that maps each concept name C ∈ NC to a set C I ⊆ ∆, each role name R ∈ NR to a binary relation RI ⊆ ∆ × ∆, and each individual name a ∈ NI to an element aI ∈ ∆. It is 199 included in mi , and, if T(Bird ) ⊑ FlyingAnimal and T(FlyingAnimal ) ⊑ BigWings are defeasible inclusions in the knowledge base, they both may be relevant properties of birds to be included in mi . For this reason we will not put restrictions on the typicality inclusions that can belong to a module. We will see later that the semantic construction for a module mi will be able to ignore the typicality inclusions which are not relevant for subject Ci and that there are cases when not even the inclusions T(C) ⊑ D with C subsumed by Ci are admitted in mi . The modularization m1 , . . . , mk of the defeasible part D of the knowledge base does not define a partition of D, as the same inclusion may belong to more than one module mi . For instance, the typical properties of employed students are relevant for both concept Student and concept Employee and should belong to their related modules (if any). Also, a granularity of modularization has to be chosen and, as we will see, this choice may have an impact on the global semantics of the knowledge base. At one extreme, all the defeasible inclusions in D are put together in the same module, e.g., the module associated with concept ⊤. At the other extreme, which has been studied in (Giordano and Theseider Dupré 2020), a module mi contains only the defeasible inclusions of the form T(Ci ) ⊑ D, where Ci is the subject of mi (and in this case, the inclusions T(C) ⊑ D with C subsumed by Ci are not admitted in mi ). In this regard, the framework proposed in this paper could be seen as an extension of the proposal in (Giordano and Theseider Dupré 2020) to allow coarser grained modules, while here we do not allow for userdefined preferences among defaults. Let us consider an example of multi-concept knowledge base. Example 5. Let K be the knowledge base hT , D, m1 , m2 , m3 , A, si, where A = ∅, T contains the strict inclusions: Preferential interpretations for description logics were first studied in (Giordano et al. 2007), while ranked interpretations (i.e., modular preferential interpretations) were first introduced for ALC in (Britz, Heidema, and Meyer 2008). A preferential (ranked) model of an ALC + T knowledge base K is a preferential (ranked) ALC + T interpretation N = h∆, <, ·I i that satisfies all inclusions in K, where: a strict inclusion or an assertion is satisfied in N if it is satisfied in the ALC model h∆, ·I i, and a typicality inclusion T(C) ⊑ D is satisfied in N if (T(C))I ⊆ DI . Preferential entailment in ALC + T is defined in the usual way: for a knowledge base K and a query F (a strict or defeasible inclusion or an assertion), F is preferentially entailed by K (K |=ALC+T F ) if F is satisfied in all preferential models of K. A canonical model for K is a preferential (ranked) model containing, roughly speaking, as many domain elements as consistent with the knowledge base specification K. Given an ALC + T knowledge base K = (T , A) and a query F , let us define SK as the set of all ALC concepts (and subconcepts) occurring in K or in F , together with their complements. We consider all the sets of concepts {C1 , C2 , . . . , Cn } ⊆ SK consistent with K, i.e., s.t. K 6|=ALC+T C1 ⊓ C2 ⊓ · · · ⊓ Cn ⊑ ⊥. Definition 3 (Canonical model). . A preferential model M =h∆, <, Ii of K is canonical with respect to SK if it contains at least a domain element x ∈ ∆ s.t. x ∈ (C1 ⊓ C2 ⊓ · · · ⊓ Cn )I , for each set {C1 , C2 , . . . , Cn } ⊆ SK consistent with K. For finite, consistent ALC + T knowledge bases, existence of finite (ranked) canonical models has been proved in (Giordano et al. 2015) (Theorem 1). In the following, as we will only consider finite ALC + T knowledge bases, we can restrict our consideration to finite preferential models. 3 Modular multi-concept knowledge bases Employee ⊑ Adult Adult ⊑ ∃has SSN .⊤ PhdStudent ⊑ Student PhDStudent ⊑ Adult Has no Scolarship ≡ ¬∃hasScolarship.⊤ PrimarySchoolStudent ⊑ Children PrimarySchoolStudent ⊑ HasNoClasses Driver ⊑ Adult Driver ⊑ ∃has DrivingLicence.⊤ In this section we introduce a notion of a multi-concept knowledge base, starting from a set of strict inclusions T , a set of assertions A, and a set of typicality inclusions D, each one of the form T(C) ⊑ D, where C and D are ALC concepts. Definition 4. A modular multi-concept knowledge base K is a tuple hT , D, m1 , . . . , mk , A, si, where T is an ALC TBox, D is a set of typicality inclusions, such that m1 ∪ . . . ∪ mk = D, A is an ABox, and s is a function associating each module mi with a concept, s(mi ) = Ci , the subject of mi . and the defeasible inclusions in D are distributed in the modules m1 , m2 , m3 as follows. Module m1 has subject Employee, and contains the defeasible inclusions: (d1 ) T(Employee) ⊑ ¬Young (d2 ) T(Employee) ⊑ ∃has boss.Employee (d3 ) T(ForeignerEmployee) ⊑ ∃has Visa.⊤ (d4 ) T(Employee ⊓ Student ) ⊑ Busy (d5 ) T(Employee ⊓ Student ) ⊑ ¬Young Module m2 has subject Student, and contains the defeasible inclusions: (d6 ) T(Student ) ⊑ ∃has classes.⊤ (d7 ) T(Student ) ⊑ Young The idea is that each mi is a module defining the typical properties of the instances of some concept Ci . The defeasible inclusions belonging to a module mi with subject Ci are the inclusions that intuitively pertain to Ci . We expect that all the typicality inclusions T(C) ⊑ D, such that C is a subclass of Ci , belong to mi , but not only. For instance, for a module mi with subject Ci = Bird , the typicality inclusion T(Bird ⊓ Live at SouthPole) ⊑ Penguin, meaning that the birds living at the south pole are normally penguins, is clearly to be included in mi . As penguins are birds, also inclusion T(Penguin) ⊑ Black is to be 200 we use <i , instead of <, for the preference relation in Ni , for i = 1, . . . , k). In his seminal work on the lexicographic closure, Lehmann (1995) defines a model theoretic semantics of the lexicographic closure construction by introducing an order relation among propositional models, considering which defaults are violated in each model, and introducing a seriousness ordering ≺ among sets of violated defaults. For two propositional models w and w′ , w ≺ w′ (w is preferred to w′ ) is defined in (Lehmann 1995) as follows: (d8 ) T(Student ) ⊑ Has no Scolarship (d9 ) T(HighSchoolStudent ) ⊑ Teenager (d10 ) T(PhDStudent ) ⊑ ∃hasScolarship.Amount (d11 ) T(PhDStudent ) ⊑ Bright (d4 ) T(Employee ⊓ Student ) ⊑ Busy (d5 ) T(Employee ⊓ Student ) ⊑ ¬Young Module m3 has subject V ehicle, and contains the defeasible inclusions: (d12 ) T(Vehicle) ⊑ ∃has owner .Driver (d13 ) T(Car ) ⊑ ¬SportsCar (d14 ) T(SportsCar ) ⊑ RunFast (d15 ) T(Truck ) ⊑ Heavy (d16 ) T(Bicycle) ⊑ ¬RunFast w ≺ w′ iff V (w) ≺ V (w′ ) (1) ′ w is preferred to w when the defaults V (w) violated by w are less serious than the defaults V (w′ ) violated by w′ . As we will recall below, the seriousness ordering depends on the number of defaults violated by w and by w′ for each rank. In a similar way, in the following, we introduce a ranked relation <i on the domain ∆ of a model of Ki . Let us first define, for a preferential model Ni = h∆, <i , ·I i of Ki , what it means that an element x ∈ ∆ violates a typicality inclusion T(C) ⊑ D in mi . Observe that, in previous example, (d4 ) and (d5 ) belong to both modules m1 and m2 . An additional module might be added containing the prototypical properties of Adults. 4 A lexicographic semantics of modular multi-concept knowledge bases In this section, we define a semantics of modular multiconcept knowledge bases, based on Lehmann’s lexicographic closure semantics (1995). The idea is that, for each module mi , a semantics can be defined using lexicographic closure semantics, with some minor modification. Given a modular multi-concept knowledge base K = hT , D, m1 , . . . , mk , A, si, we let rank (C ) be the rank of concept C in the rational closure ranking of the knowledge base (T ∪ D, A), according to the rational closure construction in (Giordano et al. 2013b). In the rational closure ranking, concepts with higher ranks are more specific than concepts with lower ranks. While we will not recall the rational closure construction, let us consider again Example 5. In Example 5, the rational closure ranking assigns to concepts Adult, Employee, ForeignEmployee, Driver , Student, HighSchoolStudent , PrimarySchoolStudent the rank 0, while to concepts PhDStudent and Employee ⊓ Student the rank 1. In fact, PhDStudent are exceptional students, as they have a scholarship, while employed students are exceptional students, as they are not young. Their rank is higher than the rank of concept Student as they are exceptional subclasses of class Student. Based on the concept ranking, the rational closure assigns a rank to typicality inclusions: the rank of T(C) ⊑ D is equal to the rank of concept C. For each module mi of a knowledge base K = hT , D, m1 , . . . , mk , A, si, we aim to define a canonical model, using the lexicographic order based on the rank of typicality inclusions in mi . In the following we will assume that the knowledge base hT ∪ D, Ai is consistent in the logic ALC + T, that is, it has a preferential model. This also guarantees the existence of (finite) canonical models (Giordano et al. 2015). In the following, as the knowledge base K is finite, we will restrict our consideration to finite preferential and ranked models. Let us define the projection of the knowledge base K on module mi as the knowledge base Ki = hT ∪ mi , Ai. Ki is an ALC + T knowledge base. Hence a preferential model Ni = h∆, <i , ·I i of Ki is defined as in Section 2 (but now Definition 6. Given a module mi of K, with s(mi ) = Ci , and a preferential model Ni = h∆, <i , ·I i of Ki , an element x ∈ ∆ violates a typicality inclusion T(C) ⊑ D in mi if x ∈ C I and x 6∈ DI . Notice that, the set of typicality inclusions violated by a domain element x in a model only depends on the interpretation ·I of ALC concepts, and on the defeasible inclusions in mi . Let Vi (x) be the set of the defeasible inclusions of mi violated by domain element x, and let Vih (x) be the set of all defeasible inclusions in mi with rank h which are violated by domain element x. In order to compare alternative sets of defaults, in (Lehmann 1995) the seriousness ordering ≺ among sets of defaults is defined by associating with each set of defaults D ⊆ K a tuple of numbers hn0 , n1 , . . . , nr i, where r is the order of K, i.e. the least finite i such that there is no default with the finite rank r or rank higher than r (but there is at least one default with rank r − 1). The tuple is constructed considering the ranks of defaults in the rational closure. n0 is the number of defaults in D with rank ∞ and, for 1 ≤ i ≤ k, ni is the number of defaults in D with rank r − i (in particular, nr is the number of defaults in D with rank 0). Lehmann defines the strict modular order ≺ among sets of defaults from the natural lexicographic order over the tuples hn0 , n1 , . . . , nk i. This order gives preference to those sets of defaults containing a larger number of more specific defaults. As we have seen from equation (1), ≺ is used by Lehmann to compare sets of violated defaults and to prefer the propositional models whose violations are less serious. We use the same criterion for comparing domain elements, introducing a seriousness ordering ≺i for each module mi . Considering that the defaults with infinite rank must be satisfied by all domain elements, we will not need to consider their violation in our definition (that is, we will not consider n0 in the following). The set Vi (x) of defaults from module mi which are violated by x, can be associated with a tuple of numbers 201 ti,x = h|Vir−1 (x)|, . . . , |Vi0 (x)|i. Following Lehmann, we let Vi (x) ≺i Vi (y) iff ti,x comes before ti,y in the natural lexicographic order on tuples (restricted to the violations of defaults in mi ), that is: ⊤, containing all the typicality inclusions in K, the preference relation <1 corresponds to Lehmann’s lexicographic closure semantics, as its definition is based on the set of all defeasible inclusions in the knowledge base. Vi (x) ≺i Vi (y) iff ∃l such that |Vil (x)| < |Vil (y)| 5 The combined lexicographic model of a KB and, ∀h > l, |Vih (x)| = |Vih (y)| For multiple modules, each <i determines a ranked preference relation which can be used to answer queries over module mi (i.e. queries whose subject is Ci ). If we want to evaluate the query T(C) ⊑ D (are all typical C elements also D elements?) in module mi (assuming that C concerns subject Ci ), we can answer the query using the <i relation, by checking whether min<i (C I ) ⊆ DI . For instance, in Example 5, the query “are all typical Phd students young?” can be evaluated in module m2 . The answer would be positive, as the property of students of being normally young is inherited by PhD Student. The evaluation of a query in a specific module is something that is considered in context-based formalisms, such as in the CKR framework (Bozzato, Eiter, and Serafini 2014), where there is a language construct eval (X , c) for evaluating a concept (or role) X in context c. The lexicographic orders <i and <j (for i 6= j) do not need to agree. For instance, in Example 5, for two domain elements x and y, we might have that x <1 y and y <2 x, as x is more typical than y as an employee, but less typical than x as a student. To answer a query T(C) ⊑ D, where C is a concept which is concerned with more than one subject in the knowledge base (e.g., are typical employed students young?), we need to combine the relations <i . A simple way of combining the modular partial order relations <i is to use Pareto combination. Let ≤i be defined as follows: x ≤i y iff y 6<i x. As <i is a modular partial order, ≤i is a total preorder. Given a canonical multi-concept lexicographic model M = h∆, <1 , . . . , <k , ·I i of K, we define a global preference relation < on ∆ as follows: I Definition 7. A preferential model Ni = h∆, <i , · i of Ki = hT ∪ mi , Ai, is a lexicographic model of Ki if h∆, ·I i is an ALC model of hT , Ai and <i satisfies the following condition: x <i y iff Vi (x) ≺i Vi (y). (2) Informally, <Cj gives higher preference to domain elements violating less typicality inclusions of mi with higher rank. In particular, all x, y 6∈ CiI , x ∼Ci y, i.e., all ¬Ci elements are assigned the same preference wrt <i , the least one, as they trivially satisfy all the typicality properties in mi . As in Lehmann’s semantics, in a lexicographic model Ni = h∆, <i , ·I i of Ki , the preference relation <i is a strict modular partial order, i.e. an irreflexive, transitive and modular relation. As well-foundedness trivially holds for finite interpretations, a lexicographic model Ni of Ki is a ranked model of Ki . Proposition 8. A lexicographic model Ni = h∆, <i , ·I i of Ki = hT ∪ mi , Ai is a ranked model of Ki . A multi-concept model for K can be defined as a multipreference interpretation with a preference relation <i for each module mi . Definition 9 (Multi-concept interpretation). Let K = hT , D, m1 , . . . , mk , A, si be a multi-concept knowledge base. A multi-concept interpretation M for K is a tuple h∆, <1 , . . . , <k , ·I i such that, for all i = 1, . . . , k, h∆, <i , ·I i is a ranked ALC + T interpretation, as defined in Section 2. Definition 10 (Multi-concept lexicographic model). Let K = hT , D, m1 , . . . , mk , A, si be a multi-concept knowledge base. A multi-concept lexicographic model M = h∆, <1 , . . . , <k , ·I i of K is a multi-concept interpretation for K, such that, for all i = 1, . . . , k, Ni = h∆, <i , ·I i is a lexicographic model of Ki = hT ∪ mi , Ai. A canonical multi-concept lexicographic model of K is multi-concept lexicographic model of K such that ∆ and ·I are the domain and interpretation function of some canonical preferential model of hT ∪ D, Ai, according to Definition 3. Definition 11 (Canonical multi-concept lexicographic model). Given a multi-concept knowledge base K = hT , D, m1 , . . . , mk , A, si, a canonical multi-concept lexicographic model of K, M = h∆, <1 , . . . , <k , ·I i, is a multiconcept lexicographic model of K such that there is a canonical ALC + T model h∆, <∗ , ·I i of hT ∪ D, Ai, for some <∗ . Observe that, restricting to the propositional fragment of the language (which does not allow universal and existential restrictions nor assertions), for a knowledge base K without strict inclusions and with a single module m1 , with subject x < y iff (i) for some i = 1, . . . , k, x <i y and (ii) for all j = 1, . . . , k, x ≤j y, (∗) The resulting relation < is a partial order but, in general, modularity does not hold for <. Definition 12. Given a canonical multi-concept lexicographic model M = h∆, <1 , . . . , <k , ·I i of K, the combined lexicographic interpretation of M, is a triple MP = h∆, <, ·I i, where < is the global preference relation defined by (*). We call MP a combined lexicographic model of K (shortly, an mcl -model of K). Proposition 13. A combined lexicographic model MP of K is a preferential interpretation satisfying all the strict inclusions and assertions in K. A combined lexicographic model MP of K is a preferential interpretation as those defined for ALC + T in Definition 2 (and, in general, it is not a ranked interpretation). However, preference relation < in MP is not an arbitrary irreflexive, transitive and well-founded relation. It is obtained by first computing the lexicographic preference relations <i 202 MP = h∆, <, ·I i of K is a mcl -model of K such that all the typicality inclusions in K are satisfied in MP , i.e., for all T(C) ⊑ D ∈ D, min< (C I ) ⊆ DI . for modules, and then by combining them into <. As MP satisfies all strict inclusions and assertions in K but is not required to satisfy all typicality inclusions T(C) ⊑ D in K, MP is not a preferential ALC + T model of K as defined in Section 2. Consider a situation in which there are two concepts, Student and YoungPerson, that are very related in that students are normally young persons and young persons are normally students (i.e., T(Student ) ⊑ YoungPerson and T(YoungPerson) ⊑ Student) and suppose there are two modules m1 and m2 such that s(m1 ) = Student and s(m2 ) = YoungPerson. The two classes may have different (and even contradictory) prototypical properties, for instance, normally students are quiet (e.g., when they are in their classrooms), T(Student ) ⊑ Quiet, but normally young persons are not quiet, T(YoungPerson) ⊑ ¬Quiet. Considering the preference relations <1 and <2 , associated with the two modules in a canonical multi-concept lexicographic model, we may have that, for two young persons Bob and John, which are also students, bob <1 john and john <2 bob, as Bob is quiet and John is not. Then, John and Bob are incomparable in the global relation <. Both of them, depending on the other prototypical properties of students and young persons, might be minimal, among students, wrt the global preference relation <. Hence, the set min< (Student I ) is not necessarily a subset of min<1 (Student I ). That is, typical students in the global relation may include instances (e.g., john ) which do not satisfy all the typicality inclusions for Student, as they are are (globally) incomparable with the elements in min<1 (Student I ). This implies that the notion of mcl entailment (defined below) cannot be stronger than preferential entailment in Section 2. However, given the correspondence of mcl -models with the lexicographic closure in the case of a single module with subject ⊤, containing all the typicality inclusions in D, mcl -entailment can neither be weaker than preferential entailment. In general, for a knowledge base K and a module mi , with s(mi ) = Ci , the inclusion min< (CiI ) ⊆ min<i (CiI ) may not hold and, for this reason, a combined lexicographic interpretation may fail to satisfy all typicality inclusions. In this respect, canonical multi-concept lexicographic models are more liberal than KLM-style preferential models for typicality logics (Giordano et al. 2009), where all the typicality inclusions are required to be satisfied and, in the previous example, min< (Student I ) ⊆ Quiet I must hold for the typicality inclusion to be satisfied. In fact, the knowledge base above is inconsistent in the preferential semantics and has no preferential model: from T(Student ) ⊑ YoungPerson and T(YoungPerson) ⊑ Student, it follows that T(Student ) = T(YoungPerson) should hold in all preferential models of the knowledge base, which is impossible given the conflicting typicality inclusions T(Student ) ⊑ Quiet and T(YoungPerson) ⊑ ¬Quiet . To require that all typicality inclusions in K are satisfied in MP , the notion of mcl -model of K can be strengthened as follows. Observe that, mcl T-model MP = h∆, <, ·I i of K = hT , D, m1 , . . . , mk , A, si is a KLM-style preferential model for the ALC + T knowledge base hT ∪ D, Ai, as defined in Section 2. As a difference, the preference relation < in a mcl T-model is not an arbitrary irreflexive, transitive and well-founded relation, but is defined from the lexicographic preference relations <i ’s according to equation (*). We define a notion of multi-concept lexicographic entailment (mcl -entailment) in the obvious way: a query F is mcl entailed by K (K |=mcl F ) if, for all mcl -models MP = h∆, <, ·I i of K, F is satisfied in MP . Notice that a query T(C) ⊑ D is satisfied in MP when min< (C I ) ⊆ DI . Similarly, a notion of mcl T-entailment can be defined: K |=mcl T F if, for all mcl T-models MP = h∆, <, ·I i of K, F is satisfied in MP . As, for any multi-concept knowledge base K, the set of mcl T-models of K is a subset of the set of mcl -models of K, and there is some K for which the inclusion is proper (see, for instance, the student and young person example above), mcl T-entailment is stronger than mcl entailment. It can be proved that both notions of entailment satisfy the KLM postulates of preferential consequence relations, which can be reformulated for a typicality logic, considering that typicality inclusions T(C) ⊑ D (Giordano et al. 2007) stand for conditionals C |∼D in KLM preferential logics (Kraus, Lehmann, and Magidor 1990; Lehmann and Magidor 1992). See also (Booth et al. 2019) for the formulation of KLM postulates in the Propositional Typicality Logic (PTL). In the following proposition, we let “T(C) ⊑ D” mean that T(C) ⊑ D is mcl -entailed from a given knowledge base K. Proposition 15. mcl -entailment satisfies the KLM postulates of preferential consequence relations, namely: (REFL) T(C) ⊑ C (LLE) If A ≡ B and T(A) ⊑ C, then T(B) ⊑ C (RW) If C ⊑ D and T(A) ⊑ C, then T(A) ⊑ D (AND) If T(A) ⊑ C and T(A) ⊑ D, then T(A) ⊑ C ⊓ D (OR) If T(A) ⊑ C and T(B) ⊑ C, then T(A ⊔ B) ⊑ C (CM) If T(A) ⊑ D and T(A) ⊑ C, then T(A ⊓ D) ⊑ C Stated differently, the set of the typicality inclusions T(C) ⊑ D that are mcl -entailed from a given knowledge base K is closed under conditions (REFL)-(CM) above. For instance, (LLE) means that if A and B are equivalent concepts in ALC and T(A) ⊑ C is mcl -entailed from a given knowledge base K, than T(B) ⊑ C is also mcl -entailed from K; similarly for the other conditions (where inclusion C ⊑ D is entailed by K in ALC). It can be proved that also mcl T-entailment satisfies the KLM postulates of preferential consequence relations. It can be shown that both mcl -entailment and mcl Tentailment are not stronger than Lehmann’s lexicographic closure in the propositional case. Let us consider again Example 5. Definition 14. A T-compliant mcl -model (or mcl T-model) 203 Example 16. Let us add another module m4 with subject Citizen to the knowledge base K, plus the following additional axioms in T : Italian ⊑ Citizen French ⊑ Citizen Canadian ⊑ Citizen Module m4 has subject Citizen, and contains the defeasible inclusions: (d17 ) T(Italian) ⊑ DriveFast (d18 ) T(Italian) ⊑ HomeOwner Suppose the following typicality inclusion is also added to module m2 : (d19 ) T(PhDStudent ) ⊑ ¬HomeOwner What can we conclude about typical Italian PhD students? We can see that neither the inclusion T(PhDStudent ⊓ Italian) ⊑ HomeOwner nor the inclusion T(PhDStudent ⊓ Italian) ⊑ ¬HomeOwner are mcl entailed by K. In fact, in all canonical multi-concept lexicographic models M = h∆, <1 , . . . , <4 , ·I i of K, all elements in min<2 ((P hDStudent ⊓ Italian)I ) ( the minimal Italian PhDStudent wrt <2 ), have scholarship, are bright, are not home owners (which are typical properties of PhD students), have classes and are young (which are properties of students not overridden for PhD students). On the other end, all elements in min<4 ((PhDStudent ⊓Italian)I ) (i.e., the minimal Italian PhDStudent wrt <4 ) have the properties that they drive fast and are home owners. As <2 -minimal elements and <4 -minimal PhDStudent ⊓Italian -elements are incomparable wrt <, the <-minimal Italian PhD students will include them all. Hence, min< ((PhDStudent ⊓ Italian)I ) 6⊆ HomeOwner I and min< ((PhDStudent ⊓ Italian)I ) 6⊆ (¬HomeOwner )I . in his logical framework for default reasoning, leading to a generalization of the approach to allow a partial ordering between premises. The example above shows that our approach using ranked preferences for the single modules, but a non-ranked global preference relation < for their combination, does not suffer from this problem, provided a suitable modularization is chosen (in example above, obtained by separating the typical properties of Italians and those of students in different modules). 6 Further issues: Reasoning with a hierarchy of modules and user-defined preferences The approach considered in Section 4 does not allow to reason with a hierarchy of modules, but it considers a flat collection of modules m1 , . . . , mk , each module concerning some subject Ci . As we have seen, a module mi may contain defeasible inclusions referring to subclasses of Ci , such as PhDStudent in the case of module m2 with subject Student. When defining the preference relation <i the lexicographic closure semantics already takes into account the specificity relation among concepts within the module (e.g., the fact that PhDStudent is more specific than Student). However, nothing prevents us from defining two modules mi (with subject Ci ) and mj (with subject Cj ), such that concept Cj is more specific than concept Ci . For instance, as a variant of Example 5, we might have introduced two different modules m2 with subject Student and m5 with subject PhDStudent. As concept PhDStudent is more specific than concept Student (in particular, PhDStudent ⊑ Student is entailed from the strict part of knowledge base T in ALC), the specificity information should be taken into account when combining the preference relations. More precisely, preference <5 should override preference <2 when comparing PhDStudent-instances. This is the principle followed by Giordano and Theseider Dupré (2020) to define a global preference relation, in the case when each module with subject Ci only contains typicality inclusions of the form T(Ci ) ⊑ D. A more sophisticated way to combine the preference relations <i into a global relation < is used to deal with this case with respect to Pareto combination, by exploiting the specificity relation among concepts. While we refer therein for a detailed description of this more sophisticated notion of preference combination, let us observe that this solution could be as well applied to the modular multi-concept knowledge bases considered in this paper, provided an irreflexive and transitive notion of specificity among modules is defined. Another aspect that has been considered in the previously mentioned paper is the possibility of assigning ranks to the defeasible inclusions associated with a given concept. While assigning a rank to all typicality inclusions in the knowledge base may be awkward, often people have a clear idea about the relative importance of the properties for some specific concept. For instance, we may know that the defeasible property that students are normally young is more important than the property that student normally do not have a scholarship. For small modules, which only contain typicality inclusions T(Ci ) ⊑ D for a concept Ci , the specification of user- The home owner example is a reformulation of the example used by Geffner and Pearl to show that the rational closure of conditional knowledge bases sometimes gives too strong conclusions, as “conflicts among defaults that should remain unresolved, are resolved anomalously” (Geffner and Pearl 1992). Informally, if defaults (d18 ) and (d19 ) are conflicting for Italian Phd students before adding any default which makes PhD students exceptional wrt Students (in our formalization, default (d10 )), they should remain conflicting after this addition. Instead, in the propositional case, both the rational closure (Lehmann and Magidor 1992) and Lehmann’s lexicographic closure (1995) would entail that normally Italian Phd students are not home owners. This conclusion is unwanted, and is based on the fact that (d18 ) has rank 0, while (d19 ) has rank 1 in the rational closure ranking. On the other hand, T(PhDStudent ⊓ Italian) ⊑ ¬ HomeOwner is neither mcl -entailed from K, nor mcl T-entailed from K. Both notions of entailment, when restricted to the propositional case, cannot be stronger than Lehmann’s lexicographic closure. Geffner and Pearl’s Conditional Entailment (1992) does not suffer from the above mentioned problem as it is based on (non-ranked) preferential models. The same problem, which is related to the representation of preferences as levels of reliability, has also been recognized by Brewka (1989) 204 defined ranks of the Ci ’s typical properties is a feasible option and a ranked modular preference relation can be defined from it, by using Brewka’s # strategy from his framework of Basic Preference Descriptions for ranked knowledge bases (Brewka 2004). This alternative may coexist with the use of the lexicographic closure semantics built from the rational closure ranking for larger modules. A mixed approach, integrating user-specified preferences with the rational closure ranking for the same module, might be an interesting alternative. This integration, however, does not necessarily provide a total preorder among typicality inclusions, which is our starting point for defining the modular preferences <i and their combination. Alternative semantic constructions should be considered for dealing with this case. According to the choice of fine grained or coarse grained modules, to the choice of the preferential semantics for each module (e.g., based on user-specified ranking or on Lehmann’s lexicographic closure, or on the rational closure, etc.), and to the presence of a specificity relation among modules, alternative preferential semantics for modularized multi-concept knowledge bases can emerge. features from most of NMR formalisms in the literature. In addition to those already mentioned in the introduction, let us recall the work by Straccia on inheritance reasoning in hybrid KL-One style logics (1993) the work on defaults in DLs (Baader and Hollunder 1995), on description logics of minimal knowledge and negation as failure (Donini, Nardi, and Rosati 2002), on circumscriptive DLs (Bonatti, Lutz, and Wolter 2009; Bonatti, Faella, and Sauro 2011), the generalization of rational closure to all description logics (Bonatti 2019). as well as the combination of description logics and rule-based languages (Eiter et al. 2008; Eiter et al. 2011; Motik and Rosati 2010; Knorr, Hitzler, and Maier 2012; Gottlob et al. 2014; Giordano and Theseider Dupré 2016; Bozzato, Eiter, and Serafini 2018). Our multi-preference semantics is related with the multipreference semantics for ALC developed by Gliozzi (Gliozzi 2016), which is based on the idea of refining the rational closure construction considering the preference relations <Ai associated with different aspects, but we follow a different route concerning the definition of the preference relations associated with modules, and the way of combining them in a single preference relation. In particular, defining a refinement of rational closure semantics is not our aim in this paper, as we prefer to avoid some unwanted conclusions of rational and lexicographic closure while exploiting their good inference properties. The idea of having different preference relations, associated with different typicality operators, has been studied by Gil (2014) to define a multipreference formulation of the typicality DL ALC + Tmin , mentioned above. As a difference, in this proposal we associate preferences with modules and their subject, and we combine the different preferences into a single global one. An extension of DLs with multiple preferences has also been developed by Britz and Varzinczak (2018; 2019) to define defeasible role quantifiers and defeasible role inclusions, by associating multiple preference relations with roles. The relation of our semantics with the lexicographic closure for ALC by Casini and Straccia (2010; 2013) should be investigated. A major difference is in the choice of the rational closure ranking for ALC, but it would be interesting to check whether their construction corresponds to our semantics in the case of a single module m1 with subject ⊤, when the same rational closure ranking is used. Bozzato et al. present extensions of the CKR (Contextualized Knowledge Repositories) framework by Bozzato et al. (2014; 2018) in which defeasible axioms are allowed in the global context and exceptions can be handled by overriding and have to be justified in terms of semantic consequence, considering sets of clashing assumptions for each defeasible axiom. An extension of this approach to deal with general contextual hierarchies has been studied by the same authors (Bozzato, Eiter, and Serafini 2019), by introducing a coverage relation among contexts, and defining a notion of preference among clashing assumptions, which is used to define a preference relation among justified CAS models, based on which CKR models are selected. An ASP based reasoning procedure, that is complete for instance checking, is devel- 7 Conclusions and related work In this paper, we have proposed a modular multi-concept extension of the lexicographic closure semantics, based on the idea that defeasible properties in the knowledge base can be distributed in different modules, for which alternative preference relations can be computed. Combining multiple preferences into a single global preference allows a new preferential semantics and a notion of multi-concept lexicographic entailment (mcl -entailment) which, in the propositional case, is not stronger than the lexicographic closure. mcl -entailment satisfies the KLM postulates of a preferential consequence relation. It retains some good properties of the lexicographic closure, being able to deal with irrelevance, with specificity within the single modules, and not being subject to the “blockage of property inheritance” problem. The combination of different preference relations provides a simple solution to a problem, recognized by Geffner and Pearl, that the rational closure of conditional knowledge bases sometimes gives too strong conclusions, as “conflicts among defaults that should remain unresolved, are resolved anomalously” (Geffner and Pearl 1992). This problem also affects the lexicographic closure, which is stronger than the rational closure. Our approach using ranked preferences for the single modules, but a non-ranked preference < for their combination, does not suffer from this problem, provided a suitable modularization is chosen. As Geffner and Pearl’s Conditional Entailment (Geffner and Pearl 1992), also some non-monotonic DLs, such as ALC + Tmin , a typicality DL with a minimal model preferential semantics (Giordano et al. 2013a), and the non-monotonic description logic DLN (Bonatti et al. 2015), which supports normality concepts based on a notion of overriding, do not not suffer from the problem above. Reasoning about exceptions in ontologies has led to the development of many non-monotonic extensions of Description Logics (DLs), incorporating non-monotonic 205 oped for SROIQ-RL. For the lightweight description logic EL+ ⊥ , an Answer Set Programming (ASP) approach has been proposed (Giordano and Theseider Dupré 2020) for defeasible inference in a miltipreference extension of EL+ ⊥ , in the specific case in which each module only contains the defeasible inclusions T(Ci ) ⊑ D for a single concept Ci , where the ranking of defeasible inclusions is specified in the knowledge base, following the approach by Gerhard Brewka in his framework of Basic Preference Descriptions for ranked knowledge bases (Brewka 2004). A specificity relation among concepts is also considered. The ASP encoding exploits asprin (Brewka et al. 2015), by formulating multipreference entailment as a problem of computing preferred answer sets, which is proved to be Πp2 complete. For EL+ ⊥ knowledge bases, we aim at extending this ASP encoding to deal with the modular multiconcept lexicographic closure semantics proposed in this paper, as well as with a more general framework, allowing for different choices of preferential semantics for the single modules and for different specificity relations for combining them. For lightweight description logics of the EL family (Baader, Brandt, and Lutz 2005), the ranking of concepts determined by the rational closure construction can be computed in polynomial time in the size of the knowledge base (Giordano and Theseider Dupré 2018; Casini, Straccia, and Meyer 2019). This suggests that we may expect a Πp2 upper-bound on the complexity of multiconcept lexicographic entailment. Booth, R.; Casini, G.; Meyer, T.; and Varzinczak, I. 2019. On rational entailment for propositional typicality logic. Artif. Intell. 277. Bozzato, L.; Eiter, T.; and Serafini, L. 2014. Contextualized knowledge repositories with justifiable exceptions. In DL 2014, volume 1193 of CEUR Workshop Proceedings, 112– 123. Bozzato, L.; Eiter, T.; and Serafini, L. 2018. Enhancing context knowledge repositories with justifiable exceptions. Artif. Intell. 257:72–126. Bozzato, L.; Eiter, T.; and Serafini, L. 2019. Justifiable exceptions in general contextual hierarchies. In Bella, G., and Bouquet, P., eds., Modeling and Using Context - 11th International and Interdisciplinary Conference, CONTEXT 2019, Trento, Italy, November 20-22, 2019, Proceedings, volume 11939 of Lecture Notes in Computer Science, 26– 39. Springer. Brewka, G.; Delgrande, J. P.; Romero, J.; and Schaub, T. 2015. asprin: Customizing answer set preferences without a headache. In Proc. AAAI 2015, 1467–1474. Brewka, G. 1989. Preferred subtheories: An extended logical framework for default reasoning. In Proceedings of the 11th International Joint Conference on Artificial Intelligence. Detroit, MI, USA, August 1989, 1043–1048. Brewka, G. 2004. A rank based description language for qualitative preferences. In Proceedings of the 16th Eureopean Conference on Artificial Intelligence, ECAI’2004, Valencia, Spain, August 22-27, 2004, 303–307. Britz, K., and Varzinczak, I. J. 2018. Rationality and context in defeasible subsumption. In Proc. 10th Int. Symp. on Found. of Information and Knowledge Systems, FoIKS 2018, Budapest, May 14-18, 2018, 114–132. Britz, A., and Varzinczak, I. 2019. Contextual rational closure for defeasible ALC (extended abstract). In Proc. 32nd International Workshop on Description Logics, Oslo, Norway, June 18-21, 2019. Britz, K.; Heidema, J.; and Meyer, T. 2008. Semantic preferential subsumption. In Brewka, G., and Lang, J., eds., Principles of Knowledge Representation and Reasoning: Proceedings of the 11th International Conference (KR 2008), 476– 484. Sidney, Australia: AAAI Press. Casini, G., and Straccia, U. 2010. Rational Closure for Defeasible Description Logics. In Janhunen, T., and Niemelä, I., eds., Proc. 12th European Conf. on Logics in Artificial Intelligence (JELIA 2010), volume 6341 of LNCS, 77–90. Helsinki, Finland: Springer. Casini, G., and Straccia, U. 2012. Lexicographic Closure for Defeasible Description Logics. In Proc. of Australasian Ontology Workshop, vol.969, 28–39. Casini, G., and Straccia, U. 2013. Defeasible inheritancebased description logics. Journal of Artificial Intelligence Research (JAIR) 48:415–473. Casini, G.; Meyer, T.; Varzinczak, I. J.; ; and Moodley, K. 2013. Nonmonotonic Reasoning in Description Logics: Rational Closure for the ABox. In 26th International Workshop Acknowledgement: We thank the anonymous referees for their helpful comments and suggestions. This research is partially supported by INDAM-GNCS Project 2019. References Baader, F., and Hollunder, B. 1995. Embedding defaults into terminological knowledge representation formalisms. J. Autom. Reasoning 14(1):149–180. Baader, F.; Calvanese, D.; McGuinness, D.; Nardi, D.; and Patel-Schneider, P. 2007. The Description Logic Handbook - Theory, Implementation, and Applications, 2nd edition. Cambridge. Baader, F.; Brandt, S.; and Lutz, C. 2005. Pushing the EL envelope. In Kaelbling, L., and Saffiotti, A., eds., Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI 2005), 364–369. Edinburgh, Scotland, UK: Professional Book Center. Bonatti, P. A.; Faella, M.; Petrova, I.; and Sauro, L. 2015. A new semantics for overriding in description logics. Artif. Intell. 222:1–48. Bonatti, P. A.; Faella, M.; and Sauro, L. 2011. Defeasible inclusions in low-complexity dls. J. Artif. Intell. Res. (JAIR) 42:719–764. Bonatti, P. A.; Lutz, C.; and Wolter, F. 2009. The Complexity of Circumscription in DLs. Journal of Artificial Intelligence Research (JAIR) 35:717–773. Bonatti, P. A. 2019. Rational closure for all description logics. Artif. Intell. 274:197–223. 206 Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. L. 2015. Semantic characterization of rational closure: From propositional logic to description logics. Artificial Intelligence 226:1–33. Gliozzi, V. 2016. Reasoning about multiple aspects in rational closure for DLs. In Proc. AI*IA 2016 - XVth International Conference of the Italian Association for Artificial Intelligence, Genova, Italy, November 29 - December 1, 2016, 392–405. Gottlob, G.; Hernich, A.; Kupke, C.; and Lukasiewicz, T. 2014. Stable model semantics for guarded existential rules and description logics. In Proc. KR 2014. Knorr, M.; Hitzler, P.; and Maier, F. 2012. Reconciling owl and non-monotonic rules for the semantic web. In ECAI 2012, 474479. Kraus, S.; Lehmann, D.; and Magidor, M. 1990. Nonmonotonic reasoning, preferential models and cumulative logics. Artificial Intelligence 44(1-2):167–207. Lehmann, D., and Magidor, M. 1992. What does a conditional knowledge base entail? Artificial Intelligence 55(1):1– 60. Lehmann, D. J. 1995. Another perspective on default reasoning. Ann. Math. Artif. Intell. 15(1):61–82. Motik, B., and Rosati, R. 2010. Reconciling Description Logics and rules. Journal of the ACM 57(5). Pearl, J. 1990. System Z: A Natural Ordering of Defaults with Tractable Applications to Nonmonotonic Reasoning. In Parikh, R., ed., TARK (3rd Conference on Theoretical Aspects of Reasoning about Knowledge), 121–135. Pacific Grove, CA, USA: Morgan Kaufmann. Straccia, U. 1993. Default inheritance reasoning in hybrid kl-one-style logics. In Bajcsy, R., ed., Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI 1993), 676–681. Chambéry, France: Morgan Kaufmann. on Description Logics (DL 2013), volume 1014 of CEUR Workshop Proceedings, 600–615. Casini, G.; Meyer, T.; Moodley, K.; and Nortje, R. 2014. Relevant closure: A new form of defeasible reasoning for description logics. In JELIA 2014, LNCS 8761, 92–106. Springer. Casini, G.; Straccia, U.; and Meyer, T. 2019. A polynomial time subsumption algorithm for nominal safe elo⊥ under rational closure. Inf. Sci. 501:588–620. Donini, F. M.; Nardi, D.; and Rosati, R. 2002. Description logics of minimal knowledge and negation as failure. ACM Transactions on Computational Logic (ToCL) 3(2):177–225. Eiter, T.; Ianni, G.; Lukasiewicz, T.; Schindlauer, R.; and Tompits, H. 2008. Combining answer set programming with description logics for the semantic web. Artif. Intell. 172(1213):1495–1539. Eiter, T.; Ianni, G.; Lukasiewicz, T.; and Schindlauer, R. 2011. Well-founded semantics for description logic programs in the semantic web. ACM Trans. Comput. Log. 12(2):11. Geffner, H., and Pearl, J. 1992. Conditional entailment: Bridging two approaches to default reasoning. Artif. Intell. 53(2-3):209–244. Gil, O. F. 2014. On the Non-Monotonic Description Logic ALC+Tmin . CoRR abs/1404.6566. Giordano, L., and Gliozzi, V. 2019. Reasoning about exceptions in ontologies: An approximation of the multipreference semantics. In Symbolic and Quantitative Approaches to Reasoning with Uncertainty, 15th European Conference, ECSQARU 2019, Belgrade, Serbia, September 18-20, 2019, Proceedings, 212–225. Giordano, L., and Theseider Dupré, D. 2016. ASP for minimal entailment in a rational extension of SROEL. TPLP 16(5-6):738–754. DOI: 10.1017/S1471068416000399. Giordano, L., and Theseider Dupré, D. 2018. Defeasible Reasoning in SROEL from Rational Entailment to Rational Closure. Fundam. Inform. 161(1-2):135–161. Giordano, L., and Theseider Dupré, D. 2020. An ASP approach for reasoning in a concept-aware multipreferential lightweight DL. CoRR abs/2006.04387. Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. L. 2007. Preferential Description Logics. In Dershowitz, N., and Voronkov, A., eds., Proceedings of LPAR 2007 (14th Conference on Logic for Programming, Artificial Intelligence, and Reasoning), volume 4790 of LNAI, 257–272. Yerevan, Armenia: Springer-Verlag. Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. L. 2009. ALC+T: a preferential extension of Description Logics. Fundamenta Informaticae 96:1–32. Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. L. 2013a. A NonMonotonic Description Logic for Reasoning About Typicality. Artificial Intelligence 195:165–202. Giordano, L.; Gliozzi, V.; Olivetti, N.; and Pozzato, G. 2013b. Minimal Model Semantics and Rational Closure in Description Logics . In 26th International Workshop on Description Logics (DL 2013), volume 1014, 168 – 180. 207 An Approximate Model Counter for ASP∗ Flavio Everardo1,2 , Markus Hecher1,3 , Ankit Shukla4 1 University of Potsdam, Germany 2 Tecnológico de Monterrey Puebla Campus, Mexico 3 TU Wien, Vienna, Austria 4 JKU, Linz, Austria flavio.everardo@cs.uni-potsdam.de, mhecher@gmail.com, ankit.shukla@jku.at Abstract these cover also applications in machine learning and probabilistic inference (Chavira and Darwiche 2008). In terms of computational complexity, counting has been well-studied since the late 70s (Durand, Hermann, and Kolaitis 2005; Hemaspaandra and Vollmer 1995; Valiant 1979b; 1979a). There are also results for counting involving projection, where one wants to count only with respect to a given set of projected atoms, which has been established for logic (Aziz et al. 2015; Capelli and Mengel 2019; Fichte et al. 2018; Lagniez and Marquis 2019; Gupta et al. 2019; Sharma et al. 2019), reliability estimation (Dueñas-Osorio et al. 2017) as well as in ASP (Gebser, Kaufmann, and Schaub 2009; Aziz 2015; Fichte and Hecher 2019). Given that in general counting answer sets is rather hard, namely # · coNP-complete (Fichte et al. 2017; Durand, Hermann, and Kolaitis 2005), which further increases to # · ΣP 2 -completeness (Fichte and Hecher 2019) if counting with respect to a projection, a different approach than exact counting seems to be required in practice. Indeed, such approaches were successful for propositional logic (SAT), which is # · P-complete and where reasoning modes like sampling (near-uniform generation) or (approximate) model counting (Gomes, Sabharwal, and Selman 2007a; Chakraborty, Meel, and Vardi 2013a; 2013b; Sharma et al. 2019) were studied. For this purpose, so-called parity (XOR) constraints are used specifically to partition the search space in parts that preferably are of roughly the same size. Parity constraints have been recently accommodated in ASP as the fundamental part of the clingo-based system xorro (Everardo et al. 2019). With different solving approaches over parity constraints in xorro, these constraints amount to the classical XOR operator following the aggregates-like syntax using theory atoms. These constraints are interpreted as directives, solved on top of an ASP program acting as answer set filters that do not satisfy the parity constraint in question. With most of the applications of XOR constraints in the neighboring area of SAT (Meel 2018), only a few attention has been paid to treat parity constraints as well as reasoning modes like sampling or approximate model counting for ASP. To this end, we present an extension of xorro towards approximate answer set counting following the work from (Chakraborty, Meel, and Vardi 2013b) benefiting from the advanced interfaces of the ASP solver clingo (Gebser et Answer Set Programming (ASP) is a declarative framework that is well-suited for problems in KR, AI, and other areas with plenty of practical applications, in both the academy and in the industry. While modern ASP solvers not only compute one solution (answer set) but support different (reasoning) problems, the problem of counting answer sets has not been subject to intense studies. This is in contrast to the neighboring area of propositional satisfiability (SAT), where several applications and problems related to quantitative reasoning trace back to model counting. However, due to high computational complexity and depending on the actual application, approximate counting might be sufficient. Indeed, there are plenty of applications, where approximate counting for SAT is well-suited. This work deals with establishing approximate model counting for ASP, thereby lifting ideas from SAT to ASP. We present the first approximate counter for ASP by extending the clingo-based system xorro and also show preliminary experiments for several problems. While we do not have specific guarantees in terms of accuracy, our preliminary results look promising. 1 Introduction Answer Set Programming (ASP) (Lifschitz 1999; Brewka, Eiter, and Truszczyński 2011; Gebser et al. 2012) is a problem modeling and solving framework that is wellknown in the area of knowledge representation and reasoning and artificial intelligence. This framework has been practically applied to several problems in both academic and industry (Balduccini, Gelfond, and Nogueira 2006; Niemelä, Simons, and Soininen 1999; Nogueira et al. 2001; Guziolowski et al. 2013; Schaub and Woltran 2018)., 1 Recently, there has been growing interest in counting solutions to problems. Indeed, counting solutions is a wellknown task not only in mathematics and computer science, but also in other areas (Chakraborty, Meel, and Vardi 2016; Domshlak and Hoffmann 2007; Gomes, Sabharwal, and Selman 2009; Sang, Beame, and Kautz 2005). Examples of ∗ The work has been supported by the Austrian Science Fund (FWF), Grants Y698 and P32830, and the Vienna Science and Technology Fund, Grant WWTF ICT19-065. It is also accepted for presentation at the ASPOCP’20 workshop (Everardo, Hecher, and Shukla 2020). 1 An incomplete but vast list of ASP applications: https://www.dropbox.com/s/pe261e4qi6bcyyh/aspAppTable.pdf 208 al. 2016), and the sophisticated solving techniques developed in SAT (e.g. the award-winning solver crypto-minisat (Soos, Nohl, and Castelluccia 2009)). While we do not yet have theoretical guarantees in terms of the accuracy of our approach in general, the results look promising and we hope that this will foster further research in approximate answer set counting. 2 drops to NP-complete (Bidoı́t and Froidevaux 1991; Marek and Truszczyński 1991). Example 1 Assume a graph G consisting of vertices V = a, b, c, d and edges E = {{a, b}, {a, c}, {b, c}, {c, d}}. Then, the vertex cover problem asks for a set S ⊆ V of vertices such that for each edge e ∈ E we have that S ∩ e 6= ∅. An extension is the subset-minimal vertex cover problem, where we ask only for sets S, where no subset S ′ ( S is a vertex cover of G. We elegantly encode the computation of subset-minimal vertex covers into an ASP program Π as follows: For each edge {u, v} ∈ E, program Π contains the rules u ∨ v ←. Observe that the resulting program Π indeed precisly characterizes the subset-minimal vertex covers of G, which are {a, c}, {b, c}, and {a, b, d}. Preliminaries Computational Complexity. We assume familiarity with standard notions in computational complexity (Papadimitriou 1994) and use counting complexity classes of the form # · C as defined in the literature (Toda and Watanabe 1992; Durand, Hermann, and Kolaitis 2005; Hemaspaandra and Vollmer 1995). Let Σ and Σ′ be finite alphabets, I ∈ Σ∗ an instance, and kIk denote the size of I. A witness func′∗ tion W : Σ∗ → 2Σ maps an instance I ∈ Σ∗ to its witnesses. A counting problem L : Σ∗ → N0 is a function that maps a given instance I ∈ Σ∗ to the cardinality of its witnesses |W(I)|. Let C be a decision complexity class, e.g., P. Then, # · C denotes the class of all counting problems whose witness function W satisfies (i) there is a function f : N0 → N0 such that for every instance I ∈ Σ∗ and every W ∈ W(I) we have |W | ≤ f (kIk) and f is computable in time O(kIkc ) for some constant c and (ii) for ∗ every instance I ∈ Σ∗ and every candidate witness W ∈ Σ′ , the problem of deciding whether W ∈ W(I) indeed holds, is in the complexity class C. As a result, # · P is the complexity class consisting of all counting problems associated with decision problems in NP. Answer Set Programming (ASP). We assume familiarity with propositional satisfiability (SAT) (Kleine Büning and Lettman 1999) and follow standard definitions of propositional ASP (Brewka, Eiter, and Truszczyński 2011). Let m, n, ℓ be non-negative integers such that m ≤ n ≤ ℓ, a1 , . . ., aℓ be distinct propositional atoms. Moreover, we refer by literal to an atom or the negation thereof. A (logic) program Π is a set of rules of the form a1 ∨ · · · ∨ am ← am+1 , . . . , an , ¬an+1 , . . . , ¬aℓ . For brevity, we use choice rules (Simons, Niemelä, and Soininen 2002) of the form {a} ←, which is a shortcut that corresponds to two rules a ← ¬a′ and a′ ← ¬a, where a′ is a fresh atom. For a rule r, we let Hr = {a1 , . . . , am }, Br+ = {am+1 , . . . , an }, and Br− = {an+1 , . . . , aℓ }. We denote the sets of atoms occurring in a rule rSor in a program Π by at(r) = Hr ∪ Br+ ∪ Br− and at(Π) = r∈Π at(r). An interpretation I is a set of atoms. I satisfies a rule r if (Hr ∪ Br− ) ∩ I 6= ∅ or Br+ \ I 6= ∅. I is a model of Π if it satisfies all rules of Π, in symbols I |= Π. The GelfondLifschitz (GL) reduct of Π under I is the program ΠI obtained from Π by first removing all rules r with Br− ∩ I 6= ∅ and then removing all ¬z where z ∈ Br− from every remaining rule r (Gelfond and Lifschitz 1991). I is an answer set of a program Π if I is a minimal model of ΠI . We denoted the set of all answer sets of program Π by Sol(Π). The problem of deciding whether an ASP program has an answer set is called consistency, which is ΣP 2 -complete (Eiter and Gottlob 1995). If no rule uses disjunction, the complexity Answer Set Counting (#ASP). The problem #ASP asks for a given program Π to compute the number of answer sets of Π. In general we have that problem #ASP is # · co-NP-complete (Fichte et al. 2017). If we restrict the problem #ASP to normal programs without disjunction, the complexity drops to # · P-completeness, which is easy to see via standard reductions that preserve the number of answer sets from and to propositional satisfiability (SAT), see, e.g., (Janhunen 2006). Example 2 Recall the previous example, graph G and example program Π. Since G has 3 subset-minimal vertex covers, the solution to problem #ASP for given program Π is 3. 3 Parity Constraints, xorro, and Search Space Partition Towards the definition of parity constraints, let ⊤ and ⊥ stand for the Boolean constants true and false, respectively. Given atoms a1 and a2 , the exclusive or (XOR for short) of a1 and a2 is denoted by a1 ⊕ a2 and it is satisfied if either a1 or a2 is true (but not both). Generalizing the idea for n distinct atoms a1 , . . . , an , we obtain an n-ary XOR constraint (((a1 ⊕ a2 ) . . . ) ⊕ an ) by multiple applications of ⊕. Since it is satisfied iff an odd number of atoms among a1 , . . . , an are true, we can simply refer it to as an odd parity constraint and it can be written simply as a1 ⊕ . . . ⊕ an due to associativity. Analogously, an even parity (or XOR) constraint is defined by a1 ⊕ . . . ⊕ an ⊕⊤ as it is satisfied iff an even number of atoms among a1 , . . . , an hold. Then, e.g., a1 ⊕ a2 ⊕ ⊤ is satisfied iff none or both of a1 and a2 hold. Similarly, an even parity constraint can be represented in terms of the odd parity by an uneven number of negated literals. For instance, ¬a1 ⊕ a2 is equivalent to a1 ⊕ a2 ⊕ ⊤ and pairs of negated literals cancel parity inversion, for example ¬a1 ⊕ ¬a2 is equivalent to a1 ⊕ a2 Finally, XOR constraints of forms a ⊕ ⊥ and a ⊕ ⊤ are called unary. To accommodate parity constraints in ASP’s input language, we rely on clingo’s theory language extension (Gebser et al. 2016) following the common syntax of aggregates (Gebser et al. 2015): &even{1:p(1);4: not p(4);5:p(5)}. &odd{2:not p(2);5: not p(5);6:p(6)}. &odd{X:p(X), X > 2}. &even{X:p(X), X < 3}. 209 sets, i.e., partition the Sol(Π) into cells with a roughly equal number of answer sets. The difficulty is resolved by the use of universal hashing (Carter and Wegman 1977). The universal hashing selects a hash function randomly from a family of hash functions with a low probability of collision between any two keys. By choosing a random hash function from a family, we randomize over arbitrary input distribution. This guarantees a low number of collisions in expectation irrespective of how data is chosen. &odd{5:p(5)}. That is, xorro extends the input language of clingo by aggregate names &even and &odd that are followed by a set, whose elements are terms conditioned by conjunctions of literals separated by commas.2 From a search space of 64 answer sets in the context of the choice rule {p(1..6)}., the parity constraints shown above amounts to the XOR operations: p(1) ⊕ p(4) ⊕ p(5), p(2) ⊕ p(5) ⊕ p(6), p(3) ⊕ p(4) ⊕ p(5) ⊕ p(6), p(1) ⊕ p(2) ⊕ ⊤, and p(1) ⊕ ⊥ yielding the answer sets {p(5)} and {p(1),p(2),p(4),p(5),p(6)}. This means that each parity constraint divides the search space (roughly) by half, and for this symmetric search space example with five parity constraints (m = 5), we have 2m clusters, each with two answer sets. Currently, these constraints are interpreted as directives, filtering answer sets that do not satisfy the parity constraint in question. 3 Hence, the first two constraints hold for any combination of an uneven number of literals from p(1), p(4), p(5) and p(2), p(5), p(6) respectively. The third constraint holds if an odd number of literals from p(3) to p(6) are true, while the fourth constraint requires that either none or both of the atoms p(1) and p(2) are included. The last parity constraint discards any answer set not containing the atom p(5). The solver xorro handles parity constraints in six different ways, switching between ASP encodings of parity constraints (eager approaches), and the use of theory propagators within clingo’s Python interface (lazy approaches). 4 4 Definition 1 A family of functions H = {h : U → [m]} is called a universal family if, ∀x, y ∈ U, x 6= y : Pr [h(x) = h(y)] ≤ h∈H 1 m if the hash function h is drawn randomly from H, the probability at which any two keys of the universe U will collide is at most 1/m. This probability of collision is what we generally expect if the hash function assigned truly random hash codes h(x) to every key x ∈ U . For model counting with high guarantees this is not enough, we not only need every input to be hashed uniformly but also to be hashed independently. We need a family of kwise independent hash functions (Wegman and Carter 1979). A random hash function h is k-wise independent if for all choices of distinct x1 ,. . . ,xk the values h(x1 ),. . . ,h(xk ) are independent. Definition 2 Let H = {h : U → [m]} be a family of hash functions. We call H k-wise independent if for any distinct x1 , . . . , xk ∈ U and any y1 , ..., yk ∈ [m] we have Prh∈H [h(x1 ) = y1 ∧ ... ∧ h(xk ) = yk ] = m1k Universal hashing and XOR’s One approach to solving #ASP is to count all the answer sets by enumeration (almost), known as exact counting. We focus on the problem of approximate model counting. In applications of model counting like probabilistic reasoning, planning with uncertainty etc, it may be sufficient to approximate the solution count and avoid the overhead of exact model counting. An approximate counting tries to approximately compute the number of answer sets by using a probabilistic algorithm ApproxASP(Π, ǫ, δ). The algorithm takes a program Π with a tolerance ǫ > 0 and a confidence 0 < δ ≤ 1 as an input. The output is an estimate mc based on the parameter ǫ and δ. Proving theoretical bounds and adapting the algorithm for the same is an ongoing work. The central idea of the approximate model counting approach is the use of hash functions to partition the set Sol(Π) of answer sets for a program Π, into roughly equal small cells. Then pick a random cell and scale it by the number of cells to obtain an ǫ-approximate estimate of the model count. Note that apriori we do not know the distribution of the solution set Sol(Π) of answer sets. We have to perform good hashing without the knowledge of the distribution of answer Intuitively a family H = {h : U → [m]} of hash functions is k-wise independent if for any distinct keys x1 , ..., xk ∈ U , the hash codes h(x1 ), ..., h(xk ) are independent random variables and for any fixed x, h(x) is uniformly distributed in [m]. Let n, m and k be positive integers, we use H(n, m, k) to denote a family of k-wise independent hash functions mapping {0, 1}n to {0, 1}n . The canonical construction of a k-wise independent family is based on polynomials of degree k - 1. Let p ≥ |U | be prime. Picking random a0 , ..., ak−1 ∈ {0, ..., p-1}, the hash function is defined by: h(x) = ((ak−1 xk−1 + ... + a1 x + a0 ) mod p) mod m For p ≫ m, the hash function is statistically close to kwise independent. The 2-wise independent hash function is: h(x) = (((a1 x + a0 ) mod p) mod m) The higher the k, the stronger will be the guarantee on a range of the size of cells. To encode k-wise independence we will require polynomial of degree k - 1. But the higher the k, the harder it will be to solve the formula with these constraints. To balance this trade-off we use 3-wise independent hash functions. If we use k-wise independence, all cells will be small and we will get a uniform generation. Using 3-wise independence we achieve a random cell is small with high probability, known as almost-uniform generation. 2 In turn, multiple conditional terms within an aggregate are separated by semicolons. 3 XOR constraints cannot occur either in the bodies or heads of rules. 4 The distinction of eager and lazy approaches follows the methodology in Satisfiability modulo theories (Barrett et al. 2009). 210 Algorithm 2: CountAS(Π, S, pivot) 1 /* Assume z1 , ..., zn are the atoms of Π */ 2 i, l = ⌊log2 (pivot) − 1⌋ 3 while (1 ≤ |S| ≤ pivot) ∨(i = n) do 4 i=i+1 5 h ← Hxor (n, i − l, 3) 6 α ← {0, 1}i−l 7 S = xorro(Π ∧ (h(z1 , ..., zn ) = α, pivot + 1) 8 if |S| > pivot or |S| = 0 then 9 return ⊥ 10 else 11 return |S| ·2i−l Algorithm 1: ApproxASP 1 ApproxASP(Π, ǫ, δ) ; Result: Approximate number of answer sets or ⊥ 2 counter = 0 ; C = [ ] ; 1/2 3 pivot = 2 × ⌈3e (1 + 1ǫ )2 ⌉ ; 4 S = xorro(Π, pivot + 1) ; 5 if |S| ≤ pivot then 6 return |S| 7 else 8 iter = ⌈27 log2 (3/δ)⌉; 9 while counter < iter do 10 c = CountAS(Π, S, pivot) ; 11 counter = counter + 1 ; 12 if c 6= ⊥ then 13 AddToList(C, c) 14 end 15 end 16 end 17 return FindMean(C) ; 5 ASP model count c returned by CountAS are appended to the list C. The final estimate of the model count returned by ApproxASP is the mean of the estimates stored in C, computed using FindMean(C). Algorithm CountAS takes as input an ASP program Π and a pivot. It returns an ǫ-approximate estimate of the model count of the program Π. The algorithm uses random hash functions from Hxor (n, i − l, 3) to partition the model space of the program Π. This is done by choosing a random hash function h (on line 5) and choosing randomly with probability half (α, line 6) bits to set. Then it conjuncts the chosen XOR with the program Π and uses xorro to check whether it has at most pivot + 1 models. This process repeats (lines 5-7) and the loop terminates if either a randomly chosen cell is found to be small (|S| ≤ pivot) and non-empty, or if the number of cells generated > 2n + 1/pivot. We scale the size of S by the number of cells generated by the corresponding hashing function to compute an estimate of the model count. If all randomly chosen cells were either empty or not small, we return ⊥ and report a counting error. Algorithm We use an already evolved algorithm from the case of propositional satisfiability (Chakraborty, Meel, and Vardi 2013b) and lift it to ASP. For our work we use a specific family of hash functions denoted by Hxor (n, m, 3), to partition the set of models of an input formula into “small” cells. This family of hash functions has been used in (Gomes, Sabharwal, and Selman 2006; Chakraborty, Meel, and Vardi 2013b), and is shown to be 3-wise independent in (Gomes, Sabharwal, and Selman 2007b). Please refer to (Gomes, Sabharwal, and Selman 2006; 2007b; Chakraborty, Meel, and Vardi 2013b) for details. Our resulting system extends xorro (Everardo et al. 2019) and is called xampler5 . We assume that xampler has access to xorro (Everardo et al. 2019) that takes as input an ASP program Π′ possibly in conjunction with XOR constraints, as well as a bound b ≥ 0. The function xorro(Π′ , b) returns a set of models S of Π′ such that |S| = min(b, #Π′ ). The system xampler implements algorithm ApproxASP, which takes as input ASP program Π, a tolerance ǫ (0 < ǫ ≤ 1) and a confidence δ (0 < δ ≤ 1) as an input. It computes a threshold pivot that depends on ǫ to determine the chosen value of the size of a small cell. Then it checks if the input program Π has at least a pivot number of answer sets. It uses xorro to check if the input program has at least b (b = pivots + 1) answer sets. If the total number of answer sets S of Π is less than or equals to b the algorithm returns the answer set count |S|. Otherwise, the algorithm continues and calculates a parameter iter(≥ 1) that determines the number of times CountAS is invoked. Note that iter depends only on δ. Next, there are at most iter number of calls that are made to CountAS. The resulted in non-⊥ estimates of the 6 Experiments To test the algorithms above, we benchmarked our resulting approximate counter xampler, which extends xorro by these algorithms. For now, we focus only on the quality of the counting, leaving the scalability and performance for further work. To test the quality of the counting, we generated ten random instances from different ASP problem classes, where we aim for counting graph colorings, subsetminimal vertex covers, solutions (witnesses) to the schur decision problem, hamiltonian paths, subset-minimal independent dominating sets as well as solving projected model counting on 2-QBFs (Durand, Hermann, and Kolaitis 2005; Kleine Büning and Lettman 1999). 6 Note that projected model counting on 2-QBFs is proven to be # · co-NPcomplete (Durand, Hermann, and Kolaitis 2005). Further, we also suspect that both problems of counting all the subsetminimal vertex covers as well as counting subset-minimal independent dominating set are hard for this complexity class. At least there are no known polynomial encodings for SAT 5 The system xampler is open-source and available at https:// github.com/flavioeverardo/xampler/. 6 The encodings and instances can be found at: https://tinyurl. com/approx-asp 211 Best init #Answer Sets Median Q Mean Q Median 1 2 3 4 5 6 7 8 9 10 262,080 90,000 62,400 37,680 70,560 4,800 20,880 9,959,040 13,996,920 5,569,560 Average 230,400 77,568 49,408 33,536 64,768 4,240 17,600 5,636,096 10,289,152 4,407,296 0.88 0.86 0.79 0.89 0.92 0.88 0.84 0.57 0.74 0.79 0.82 253,097 92,529 64,201 38,524 72,083 4,869 20,917 10,146,567 13,373,132 5,840,620 0.97 1.03 1.03 1.02 1.02 1.01 1 1.02 0.96 1.05 1.01 170,752 54,272 36,608 20,864 41,216 3,200 13,440 4,259,840 4,816,896 2,564,096 Worst Q 0.65 0.6 0.59 0.55 0.58 0.67 0.64 0.43 0.34 0.46 0.55 Mean Q 295,872 95,603 50,749 28,517 89,509 5,344 23,239 12,793,030 17,169,132 4,992,504 1.13 1.06 0.81 0.76 1.27 1.11 1.11 1.28 1.23 0.9 1.07 Table 1: Approximate answer set count over random instances of the Graph Coloring problem. Best init #Answer Sets Median Q Mean Q Median 1 2 3 4 5 6 7 8 9 10 104,640 23,856 71,136 1,537,680 104,640 608,400 2,530,080 1,261,008 23,756,544 8,406,048 Average 90,112 14,912 69,888 1,327,104 89,088 552,960 1,941,504 686,080 12,713,984 4,554,752 0.86 0.63 0.98 0.86 0.85 0.91 0.77 0.54 0.54 0.54 0.75 104,866 24,143 74,317 1,592,656 107,800 585,332 2,592,145 1,268,456 23,627,783 8,339,475 1 1.01 1.04 1.04 1.03 0.96 1.02 1.01 0.99 0.99 1.01 54,528 5,344 55,296 1,048,576 59,392 312,320 1,314,816 332,800 8,650,752 2,314,240 Worst Q 0.52 0.22 0.78 0.68 0.57 0.51 0.52 0.26 0.36 0.28 0.47 Mean Q 92,076 16,867 79,189 1,378,273 83,326 740,732 2,122,486 1,081,334 25,093,716 7,244,022 0.88 0.71 1.11 0.9 0.8 1.22 0.84 0.86 1.06 0.86 0.92 Table 2: Approximate answer set count over random instances of the Schur decision problem. init #Answer Sets Median Q 1 2 3 4 5 6 7 8 9 10 480,163 14,439 362,880 74,156 63,861 20,705 19,020 653,487 49,837 1,271,017 Average 304,128 10,192 221,184 39,424 44,800 14,816 15,488 443,392 49,408 872,448 0.63 0.71 0.61 0.53 0.7 0.72 0.81 0.68 0.99 0.69 0.71 Best Mean 478,946 14,944 395,350 77,848 63,243 21,980 22,224 659,262 47,274 1,196,522 Worst Q Median Q Mean Q 1 1.03 1.09 1.05 0.99 1.06 1.17 1.01 0.95 0.94 1.03 202,752 6,176 89,088 25,408 30,528 6,088 10,592 234,496 18,816 453,632 0.42 0.43 0.25 0.34 0.48 0.29 0.56 0.36 0.38 0.36 0.39 584,700 15,377 652,665 98,129 95,465 15,884 30,914 874,308 65,238 2,055,348 1.22 1.06 1.8 1.32 1.49 0.77 1.63 1.34 1.31 1.62 1.36 Table 3: Approximate answer set count over random instances of the Hamiltonian Path problem. 212 Best init #Answer Sets Median Q Mean Q Median 1 2 3 4 5 6 7 8 9 10 46,737 58,538 6,405,610 5,330,500 7,460,775 5,733,125 187,928 919,808 493,431,189 5,785,344 Average 46,592 55,296 6,193,152 5,210,112 7,536,640 5,734,400 183,296 909,312 492,830,720 5,799,936 1 0.94 0.97 0.98 1.01 1 0.98 0.99 1 1 0.99 45,171 60,324 6,449,549 5,174,303 7,446,894 5,811,551 188,198 916,914 494,853,532 5,868,650 0.97 1.03 1.01 0.97 1 1.01 1 1 1 1.01 1 36,224 40,448 5,701,632 3,604,480 6,717,440 4,620,288 155,648 962,560 353,370,112 6,307,840 Worst Q 0.78 0.69 0.89 0.68 0.9 0.81 0.83 1.05 0.72 1.09 0.84 Mean Q 50,838 49,216 7,413,423 4,619,493 6,787,805 5,966,710 195,252 963,723 440,323,668 6,407,616 1.09 0.84 1.16 0.87 0.91 1.04 1.04 1.05 0.89 1.11 1 Table 4: Approximate answer set count over random instances of the Subset-Minimal Vertex Cover problem. init #Answer Sets Median Q 1 2 3 4 5 6 7 8 9 10 47,730 41,113 118,985 133,564 33,792 12,800 99,215 103,471 48,970 57,266 Average 51,200 40,960 137,216 143,360 57,344 20,480 71,680 85,632 49,152 61,440 1.07 1 1.15 1.07 1.7 1.6 0.72 0.83 1 1.07 1.12 Best Mean 87,210 40,715 237,918 232,852 79,616 20,293 143,568 79,759 72,946 103,509 Q Median 1.83 0.99 2 1.74 2.36 1.59 1.45 0.77 1.49 1.81 1.6 40,960 24,320 147,456 154,112 65,536 22,528 60,416 25,984 38,912 63,488 Worst Q Mean 0.86 0.59 1.24 1.15 1.94 1.76 0.61 0.25 0.79 1.11 1.03 104,170 60,945 363,503 287,719 80,630 39,082 201,910 167,701 85,419 117,394 Q 2.18 1.48 3.06 2.15 2.39 3.05 2.04 1.62 1.74 2.05 2.18 Table 5: Approximate answer set count over random instances of the Subset-Minimal Independent Dominating Set problem. Best init #Answer Sets Median Q Mean Q Median 1 2 3 4 5 6 7 8 9 10 32,716 65,320 130,854 260,463 928,467 1,045,622 4,168,467 8,294,512 8,346,882 522,316 Average 32,768 65,536 131,072 262,144 1,638,400 1,048,576 6,291,456 8,388,608 8,388,608 524,288 1 1 1 1.01 1.76 1 1.51 1.01 1 1 1.13 53,620 96,483 179,361 400,777 3,167,573 1,530,013 7,689,557 4,882,538 12,582,912 664,462 1.64 1.48 1.37 1.54 3.41 1.46 1.81 0.59 1.51 1.27 1.61 65,536 65,536 131,072 522,240 3,014,656 1,048,576 8,323,072 131,072 8,388,608 786,432 Worst Q 2 1 1 2.01 3.25 1 2 0.02 1 1.51 1.48 Mean Q 55,050 102,221 283,704 522,240 4,969,813 1,643,081 7,542,101 12,558,336 14,198,285 785,521 1.68 1.56 2.17 2.01 5.35 1.57 1.84 1.51 1.7 1.5 2.09 Table 6: Approximate answer set count over random instances on Projected Model Counting on 2-QBFs. 213 that precisely capture the solutions to these problems. Hence, it is unlikely that one can easily approximate the number of solutions by means of approximate SAT counting. Also, to track the counting, we cared that these instances were “easy to solve” for clingo, meaning that clingo must enumerate all answer sets within 600 seconds timeout (without printing). To get the feeling for our initial counting experiments, we tried different values for both the tolerance and the confidence, seeking for different size of clusters and number of iterations, as shown in lines 3 and 8 from Algorithm 1, respectively. It is worth reminding that these parameters directly affect the density of the parity constraints (lines 5 and 6 from Algorithm 2). These constraints follow the syntax and principles discussed in Sections 3 and 4, respectively. As part of the setup of the experiments and for comparison, we asked xorro (Everardo et al. 2019) to estimate the count also by calculating the median, taken from the original ApproxMC Algorithm in (Chakraborty, Meel, and Vardi 2013b). The experiments were run sequentially under the Ubuntubased Elementary OS on a 16 GB memory with a 2.60 GHz Dual-Core Intel Core i7 processor laptop using Python 3.7.6. Each benchmark instance (in smodels output format, generated offline with the grounder gringo that is part of clingo (Gebser et al. 2016)) was run five times without any time restriction. As shown in Algorithm 1, a run is finished with one of two possible situations, either xorro returns the approximate answer sets count or unsatisfiability. Our experiments’ results are summarized in Tables 1-6 listing for each problem class instances, the number of answer sets in the first two columns. The remainder of the table is divided into the best and worst runs from the five. For both, the median and the mean counts, we add a quality factor (Q) estimating the closeness to the total number of answer sets. The last row of each table displays the average Q for each count. In the first three tables, we can see the pattern that the mean count got better results even in their worst case. On the other hand, the medians under approximate the counts. For instance, in the Schur problem, the last three instances where almost 50% under approximated, lowering the average on the bottom line. However, for the subset-minimal vertex covers, we see that both counts were almost exactly on average. In this example, also the worst cases are close to an exact count. For the most complex problems shown in Tables 5 and 6, the average counts over approximate the number of solutions. However, the margin for the median’s best case is close to an exact count. A proof for this is in Table 6 where a Q of 1 was gotten in six instances out of ten. In these problems, the mean count over approximates the number of answer sets giving no proper estimations. The best-case scenario goes 60 percent over the desired number. It is also noticeable that in most of the cases, the median count under approximate the number of answer sets, and the opposite happened with the mean (over approximate). The large deviations between the best and the worst cases correspond to one of two possible scenarios. If the count is under approximating, it means the partition was not well distributed, and some clusters had too few or too many answer sets. On the opposite case, where there is an over- approximation count, our set of XORs contains linear combinations or linearly dependent equations, meaning that the partitioning is not performed concerning the number of XOR s. For instance, the conjunction of the XOR constraints a ⊕ ⊤ ∧ b ⊕ ⊤ ∧ a ⊕ b ⊕ ⊤ can be equivalently reduced to a ⊕ ⊤ ∧ b ⊕ ⊤. Back to our example in Section 3, instead of counting |S| ·25 being S = 2, one linear combination causes the double of answer sets from the resulting cluster, so for this case, S = 4, and the approximate count is 1024 instead of 64. As we mentioned above, the performance was not examined for this paper, meaning that it is worth considering for further experiments by testing all the different approaches from xorro. For the experiments above, we ran xorro with the lazy counting approach, which got the highest overall performance score from all the six implementations. However, the random parity constraints generated during each counting iteration were quite small, meaning that other approaches would benefit more for these XORs densities, like the Unit Propagation approach (Everardo et al. 2019). 7 Conclusion and Future work This paper discusses an extension of the ASP system xorro towards approximate answer set counting. This is established by lifting ideas from existing techniques for SAT to the formalism of ASP. While our preliminary results are promising and show that indeed approximate counting works for ASP, there is still potential for future works. On the one hand, we highly recommend studying and showing proper guarantees that prove our results are guaranteed to be accurate with high probability and do not deviate far from the actual result. Further, we highly encourage additional tuning and improvements to our preliminary implementation concerning several aspects such as the parity constraints solving, scalability, and the abolition of linear combinations. Talking about scalability, we need to perform more experiments and algorithm revisions in order to perform better than clingo’s (clasp’s) enumeration. We hope that this work fosters applications and further research on quantitative reasoning like e.g., (Kimmig et al. 2011; Tsamoura, Gutiérrez-Basulto, and Kimmig 2020), for ASP. References Aziz, R. A.; Chu, G.; Muise, C.; and Stuckey, P. 2015. #(∃)SAT: Projected Model Counting. In Heule, M., and Weaver, S., eds., Proceedings of the 18th International Conference on Theory and Applications of Satisfiability Testing (SAT’15), 121–137. Austin, TX, USA: Springer. Aziz, R. A. 2015. Answer Set Programming: Founded Bounds and Model Counting. Ph.D. Dissertation, Department of Computing and Information Systems , The University of Melbourne. Balduccini, M.; Gelfond, M.; and Nogueira, M. 2006. Answer set based design of knowledge systems. Ann. Math. Artif. Intell. 47(1-2):183–219. Barrett, C.; Sebastiani, R.; Seshia, S.; and Tinelli, C. 2009. Satisfiability modulo theories. In Biere, A.; Heule, M.; van 214 Maaren, H.; and Walsh, T., eds., Handbook of Satisfiability, volume 185 of Frontiers in Artificial Intelligence and Applications. IOS Press. chapter 26, 825–885. Bidoı́t, N., and Froidevaux, C. 1991. Negation by default and unstratifiable logic programs. Theoretical Computer Science 78(1):85–112. Brewka, G.; Eiter, T.; and Truszczyński, M. 2011. Answer set programming at a glance. Communications of the ACM 54(12):92–103. Capelli, F., and Mengel, S. 2019. Tractable QBF by knowledge compilation. In Niedermeier, R., and Paul, C., eds., STACS 2019, volume 126 of LIPIcs, 18:1–18:16. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik. Carter, L., and Wegman, M. N. 1977. Universal classes of hash functions (extended abstract). In Hopcroft, J. E.; Friedman, E. P.; and Harrison, M. A., eds., Proceedings of the 9th Annual ACM Symposium on Theory of Computing, May 4-6, 1977, Boulder, Colorado, USA, 106–112. ACM. Chakraborty, S.; Meel, K.; and Vardi, M. 2013a. A scalable and nearly uniform generator of SAT witnesses. In Sharygina, N., and Veith, H., eds., Proceedings of the Twenty-fifth International Conference on Computer Aided Verification (CAV’13), volume 8044 of LNCS, 608–623. Springer. Chakraborty, S.; Meel, K. S.; and Vardi, M. Y. 2013b. A scalable approximate model counter. In Schulte, C., ed., Principles and Practice of Constraint Programming - 19th International Conference, CP 2013, Uppsala, Sweden, September 16-20, 2013. Proceedings, volume 8124 of Lecture Notes in Computer Science, 200–216. Springer. Chakraborty, S.; Meel, K. S.; and Vardi, M. Y. 2016. Improving approximate counting for probabilistic inference: From linear to logarithmic SAT solver calls. In Kambhampati, S., ed., Proceedings of 25th International Joint Conference on Artificial Intelligence (IJCAI’16), 3569–3576. New York City, NY, USA: The AAAI Press. Chavira, M., and Darwiche, A. 2008. On probabilistic inference by weighted model counting. Artificial Intelligence 172(6–7):772—799. Domshlak, C., and Hoffmann, J. 2007. Probabilistic planning via heuristic forward search and weighted model counting. J. Artif. Intell. Res. 30. Dueñas-Osorio, L.; Meel, K. S.; Paredes, R.; and Vardi, M. Y. 2017. Counting-based reliability estimation for powertransmission grids. In Singh, S. P., and Markovitch, S., eds., Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI’17), 4488–4494. San Francisco, CA, USA: The AAAI Press. Durand, A.; Hermann, M.; and Kolaitis, P. G. 2005. Subtractive reductions and complete problems for counting complexity classes. Theoretical Computer Science 340(3):496–513. Eiter, T., and Gottlob, G. 1995. On the computational cost of disjunctive logic programming: Propositional case. Ann. Math. Artif. Intell. 15(3–4):289–323. Everardo, F.; Janhunen, T.; Kaminski, R.; and Schaub, T. 2019. The return of xorro. In Balduccini, M.; Lierler, Y.; and Woltran, S., eds., Logic Programming and Nonmonotonic Reasoning - 15th International Conference, LPNMR 2019, Philadelphia, PA, USA, June 3-7, 2019, Proceedings, volume 11481 of Lecture Notes in Computer Science, 284–297. Springer. Everardo, F.; Hecher, M.; and Shukla, A. 2020. Extending xorro with Approximate Model Counting. In ASPOCP@ICLP. Fichte, J. K., and Hecher, M. 2019. Treewidth and counting projected answer sets. In LPNMR’19, volume 11481 of LNCS, 105–119. Springer. Fichte, J. K.; Hecher, M.; Morak, M.; and Woltran, S. 2017. Answer set solving with bounded treewidth revisited. In LPNMR, volume 10377 of Lecture Notes in Computer Science, 132–145. Springer. Fichte, J. K.; Hecher, M.; Morak, M.; and Woltran, S. 2018. Exploiting treewidth for projected model counting and its limits. In SAT’18, volume 10929 of LNCS, 165–184. Springer. Gebser, M.; Kaminski, R.; Kaufmann, B.; and Schaub, T. 2012. Answer Set Solving in Practice. Morgan & Claypool. Gebser, M.; Harrison, A.; Kaminski, R.; Lifschitz, V.; and Schaub, T. 2015. Abstract gringo. TPLP 15(4-5):449–463. Gebser, M.; Kaminski, R.; Kaufmann, B.; Ostrowski, M.; Schaub, T.; and Wanko, P. 2016. Theory solving made easy with clingo 5. In Carro, M.; King, A.; Saeedloei, N.; and Vos, M. D., eds., Technical Communications of the 32nd International Conference on Logic Programming, ICLP 2016 TCs, October 16-21, 2016, New York City, USA, volume 52 of OASICS, 2:1–2:15. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik. Gebser, M.; Kaufmann, B.; and Schaub, T. 2009. Solution enumeration for projected boolean search problems. In van Hoeve, W.-J., and Hooker, J. N., eds., Proceedings of the 6th International Conference on Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems (CPAIOR’09), volume 5547 of LNCS, 71–86. Berlin: Springer. Gelfond, M., and Lifschitz, V. 1991. Classical negation in logic programs and disjunctive databases. New Generation Comput. 9(3/4):365–386. Gomes, C. P.; Sabharwal, A.; and Selman, B. 2006. Model counting: A new strategy for obtaining good bounds. In AAAI, 54–61. Gomes, C.; Sabharwal, A.; and Selman, B. 2007a. Nearuniform sampling of combinatorial spaces using XOR constraints. In Schölkopf, B.; Platt, J.; and Hofmann, T., eds., Proceedings of the Twentieth Annual Conference on Neural Information Processing Systems (NIPS’06), 481–488. MIT Press. Gomes, C. P.; Sabharwal, A.; and Selman, B. 2007b. Nearuniform sampling of combinatorial spaces using xor constraints. In Advances In Neural Information Processing Systems, 481–488. Gomes, C. P.; Sabharwal, A.; and Selman, B. 2009. Chapter 20: Model counting. In Biere, A.; Heule, M.; van Maaren, H.; and Walsh, T., eds., Handbook of Satisfiability, volume 215 185 of Frontiers in Artificial Intelligence and Applications. Amsterdam, Netherlands: IOS Press. 633–654. Gupta, R.; Sharma, S.; Roy, S.; and Meel, K. S. 2019. Waps: Weighted and projected sampling. In Vojnar, T., and Zhang, L., eds., Proceedings of the 25th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’19), 59–76. Prague, Czech Republic: Springer. Held as Part of the European Joint Conferences on Theory and Practice of Software. Guziolowski, C.; Videla, S.; Eduati, F.; Thiele, S.; Cokelaer, T.; Siegel, A.; and Saez-Rodriguez, J. 2013. Exhaustively characterizing feasible logic models of a signaling network using answer set programming. Bioinformatics 29(18):2320– 2326. Erratum see Bioinformatics 30, 13, 1942. Hemaspaandra, L. A., and Vollmer, H. 1995. The satanic notations: Counting classes beyond #P and other definitional adventures. SIGACT News 26(1):2–13. Janhunen, T. 2006. Some (in)translatability results for normal logic programs and propositional theories. Journal of Applied Non-Classical Logics 16(1-2):35–86. Kimmig, A.; Demoen, B.; Raedt, L. D.; Costa, V. S.; and Rocha, R. 2011. On the implementation of the probabilistic logic programming language problog. Theory Pract. Log. Program. 11(2-3):235–262. Kleine Büning, H., and Lettman, T. 1999. Propositional logic: deduction and algorithms. Cambridge University Press, Cambridge. Lagniez, J.-M., and Marquis, P. 2019. A recursive algorithm for projected model counting. In Hentenryck, P. V., and Zhou, Z.-H., eds., Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI’19). Lifschitz, V. 1999. Answer set planning. In ICLP, 23–37. MIT Press. Marek, W., and Truszczyński, M. 1991. Autoepistemic logic. J. of the ACM 38(3):588–619. Meel, K. S. 2018. Constrained counting and sampling: Bridging the gap between theory and practice. CoRR abs/1806.02239. Niemelä, I.; Simons, P.; and Soininen, T. 1999. Stable model semantics of weight constraint rules. In LPNMR’99, volume 1730 of LNCS, 317–331. Springer. Nogueira, M.; Balduccini, M.; Gelfond, M.; Watson, R.; and Barry, M. 2001. An A-Prolog decision support system for the Space Shuttle. In PADL’01, volume 1990 of LNCS, 169–183. Springer. Papadimitriou, C. H. 1994. Computational Complexity. Addison-Wesley. Sang, T.; Beame, P.; and Kautz, H. 2005. Performing bayesian inference by weighted model counting. In Veloso, M. M., and Kambhampati, S., eds., Proceedings of the 29th National Conference on Artificial Intelligence (AAAI’05). The AAAI Press. Schaub, T., and Woltran, S. 2018. Special issue on answer set programming. KI 32(2-3):101–103. Sharma, S.; Roy, S.; Soos, M.; and Meel, K. S. 2019. Ganak: A scalable probabilistic exact model counter. In Kraus, S., ed., Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI-19, 1169–1176. IJCAI. Simons, P.; Niemelä, I.; and Soininen, T. 2002. Extending and implementing the stable model semantics. Artif. Intell. 138(1-2):181–234. Soos, M.; Nohl, K.; and Castelluccia, C. 2009. Extending SAT solvers to cryptographic problems. In Kullmann, O., ed., Theory and Applications of Satisfiability Testing - SAT 2009, 12th International Conference, SAT 2009, Swansea, UK, June 30 - July 3, 2009. Proceedings, volume 5584 of Lecture Notes in Computer Science, 244–257. Springer. Toda, S., and Watanabe, O. 1992. Polynomial time 1-turing reductions from #ph to #p. Theor. Comput. Sci. 100(1):205– 221. Tsamoura, E.; Gutiérrez-Basulto, V.; and Kimmig, A. 2020. Beyond the grounding bottleneck: Datalog techniques for inference in probabilistic logic programs. In AAAI, 10284– 10291. AAAI Press. Valiant, L. G. 1979a. The complexity of computing the permanent. Theoretical Computer Science 8(2):189–201. Valiant, L. 1979b. The complexity of enumeration and reliability problems. SIAM J. Comput. 8(3):410–421. Wegman, M. N., and Carter, L. 1979. New classes and applications of hash functions. In 20th Annual Symposium on Foundations of Computer Science, San Juan, Puerto Rico, 29-31 October 1979, 175–182. IEEE Computer Society. 216 A Survey on Multiple Revision Fillipe Resina∗ , Renata Wassermann Universidade de São Paulo {fmresina, renata}@ime.usp.br Abstract 2012). Another possible scenario occurs when an agent receives a set of new beliefs and, based on its previous knowledge, selects the most reliable ones to incorporate. In some situations, it may be possible to reduce multiple revision to single revision, for example, taking the conjunction of all new sentences, but it is not always feasible. Thus, a framework for multiple changes is needed. In addition, it is not the same as applying revision in an iterated way (Darwiche and Pearl 1997), taking the input set and revising sequentially, one by one. In Multiple Revision, it is assumed that there is no preference over the input sentences, i.e., all of them have equal priority and should be processed at the same time. Besides that, since the order in which you would process the sentences can make a difference in the final result, working with iterated revision may cause an asymmetry. In (Delgrande and Jin 2012) it is also observed that, in many approaches to iterated revision, if an agent revises by a sentence and then by a sentence that is inconsistent with the previous one, then the agent’s beliefs are precisely the same as if only the second revision had been performed. We are also not going to address belief merging, a kind of change operation in which preceding and new beliefs play symmetric roles. For more information refer to (Fuhrmann 1997; Konieczny and Pérez 2002; Falappa et al. 2012). Given the importance of Multiple Revision, the purpose of this paper is to summarize the literature on the field, providing unified terminology and notation for readers looking for an overview. We also identify some limitations of the models and present some comparisons between them. Belief Revision deals with the problem of how a rational agent should proceed in face of new information. A revision occurs when an agent receives new information possibly inconsistent with its epistemic state and has to change it in order to accommodate the new belief in a consistent way. However, this new information may come as a set of beliefs (instead of a single one), a problem known as Multiple Revision. Unlike Iterated Revision, in Multiple Revision all information is processed simultaneously. The purpose of this survey is to bring and organize the state-of-the-art in the area, showing the different approaches developed since 1988 and the open problems that still exist. 1 Introduction According to Gärdenfors (1988), it is not very useful to know how to represent knowledge if at the same time we do not know how to change it when we receive new information. The motivation of this idea is that knowledge is not static, what means that we need to be able to deal with its dynamics. That is the context of the studies in the area of Belief Revision, which aims to handle the problem of adding or removing new information to a knowledge base in a consistent way. Most of the literature about Belief Revision is based on the AGM paradigm, named after the authors of the seminal paper (Alchourrón, Gärdenfors, and Makinson 1985). In the AGM paradigm, given a set of beliefs, there are three possible changes in relation to a new belief: expansion, contraction and revision. Expansion occurs when the base simply absorbs the information without loss. A contraction consists in retracting beliefs from the base until the specified information is not derivable. Finally, revision happens when the new belief is added in a consistent way, possibly demanding a repair in order to eliminate inconsistency. In this survey, we are going to focus on this last operation. In the original framework, the new belief is assumed to be represented by a single formula. Nevertheless, there are situations in which the information by which we are going to revise comes in block, that is, a concurrent acceptance of a (possibly infinite) set of beliefs. Besides, when we deal with a multi-agent context it may be necessary to revise a belief state by another belief state, as pointed out in (Falappa et al. ∗ Notation In this article, we assume a formal language L and use Cn to represent an operator that returns the set of logical consequences of the input set. For atomic formulas, we use lowercase Greek letters (α, β, ...) and for sets of formulas, uppercase Latin letters (A, B, C,...). K is reserved to represent a belief V set (i.e. K = Cn(K)). We use ⊥ for the falsity constant. A stands for set conjunction. We denote the power set of S by P(S). Organization of the article Section 2 presents singleton revision. Section 3 shows the first works developed on Multiple Revision. Section 4 explores the context when the input is infinite, while Section 5 studies further approaches using systems of spheres. Section 6 gathers approaches using direct construction in package revision and Section 7 gath- Supported by the Brazilian funding agency CAPES 217 ers approaches in non-prioritized revision. Section 8 synthesizes alternative constructions based on core beliefs of a belief state. Finally, some conclusions and open problems are discussed. 2 The following representation theorem connects the construction to the rationality postulates: Theorem 1. (Alchourrón, Gärdenfors, and Makinson 1985) Let K be a belief set. An operator ∗ on K is a partial meet revision function iff ∗ satisfies (K ∗ 1) − (K ∗ 6). The Revision Operation For belief bases, the same construction of internal partial meet revision can be used (Hansson 1993). When an agent needs to accept new information inconsistent with its previous beliefs, it may be necessary to give up one or more beliefs to avoid the inconsistency. Aiming at information economy, only what is needed should be changed. This kind of change is known as revision. AGM revision (∗) receives a set K of beliefs, a new sentence α and returns a new set K ∗ α in which α was consistently added. The Levi Identity (Gärdenfors 1988) relates revision to contraction (−) and expansion (+): K ∗ α = (K − ¬α) + α Hansson (1993) called this operation internal revision, and proposed external revision as a reverse operation: K ± α = (K + α) − ¬α External revision does not usually work for belief sets, as if α is inconsistent with K, the expansion of K by α trivializes the set. Therefore, this kind of operation was introduced for belief bases (sets of sentences not necessarily closed under logical consequence). In (Hansson 1999b) the difference between internal and external revision is fully explored. The AGM theory defines some rationality postulates that a revision should obey: (K*1) K ∗ α is a belief set (K*2) α ∈ K ∗ α (K*3) K ∗ α ⊆ K ∪ {α} (K*4) If ¬α ∈ / Cn(K), then K ∗ α = Cn(K ∪ {α}) (K*5) If ¬α ∈ / Cn(∅), then K ∗ α is consistent under Cn (K*6) If Cn(α) = Cn(β), then K ∗ α = K ∗ β (K*7) K ∗ (α ∧ β) ⊆ Cn((K ∗ α) ∪ {β}) (K*8) Cn((K ∗ α) ∪ {β}) ⊆ K ∗ (α ∧ β), provided that ¬β ∈ / K ∗α These properties were later generalized for belief bases as will be seen in Section 3.3. 2.1 Kernel Revision Hansson (1994) proposed kernel contraction as an alternative construction in which we remove from a belief base B at least one element of each minimal subset of B that implies α (B ⊥⊥ α), obtaining a belief base that does not imply α. To perform these removals of elements, we use an incision function σ, i.e., a function that selects at least one sentence from each kernel. From the definition of kernel contraction one can obtain a kernel revision: Definition 2. The kernel revision on B based on an incision function σ is the operator ∗σ such that for all sentences α: B ∗σ α = (B \ σ(B ⊥⊥ ¬α)) ∪ {α} Systems of Spheres Grove (1988) proposed a construction for revision based on sets of possible worlds, defined as maximal consistent subsets of L. In what follows, the set of all possible worlds will be represented by ML . For any set R ⊆ ML , [R] denotes the set of possible worlds that contain R, i.e., [R] = {M ∈ ML : R ⊆ M }. If R is inconsistent this will be the empty set. The elements of [R] are the R-worlds. For a sentence ϕ ∈ L, [ϕ] is an abbreviation of [{ϕ}]. The elements of [ϕ] are the ϕ-worlds. Definition 3. (Grove 1988) Let X be a subset of ML . A system of spheres centered on X is a collection S ⊆ P(ML ), that satisfies the following conditions: (S1) S is totally ordered with respect to set inclusion; that is, if U , V ∈ S, then U ⊆ V or V ⊆ U. (S2) X ∈ S, and if U ∈ S then X ⊆ U . (S3) ML ∈ S (and so it is the largest element of S). (S4) For every ϕ ∈ L, if there is any element in S intersecting [ϕ] then there is also a smallest element in S intersecting [ϕ]. Constructions We are going to quickly recall three different constructions for revision: Partial Meet Revision This operation was first suggested in (Alchourrón and Makinson 1982), being explored with more details in (Alchourrón, Gärdenfors, and Makinson 1985) and, later, being generalized for belief bases, as can be seen in (Hansson 1999b). Given a set K and a formula α, the remainder set of K in relation to α (K⊥α) is formed by the maximal subsets of K that do not imply α. A selection function γ selects at least one element of K⊥α if it is not empty. Otherwise, γ selects {K}. Finally, the partial meetTcontraction on K generated by γ is defined as K −γ α = γ(K⊥α). Partial meet revision is obtained by applying the Levi Identity to pertial meet contraction: Definition 1. (Alchourrón, Gärdenfors, and Makinson 1985) Let γ be a selection function for K. The operator ∗γ of (internal) partial meet revision for K is defined as: K ∗γ α = (K −γ ¬α) + α The elements of S are called spheres. For any consistent sentence ϕ ∈ L, the smallest sphere in S intersecting [ϕ] is denoted by CS (ϕ). fS (ϕ) denotes the set consisting of the ϕ-worlds closest to X , i.e., fS (ϕ) = [ϕ] ∩ CS (ϕ). A revision based on a system of spheres S is defined as the intersection of the ϕ-worlds closest to K: Definition 4. (Grove 1988) Let S be a system of spheres centered on [K]. ∩fS (ϕ) if [ϕ] 6= ∅ K ∗S ϕ = L otherwise 2.2 Non-prioritized Revision Non-prioritized revision is a class of revision operations where the input is not always accepted, i.e., success is not a necessary property. The idea is that a new belief should 218 Package revision can be defined from package contraction via a generalized version of the Levi Identity: not always have primacy over previous beliefs, i.e., it may be rejected if it conflicts with more valuable previous beliefs. Hansson (1997) proposed semi-revision as an operation of non-prioritized revision that can be based on the idea of consolidation, i.e, extracting a consistent subset of an inconsistent belief base (essentially a contraction by ⊥). Semirevision (denoted by ?) lies in the expansion + consolidation variety of non-prioritized belief revision (Hansson 1999a). For possibilities of interdefinability between semi-revision and consolidation, see (Hansson 1997). Fermé and Hansson (1999) explored a new possibility of non-prioritized revision which incorporates only a part of the input belief, calling it selective revision. Many models were developed to work with two options: complete acceptance of the input information or full rejection of it. So the selective revision model aimed to be an in-between approach. In order to achieve the partial acceptance, they propose the use of a transformation function f to the input α aiming to extract, roughly speaking, the most trustworthy part of it. Given an AGM revision ∗ and a transformation function f , the selective revision ◦ is given by K ◦ α = K ∗ f (α). Selective revision lies in the decision + revision variety of non-prioritized belief revision (Hansson 1999a). 3 K ∗p A = (K −p ¬A) + A Theorem 2. (Fuhrmann 1988) If the operation of package revision is defined via Levi Identity from the operation of package contraction, then the following conditions hold: (closure) K ∗p A is a theory; (success) A ⊆ K ∗p A; (inclusion) K ∗p A ⊆ K + A; (consistency) if A 6⊢ ⊥, then K ∗p A is consistent. (vacuity) if K ∩ ¬A = ∅ then K + A ⊆ K ∗p A; (extensionality) if Cn(A) = Cn(A′ ), K ∗p A = K ∗p A′ ; (conjunctive inclusion) K ∗p (A′ ∪ A) ⊆ (K ∗p A′ ) + A; (conjunctive vacuity) if K∩¬A = ∅, then (K∗p A′ )+A ⊆ K ∗p (A′ ∪ A). Furhmann claims that the operation of choice revision is less intuitive than package revision, making it hard to find practical applications. If in a choice revision K ∗c A an agent has to add at least one sentence from A, which ones should he choose? If [K −c ¬A] represents the choice contraction of K by those elements of ¬A that are “easiest to retract”, using the Levi Identity, we have: K ∗c A = [K −c ¬A] + A ∩ {α : ¬α ∈ / [K −c ¬A]} Early Steps on Multiple Operations In (Fuhrmann 1988) we have a first picture of a change operation that is not necessarily by a single input. The author claims that sometimes we need to withdraw more than one proposition of a belief set at the same time, proposing the name Multiple Contraction for this case. Let K be a belief set and A a set of sentences to be retracted from K. If an agent wants no sentence of A to be implied by K − A, i.e., Cn(K − A) ∩ A = ∅, we have a (multiple) package contraction1 (denoted by −p ). On the other hand, if an agent simply does not want A to be a in the consequences of K − A, i.e., A * Cn(K − A), we have a (multiple) choice contraction (here denoted by −c ). For further information about multiple contraction see (Fuhrmann and Hansson 1994). 3.1 3.2 Generalizing Grove’s result Lindström (1991) introduced a set of operations called infinitary belief revision. In his article, he explores nonmonotonic inference operations and brings a connection with AGM revision. Nevertheless, in order to achieve total interdefinability between belief revision and nonmonotonic inference it was necessary to support possibly infinite sets of propositions as input. The axioms proposed are direct generalizations of the basic revision axioms presented in (Gärdenfors 1988), as well as done in (Fuhrmann 1988). The only difference is that Lindström joins the inclusion and vacuity postulates to form a new one called expansion: (expansion) if K ∪ A is consistent, then K ∗ A = K + A. Compared to AGM, Lindström makes weaker assumptions about the underlying logic. He only assumes that it is a deductive logic, while in the AGM framework, supraclassicality and satisfaction of the deduction theorem4 are also required. The following theorem shows the definition of infinitary belief revision in terms of Grove’s systems of spheres (Grove 1988): Multiple Revision Fuhrmann (1988) discussed properties of revision operations that receive as input more than one sentence simultaneously, named Multiple Revision. Analogously to Multiple Contraction, when the result of K ∗ A should imply everything in A (A ⊆ Cn(K ∗ A)) we have (multiple) package revision2 (denoted by ∗p ), while when the result of K ∗ A should contain some elements (but not necessarily all) of A (Cn(K ∗ A) ∩ A 6= ∅), then we have (multiple) choice revision3 (here denoted by ∗c ). In order to proceed with the generalization, we need the definition of set negation. Fuhrmann defined it this way: if A is a set of formulas, ¬A = {¬α : α ∈ A} Theorem 3. (Lindström 1991) Let K be a belief set and ∗ be any (multiple) belief revision operation on K. Then, for all A, ∗ is a system of spheres-based revision iff it satisfies closure, success, extensionality, inclusion, vacuity, consistency and conjunctive vacuity. 1 In (Fuhrmann 1988), this was called meet contraction, receiving the name package only in (Fuhrmann and Hansson 1994). 2 In (Fuhrmann 1988) this operation was denominated meet revision and in (Rott 2001), bunch revision. 3 Rott (2001) called this one pick revision. 4 219 β ∈ Cn(A ∪ {α}) iff α → β ∈ Cn(A). 3.3 Internal and External Revision Definition 6. (Fuhrmann 1997) X ∈ K ↓ A iff X ⊆ K, X ∪ A 6⊢ ⊥ and ∀X ′ s.t. X ⊂ X ′ ⊆ K, then X ′ ∪ A ⊢ ⊥. Hansson (1993), when extending the AGM framework for belief bases, chose to generalize it to the multiple case at the same time. Considering the operation of revision obtained from the Levi Identity, in order to proceed with the generalization we need, for sets, an equivalent way of negating the input: Definition 5. (Hansson 1999b) Let X be a finite set of sentences. Then neg(X) (the sentential negation of X) is defined as follows: 1. neg(∅) = ⊥; 2. if X = {α}, then neg(X) = ¬α; 3. if X = {α1 , ..., αn } for some n > 1, then neg(X) = ¬α1 ∨ ¬α2 ∨ ... ∨ ¬αn . From this definition of set negation, he defines internal revision as B ∗γ A = ∩ γB (B⊥{neg(A)}) ∪ A Then, it is possible to characterize it axiomatically: Theorem 4. (Hansson 1993) The operator * is an operator of multiple internal partial meet revision for a belief base B iff it satisfies: (success) A ⊆ B ∗ A (inclusion) B ∗ A ⊆ B ∪ A (consistency) B ∗ A is consistent if A is consistent (uniformity) If for all B ′ ⊆ B, B ′ ∪ A is inconsistent iff B ′ ∪ C is inconsistent, then B ∩ (B ∗ A) = B ∩ (B ∗ C). (relevance) If α ∈ B and α ∈ / B ∗ A, then there is some B ′ such that B ∗ A ⊆ B ′ ⊆ B ∪ {A}, B ′ is consistent but B ′ ∪ {α} is inconsistent. The postulates for consistency, inclusion and success are direct generalizations of the corresponding Gärdenfors’ postulates for singleton revision. Now we can define multiple external partial meet revision as K ±γ A = ∩ γK∪A (K ∪ A)⊥{neg(A)}). Theorem 5. (Hansson 1993) The operator * is an operator of multiple external partial meet revision iff it satisfies consistency, inclusion, relevance, success and, in addition: (weak uniformity) If A and C are subsets of B and it holds that for all B ′ ⊆ B that B ′ ∪ A is inconsistent iff B ′ ∪ C} is inconsistent, then B ∗ A = B ∗ C. (pre-expansion) B + A ∗ A = B ∗ A 3.4 After defining the generalized remainder set, a selection function γ selects from K ↓ A the preferred elements6 . Multiple package partial meet T revision is defined as: K ∗p A = γ(K ↓ A) ∪ A The following result was obtained for this package revision: Theorem 6. (Fuhrmann 1997) The operation ∗p just defined satisfies the following conditions: (success) A ⊆ K ∗ A (inclusion) K ∗ A ⊆ K ∪ A (consistency) If K ∗A is inconsistent, then A is inconsistent (congruence) If ¬A ≡K ¬C then K ∗ A = K ∗ C 7 (relevance) If α ∈ K \ K ∗ A then ∃X : (K ∗ A) ∩ K ⊆ X ⊆ K and X 0 ¬A and X, α ⊢ ¬A Fuhrmann also explored the relation between this construction and one using remainders: Theorem 7. (Fuhrmann 1997) If the operation ∗ satisfies the five conditions from Theorem 6, then there exists a selectionfunction γK for K such that: K ∗ A = T γ(K⊥¬A) ∪ A According to the principle of categorial matching, when applying revision, belief sets should map onto belief sets and belief bases should map onto belief bases. In the way the operation was defined so far, it is not true that in general a closeness is preserved. Fuhrmann called the operation defined so far as pre-revision and established an operation of matching revision in accordance with the principle of categorial matching: Cn(A ∗ B) if A = Cn(A) A⋆B = A∗B otherwise The following observation indicates that the operation ⋆ is in practice a simple adaptation of pre-revision. Observation 1. (Fuhrmann 1997) If ∗ satisfies the conditions from Theorem 6, then ⋆ satisfies the five conditions and also closure: if K = Cn(K) then K ⋆ A = Cn(K ⋆ A). One of the questions that emerge is if it is possible to reduce multiple operations to operations by singletons. For package revision, Fuhrmann obtained the following result: V Lemma 1. (Fuhrmann 1997) For finite A, K ∗A = K ∗ A Multiple Package Partial Meet Revision As shown in the second condition of Theorem 2, the generalization of the revision operation inherits an important characteristic already present in the original definition: the input set has priority over the sentences to be revised. Following this approach, we have in (Fuhrmann 1997) a further exploration of the topic, bringing us a construction. Fuhrmann adresses the issue focusing on the package variety and shows how the operations of revision and contraction can be interdefinable. Following the partial meet approach, when we have arbitrary sets of sentences K and A and need to revise K in order to consistently incorporate all elements of A, we can first find all maximal subsets of K that are consistent with A, which form a generalized version of the remainder set for multiple operations in revision5 : 5 However, there is a logical drawback for this reduction. For comparison purposes, suppose that, instead of revising by a set {α, β} (a collection of items of information), the agent decides to revise by the conjunction α ∧ β (a single item of information). As observed in (Delgrande and Jin 2012), although the two options have the same logical content (since they imply precisely the same formulas), revising by {α, β} should result in a belief state such that, if there is no known link between α and β, then if β were afterwards found out as not true, then α should still be considered as true. 6 7 In (Fuhrmann 1997), this is called K open for A (K op A). 220 In (Fuhrmann 1997), this function is called choice function. ¬A ≡K ¬C iff ∀X ⊆ K: X ⊢ ¬A iff X ⊢ ¬C 4 Infinitary Belief Change Definition 9. (Zhang 1996) A NOP Σ = (K, P, <) of K is called a nice-well-ordering partition (NWOP) of K if < is a well-ordering relation on P. A contraction generated by NWOP was presented to establish a more computational-oriented method to deal with general belief revision. In (Zhang et al. 1997a), the authors claim that in (Zhang 1996) we have a complete extension of AGM’s postulates for belief changes but without a representation theorem to the framework proposed. So, in their paper they provide two representation theorems for general contraction and, in addition, a new property called Limit Postulate in order to specify properties of infinite belief changes. They also developed a Partial Meet model to the general contraction. When the input is not finite, the postulates for general contraction are not enough to characterize NOP contraction by infinite sets of sentences. A plausible idea is to assume that the contraction by an infinite set as a limit of the contractions by its finite subsets. Let Ā be a finite subset of an infinite set A. So the Limit Postulate for general contraction can be defined as follows: S T (K ⊖ LP) K ⊖ A = K ⊖ Ā′ Zhang (1996) observed that there was still a need for a general framework for belief revision by sets of sentences, especially infinite sets, a topic that was retaken in (Zhang et al. 1997a; Zhang and Foo 2001). The Levi identity, for example, does not hold for infinite inputs. So the purpose of the articles was to extend the AGM framework (its axiomatization and modeling) to a more general one in order to include revision by any set of sentences. 4.1 General contraction In the traditional AGM account of contraction, contracting K by A means removing enough from K so that A is not implied anymore. However, according to Zhang (1996), this would break the connection between revision and contraction when A is not finite. He proposed a new operator called general contraction (denoted by ⊖ and also called set contraction in (Zhang and Foo 2001)) whose purpose is to delete sentences from K so that the remaining set is consistent with A and logically closed. Zhang and Foo (2001) observe that, even though it is different from the initial purpose of contraction, this new operator elucidates a significant intuition about contraction: we only give up our beliefs when they conflict with some new information. General contraction comes down to an auxiliary tool to build revision. General contraction amounts to the first step of internal revision and can be constructed using Definition 6. Zhang and Foo showed that the relations between AGM revision and contraction are, with suitable adaptations, also valid between multiple revision and general contraction. According to the authors, that is why they focused on set contraction in their article, i.e., multiple revision can be derived through the Levi identity. Zhang (1996) proposed a structure called total-ordering partition (TOP), which is logically independent. If the partition is rearranged in order to satisfy some logical constraints we have a nice-ordering partition (NOP)8 . Ā⊆f A Ā′ ⊆f Cn(A) Ā⊆Ā′ Theorem 8. (Zhang et al. 1997a) If ⊖ is a NOP contraction function, then ⊖ satisfies (K ⊖ LP ). Theorem 9. (Zhang et al. 1997a) Let ⊖ be a general contraction function over K. If ⊖ satisfies (K ⊖LP ), then there exists a NOP Σ = (K, P, <) of K such that ⊖ is exactly the NOP contraction generated Σ. 4.2 General revision (or set revision), denoted by ⊕ as in (Lindström 1991), can be defined in terms of general contraction analogously to the Levi Identity: (Def ⊕) K ⊕ A = Cn((K ⊖ A) ∪ A) The eight postulates proposed for general revision are the same as the ones given by Lindström (1991) and Peppas (2004), denoted by (K ⊕ 1)-(K ⊕ 8). Considering infinite inputs, according to (Zhang et al. 1997a) a corresponding assumption for general revision can be obtained in terms of (Def ⊕): S T (K ⊕ LP) K ⊕ A = K ⊕ Ā′ Definition 7. (Zhang 1996) For any belief set K, let P be a partition of K and < a total-ordering relation on P. The triple Σ = (K, P, <) is called a TOP of K. For any p ∈ P, if A ∈ p, p is called the rank of A, denoted by b(A). A TOP Σ = (K, P, <) is a NOP if it satisfies the following logical constraint: if A1 , ..., An ⊢ B, then sup{b(A1 ), ..., b(An )} ≥ b(B). Using these new concepts, we are given an explicit construction for multiple contraction functions: Ā⊆f A Ā′ ⊆f Cn(A) Ā⊆Ā′ In (Zhang and Foo 2001) it is proven that the Limit Postulate is enough to complete the fully characterization of general belief change operations. Proposition 1. (Zhang and Foo 2001) If ⊕ satisfies the postulates (K ⊕ 1) − (K ⊕ 8), then (K ⊕ LP ) is equivalent to the following condition: T K ⊕A= (K ⊕ Ā) + A Definition 8. (Zhang 1996) Let Σ = (K, P, <) be a NOP of a belief set K. The NOP contraction ⊖ is such that: • if A ∪ K is consistent, then K ⊖ A = K; • otherwise, B ∈ K ⊖ A iff B ∈ K and there exists C ∈ K such that A ⊢ ¬C and: ∀D ∈ K(C ⊢ D ∧ A ⊢ ¬D → (b(C) ∨ B < b(D)∨ ⊢ C ∨ B)) Ā⊆f Cn(A) Zhang (1996) provided a set of rationality postulates for NOP contraction, as well as another constructive approach: 8 Constructing revision The results presented in (Zhang et al. 1997a) both for general contraction and revision give a groundwork for exploring the link between non-monotonic reasoning and multiple NOP generalizes epistemic entrenchment (Gärdenfors 1988). 221 Theorem 10. (Peppas 2004) Let K be a theory of L and ⊕ a function satisfying (K ⊕ 1) − (K ⊕ 8). Then there exists a well ranked system of spheres S centered on [K] such that for all nonempty Γ ⊆ L, condition (S⊕) is satisfied. On the connection between multiple revision and AGM sentence revision, the author brings the definition of restriction and extendability: Definition 14. (Peppas 2004) For a multiple revision ⊕, the restriction of ⊕ to sentences is a function ∗ defined such that, for all theories K and ϕ ∈ L, K ∗ ϕ = K ⊕ {ϕ}. An AGM revision function ∗ is extendable iff there exists a multiple revision function ⊕ whose restriction to sentences is ∗. Based on the results from (Lindström 1991), Peppas states that the class of extendable revision functions corresponds to the family of revision functions corresponding, by means of (S*), to well ranked systems of spheres. As a consequence, all well behaved revision functions are extendable. About the plausibility to reduce multiple revision to sentence revision, if the input is finite, the reduction is presented as already showed in the previous section, i.e., K ⊕ Γ = V K ∗ Γ. If, on the other hand, the input is infinite, a possibility of reduction is proposed in the form of a theorem that works for sets Γ of arbitrary size. Nevertheless, it depends on a boundness condition. A multiple revision function ⊕ is bounded iff there exists a system of spheres S corresponding to ⊕ by meansVof (S⊕) that has only finitely many spheres. Let Z(Γ) = { ∆ : ∆ ⊆f Γ}. Then: Theorem 11. (Peppas 2004) Let ⊕ be a bounded multiple revision function and ∗ its restriction to sentences. Then, for any theory K and any set of sentencesTΓ of L, the condition (K ⊕ F ) holds as follows: K ⊕ Γ = ϕ∈Z(Γ) (K ∗ ϕ) + Γ. belief revision, a topic investigated in (Zhang et al. 1997b), where the authors proposed a rational non-monotonic system and provided two representation theorems relating Multiple Revision and their system. According to (Zhang and Foo 2001), the Limit Postulate for revision, due to its equivalence to a property of non-monotonic reasoning called Finite Supracompactness (Zhang et al. 1997b), can be called the compactness of belief revision. 5 More on Systems of Spheres After the initial work developed by Lindström (1991), Peppas (2004) studied some smoothness conditions on systems of spheres and their connection with multiple revision, giving also a constructive model for multiple revision based on systems of spheres along with a representation result. A specific smoothness condition satisfied by a total preorder on possible worlds is called the limit assumption which, in the definition of system of spheres (Section 2.1), corresponds to condition (S4). Its central function is to ensure that for every consistent sentence ϕ there is always a ’most-plausible’ ϕ-world. The smoothness conditions considered by Peppas is his article are actually variants of the limit assumption. 5.1 Well behaved functions Peppas analyzed aspects of well orderness of spheres: Definition 10. (Peppas 2004) A system of spheres S is well ordered with respect to set inclusion iff it satisfies the (SW) property: every nonempty subset of S has a smallest element with respect to ⊆. (SW) is stronger than (S4). With this definition, it is possible to define a class of revision functions: (K ⊕ F ) defines a reduction that starts with a revision of K by every finite conjunction ϕ of sentences in Γ. Then, each such revised theory K ∗ϕ is expanded by Γ and, finally, all expanded theories (K ∗ ϕ) + Γ are intersected. A set V of consistent complete theories is elementary iff V = [∩V]. In words, V is elementary if no world outside V is compatible with the theory ∩V. Consider the following condition on a system of spheres S: S (SF) For every G ⊆ S, G is elementary. It is exactly the smoothness condition needed for the reduction of multiple revision to sentence revision in the spirit of (K ⊕ F ): Theorem 12. (Peppas 2004) Let K be a theory of L, ⊕ a multiple revision function and ∗ its restriction to sentences. Moreover, let S be a well ranked system of spheres centered on [K] that corresponds to ⊕ by means of (S⊕). Then, S satisfies (SF) iff ⊕ satisfies (K ⊕ F ). Definition 11. (Peppas 2004) A revision ∗ is well behaved iff it can be constructed by well ordered systems of spheres. The extension of the construction based on systems of spheres to multiple revision can be defined as follows: Definition 12. (Peppas 2004) Let K be a theory of L and S a system of spheres centered on [K]. The multiple revision of K by Γ is: ∩fS (Γ) if [Γ] 6= ∅ (S⊕) K ⊕Γ= L otherwise However, Peppas observes that if S is restricted only by conditions (S1) − (S4), we cannot assume that for a set of sentences Γ there is always a smallest sphere intersecting [Γ], even for consistent inputs. Thus, an extra constraint on S is needed: Definition 13. (Peppas 2004) A system of spheres S is called a well ranked system of spheres if it satisfies the (SM) property: for every consistent set of sentences Γ there exists a smallest sphere in S intersecting [Γ]. 5.2 Extra constraints Peppas, Koutras and Williams (2012) observed that the limit postulate demanded additional constraints on systems of spheres and its relationship with the condition defined in Proposition 1 was still an open problem. Theorem 13. (Peppas, Koutras, and Williams 2012) There exists a consistent theory K and a multiple revision function For studying multiple revision, the author restricted the systems of spheres considered to the family of well ranked ones. He recalls the postulates for multiple revision as defined by Lindström (1991), calling a function ⊕ that obeys the set of postulates as a multiple revision function. 222 (Uniformity 2) Given A and A′ two consistent sets, for all subset X of B, if X ∪ A ⊢ ⊥ iff (X ∪ A′ ) ⊢ ⊥ then B ∩ (B ∗ A) = B ∩ (B ∗ A′ ). ⊕ satisfying (K⊕1)−(K⊕8) such that ⊕ satisfies (K⊕LP ) but violates (K ⊕ F ) at K. From the theorem above we can conclude that (K ⊕ LP ) is strictly weaker than (K ⊕ F ). A sphere U ∈ S is said to be proper if it contains at least one world outside all spheres smaller S than U . The core of U , denoted by U c , in the set U c = {V ∈ S : V ⊆ U }. So, a sphere U ∈ S is proper iff U 6= U c . Considering proper spheres, it is possible to add an extra restriction to them: (EL) (Relevance) If α ∈ B \ (B ∗ A) then there is a set C such that B ∗ A ⊆ C ⊆ (B ∪ A), C is consistent with A but C ∪ {α} is inconsistent with A. (Core-retainment) If α ∈ B \ (B ∗ A) then there is a set C such that C ⊆ (B ∪ A), C is consistent with A but C ∪ {α} is inconsistent with A. All proper spheres in S are elementary. Except for Weak Success and Uniformity 1, the postulates above were adapted from similar postulates for singleton revision (Hansson 1999b). By generalizing the techniques from classical belief base revision, the authors defined two kinds of construction: Package Kernel Revision and Package Partial Meet Revision9 . There is a dissimilarity between (K ⊕ LP ) and (EL): Lemma 2. (Peppas, Koutras, and Williams 2012) There is a consistent theory K and a well ranked system of spheres S centered on [K] such that S satisfies (EL) and yet the multiple revision function ⊕ induced from S violates (K ⊕ LP ). Definition 16. (Falappa et al. 2012) Let B be a belief base and σ an incision function. The package kernel revision on B generatedby σ is the operator ∗σ such that, for all set A: (B \ σ(B ↓↓ A)) ∪ A if A is consistent B ∗σ A = K otherwise where B ↓↓ A stands for the minimal subsets of B inconsistent with A. This last result shows that (EL) is not enough to characterize (K ⊕ LP ) and, at the same time, (SF) is too strong. So there is a need for something in the middle. Before that, we need the definition of a finitely reachable sphere: Definition 15. (Peppas, Koutras, and Williams 2012) Let K be a theory and S a system of spheres centered on [K]. A sphere V is finitely reachable in S iff there exists a consistent sentence ϕ ∈ L such that CS (ϕ) = V. Theorem 15. (Falappa et al. 2012) An operator ∗ is a package kernel revision for B iff it satisfies inclusion, consistency, weak success, vacuity 1, uniformity 1 (and uniformity 2), and core-retainment. Consider the following restrictions on a system of spheres S, where V is an arbitrary sphere in S: (R1) If V is finitely reachable then V is elementary. (R2) If V is finitely reachable then V c is elementary. T (R3) If V c 6= V then [ V c ] ⊆ V. Definition 17. (Falappa et al. 2012) Let B be a set of sentences and γ a selection function. The package partial meet revision on B generated by γ is the operator ∗γ such that, for all sets A: T γ(B ↓ A) ∪ A if A is consistent B ∗γ A = B otherwise Theorem 14. (Peppas, Koutras, and Williams 2012) Let K be a theory, S a well-ranked system of spheres centered on [K] and ∗ the multiple revision induced from S on K via (S⊕). Then ∗ satisfies (K ⊕LP ) iff S satisfies (R1)−(R3). 6 Theorem 16. (Falappa et al. 2012) An operator ∗ is a package partial meet revision for K iff it satisfies inclusion, consistency, weak success, vacuity 1, uniformity 2 (and uniformity 1), and relevance. Direct Constructions Fuhrmann (1988), based on the Levi Identity, states that revision is clearly a compound operation, using this argument to defend that, given the not-very-complex nature of expansion operations, a theory of belief change should concentrate on the contraction operation. However, it would be interesting to study the definition of revision operations in a direct way, i.e., without using contraction as an intermediate step. This is one of the main goals of the works developed in (Falappa et al. 2012; Valdez and Falappa 2016) for belief bases. Let B, A and A′ be sets of sentences and ∗p be a package revision operator on belief bases. Falappa et al. (2012) defined postulates for this operation: The only non-prioritized aspect of these operators is that no change is performed when the incoming information is inconsistent. In a very similar way to what was done in (Falappa et al. 2012), Valdez and Falappa (2016) proposed two constructions for multiple package revision in Horn logic, one based on kernel and the other based on partial meet. Both of them were characterized axiomatically. 7 Non-prioritized Multiple Revision As seen in Section 3.1, multiple revision comes in two flavors. Choice revision is a non-prioritized kind of (multiple) revision, as the input does not have priority with respect to the initial beliefs: the agent can incorporate part of the new beliefs whilst ignoring the rest. (Falappa et al. 2012) and (Zhang 2018) also note that choice revision cannot be reduced to selective revision (Fermé and Hansson 1999). (Inclusion) B ∗ A ⊆ B ∪ A. (Weak Success) If A is consistent then A ⊆ B ∗ A. (Consistency) If A is consistent then B ∗ A is consistent. (Vacuity 1) If A is inconsistent then B ∗ A = B. (Uniformity 1) Given A and A′ two consistent sets, for all subset X of B, if X ∪ A ⊢ ⊥ iff (X ∪ A′ ) ⊢ ⊥ then B \ (B ∗ A) = B \ (B ∗ A′ ). 9 223 Originally, Package Revision was called Prioritized Change. 7.1 The Levi Identity model Due to the usage of set-negation to perform the contraction part of the choice operation, this approach proposed in (Zhang 2018) depends on negation and on disjunction of sentences. Zhang (2018) proposed two types of choice revision, one on the contraction + expansion approach and the other on the expansion + contraction one. Both of them were axiomatically characterized. Before defining the operation, the author introduces an auxiliary operation called partial expansion (∔), which is a generalization of the traditional expansion operation. Given two sets of sentences B and A, B ∔ A contains the whole B plus some of A. Zhang also adapted a definition of negation set given in (Hansson 1993): Definition 18. (Zhang 2018) Let A be some set of sentences. Then the negation set neg(A) of A is defined as follows: 1. neg(∅) = ⊤, 2. neg(A) = ∪n≥1 {¬ϕ1 ∨ ¬ϕ2 ∨ · · · ∨ ¬ϕn | ϕi ∈ A for every i such that 1 ≤ i ≤ n} Finally, two constructions are shown: one for internal revision and one for external revision, both of them depending on contraction. Definition 19. An operator ∗c is an internal choice revision iff there exists a choice contraction −c and a consistencypreserving partial expansion ∔ such that for all sets B and A, B ∗c A = B −c neg(A) ∔ A. Theorem 17. ∗c is an internal choice revision on consistent belief bases with finite inputs iff it satisfies the following conditions: for every consistent B and finite A and A′ , (∗c -inclusion) B ∗c A ⊆ (B ∪ A) (∗c -success) If A 6= ∅, then A ∩ (B ∗c A) 6= ∅ (∗c -iteration) B ∗c A = (B ∩ (B ∗c A)) ∗c A (∗c -consistency) If A 6≡ {⊥}, then B ∗c A 0 ⊥ (∗c -coincidence) If A ∩ B 6= ∅ and A ⊆ A′ ⊆ (A ∪ B), then B ∗c A = B ∗c A′ (∗c -uniformity) If it holds for all B ′ ⊆ B that B ′ ∪ {ϕ} ⊢ ⊥ for some ϕ ∈ A iff B ′ ∪ {ψ} ⊢ ⊥ for some ψ ∈ A′ , then B ∩ (B ∗c A) = B ∩ (B ∗c A′ ) (∗c -relevance) If ϕ ∈ B \ B ∗c A, then there is some B ′ with B ∩ (B ∗c A) ⊆ B ′ ⊆ B, such that B ′ ∪ {ψ} 0 ⊥ for some ψ ∈ A and B ′ ∪ {ϕ} ∪ {λ} ⊢ ⊥ for every λ ∈ A Definition 20. An operator ∗c is an external choice revision iff there exists a package contraction −p and a partial expansion ∔ such that for all B and A, B ∗c A = B ∔ A −p neg(A′ ), where A′ = (B ∔ A) \ B. Theorem 18. ∗c is an external choice revision iff, for all B, B1 , B2 , A and A′ , it satisfies ∗c -inclusion, success, coincidence and: (∗c -confirmation) If (A∩(B∗c A)) ⊆ B, then B∗c A = B (∗c -consistency) If (B ∗c A)\B 6= ∅ and (B ∗c A)\B 0 ⊥ (∗c -uniformity) If B1 6= ((B1 ∗c A)∪B1 ) = B = ((B2 ∗c A′ ) ∪ B2 ) 6= B2 and it holds for all B ′ ⊆ B that B ′ ∪ ((B1 ∗c A) \ B1 ) 0 ⊥ iff B ′ ∪ ((B2 ∗c A′ ) \ B2 ) 0 ⊥, then B1 ∗c A = B2 ∗c A′ (∗c -relevance) If ϕ ∈ B \ B ∗c A, then there is some B ′ with B ∗c A ⊆ B ′ ⊆ B ∪ (B ∗c A), such that B ′ 0 ⊥ and B ′ ∪ {ϕ} ⊢ ⊥ 7.2 The Descriptor Revision approach Zhang (2019) also proposed two types of choice revision but based on a different approach to belief change named Descriptor Revision (Hansson 2014). This approach applies a “select-direct” procedure by considering that there is a set of belief sets as possible results of belief change and this change is implemented through a direct choice among these possible results. Both types were characterized axiomatically through a set of postulates and a representation theorem, with the assumption of a finite language. It is important to observe that, in this approach, revision was explored without taking into account its connection with contraction, i.e., it was defined without using contraction as an intermediate step. Zhang (2019) shows that, in general, choice revision V by a finite set A cannot be reduced to selective revision by A and, similarly, it is not possible to perform choice revision by an AGM revision using a disjunction of all the sentences of the input. 7.3 The Multiple Believability Relations approach Zhang (2019) also proposed a second modeling for choice revision, based on Multiple Believability Relations and without assuming a finite language. Zhang defines a believability relation as a binary relation representing that “the subject is at least as prone to believing ϕ as to believing ψ” (ϕ ψ). A multiple believability relation ∗ is a binary relation on finite sets of formulas satisfying ϕ ψ iff {ϕ} ∗ {ψ}. One of the ways of proceeding with the generalization described above is defining choice multiple believability relations (c ). A c B indicates that it is easier for an agent to absorb the plausible information in A than that in B. A construction for this operation was proposed and axiomatically characterized. 7.4 The Semi-Revision approach The operators defined in (Falappa, Kern-Isberner, and Simari 2002) work with partial acceptance in the following way: for a belief set K and an input set A, the incoming set is initially accepted and, then, all possible inconsistencies of K ∪ A are removed. So it is a kind of external revision. However, the input sets considered are explanations. An explanation contains an explanans (the beliefs that support a consequence) and an explanandum (the final consequence). Each explanation is a set of sentences with some restrictions. The idea is that it does not seem rational for an agent to absorb any external belief without an explanation to support the provided belief, especially if the new information is not consistent with its own set of beliefs. The authors generalized the framework from (Hansson 1997) to define an operator ◦ that support sets of sentences (explanations) as input. Two constructions were proposed, one based on kernel sets and the other based on remainder sets, and both were characterized axiomatically. 224 8 The core beliefs approach Theorem 19. (Yuan, Ju, and Wen 2014) Let (B, A) ∈ B, A = Cn(∅) ∩ B and ⊲ be an EMR operator for (B, A). In the literature, we find two approaches for multiple revision which are based on the concept of core beliefs. The belief set to be revised has a subset taken as core, which is entirely preserved independently of the new information. Both approaches are characterized axiomatically and also receive two constructions: one based on kernels and another based on remainders. In addition, the approaches use the concept of belief state defined as follows: 1. If ⊲ is a KEMR operator, then ∗⊲ is a multiple package kernel revision operator. 2. If ⊲ is a PMEMR operator, then ∗⊲ is a multiple package partial meet revision operator. EMR is also similar to selective revision, as in both approaches the input is treated by a separate mechanism before effectively being used to perform revision. Nevertheless, while the transformation function from Selective Revision usually returns logical consequences of the input, the decision module from EMR produces subsets of the incoming set. In addition, Selective Revision does not protect core beliefs. Hence, EMR cannot be considered a generalization of Selective Revision. Definition 21. (Yuan, Ju, and Wen 2014) A belief state is a pair S = (B, A) satisfying: A ⊆ B ⊆ L, A is consistent and A is logically closed within B, i.e., Cn(A) ∩ B ⊆ A. The set of all belief states is denoted by B. For every (B, A) ∈ B, B is called the belief base and A the set of core beliefs. As common properties, both operators satisfy three principles: minimal change (the agent should preserve as much old beliefs as possible), consistency (the resulting belief state should be consistent after revision) and protection (core beliefs should always be preserved). 8.1 8.2 Rational Metabolic Revision There are some contexts where the agent cannot identify, initially, the implausible part of an incoming information. One option is to incorporate all the new beliefs (expansion), which may lead to some conflicts that will be useful to detect the implausible information and consolidate the belief state. Yuan (2017) proposed a new multiple revision operator along this line, metabolic revision. This operator lies in the expansion + consolidation variety of non-prioritized belief revision (Hansson 1999a). The name metabolic revision is due to a correlation with body metabolism. If an animal finds some food and considers it as good to eat, the animal will ingest it and later its body will eliminate some harmful substance or trash by the digestive system. The idea for the operator is to work in a similar manner with new information. The metabolic revision operator is represented by ♦ and maps a belief state (B, A) and a set of beliefs D to a new belief state (B ′ , A′ ). Yuan proposed two constructions: one based on kernel (♦σ ) and the other on partial meet (♦γ ). Both were characterized axiomatically. As observed by Yuan, semi-revision (Hansson 1997) is not a particular case of metabolic revision when A is empty and D is a singleton. While it is possible to establish an interrelation between two semi-revisions of different belief bases, metabolic revision is defined for a fixed belief state, i.e., properties for the interrelation between two metabolic revisions on different belief states were not defined. Evaluative Multiple Revision Evaluative Multiple Revision (EMR) is an operation through which the new information, instead of being directly handled, is pre-processed in an evaluation process that takes into account the core beliefs of the agent and, then, the revision is performed. Therefore, it is considered a sort of nonprioritized multiple revision, as the whole new information is not necessarily incorporated. EMR falls into the decision + revision variety of non-prioritized belief revision (Hansson 1999a), i.e., a two-phase revision process. The new information is first submitted to a decision module which, using the core beliefs as criteria, performs an evaluation and produces two disjoint sets – one for plausible information and another for implausible: Definition 22. (Yuan, Ju, and Wen 2014) Given a belief state (B, A), an A-evaluation is a pair of sets of formulas in L, denoted by I|P , satisfying: (i) I ∪ P 6= ∅, (ii) A ∪ P 0 ⊥ and (iii) Cn(A ∪ P ) ∩ I = ∅. I is the set of implausible new information while P is the set of plausible new information. The set of all Aevaluations is denoted by A. Differently from other frameworks, the revision module does not receive a single set of sentences to perform the revision operation. It receives the pair of sets produced by the previous module. The idea is to revise the agent’s beliefs by the plausible set and, at the same time, contract them by the implausible set. So, the EMR operator ⊲ maps a belief state (B, A) and an A-evaluation I|P to a new belief state, that is, the result of (B, A) ⊲ I|P is a pair as well. The authors proposed two constructions: one based on the kernel operation (KEMR, denoted by ⊲σ ) and another one on partial meet operation (PMEMR, denoted by ⊲γ ). Both of them were characterized axiomatically. EMR was compared with the operations of multiple package revision defined in (Falappa et al. 2012) (shown in Section 6). Roughly, the operations ∗σ and ∗γ are special cases of ⊲σ and ⊲γ , respectively, when I is empty: 9 Conclusion and Open Problems We presented a literature overview covering several models of Multiple Revision. We did not include the works on Iterated Revision since our focus was on models in which the beliefs of the incoming set are processed simultaneously. One of our aims was to unify the terminology and the symbols used in the area. Another goal was to bring an overview of the Multiple Revision literature, shortly describing different works and comparing some of them. When applicable, we observed the possibility or not of reduction to singleton revision. 225 Grove, A. 1988. Two modellings for theory change. Journal of philosophical logic 157–170. Hansson, S. O. 1993. Reversing the levi identity. Journal of Philosophical Logic 22(6):637–669. Hansson, S. O. 1994. Kernel contraction. Journal of Symbolic Logic 59(3):845–859. Hansson, S. 1997. Semi-revision. Journal of Applied NonClassical Logics 7(1-2):151–175. Hansson, S. O. 1999a. A survey of non-prioritized belief revision. Erkenntnis 50(2-3):413–427. Hansson, S. O. 1999b. A Textbook of Belief Dynamics. Kluwer Academic Publishers. Hansson, S. O. 2014. Descriptor revision. Studia Logica 102(5):955–980. Konieczny, S., and Pérez, R. P. 2002. Merging information under constraints: a logical framework. Journal of Logic and computation 12(5):773–808. Lindström, S. 1991. A semantic approach to nonmonotonic reasoning: inference operations and choice. Uppsala Prints and Preprints in Philosophy 6. Peppas, P.; Koutras, C. D.; and Williams, M.-A. 2012. Maps in multiple belief change. ACM Trans. Comput. Logic 13(4):30:1–30:23. Peppas, P. 2004. The limit assumption and multiple revision. Journal of Logic and Computation 14(3):355–371. Rott, H. 2001. Change, Choice and Inference. Oxford University Press. Valdez, N. J., and Falappa, M. A. 2016. Multiple revision on horn belief bases. In XXII Congreso Argentino de Ciencias de la Computación (CACIC 2016). Yuan, Y.; Ju, S.; and Wen, X. 2014. Evaluative multiple revision based on core beliefs. Journal of Logic and Computation 25(3):781–804. Yuan, Y. 2017. Rational metabolic revision based on core beliefs. Synthese 194(6):2121–2146. Zhang, D., and Foo, N. 2001. Infinitary belief revision. Journal of Philosophical Logic 30(6):525–570. Zhang, D.; Chen, S.; Zhu, W.; and Chen, Z. 1997a. Representation theorems for multiple belief changes. In IJCAI, 89–94. Zhang, D.; Chen, S.; Zhu, W.; and Li, H. 1997b. Nonmonotonic reasoning and multiple belief revision. In IJCAI, 95–101. Zhang, D. 1996. Belief revision by sets of sentences. Journal of Computer Science and Technology 11(2):108–125. Zhang, L. 2018. Choice revision on belief bases. arXiv preprint arXiv:1805.01325. Zhang, L. 2019. Choice revision. Journal of Logic, Language and Information 28(4):577–599. The operations described in this paper involve basically two main models of epistemic states: sentential models (belief sets, belief bases and belief states) and possible worlds. Most of the operations work with package revision but there are also a few for choice revision. We can now list some open problems. Regarding the operators defined in (Falappa et al. 2012), the interrelations between package kernel and partial meet revision are not clear. From (Yuan, Ju, and Wen 2014), a possible future work is the characterization of non-prioritized multiple revision in a unified way without dividing it into two modules. From (Yuan 2017), further exploration includes the definition of consolidation based on core beliefs and its relation with metabolic revision. For the choice revision operators defined in (Zhang 2018) it remains to define their respective constructions and also to study and establish the differences and connections between these operators and the one based on Descriptor Revision from (Zhang 2019). Choice Revision could also be defined and constructed without using contraction as an intermediate step, as well as investigated with infinite inputs. In relation to other approaches, Selective Revision could be generalized to the multiple case and non-prioritized multiple revision operators could be investigated in their relation with merge operators (Fuhrmann 1997; Falappa et al. 2012). Regarding the underlying logic of each model presented here, as most of them were developed for propositional logic, an important future work is to investigate how they can be adapted to work with other logics. References Alchourrón, C., and Makinson, D. 1982. On the logic of theory change: Contraction functions and their associated revision functions. Theoria 48(01):14–37. Alchourrón, C.; Gärdenfors, P.; and Makinson, D. 1985. On the logic of theory change. J Symbolic Logic 50:510–530. Darwiche, A., and Pearl, J. 1997. On the logic of iterated belief revision. Artificial Intelligence 89(1–2):1–29. Delgrande, J., and Jin, Y. 2012. Parallel belief revision. Artificial Intelligence 176(1):2223–2245. Falappa, M.; Kern-Isberner, G.; Reis, M.; and Simari, G. 2012. Prioritized and non-prioritized multiple change on belief bases. Journal of Philosophical Logic 41(1):77–113. Falappa, M. A.; Kern-Isberner, G.; and Simari, G. R. 2002. Explanations, belief revision and defeasible reasoning. Artificial Intelligence 141(1-2):1–28. Fermé, E., and Hansson, S. O. 1999. Selective revision. Studia Logica 63(3):331–342. Fuhrmann, A., and Hansson, S. O. 1994. A survey of multiple contractions. Journal of Logic, Language and Information 3(1):39–75. Fuhrmann, A. 1988. Relevant Logics, Modal Logics and Theory Change. Ph.D. Dissertation, Australian National Univ. Fuhrmann, A. 1997. An Essay on Contraction. FOLLI. Gärdenfors, P. 1988. Knowledge in Flux - Modeling the Dynamics of Epistemic States. MIT Press. 226 A Principle-based Approach to Bipolar Argumentation Liuwen Yu University of Luxembourg, Luxembourg, University of Bologna, Italy, University of Turin, Italy Leendert van der Torre University of Luxembourg, Luxembourg, Zhejiang University, China Abstract be represented more easily in so-called bipolar argumentation frameworks (Cayrol and Lagasquie-Schiex 2005; 2009; 2010; 2013) containing besides attack also a support relation among arguments. The concept of support has attracted quite some attention in the formal argumentation literature, maybe because it remains controversial how to use support relations to compute extensions. Most studies distinguish deductive support, necessary support and evidential support. Deductive support (Boella et al. 2010) captures the intuition that if a supports b, then the acceptance of a implies the acceptance of b, and as a consequence the non-acceptance of b implies the non-acceptance of a. Evidential support (Besnard and others 2008; Oren, Luck, and Reed 2010) distinguishes prima-facie from standard arguments, where prima-facie arguments do not require any support from other arguments to stand, while standard arguments must be supported by at least one prima-facie argument. Necessary support (Nouioua and Risch 2010) captures the intuition that if a supports b, then the acceptance of a is necessary to get the acceptance of b, or equivalently the acceptance of b implies the acceptance of a. Despite this diversity, the study of support in abstract argumentation seems to agree on the following three points. Relation support and attack The role of support among arguments has been often defined as subordinate to attack, in the sense that in deductive and necessary support, if there are no attacks then there is no effect of support. On the contrary, in the evidential approach, without support, there is no accepted argument even if there is no attack. Diversity of support Different interpretations for the notion of support can be distinguished, such as deductive (Boella et al. 2010), necessary (Nouioua and Risch 2011; Nouioua 2013) and evidential support (Besnard and others 2008; Oren, Luck, and Reed 2010; Polberg and Oren 2014). Structuring support Whereas attack has been further structured into rebutting attack, undermining attack and undercutting attack, the different kinds of support have not led yet to a structured argumentation theory for bipolar argumentation frameworks. The picture that emerges from the literature is that the notion of support is much more diverse than the notion of at- Support relations among arguments can induce various kinds of indirect attacks corresponding to deductive, necessary or evidential interpretations. These different kinds of indirect attacks have been used in metaargumentation, to define reductions of bipolar argumentation frameworks to traditional Dung argumentation frameworks, and to define constraints on the extensions of bipolar argumentation frameworks. In this paper, we give a complete analysis of twenty eight bipolar argumentation framework semantics and fourteen principles. Twenty four of these semantics are for deductive and necessary support and defined using a reduction, and four other semantics are defined directly. We consider five principles directly corresponding to the different kinds of indirect attack, three basic principles concerning conflict-freeness and the closure of extensions under support, three dynamic principles, a generalized directionality principle, and two supported argument principles. We show that two principles can be used to distinguish all reductions, and that some principles do not distinguish any reductions. Our results can be used directly to obtain a better understanding of the different kinds of support, to choose an argumentation semantics for a particular application, and to guide the search for new argumentation semantics of bipolar argumentation frameworks. Indirectly they may be useful for the search for a structured theory of support, and the design of algorithms for bipolar argumentation. keywords: Abstract argumentation, support, principle-based approach, bipolar argumentation framework Introduction In his requirements analysis for formal argumentation, Gordon (2018) proposes the following definition covering more clearly argumentation in deliberation as well as persuasion dialogues: “Argumentation is a rational process, typically in dialogues, for making and justifying decisions of various kinds of issues, in which arguments pro and con alternative resolutions of the issues (options or positions) are put forward, evaluated, resolved and balanced.” At an abstract level, it seems that these pro and con arguments can Copyright c 2020, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. 227 tack. Whereas there is a general agreement in the formal argumentation community how to interpret attack, even when different kinds of semantics have been defined, there is much less consensus on the interpretation of support. Moreover, it seems that each variant of support can be used for different applications. This paper contributes to a further understanding of the concept of support using a principle-based analysis. Some of the fourteen principles we study in this paper turn out to discriminate the various reductions based semantics of bipolar argumentation frameworks, and they can therefore be used to choose one semantics over another. Some other principles always hold, or never, and can therefore guide the search for new semantics of bipolar argumentation frameworks. Principles and axioms can be used in many ways. Often, they conceptualize the behavior of a system at a higher level of abstraction. Moreover, in absence of a standard approach, principles can be used as a guideline for choosing the appropriate definitions and semantics depending on various needs. Therefore, in formal argumentation, principles are often more technical. The most discussed principles are admissibility, directionality and scc decomposibility, which all have a technical nature. In this paper, from these we study a generalized notion of directionality, taking into account not only the directionality of the attacks, but also of the supports. The layout of this paper is as follows. In Section 2 we introduce the four kinds of indirect attack corresponding to deductive and necessary interpretation discussed in the literature on bipolar argumentation. In Section 3 we introduce the four atomic reductions corresponding to these four kinds of indirect attack, and two iterated reductions. In Section 4 we introduce the fragment of bipolar argumentation frameworks with evidential support. In Section 5 we introduce the new principles and we give an analysis of which properties are satisfied by which reduction. Section 6 is devoted to the related work and to some concluding remarks. We distinguish several definitions of extension, each corresponding to an acceptability semantics that formally rules the argument evaluation process. Definition 3 (Acceptability semantics (Dung 1995)) Let hA, Ri be an AF: • E ⊆ A is admissible iff it is conflict-free and defends all its elements. • A conflict-free E ⊆ A is a complete extension iff E = {a|E defends a}. • E ⊆ A is the grounded extension iff it is the smallest (for set inclusion) complete extension. • E ⊆ A is a preferred extension iff it is a largest (for set inclusion) complete extension. • E ⊆ A is a stable extension iff it is a preferred extension that defeats all arguments in A\E. Example 1 (Four arguments) The argumentation framework visualized on the left hand side of Figure 1 is defined by AF = h{a, b, c, d}, {(a, b), (b, a), (c, d), (d, c)}i. There are four preferred extensions: {a, c}, {b, c}, {a, d}, {b, d}, and they are all stable extensions. a b a b c d c d Figure 1: An argumentation framework (AF) and a bipolar argumentation framework (BAF) Bipolar argumentation framework is an extension of Dung’s framework. It is based on a binary attack relation between arguments and a binary support relation over the set of arguments. Definition 4 (Bipolar argumentation framework) (Cayrol and Lagasquie-Schiex 2005) A bipolar argumentation framework (BAF, for short) is a 3-tuple hA, R, Si where A is a set of arguments, R is a binary attack relation ⊆ A × A and S is a binary support relation ⊆ A × A, and R ∩ S = φ . Thus, an AF is a special BAF with the form hA, R, 0i. / A BAF can be represented as a directed graph. Given a, b, c ∈ A, (a, b) ∈ R means a attacks b, noted as a → b; (b, c) ∈ S means b supports c, noted as b 99K c. Example 2 (Four arguments, continued) The bipolar argumentation framework visualized at the right hand side of Figure 1 extends the argumentation framework in Example 1 such that a supports c. Support relations only influence the extensions when there are also attacks, leads to the study of the interactions between attack and support. In the literature, the different kinds of relations between support and attack have been studied as different notions of indirect attack. Indirect Attacks in Bipolar Argumentation framework This section gives a brief summary of the concept of indirect attack in bipolar argumentation. Dung’s argumentation framework (Dung 1995) consists of a set of arguments and a relation between arguments, which is called attack. Definition 1 (Argumentation framework (Dung 1995)) An argumentation framework (AF) is a tuple hA, Ri where A is a set of arguments and R ⊆ A × A is a binary attack relation over A . An AF can be represented as a directed graph, where the nodes represent arguments, and the edges represent the attack relation: Given a, b ∈ A, (a, b) ∈ R stands for a attacks b, noted as a → b. Definition 2 (Conflict-freeness & Defense (Dung 1995)) Let hA, Ri be an AF: • E ⊆ A is conflict-free iff ∄a, b ∈ E such that (a, b) ∈ R. • E ⊆ A defends c iff ∀b ∈ A with (b, c) ∈ R, ∃a ∈ E such that (a, b) ∈ R. Definition 5 (Four indirect attacks (Polberg 2017)) Let BAF = hA, R, Si be a BAF and a, b ∈ A, there is: 228 • a supported attack from a to b in BAF iff there exists an argument c s.t. there is a sequence of supports from a to c and c attacks b, represented as (a, b) ∈ Rsup . attacks were built from the combination of direct attacks and the supports, then from the obtained indirect attacks and the support we can build additional indirect attacks and so on. • a mediated attack from a to b in BAF iff there exists an argument c s.t. there is a sequence of supports from b to c and a attacks c, represented as (a, b) ∈ Rmed . Definition 7 (Tiered indirect attacks(Polberg 2017)) Let BF = (A, R, S) be a BAF. The tiered indirect attacks of BF are as follows : • a secondary attack from a to b in BAF iff there exists an argument c s.t. there is a sequence of supports from c to b and a attacks c, (a, b) ∈ Rsec . • R0 ind = 0/ sup sec med ext • Rind 1 = R0/ , R0/ , R0/ , R0/ sup sec med ext ind • Rind i = {RE , RE , RE , RE | E ⊆ Ri−1 } for i > 1, where: • a extended attack from a to b in BAF iff there exists an argument c s.t. there is a sequence of supports from c to a and c attacks b, (a, b) ∈ Rext . a c – Rsup E = {(a, b) | there exists an argument c s.t. there S is a sequence of supports from a to c and (c, b) ∈ R ∪ E} – Rsec E = {(a, b) | there exists an argument c s.t. there S is a sequence of supports from c to b and (a, c) ∈ R ∪ E} – Rmed E = {(a, b) | there exists an argument c s.t. there S is a sequence of supports from b to c and (a, c) ∈ R ∪ E} – Rext E = {(a, b) | there exists an argument c s.t. there S is a sequence of supports from c to a and (c, b) ∈ R ∪ E} b (a) Supported attack a c With Rind denote the collection of all sets of indirect S we will ind attacks ∞ i=0 Ri b (b) Mediated attack Deductive and necessary support In this section we rephrase the different kinds of indirect attacks as an intermediate step towards semantics for bipolar argumentation frameworks. The reductions can be used together with definitions 2 and 3 to define the extensions of a (c) Secondary attack bipolar argumentation framework. The notion of conflict-free does not change, in the sense a c b that the conflict-free principle for bipolar frameworks is defined in the same way as the related principle for Dung’s theory, though now also indirect attacks are taken into ac(d) Extended attack count. For example, support relations can help arguments to defend against other arguments, and in general support relaFigure 2: Four kinds of indirect attack tions can influence the acceptability of arguments in various ways. If we have a bipolar argumentation framework withDefinition 6 (Super-mediated attack (Cayrol and Lagasquie-Schiex 2013)) out support relations we would like to recover Dung’s defiLet BAF = hA, R, Si be a BAF and a, b ∈ A, there is a nitions 2 and 3, such that bipolar argumentation is a proper super-mediated attack from a to b in BAF iff there exists extension of Dung’s argumentation. Moreover, if we have an argument c s.t. there is a sequence of supports from b a bipolar argumentation framework without attack relations, to c and a direct or supported attacks c, represented as then we would like to accept all arguments, for all semantics. med (a, b) ∈ RRsup . The idea of the atomic reductions is that we interpret all the support relations of the framework according to one of the types of support. This will help us in the analysis of the behavior of the different kinds of support. a c b Definition 8 (Existing reductions of BAF to AF) Given a BAF = hA, R, Si, ∀a, b, c ∈ A: d a c b • SupportedReduction (Cayrol and Lagasquie-Schiex 2013) (RS for short): (a, b) ∈ Rsup is the collection of supported attack iff (a, c) ∈ S and (c, b) ∈ R, RS(BAF) = (A, R ∪ Rsup ). • MediatedReduction (Cayrol and Lagasquie-Schiex 2013) (RM for short): (a, b) ∈ Rmed is the collection of mediated attackiff (b, c) ∈ S and (a, c) ∈ R, RM(BAF) = A, R ∪ Rmed . Figure 3: Super-mediated attack We can obtain various kinds of indirect attacks according to different interpretation of support relation. These indirect 229 • SecondaryReduction (Cayrol and Lagasquie-Schiex 2013) (R2 for short): (a, b) ∈ Rsec is the collection of secondary attack iff (c, b) ∈ S and (a, c) ∈ R, R2(BAF) = (A, R ∪ Rsec ). • ExtendedReduction (Cayrol and Lagasquie-Schiex 2013) (RE for short): (a, b) ∈ Rext is the collection of extended attack, iff (c, a) ∈ S and (c, b) ∈ R, RE(BAF) = (A, R ∪ Rext ). a b a b a b c d c d c d RS (BAF) The initial BAF • DeductiveReduction(Polberg 2017)(RD for short) Let ind the collection of supported and R′ = {Rsup , Rmed Rsup } ⊆ R S super-mediated attacks in BF, RD(BAF) = (A, R ∪ R′ ). • NecessaryReduction(Polberg 2017)(RN for short) Let R′ = {Rsec , Rext } ⊆ Rind the collection of secondary and S extended attacks in BF, RN(BAF) = (A, R ∪ R′ ). a b a b a b c d c d c d R2 (BAF) In general, we write E(BAF) for the extensions of a BAF, which is characterized by a reduction and a Dung semantics. We write ES (BAF) for the extensions of the bipolar framework using Dung semantics S. Example 3 (Six reductions, continued) The reduction of the bipolar argumentation framework in Example 2 is visualized in Figure 4. The reductions lead to the following extensions. RM (BAF) RE (BAF) a b c d RD (BAF) RN(BAF) • After RS, we get the associated AF with the addition of a attacks b, the preferred extensions are: (a, c), (b, c), (b, d); • After RM, we get the associated AF with the addition of d attacks a, the preferred are extensions: (a, c), (b, c), (b, d); • After R2, we get the associated AF with the addition of a attacks d, the preferred are extensions: (a, c), (a, d), (b, d); • After RE, we get the associated AF with the addition of a attacks d, the preferred are extensions: (a, c), (a, d), (b, d). • After RD, we get the associated AF with the addition of a attacks d, the preferred are extensions: (a, c), (b, c), (b, d). • After RN, we get the associated AF with the addition of a attacks d, the preferred are extensions: (a, c), (a, d), (b, d). Figure 4: The initial BAF with the associated AFs after Reductions Moreover, evidential support contains special arguments which do not need to be supported by other arguments. Such arguments may have to satisfy other constraints, for example that they cannot be attacked by ordinary arguments, or that they cannot attack ordinary arguments. To keep our analysis uniform, we do not explicitly distinguish such special arguments, but encode them implicitly: if an argument supports itself, then it is such a special argument. This leads to the following definition of an evidential sequence for an argument. Definition 9 (Evidential sequence) Given a BAF = hA, R, Si. A sequence (a0 , . . . , an ) of elements of A is an evidential sequence for argument an iff (a0 , a0 ) ∈ S, and for 0 ≤ i<n we have (ai , ai+1 ) ∈ S. Definition 10 (e-Defense & e-Admissible) Given a BAF = hA, R, Si, a set of arguments S ⊆ A e-defends argument a ∈ A iff for every evidential sequence (a0 , . . . , an ) where an attacks a, there is an argument b ∈ S attacking one of the arguments of the sequence. Moreover, a set of arguments S is e-admissible iff • for every argument a ∈ S there is an evidential sequence (a0 , . . . , a) such that each ai ∈ S (a is e-supported by S), • S is conflict free, and • S e-defends all its elements. In line with Dung’s definitions, a set of arguments is called an e-complete extension if it is e-admissible and it contains all arguments it e-defends; it is e-grounded extension iff it is a minimal e-complete extension; and it is e-preferred if it is maximal e-admissible extension. Moreover, it is e-stable if It should be noted that these atomic reductions can be combined in different ways into more complex notions of reductions. For example, it is common practice to add both the indirect attacks of two types, and also the order in which attacks are added can have an impact. Evidential support Evidential support is usually defined for a more general bipolar framework in which sets of arguments can attack or support other arguments. To keep our presentation uniform and to compare evidential support to deductive and necessary support, we only consider the fragment of bipolar argumentation frameworks where individual arguments attack or support other arguments. This also simplifies the following definitions. 230 for every for every evidential sequence (a0 , . . . , an ) where an not in S, we have an argument b ∈ S attacking an element of the sequence. We use REv(BAF) to represent the associated AF of the BAF with evidential support. a d The traditional definitions add moreover that elements of evidential support are unique, that support is minimal, and so on. This does not affect the definition of the extensions, and we therefore do not consider that in this paper. Finally, there is also a kind of reduction of bipolar frameworks to Dung frameworks, but this does not work by a reduction of support relations to attack relations. Instead, it is based on a kind of meta-argumentation, in which the arguments of the Dung framework consists of sets or sequences of arguments in the bipolar framework. As this reduction is not directly relevant for the concerns of this paper, we refer the reader to the relevant literature (Polberg 2017). a b The initial BAF c b c d BAF ′ Figure 5: The counterexample of Proof 1 Table 1: Comparison among the reductions and the proposed principles. We refer to Dung’s semantics as follows: Complete (C), Grounded (G), Preferred (P), Stable (S). When a principle is never satisfied by a certain reduction for all semantics, we use the × symbol. P1 refers to Principle 1, the same holds for the others. Red. P1 P2 P3 P4 P5 RS CGPS CGPS × × × RM CGPS × CGPS × × R2 CGPS × × CGPS × RE CGPS × × × CGPS RD CGPS CGPS × CGPS × RN CGPS × CGPS × CGPS REv × × × × × Principle-based analysis based on the different kinds of indirect attacks and property In this section we introduce principles corresponding to the different notions of indirect attack. They correspond to constraints TRA, nATT and n+ATT Cayrol et al. (Cayrol and Lagasquie-Schiex 2015). Basically these properties correspond to the interpretations underlying the different kinds of support. Principle 1 (Transitivity) For each BAF = hA, R, Si, if aSb and bSc, then EhA, R, Si = EhA, R, S ∪ {aSc}i. Principle 2 (Supported attack) For each BAF = hA, R, Si, if aSc and cRb, then EhA, R, Si = EhA, R ∪ {aRb}, Si. Principle 3 (Mediated attack) For each BAF = hA, R, Si, if bSc and aRc, then EhA, R, Si = EhA, R ∪ {aRb}, Si. Principle 6 (Conflict-free) Given a BAF = (A, R, S), if (a, b) ∈ R, then ∄E ∈ E(BAF) s.t. (a, b) ∈ E. The important principle of closure of an extension under supported arguments was introduced by Cayrol et al. (Cayrol and Lagasquie-Schiex 2015), called Principle 7 (Closure) Given a BAF = (A, R, S), for all extensions E in E, ∀a, b ∈ A, if aSb and a ∈ E, then b ∈ E. The following propositions show that closure under supported arguments holds only for some reductions. Proposition 2 RS and RM satisfy Principle 7 for all the semantics. Proof 2 To prove Proposition 2, we use proof by contradiction. Let E ⊆ A be a complete extension of an AF which is the associated framework of a BAF after RM. Assume RM does not satisfy Principle 7 for complete semantic, s.t. ∃a ∈ E, b ∈ A\E, s.t. (a, b) ∈ S. As b ∈ / E, s.t. ∃c ∈ A, (c, b) ∈ R, but ∄d ∈ E s.t. d defends b, i.e., d attacks c. If c attacks b, then c mediated attacks a, there is no d attacks c, then E is not admissible. There is a contradiction between E is not admissible and E is complete. Therefore, RM satisfies Principle 7 for complete semantics. Polberg (Polberg 2017) introduces a variant of closure, called inverse closure. Principle 8 (Inverse Closure) Given BAF = hA, R, Si, for all extensions E in E, ∀a, b ∈ A, if aSb and b ∈ E, then a ∈ E. Principle 4 (Secondary attack) For each BAF = hA, R, Si, if cSb and aRc, then EhA, R, Si = EhA, R ∪ {aRb}, Si. Principle 5 (Extended attack) For each BAF = hA, R, Si, if cSa and cRb, then EhA, R, Si = EhA, R ∪ {aRb}, Si. Proposition 1 REv does not satisfy Principle 1 for all the semantics. Proof 1 We use a counterexample to proof REv does not satisfy Principle 1 for e-complete semantics. Assume a BAF = hA, R, Si, in which A = {a, b, c, d}, R = {(d, b)}, S = {(a, a), (a, b), (b, c)(d, d)}, the e-complete semantics of BAF is {a, d}. Because a supports c and c supports b, s.t. a supports c, then we have BAF ′ = h{a, b, c, d}, {(d, b)}, {(a, a), (a, b), (b, c)(d, d), (a, c)}i, the e-complete semantics of BAF ′ is {a, c, d}, see Figure 5. The table below shows the correspondence between the reductions and the first five principles. We omit the straightforward proofs. Basic principles We start with the basic property from Baroni’s classification (Baroni and Giacomin 2007), conflict-freeness. Since we only add attack relations, and all Dung’s semantics satisfy the conflict-free principle, the property of conflict-freeness is trivially satisfied for all reductions. 231 The following proposition shows that the reductions that do not satisfy closure, satisfy the inverse closure principle instead. Consequently, closure and inverse closure are good principles to distinguish the behavior of the reductions. Proposition 3 R2 and RE satisfy Principle 8 for all the semantics. Proof 3 To prove Proposition 3, we use proof by contradiction. Let E ⊆ A be a complete extension of an AF which is the associated framework of a BAF after R2. Assume R2 does not satisfy Principle 8, then b ∈ E, a ∈ / E, s.t. (a, b) ∈ S. As a ∈ / E, s.t. ∃c ∈ A, (c, a) ∈ R, but ∄d ∈ E s.t. d defends a, i.e., d attacks c. If c attacks a, then c secondary attacks b, there is no d attacks c, then E is not admissible. There is a contradiction between E is not admissible and E is complete. Therefore, R2 satisfies Principle 8 for complete semantics. a b a b BAF ′ Principle 10 (Addition persistence) Suppose E is an extension of a BAF, and a, b ∈ E. Now BAF ′ is the framework with the addition of a support relation from a to b, i.e. ES (A, R, S) = ES (A, R, S ∪ (a, b)). We have that E is also an extension of BAF ′ . As expected, addition persistence holds for all reductions. However, Principle 9 only holds for the grounded semantics. Below is the proof for R2 does not satisfy Principle 9 for preferred semantics, we omit other proofs due to lack of space. Proposition 5 All the reductions satisfy Principle 10 for all the semantics. Proof 6 Due to the lack of space we only provide proof sketch. While two arguments are already in E which is an extension of a BAF, we add support relation between them, then there are three situations: the first is no new attack needs to be added; the second is a new attack from an argument inside this extension to an outside argument, which has no influence to this extension; the third is the addition of a new attack from the outside argument to an inside argument, the attacked argument is still defended. Thus E is still an extension of BAF. Proposition 4 All the reductions do not satisfy Principle 9 for all the semantics except for grounded semantics. Proof 4 We use a counterexample to prove R2 does not satisfy Principle 9 for preferred semantics. Assume a BAF = hA, R, Si, in which A = {a, b, c}, R = {(b, a), (a, c)}, S = 0, / the preferred semantics of BAF is {b, c}. Let c supports b, then we have BAF ′ = h{a, b, c}, {(b, a), (a, c)}, {(c, b)}i, the preferred semantics of BAF ′ is {a} and {b, c}. Along the same lines, the following principle considers the removal of support relations in specific cases. c Principle 11 (Removal persistence) Suppose E is an extension of a BAF, ∀a, b, c ∈ A, where a supports c and c attacks b, a ∈ E but b ∈ / E. Now BAF ′ is the framework which removes the support relation from a to c, ES (A, R, S) = ES (A, R, S\(a, b)), we have that E is also an extension of BAF ′ ; Or we suppose E is an extension of a BAF, ∀a, b, c ∈ A, where c supports a and c attacks b, a ∈ E but b ∈ / E. Now BAF ′ is the framework which removes the support relation from c to a, ES (A, R, S) = ES (A, R, S\(a, b)), We have that E is also an extension of BAF ′ . (a) The initial BAF a d A more fine-grained analysis is based on dynamic properties that consider the addition of relations in certain cases. The following principle considers the addition of support relations among arguments which are both accepted. Principle 9 (Number of extensions) |ES (A, R, S ∪ S′ )| 6 |ES (A, R, S)| b c Figure 7: The counterexample of Proof 5 Dynamic properties often give insight in the behavior of semantics. Principle 9 says that adding support relations can only lead to a decrease of extensions, as in Example 3. a d The initial BAF Dynamic principles b c c (b) BAF ′ : After the addition of support Figure 6: The counterexample of Proof 4 Principle 11 only holds for some reductions. Proposition 6 RM and R2 do not satisfy Principle 11 for the all the semantics. Proof 5 We use a counterexample to prove REv does not satisfy Principle 9 for e-complete semantics. Assume a BAF = hA, R, Si, in which A = {a, b, c, d}, R = {(a, b)(b, a)}, S = {(c, c)(c, a)}, the e-complete semantics of BAF is {a, c}. Let d supports d, then we have BAF ′ = h{a, b, c, d}, {(a, b)(b, a)}, {(c, c), (c, a), (d, d)}i, the e-complete semantics of BAF ′ is {a, c, d} and {b, c, d}. Proof 7 We use a counterexample to prove RM and R2 do not satisfy Principle 11 for preferred semantics. The preferred semantics of Figure 2(b) is {a}, if we delete the support relation from b to c, the preferred semantics turns to {a, b}; The preferred semantics of Figure 2(c) is {a}, if we 232 delete the support relation from c to b, the preferred semantics turns to {a, b}. a b a b c c d c d b d Directionality Directionality can be generalized to bipolar argumentation as follows. Definition 11 (Unattacked and unsupported arguments in BAF) Given an BAF = hA, R, Si, a set U is unattacked and unsupported if and only if there exists no a ∈ A\U such that a attacks U or a supports U. The unattacked and unsupported sets in BAF is denoted US(BAF) (U for short). Figure 8: The counterexample for Proof 10 Proposition 10 RM does not satisfy Principle 12 for grounded, complete and preferred semantics. Principle 12 (BAF Directionality) A BAF semantics σ satisfies the BAF directionality principle iff for every BAF, for every U ∈ US(BAF), it holds that σ (BAF↓U ) = {E ∩ U|E ∈ σ (BAF)}, where for BAF = hA, R, Si, BAF↓U = (U, R ∩ U × U, S ∩ U × U) is a projection, and σ (BAF↓U ) are the extensions of the projection. Proof 11 We use a counterexample to prove RM does not satisfy Principle 12 for preferred semantics, which is showed in Figure 9. Due to the lack of space, here we omit the details. a In (Baroni and Giacomin 2007), the authors have showed that stable semantics violates directionality, here we omit the proof of all the reductions do not satisfy Principle 12 for stable semantics. b Proposition 7 RS satisfies Principle 12 for grounded, complete and preferred semantics. a c b a c b Figure 9: The counterexample for Proof 11 Proof 8 To prove Proposition 7, we use proof by contradiction. Assume RS does not satisfy Principle 12, Let U1 be an unattacked and unsupported set, let U2 be A\U, we assume a semantics for AF that satisfies Principle 12 for grounded, complete and preferred semantics. From the above, we have a supports c and c attacks b, s.t. a supported attacks b, a is in U2 and b is in U1 . If b is in U1 , then c must be in U1 , if c is in U1 , then a must be in U1 . Contradiction. Supported arguments Principle 13 (Global support) Given a BAF = hA, R, Si, for all extensions E in E, if a ∈ E, then there must be an argument b s.t. b ∈ E, and b supports a. Principle 14 (Grounded support) Given a BAF = hA, R, Si, for all extensions E in E, if a ∈ E, then there must be an argument b ∈ E and (b, b) ∈ S (or (a, a) ∈ S), s.t. there is a support sequence (b, a0 , . . . , an , a), all ai ∈ E. Proposition 8 R2 satisfies Principle 12 for grounded, complete and preferred semantics.. Proof 9 To prove Proposition 8, we use proof by contradiction. Assume R2 does not satisfy Principle 12, Let U1 be an unattacked and unsupported set, let U2 be A\U, we assume a semantics for AF that satisfies Principle 12 for grounded, complete and preferred semantics. From the above, we have a supports c, c attacks b, s.t. a secondary attacks b, a is in U2 and b is in U1 . If b is in U1 , then c must be in U1 , if c is in U1 , then a must be in U1 . Contradiction. Proposition 11 All the reductions except for REv do not satisfy Principle 13 and 14 for all the semantics. Proof 12 We can simply use a counterexample to prove Proposition 11. Assume we have a BAF = h{a}, 0, / 0i, / Ecomplete (RS(BAF))=Ecomplete (RM(BAF)) = Ecomplete (R2 (BAF)) = Ecomplete (RE(BAF)) = Ecomplete (RD(BAF)) = Ecomplete (RN(BAF)) = {{a}}. However, there is no argument supports a. Proposition 9 RE does not satisfy Principle 12 for grounded, complete and preferred semantics. The following table summarizes the results of this section. Proof 10 We use a counterexample to prove RE does not satisfy Principle 12 for preferred semantics. Assume we have a BAF visualized as the left in Figure 6, argument c supports a, then we have the associated AF visualized as the middle in Figure 8 in which we add a extended attacks b and the same form a to d. From the initial BAF, we have an unattacked and unsupported set U = {b, c, d}, the right of Figure 8 visualizes BAF↓U . The preferred extensions of BAF is σ (BAF) = σ (AF) = {{a, c}}, σ (BAF↓U ) = {{c}, {b, d}}, σ (BAF) ∩ U = {{c}}, σ (BAF↓U ) 6= {{c}}. Thus, BAF↓U is not the projection of whole framework. Concluding remarks, related and future work Actually, there is a gap between the formal analysis of bipolar frameworks, i.e.knowledge reasoning, and the informal representation, i.e.knowledge representation. In (Cayrol and Lagasquie-Schiex 2013) , the authors give the following example written in natural language: “a bipolar degree supports a scholarship”. The interpretation of this sentence is subjective. You can whether give the support necessary interpretation: “A bachelor degree is necessary for a scholarship, so if someone does not have a bachelor degree, one 233 Table 2: Comparison among the reductions and the proposed principles. Red. P6 P7 P8 P9 P10 P11 P12 P13 p14 RS CGPS GCPS × G CGPS GCPS CGP × × RM CGPS GCPS × G CGPS × × × × R2 CGPS × GCPS G CGPS × CGP × × RE CGPS × GCPS G CGPS GCPS × × × RD CGPS GCPS × G CGPS × × × × RN CGPS × GCPS G CGPS × × × × REv CGPS × × G CGPS × × GCPS GCPS does not get a scholarship”; or give a deductive interpretation: “A bachelor degree is sufficient for a scholarship, so if one does not get a scholarship, one does not have a bachelor degree”. This translation from natural language to formal one is standard pragmatics, i.e. whether “A supports B” means “A implies B” (sufficient reason) or “B implies A” (necessary reason), or to mix them to get a more complicated relation. As a result, different agents have different interpretations, formal argumentation may play the meta-dialogue to settle this issue such as we can adopt it for legal interpretation. However, the considerations above do not invalidate our work about the principle-based approach for bipolar argumentation, on the contrary, because of the ambiguity at the pragmatic and semantic level, a principle-based approach can be very useful to better understand the choices of a particular formalization. In this paper, we have proposed an axiomatic approach to bipolar argumentation framework semantics, which is summarized in tables 1 and 2 of this paper. We considered seven reductions from bipolar argumentation frameworks to a Dung-like abstract argumentation, four standard semantics to compute the set of accepted arguments, and fourteen principles to study the considered reductions. This work can be extended by considering more reductions, more semantics, and more principles. Our principles are all independent of which admissibility-based semantics is used, though some principles do not hold for semi-stable semantics. Moreover, they do not hold for some of the naive-based semantics. Some general insights can be extracted from the tables. Our principles P6, P11 and P12 can be used to distinguish among different kinds of reductions, and can be used to choose a reduction for a particular application. Principles like P9 which never hold can be used in the further search for semantics. Also we can define new semantics directly associating extensions with bipolar argumentation frameworks, i.e. without using a reduction. The results of this paper give rise to many new research questions. We intend to analyze the similarity between reductions for preference-based argumentation frameworks and for bipolar argumentation frameworks. Whereas in both frameworks, the support relation and the preference can be both added and removed. In this way, the theory of reductions for preference based argumentation and bipolar argu- mentation is closely related to dynamic principles for AF (Rienstra, Sakama, and van der Torre 2015), which can be a source of further principles. Similarly, like in preferencebased argumentation, symmetric attack can be studied. Furthermore, the first volume of the handbook of formal argumentation (Baroni, Gabbay, and Giacomin 2018)surveys the definitions, computation and analysis of abstract argumentation semantics depending on different criteria to decide the sets of acceptable arguments, and various extensions of Dung’s framework have been proposed. There are many topics where bipolar argumentation could be used, and such uses could inspire new principles. Gordon (Gordon 2018) requirements analysis for formal argumentation suggests that attack and support should be treated as equals in formal argumentation, which is also suggested by applications like DebateGraph. The handbook discusses also many topics where the theory of bipolar argumentation needs to be further developed. A structured theory of argumentation seems to be needed most. For example, maybe the most natural kind of support is a lemma supporting a proof. This corresponds to the idea of a sub-argument supporting its super-arguments. In Toulmin’s argument structure, support arguments could be used as a warrant. Moreover, the role of support in dialogue needs to be clarified. Prakken argues that besides argumentation as inference, there is also argumentation as dialogue, several chapters of the handbook are concerned with this, such as argumentation schemes. The core of the theory is a set of critical questions, which can be interpreted as attacks. Maybe the answers to the critical questions can be modeled as support? Finally, like Doutre et al (P. et al. 2017), we believe that the scope of the “principle-based approach” of argumentation semantics (Baroni and Giacomin 2007) can be widened. In the manifesto (Gabbay et al. 2018), it is argued that axioms are a way to relate formal argumentation to other areas of reasoning, e.g. social choice. Acknowledgement This project has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie ITN EJD grant agreement No 814177. We acknowledge Dr.Srdjan Vesic and Dr.Tjitze Rienstra for giving valuable advice. References Baroni, P., and Giacomin, M. 2007. On principle-based evaluation of extension-based argumentation semantics. Artificial Intelligence 171(10-15):675–700. Baroni, P.; Gabbay, D.; and Giacomin, M. 2018. Handbook of Formal Argumentation. College Publications. Besnard, P., et al. 2008. Semantics for evidence-based argumentation. Computational Models of Argument: Proceedings of COMMA 2008 172:276. Boella, G.; Gabbay, D. M.; van der Torre, L.; and Villata, S. 2010. Support in abstract argumentation. In Proceedings of the Third International Conference on Computational Mod- 234 Rienstra, T.; Sakama, C.; and van der Torre, L. 2015. Persistence and monotony properties of argumentation semantics. In International Workshop on Theory and Applications of Formal Argumentation, 211–225. Springer. els of Argument (COMMA’10), 40–51. Frontiers in Artificial Intelligence and Applications, IOS Press. Cayrol, C., and Lagasquie-Schiex, M.-C. 2005. On the acceptability of arguments in bipolar argumentation frameworks. In European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty, 378–389. Springer. Cayrol, C., and Lagasquie-Schiex, M.-C. 2009. Bipolar abstract argumentation systems. In Argumentation in Artificial Intelligence. Springer. 65–84. Cayrol, C., and Lagasquie-Schiex, M.-C. 2010. Coalitions of arguments: A tool for handling bipolar argumentation frameworks. International Journal of Intelligent Systems 25(1):83–109. Cayrol, C., and Lagasquie-Schiex, M.-C. 2013. Bipolarity in argumentation graphs: Towards a better understanding. International Journal of Approximate Reasoning 54(7):876– 899. Cayrol, C., and Lagasquie-Schiex, M.-C. 2015. An axiomatic approach to support in argumentation. In International Workshop on Theory and Applications of Formal Argumentation, 74–91. Springer. Dung, P. M. 1995. On the acceptability of arguments and its fundamental role in non-monotonic reasoning, logic programming and n-person games. Artificial Intelligence 77:321–357. Gabbay, D. M.; Giacomin, M.; Liao, B.; and van der Torre, L. W. N. 2018. Present and future of formal argumentation (dagstuhl perspectives workshop 15362). Dagstuhl Manifestos 7(1):69–95. Gordon, T. F. 2018. Towards requirements analysis for formal argumentation. Handbook of formal argumentation 1:145–156. Nouioua, F., and Risch, V. 2010. Bipolar argumentation frameworks with specialized supports. In 2010 22nd IEEE International Conference on Tools with Artificial Intelligence, volume 1, 215–218. IEEE. Nouioua, F., and Risch, V. 2011. Argumentation frameworks with necessities. In International Conference on Scalable Uncertainty Management, 163–176. Springer. Nouioua, F. 2013. Afs with necessities: further semantics and labelling characterization. In International Conference on Scalable Uncertainty Management, 120–133. Springer. Oren, N.; Luck, M.; and Reed, C. 2010. Moving between argumentation frameworks. In Proceedings of the 2010 International Conference on Computational Models of Argument. IOS Press. P., B.; David, V.; S., D.; and D., L. 2017. Subsumption and incompatibility between principles in ranking-based argumentation. In Proc. of the 29th IEEE International Conference on Tools with Artificial Intelligence ICTAI 2017. Polberg, S., and Oren, N. 2014. Revisiting support in abstract argumentation systems. In COMMA, 369–376. Polberg, S. 2017. Intertranslatability of abstract argumentation frameworks. Technical report, Technical Report. Technical Report DBAI-TR-2017-104, Institute for . . . . 235 Discursive Input/Output Logic: Deontic Modals, Norms, and Semantic Unification Ali Farjami University of Luxembourg ali.farjami@uni.lu Abstract However, its most developed formulation is the Kratzerian framework (Kratzer 2012). For Kratzer, the semantics of deontic modals has two contextual components: a set of accessible worlds and an ordering of those worlds. In the Kratzerian framework, each contextual component is given as a set of propositions. Formally these are both functions, called conversational backgrounds, from evaluation worlds to sets of propositions. The modal base determines the set of accessible worlds and the ordering source induces the ordering on worlds (Von Fintel 2012). The other paradigm, namely norm-based semantics, is originally offered by David Makinson (Makinson 1999). He drew attention to a semantics for obligations and permissions, where deontic operators are evaluated not with reference to a set of possible worlds but with reference to a set of norms. This set of norms cannot meaningfully be termed true or false. The logic developed by Makinson and van der Torre (2000) is known as Input/Output (I/O) logic. I/O logic is a fruitful framework for the theoretical study of deontic reasoning (Makinson and van der Torre 2001; Parent and van der Torre 2017b) and has strong connections to nonmonotonic logic (Makinson and van der Torre 2001), the other main approach for normative reasoning (Nute 2012). More examples of norm-based logics include theory of reasons (Horty 2012), which is based on Reiter’s default logic, and logic for prioritized conditional imperatives (Hansen 2008). The question of this paper is: How can we integrate the norm-based approach in the sense of Makinson (1999) to the classic semantics in the sense of Kratzerian framework? To achieve a more uniform semantics (Horty 2014; Fuhrmann 2017; Parent and van der Torre 2017b) for deontic modals, we build I/O operations on top of Boolean algebras for deriving permissions and obligations. The approach is close to the work is done by Gabbay, Parent, and van der Torre (2019): a geometrical view of I/O logic . For defining the I/O framework over an algebraic setting, they use the algebraic counterpart, upward-closed set of the infimum of A, for the propositional logic consequence relation (“Cn(A)”), within lattices. They have characterized only the simple-minded output operation. We show that by choosing the “U p” operator,1 upward-closed set, as the alge- The so-called modal logic and norm-based paradigms in deontic logic investigate the logical relations among normative concepts such as obligation, permission, and prohibition. The paper unifies these two paradigms by introducing an algebraic framework, called discursive input/output logic. In this framework, deontic modals are evaluated with reference both to a set of possible worlds and a set of norms. The distinctive feature of the new framework is the non-adjunctive definition of input/output operations. This brings us the advantage of modeling discursive reasoning. 1 Introduction The paper introduces a new logical framework for normative reasoning. It is a unification of the two main paradigms for deontic logic: “modal logic” and “norm-based” paradigms. Each paradigm has its advantages. An advantage of the modal logic paradigm is the capability to extend with other modalities such as epistemic or temporal operators, and advantages of the norm-based paradigm include the ability to explicitly represent normative codes such as legal systems and using non-monotonic logic techniques of common sense reasoning. Unifying these two paradigms will provide us with a framework with all of these advantages simultaneously. For example, we can design a normative temporal system, which changes over time. The temporal reasoning comes from the advantage of the modal logic part and changes operators (expansion, contraction) from the norm-based part. There are other frameworks, such as adaptive logic (Straßer 2013; Straßer, Beirlaen, and van de Putte 2016), that combine modal logic and norm-based approaches. The novelty of our approach is semantical unification. The unification is based on bringing the core semantical elements of both approaches in a single unit. Before introducing the suggested framework, we will briefly discuss the paradigms mentioned above in turn. The classic semantics for deontic modality was developed as a branch of modal logic in variants by Danielsson (1968), Hansson (1969), Føllesdal, Hilpinen (1970), van Fraassen (1973a; 1973b) and Lewis (2013; 1974), among others. Copyright c 2020, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. 1 236 Thanks to Majid Alizadeh for this suggestion. systems, we show how it is possible to add AND and other rules, required for obligation (Makinson and van der Torre 2000), to the proof systems and find I/O operations for them. The introduced I/O operations admit normative conflicts and could receive technical benefits from the constrained version of I/O logics (Makinson and van der Torre 2001) for resolving normative conflicts. The introduced framework is a form of paraconsistent logic for admitting normative conflicts (see Subsection 6.1, (Goble 2013)). Moreover, we use Stone’s representation theorem for Boolean algebras for integrating input/output logic with possible world semantics.5 The I/O operations presented here are Tarskain or closure operator over a set of conditional norms so that they can be used as logical operators for reasoning about normative systems. The algebrization of the I/O framework shows more similarity with the theory of joining-systems (Lindahl and Odelstad 2013) that is an algebraic approach for study normative systems over Boolean algebras. We can say that norms in the I/O framework play the same role of joining (Sun 2018) in the theory of Lindahl and Odelstal (2013). The article is structured as follows: Section 2 is about integrating the norm-based approach to the Kratzerian framework for deontic modals. Section 3 and 4 give the soundness and completeness results of I/O operations for deriving permissions and obligations. Section 5 generalizes the I/O operations over any abstract logic. Section 6 concludes the paper. braic counterpart of the “Cn” operator and by using the reversibility of inference rules in the I/O proof system, we can characterize all the previously studied I/O systems and find many more new logical systems. This suggested framework has a significant difference from other types of input/output logics. In contrast to the earlier input/output logics, we define non-adjunctive input/output operations. Non-adjunctive logical systems are those where deriving the conjunctive formula ϕ ∧ ψ from the set {ϕ, ψ} fails (Ciuciura 2013; Costa 2005). These systems are especially suited for modeling discursive reasoning. In fact, the first non-adjunctive system in the literature was proposed by Jaśkowaski (1969) for discursive systems. “[...] such a system which cannot be said to include theses that express opinions in agreement with one another, be termed a discursive system. To bring out the nature of the theses of such a system it would be proper to precede each thesis by the reservation: in accordance with the opinion of one of the participants of the discourse [...]. Hence the joining of a thesis to a discursive system has a different intuitive meaning than has assertion in an ordinary system.” (Jaśkowski 1969) We build two groups of I/O operations for deriving permissions and obligations over Boolean algebras. The main difference between the two operations is similar to the possible world semantics characterization of box and diamond, where box is closed under AND, ((✷A∧✷B) → ✷(A∧B)), and diamond not.2 For each deontic modal of permission and obligation, a primitive operation3 is defined in the strong sense (Alchourrón and Bulygin 1971).4 The “U p” operator, for a given set A, sees all the elements that are upper than or equal to the elements of A. This operator instead of the “Cn” operator is not closed under conjunction so that we do not have a ∧ ¬a ∈ U p(a, ¬a). Consequently, the new I/O operations defined by the “U p” operator instead of the “Cn” operate on inputs independently and do not derive joint outputs (are not closed under AND). According to the reversibility of inference rules in the I/O proof 2 Norms and Deontic Modals Before going to our discussion consider following basic logical notions: • W is a finite set of possible worlds W = {w1 , ..., wn }. • P (W ) is the set of propositions. A proposition x is true in a world w if and only if w ∈ x. • If A is a set of propositions, T – A 6= ∅ means that A is consistent. T – A ⊆ x means that x follows from A. T – A∩x 6= ∅ means that x is compatible with A (A∪{x} is consistent). • f is a function form W to P (P (W )), which is termed the modal base. f assigns to every possible world w the set of propositions (f (w)), which is called premise set, that are known in w by us. We use the same formal definition for the ordering source function g. • Normative system N ⊆ P (W ) × P (W ) denotes a set of norms (a, x), which the body and head are propositions. More explicitly, N O denotes a set of obligatory norms and N P a set of permissive norms. If (a, x) ∈ N O , it means that “given a, it is obligatory that x” and if (a, x) ∈ N P , it means that “ given a, it is permitted that x.” • x ∈ out(N O , A) means given normative system N O and input set A (state of affairs), x (obligation) is in the output (similar definition works for permission x ∈ 2 In the main literature of input/output logic developed by Makinson and van der Torre (2000), Parent and van der Torre (2014; 2014; 2017a; 2018a), and Stolpe (2008b; 2008a; 2015) at least one form of AND inference rule is present. Sun (2016) analyzed norms derivations rules of input/output logic in isolation. Still, it is not clear how we can combine them and build new logical systems, specifically systems that do not admit the rule of AND. For building a primitive operation for producing permissible propositions, we need to remove the AND rule from the proof system. 3 Von Wright (1951) defined permission as the primitive concept and obligation as the dual of it. Later, in the central literature of deontic logic, obligation introduced as the primitive concept and permission defined as the dual concept, as well in the earlier input/output logic for permission (Makinson and van der Torre 2003). Moreover, in the I/O literature, permission base on derogation is studied by Stolpe (2010b; 2010a) and based on constraints by Boella and van der Torre (2008). 4 For example Alchourrón, and Bulygin (1971) define strong permission as :“ To say the p is strongly permitted in the case q by the system α means that a norm to the effect that p is permitted in q is a consequence of α”. 5 Another possible worlds semantics of I/O logic is studied by Bochman (2005) for causal reasoning. It has no direct connection to the operational semantics (see Subsection 2.4, (Parent and van der Torre 2013)). 237 out(N P , A)). The output operations resemble inferences, where inputs need not be included among outputs, and outputs need not be reusable as inputs (Makinson and van der Torre 2000). In the classic semantics, modals are quantifiers over possible worlds. Deontic modals are quantifiers over the best worlds in the domain of accessible worlds, represented as T f (w) in the Kratzerian framework: ought or have-to are necessity modals that the prejacent (i.e. the proposition under the modal operator) is true in all of the best worlds and may or can are possibility modals that the prejacent is true in some of the best worlds (Von Fintel and Heim 2011; Von Fintel 2012). We can define deontic modals in the Kratzerian framework as follows (Von Fintel 2012): [[be-allowed-to]]w,f,g = λx (Bestg(w) ( \ In this case, deontic modals are evaluated with reference to a set of propositions given by the modal base and a normative system in each possible world. In the same way, in the world w, we can detach what we allowed to or have to as theToutput of what we preferred (as the input set) represented as g(w) and the corresponding normative systems N . The modal bases are always factual. Whenever there are possible inconsistencies, we can take the content as an ordering source (Kratzer, Pires de Oliveira, and Pessotto 2014). If the set of g(w) is not consistent, we can draw conclusions by looking at maximal consistent subsets.6 T Inconsistent premise sets: Suppose g(w) = ∅, T and T Maxfamily (g(w)) = { A|A ⊆ g(w) and A is consistent and maximal} [[be-allowed-to]]w,g = T λN P λx (x ∈ out(N P , Maxfamily (g(w)))) [[have-to]]w,g = T λN O λx (x ∈ out(N O , Maxfamily (g(w)))) f (w)) ∩ x 6= ∅) \ [[have-to]]w,f,g = λx (Bestg(w) ( f (w)) ⊆ x) T where Bestg(w) ( f (w)) is given as follows: \ \ ′ ′′ {w ∈ f (w) : ¬∃w ∈ f (w) such that ′′ Both introduced modals ([[be-allowed-to]] and [[have-to]]) are in the strong sense (Alchourrón and Bulygin 1971). For each one, we can define a weak sense of modality using the dual operator, which means x ∈ [[be-allowed-to]]W eak−sense if and only if ¬x ∈ / [[have-to]]Strong−sense ; x ∈ [[have-to]]W eak−sense if and only if ¬x ∈ / [[be-allowed-to]]Strong−sense . We have presented a family of output operations that derive different sets of permissions in Section 3. In Section 4, we define more complicated output operations for deriving obligations. As the distinctive feature, the output operations for obligations are closed under AND which means: if x ∈ out(N O , A) and y ∈ out(N O , A), then x ∧ y ∈ out(N O , A). ′ ∃y ∈ g(w) : w ∈ y and w ∈ / y} In the definition, the domain of quantification is selected by a modal base and an ordering source for deriving deontic modals. Moreover, there are two ways for quantification: compatibility and entailment. We employ the modal base and ordering source functions, from the Kratzerian framework (Kratzer 2012) and the detachment approach (Parent and van der Torre 2013) from I/O framework, instead of quantification. As an advantage of the detachment approach, we can characterize derivation systems that do not admit, for example, weakening of the output (WO) or strengthening of the input (SI). In Section 3 and Section 4, we develop various detachment methods are using different I/O operations, in turn, for permission and obligation. In input/output logic, the main semantical construct for normative propositions is the output operation, which represents the set of normative propositions related to the normative system N , regarding the state of affairs A, namely out(N, A). Detachment is the basic idea of the semantics of input/output logic (Parent and van der Torre 2013). The interpretation of “x is obligatory if a” is that “x can be detached in context a”. In a discourse, the context is represented by a modal base or an ordering source in the Kratzerian framework. To unify the norm-based approach with the classic semantics, in each world w, we can detach what we allowed to or have to as theToutput of what we know (as the input set) represented as f (w), the intersection of the propositions given by the modal base, and the corresponding normative systems N . 3 Permissive Norms: Input/Output Operations The term “input/output logic” is used broadly for a family of related systems such as simple-minded, basic and reusable (Parent and van der Torre 2018b; Makinson and van der Torre 2000). In this section, we use a similar terminology and introduce some input/output systems for deriving permissions over Boolean algebras. Each derivation system is closed under a set of rules. Moreover, we define systems that are closed only for weakening of the output (WO) or strengthening of the input (SI). We use a bottom-up approach for characterizing different derivations systems. The rule of AND, for the output, is absent in the derivation systems presented in this section. T Consistent premise sets: Suppose f (w) 6= ∅ [[be-allowed-to]]w,f = T λN P λx (x ∈ out(N P , { f (w)})) [[have-to]]w,f = T λN O λx (x ∈ out(N O , { f (w)})) 6 In the original input/output logic we have β ∈ (N, {γ, ¬γ}) for all (α, β) ∈ deriv(N ). So when the input set is inconsistent we have explosion in the original input/output logic. Reasoning from an inconsistent premise set which is represented as set of logical formulas is an important issue for deontic modlas (see Chapter 1, (Kratzer 2012)). 238 Outline of proof for soundness: for the input set A ⊆ B, we show that if (A, x) ∈ deriveB 0 (N ), then x ∈ outB (N, A). By definition (A, x) ∈ deriveB 0 0 (N ) iff B (a, x) ∈ derive0 (N ) for some a ∈ A. By induction on the length of derivation and the following theorem we have B (a, x) ∈ deriveB 0 (N ) iff x ∈ out0 (N, {a}). Then by defiB B nition of out0 we have x ∈ out0 (N, A). If A = {}, then by definition (A, x) ∈ / deriveB 0 (N ). The outline works for the soundness of other presented systems in the paper as well. Definition 1 (Boolean algebra) A structure B = hB, ∧, ∨, ¬, 0, 1i is a Boolean algebra iff it satisfies following identities: • • • • • x ∨ y = y ∨ x, x ∧ y = y ∧ x x ∨ (y ∨ z) = (x ∨ y) ∨ z, x ∧ (y ∧ z) = (x ∧ y) ∧ z x ∨ 0 = x, x ∧ 1 = x x ∨ ¬x = 1, x ∧ ¬x = 0 x∨(y∧z) = (x∨y)∧(x∨z), x∧(y∨z) = (x∧y)∨(x∧z) The elements of a Boolean algebra are ordered as a ≤ b iff a ∧ b = a. Theorem 1 (Soundness) outB 0 (N ) validates EQI and EQO. Definition 2 (Upward-closed set) Given a Boolean algebra B, a set A ⊆ B satisfying the following property is called upward-closed. Proof 1 − EQI:We need to show that EQI For all x, y ∈ B, if x ≤ y and x ∈ A then y ∈ A We denote the least upward-closed set which includes A by U p(A). U p operator satisfies following properties: • A ⊆ U p(A) • A ⊆ B ⇒ U p(A) ⊆ U p(B) • U p(A) = U p(U p(A)) If x ∈ Eq(N (Eq(a))), then there are t1 and t2 such that t1 = a and t2 = x and (t1 , t2 ) ∈ N . If a = b then t1 = b. Hence, by definition x ∈ Eq(N (Eq(b))). (Inclusion) (Monotony) (Idempotence) − EQO: We need to show that An operator that satisfies these properties is called closure operator. EQO Let N (A) = {x | (a, x) ∈ N for some a ∈ A} and Eq(X) = {x|∃y ∈ X, x = y}. B 10 Theorem 2 (Completeness) outB 0 (N ) ⊆ derive0 (N ) . Definition 3 (Semantics) Given a Boolean algebra B, a normative system N ⊆ B × B and an input set A ⊆ B, we define the zero Boolean operation as follows: Proof 2 We show that if x ∈ Eq(N (Eq(A))), then (A, x) ∈ deriveB 0 (N ). Suppose x ∈ Eq(N (Eq(A))), then there are t1 and t2 such that t1 = a and a ∈ A, and t2 = x such that (t1 , t2 ) ∈ N . 7 outB 0 (N, A) = Eq(N (Eq(A))) B We put outB 0 (N ) = {(A, x) : x ∈ out0 (N, A)}. Definition 4 (Proof system) Given a Boolean algebra B and a normative system N ⊆ B × B, we define (a, x) ∈ deriveB 0 (N ) if and only if (a, x) is derivable from N using the rules {EQI, EQO}.8 EQO (a, x) EQO (t1 , t2 ) t2 = x (t1 , x) EQI (a, x) t1 = a B Thus, x ∈ deriveB 0 (N, a) and then x ∈ derive0 (N, A). a=b (b, x) (a, x) x ∈ Eq(N (Eq(a))) x=y y ∈ Eq(N (Eq(a))) If x ∈ Eq(N (Eq(a))), then there are t1 and t2 such that t1 = a and t2 = x and (t1 , t2 ) ∈ N . If x = y then t2 = y. Hence, by definition y ∈ Eq(N (Eq(a))). Zero Boolean I/O operation EQI x ∈ Eq(N (Eq(a))) a=b x ∈ Eq(N (Eq(b))) Two basic subsystems: We can construct two simple subB systems: outB R (N, A) = Eq(N (A)) and outL (N, A) = B N (Eq(A)). We define (a, x) ∈ deriveR (N ) ((a, x) ∈ deriveB L (N )) if and only if (a, x) is derivable from N using the rule {EQO} ({EQI}). By rewriting the same definition B B of outB 0 (N ) for outR (N ) and outL (N ), and the definition B B of derive0 (N ) for deriveR (N ) and deriveB L (N ), we have: x=y (a, y) Given a set of A ⊆ B, (A, x) ∈ deriveB 0 (N ) when9 ever (a, x) ∈ deriveB 0 (N ) for some a ∈ A. Put B deriveB 0 (N, A) = {x : (A, x) ∈ derive0 (N )}. 7 Sometimes we write U p(a, b, ...)(Eq(a, b, ...)) instead of U p({a, b, ...})(Eq({a, b, ...})) as well out(N, a) (derive(N, a)) instead of out(N, {a}) (derive(N, {a})). 8 EQI stands for equivalence of the input and EQO stands for equivalence of the output. 9 In the original input/output logic (Makinson and van der Torre 2000), it is for some conjunction a of elements in A. B outB R (N ) = deriveR (N ) B outB L (N ) = deriveL (N ) 10 For the completeness proofs if A = {}, then by definition of Eq({}) = {} and U p({}) = {} we have x ∈ / outB i (N, {}) = {}. 239 Simple-I Boolean I/O operation Example 2: For the conditionals N = {(⊤, g), (g, t)} and the input set A = {} then outB II (N, A) = {} and for the input set C = {g} we have outB II (N, C) = U p(t). Definition 5 (Semantics) Given a Boolean algebra B, a normative system N ⊆ B × B and an input set A ⊆ B, we define the simple-I Boolean operation as follows: outB I (N, A) = Eq(N (U p(A))) B We put outI (N ) = {(A, x) : x ∈ outB I (N, A)}. Definition 6 (Proof system) Given a Boolean algebra B and a normative system N ⊆ B × B, we define (a, x) ∈ deriveB I (N ) if and only if (a, x) is derivable from N using the rules {SI, EQO}. SI Simple-minded Boolean I/O operation Definition 9 (Semantics) Given a Boolean algebra B, a normative system N ⊆ B × B and an input set A ⊆ B, we define the simple-minded Boolean operation as follows: outB 1 (N, A) = U p(N (U p(A))) B We put out1 (N ) = {(A, x) : x ∈ outB 1 (N, A)}. Definition 10 (Proof system) Given a Boolean algebra B and a normative system N ⊆ B × B, we define (a, x) ∈ deriveB 1 (N ) if and only if (a, x) is derivable from N using the rules {SI, W O}. Given a set of A ⊆ B, (A, x) ∈ deriveB 1 (N ) whenever (a, x) ∈ deriveB 1 (N ) for some a ∈ A. Put B deriveB 1 (N, A) = {x : (A, x) ∈ derive1 (N )}. b≤a (a, x) (b, x) x=y (a, y) Given a set of A ⊆ B, (A, x) ∈ deriveB I (N ) when11 (N ) for some a ∈ A. Put ever (a, x) ∈ deriveB I B deriveB (N, A) = {x : (A, x) ∈ derive (N )}. I I EQO (a, x) Theorem 7 (Soundness) outB 1 (N ) validates SI and W O. Proof 7 − SI: We need to show that Theorem 3 (Soundness) outB I (N ) validates SI and EQO. Proof 3 The proof is similar to Theorems 1 and 7. B Theorem 4 (Completeness) outB I (N ) ⊆ deriveI (N ) Proof 4 The proof is similar to Theorems 2 and 8. SI Since b ≤ a we have U p(a) ⊆ U p(b). Hence, N (U p(a)) ⊆ N (U p(b)) and therefore U p(N (U p(a))) ⊆ U p(N (U p(b))). − WO: We need to show that Example 1: For the conditionals N = {(⊤, g), (g, t)} and the input set A = {} then outB I (N, A) = {} and for the input set C = {g} we have outB I (N, C) = Eq(g, t). WO Simple-II Boolean I/O operation Definition 7 (Semantics) Given a Boolean algebra B, a normative system N ⊆ B × B and an input set A ⊆ B, we define the simple-II Boolean operation as follows: outB II (N, A) = U p(N (Eq(A))) B We put outII (N ) = {(A, x) : x ∈ outB II (N, A)}. Definition 8 (Proof system) Given a Boolean algebra B and a normative system N ⊆ B × B, we define (a, x) ∈ deriveB II (N ) if and only if (a, x) is derivable from N using the rules {W O, EQI}. WO (a, x) x ∈ U p(N (U p(a))) x≤y y ∈ U p(N (U p(a))) Since U p(N (U p(a))) is upward-closed and x ≤ y we have y ∈ U p(N (U p(a))). Counter-example for AND: We can show that AND is not valid. (a, x) (a, y) (a, x ∧ y) Consider the normative system N = {(a, x), (a, y)} we have x ∈ U p(N (U p({a}))) and y ∈ U p(N (U p({a}))) but x∧y ∈ / U p(N (U p({a}))) by definition of U p(X). AND x≤y (a, y) B Theorem 8 (Completeness) outB 1 (N ) ⊆ derive1 (N ). Proof 8 We show that if x ∈ U p(N (U p(A))), then (A, x) ∈ deriveB 1 (N ). Suppose x ∈ U p(N (U p(A))), then there is y1 such that y1 ∈ N (U p(A)), y1 ≤ x, and there is t1 such that (t1 , y1 ) ∈ N and a ≤ t1 for a ∈ A. a=b (b, x) Given a set of A ⊆ B, (A, x) ∈ deriveB II (N ) whenever (a, x) ∈ deriveB II (N ) for some a ∈ A. Put B deriveB II (N, A) = {x : (A, x) ∈ deriveII (N )}. EQI x ∈ U p(N (U p(a))) b≤a x ∈ U p(N (U p(b))) (a, x) outB II (N ) Theorem 5 (Soundness) validates W O and EQI. Proof 5 The proof is similar to Theorems 1 and 7. B Theorem 6 (Completeness) outB II (N ) ⊆ deriveII (N ). Proof 6 The proof is similar to Theorems 2 and 8. SI a ≤ t1 (t1 , y1 ) y1 ≤ x (t1 , x) (a, x) WO B Thus, x ∈ deriveB 1 (N, a) and then x ∈ derive1 (N, A). Example 3: For the conditionals N = {(g, t), (¬g, ¬t), (a, b)} and the input set A = {g, ¬g} we have outB 1 (N, A) = U p(t, ¬t). 11 In the original input/output logic (Makinson and van der Torre 2000), it is for some conjunction a of elements in A. 240 Basic Boolean I/O operation Reusable Boolean I/O operation Definition 11 (Saturated set) A set V is saturated in a Boolean algebra B iff Definition 14 (Semantics) Given a Boolean algebra B, a normative system N ⊆ B × B and an input set A ⊆ B, we define the reusable Boolean operation as follows: T {U p(N (V )), A ⊆ V = U p(V ) ⊇ N (V )} outB 3 (N, A) = • If a ∈ V and b ≥ a, then b ∈ V ; • If a ∨ b ∈ V , then a ∈ V or b ∈ V . B We put outB 3 (N ) = {(A, x) : x ∈ out3 (N, A)}. Definition 12 (Semantics) Given a Boolean algebra B, a normative system N ⊆ B × B and an input set A ⊆ B, we define the basic Boolean operation as follows: T {U p(N (V )), A ⊆ V, V is saturated} outB 2 (N, A) = Definition 15 (Proof system) Given a Boolean algebra B and a normative system N ⊆ B × B, we define (a, x) ∈ deriveB 3 (N ) if and only if (a, x) is derivable from N using 12 the rules of deriveB 1 (N ) along with T . B We put outB 2 (N ) = {(A, x) : x ∈ out2 (N, A)}. Definition 13 (Proof system) Given a Boolean algebra B and a normative system N ⊆ B × B, we define (a, x) ∈ deriveB 2 (N ) if and only if (a, x) is derivable from N using the rules of deriveB 1 (N ) along with OR. OR T Proof 11 − T: We need to show that Theorem 9 (Soundness) outB 2 (N ) validates SI, W O and OR. T Proof 9 − OR: We need to show that Suppose {a ∨ b} ⊆ V , since V is saturated we have a ∈ V or b ∈ V . Suppose a ∈ V , in this case since B outB 2 (N, {a}) ⊆ U p(N (V )) we have x ∈ out2 (N, {a ∨ b}). y ∈ outB 3 (N, {x}) y ∈ outB 3 (N, {a}) B Theorem 12 (Completeness) outB 3 (N ) ⊆ derive3 (N ). Proof 12 Suppose x ∈ / deriveB 3 (N, a), we need to find B such that a ∈ B = U p(B) ⊇ N (B) and x ∈ / U p(N (B)). Put B = U p({a} ∪ deriveB (N, a)). We show that N (B) ⊆ 3 B. Suppose y ∈ N (B), then there is b ∈ B such that (b, y) ∈ N . We show that y ∈ B. Since b ∈ B there are two cases: B Theorem 10 (Completeness) outB 2 (N ) ⊆ derive2 (N ). Proof 10 Suppose x ∈ / deriveB 2 (N, A), then by monotony of derivability operation there is a maximal set V such that A ⊆ V and x ∈ / deriveB 2 (N, V ). V is saturated because • b ≥ a: in this case we have (a, y) ∈ deriveB 3 (N ) since (N ) and we have (b, y) ∈ deriveB 3 (a) Suppose a ∈ V and a ≤ b, we have (a, x) ∈ / B deriveB (N ). We need to show that x ∈ / derive (N, b) 2 2 and since V is maximal we have b ∈ V . Suppose (b, x) ∈ deriveB 2 (N ). We have (b, x) x ∈ outB 3 (N, {a}) Suppose that X is the smallest set such that {a} ⊆ X = U p(X) ⊇ N (X). Since x ∈ outB 3 (N, {a}) we have x ∈ X and from y ∈ outB 3 (N, {x}) we have y ∈ X. Thus, y ∈ outB 3 (N, {a}). x ∈ outB 2 (N, {b}) x ∈ outB 2 (N, {a ∨ b}) SI (a, y) Theorem 11 (Soundness) outB 3 (N ) validates SI, W O and T. Given a set of A ⊆ B, (A, x) ∈ deriveB 2 (N ) if (a, x) ∈ B deriveB (N ) for some a ∈ A. Put derive 2 2 (N, A) = {x : B (A, x) ∈ derive2 (N )}. OR (x, y) Given a set of A ⊆ B, (A, x) ∈ deriveB 3 (N ) if (a, x) ∈ B deriveB 3 (N ) for some a ∈ A. Put derive3 (N, A) = {x : B (A, x) ∈ derive3 (N )}. (a, x) (b, x) (a ∨ b, x) x ∈ outB 2 (N, {a}) (a, x) SI a≤b (a, x) (b, y) a≤b (a, y) • ∃z ∈ deriveB 3 (N ), b ≥ z : in this case we have That is contradiction with (a, x) ∈ / deriveB 2 (N ). (b) Suppose a ∨ b ∈ V , we have x ∈ / deriveB 2 (N, a ∨ b). We need to show that x ∈ / deriveB / 2 (N, a) or x ∈ B deriveB 2 (N, b). Suppose x ∈ derive2 (N, a) and x ∈ deriveB 2 (N, b), then we have T (a, z) SI (b, y) z≤b (z, y) (a, y) We only need to show that x ∈ / U p(N (B)) = B outB 1 (N, {a} ∪ derive3 (N, a)). Suppose x ∈ U p(N (B)), then there is y1 such that x ≥ y1 and ∃t1 , (t1 , y1 ) ∈ N and t1 ∈ U p({a} ∪ deriveB 3 (N, a)). There are two cases: (a, x) (b, x) OR (a ∨ b, x) That is contradiction with x ∈ / deriveB 2 (N, a ∨ b). • t1 ≥ a: in this case we have Therefore, we have x ∈ / U p(N (V ))(outB 1 (N, V )) and so B x∈ / out2 (N, A). 12 241 T stands for transitivity. SI (t1 , y1 ) a ≤ t1 (a, y1 ) WO (a, x) Definition 18 (Semantics outCT i ) Given a Boolean algebra B, a normative system N ⊆ B × B and an input set A ⊆ B, we define the CT operation as follows: y1 ≤ x 0 outCT (N, A) i n+1 outiCT (N, A) • ∃z1 ∈ deriveB 3 (N, a), z1 ≤ t1 : in this case we have (t1 , y1 ) z1 ≤ t 1 (z1 , y1 ) (a, y1 ) WO (a, x) (a, z1 ) T SI y1 ≤ x outCT i (N, A) CT We put outCT i (N ) = {(A, x) : x ∈ outi (N, A)}. Thus, in both cases (a, x) ∈ deriveB 3 (N ) and then x ∈ (N, a) that is contradiction. deriveB 3 D Definition 19 (Semantics outCT,AN ) Given a Boolean i algebra B, a normative system N ⊆ B × B and an input set A ⊆ B, we define the CT,AND operation as follows: Example 4: For the conditionals N = {(⊤, g), (g, t), (¬g, ¬t), (a, b)} and the input set A = {¬g} we have outB 3 (N, A) = U p(g, t, ¬t). 4 0 D outCT,AN (N, A) i CT,AN D n+1 outi (N, A) { y ∧ z : y, z ∈ D (N, A) outCT,AN i Obligatory Norms: Input/Output Operations In this section, we add the rule of AND and cumulative transitivity (CT) to our introduced derivation systems. We aim to rebuild the derivation systems introduced by Markinson and van der Torre (Makinson and van der Torre 2000) for deriving obligations. derivX i D derivAN II AN D deriv1 D derivAN 2 derivCT I derivCT II derivCT 1 D derivCT,AN 1 ∈ Lemma 1 Let D be any derivation using at most EQI, SI, ′ WO, OR, AND, CT; then there is a derivation D of the same root from a subset of leaves, that applies AND only at the end. (a ∧ x, y) (a, y) Rules {WO, EQI, AND} {SI, WO, AND} {SI, WO, OR, AND} {SI, EQO, CT} {WO, EQI, CT} {SI, WO, CT} {SI, WO, CT, AND} Proof 14 See observation 18 (Makinson and van der Torre 2000). The main point of the observation is that we can reverse the order of rules AND, WO to WO, AND; AND, SI to SI, AND; AND, OR to OR, AND and finally AND, CT to SI, CT or CT, AND. Also, we can reverse the order of rules AND and EQI as follows: 1. D Definition 17 (Semantics outAN ) Given a Boolean algei bra B, a normative system N ⊆ B × B and an input set A ⊆ B, we define the AND operation as follows: 0 : x Proof 13 The proof is based on the reversibility of inference rules. Makinson and van der Torre (Makinson and van der Torre 2000) studied the reversibility of inference rules. Given a set of A ⊆ B, (A, q) ∈ deriveX i (N ) whenever (a, x) ∈ deriveX i (N ) for some a ∈ A. Put X deriveX i (N, A) = {x : (A, x) ∈ derivei (N )}. D outAN (N, A) i AN D n+1 outi (N, A) {y ∧ z : y, z ∈ D outAN (N, A) i = {(A, x) Theorem 13 Given a Boolean algebra B, for every normative system N ⊆ B × B we have D D (N ), i ∈ {II, 1, 2}; (N ) = deriveAN outAN i i CT outCT (N ) = derive (N ), i ∈ {I, II, 1} and i i CT,AN D CT,AN D out1 (N ) = derive1 (N ). (a, x) (a, y) AND (a, x ∧ y) (a, x) = outCT i (N, A) Dn = outCT,AN (N, A) ∪ i CT,AN D n outi (N, {a}), a ∈ A} S Dn = n∈N outCT,AN (N, A) i D We put outCT,AN (N ) i CT,AN D outi (N, A)}. Definition 16 (Proof system) Given a Boolean algebra B and a normative system N ⊆ B × B, we define (a, x) ∈ deriveX i (N ) if and only if (a, x) is derivable from N using EQI, EQO, SI, W O, OR, AN D, CT as follows: CT = outB i (N, A) n = outCT (N, A) ∪ i n {x : y ∈ outCT (N, {a}) and i CT n x ∈ out (N, {a ∧ y}), a ∈ A} S i n (N, A) = n∈N outCT i 2. = outB i (N, A) Dn = outAN (N, A) ∪ i AN D n outSi (N, {a}), a ∈ A} Dn (N, A) = n∈N outAN i (a, x) (a, y) AND (a, x ∧ y) a=b EQI (b, x ∧ y) (a, x) a = b (b, x) AND EQI EQI (a, y) a = b (b, y) (b, x ∧ y) Hence, in each system of {W O, EQI, AN D}, {SI, W O, AN D} and {SI, W O, OR, AN D} we can apply AND rule just at the end. Thus, we can characterize deriviAN D (N ) using the fact deriviB (N ) = outB i (N ) and the iteration of AND. D D (N, A)}. We put outAN (N ) = {(A, x) : x ∈ outAN i i 242 There is another way for representing sentence 1 by means of the ordering source. In the case of g(w) = {shA, shB}, as a set of possible inconsistent informations, we have blA, blB, ¬blA ∧ ¬blB ∈ out(N O , {shA, shB}). It is easy to check that we can reverse CT with SI, EQO, WO, and EQI, by this fact similarly we can characterize deriviCT (N ). Finally, since AND can reverse with ST, WO and CT, we can characterize deriv1CT,AN D (N ) by applying iteration of CT,AN D AND over outCT (N ). 1 (N ) that means out1 5 An abstract logic (Font and Jansana 2017) is a pair A = hL, Ci where L = hL, ...i is an algebra and C is a closure operator defined on the power set of its universe, that means for all X, Y ⊆ L: • X ⊆ C(X) • X ⊆ Y ⇒ C(X) ⊆ C(Y ) • C(X) = C(C(X)) The elements of an abstract logic can be ordered as a ≤ b if and only if b ∈ C({a}). Similarly, we can define operator U p for X ⊆ L. Definition 20 (Semantics) Given an abstract logic A, a normative system N ⊆ L × L and an input set A ⊆ L, we define the I/O operations as follows: • outA 0 (N, A) = Eq(N (Eq(A))) • outA I (N, A) = Eq(N (U p(A))) • outA II (N, A) = U p(N (Eq(A))) A • out1 (N, A) = U p(N (U p(A))) T • outA {U p(N (V )), A ⊆ V, V is saturated}13 2 (N, A) = T A • out3 (N, A) = {U p(N (V )), A ⊆ V = U p(V ) ⊇ N (V )} A We put outA i (N ) = {(A, x) : x ∈ outi (N, A)}. Definition 21 (Proof system) Given an abstract logic A and a normative system N ⊆ L × L, we deA A fine (a, x) ∈ deriveA 0 (N ) (deriveI (N ), deriveII (N ), A A A derive1 (N ), derive2 (N ), derive3 (N ) ) if and only if (a, x) is derivable from N using the rules {EQI, EQO} ({SI, EQO}, {W O, EQI}, {SI, W O}, {SI, W O, OR}, {SI, W O, T }). Put deriveA i (N, A) = {x : (A, x) ∈ deriveA (N )}. i Similarly, we can define outOR i (N ) operation and characterize some other proof systems : derivX i derivOR I derivCT,OR I derivCT,OR 1 D derivCT,OR,AN 1 I/O Mechanism over Abstract Logic Rules {SI, EQO, OR} {SI, EQO, CT, OR} {SI, WO, CT, OR} {SI, WO, CT, OR, AND} D D Four systems deriveAN , deriveAN (or 1 2 OR,AN D CT,AN D CT,OR,AN D derive1 ), derive1 and derive1 introduced by Markinson and van der Torre (2000) for reasoning about obligatory norms. The miners paradox To illustrate the advantage and difference of the new proposed semantics with the classic semantics, we focus on the miners paradox. The miners paradox is discussed by Kolodny and MacFarlane (2010). Ten miners are trapped either in shaft A or in shaft B, but we do not know which. Flood waters threaten to flood the shafts. We have enough sandbags to block one shaft, but not both. If we block one shaft, all the water will go into the other shaft, killing any miners inside it. If we block neither shaft, both shafts will fill halfway with water, and just one miner, the lowest in the shaft, will be killed. So, in our deliberation, it seems that the followings are true: 1. Either the miners are in shaft A or in shaft B. Theorem 14 (Soundness and Completeness) outA i (N ) = deriveA (N ). i Proof 15 The proofs are the same as soundness and completeness theorems in Section 3. A logical system L = hL, ⊢L i straightforwardly provides an equivalent abstract logic hFmL , C⊢L i. Therefore, we can build I/O framework over different types of logic include first-order logic, simple type theory, description logic, different kinds of modal logics that are expressive for the intentional concepts such as belief and time. 2. If the miners are in shaft A, we should block shaft A. 3. If the miners are in shaft B, we should block shaft B. 4. We should block neither shaft. In principle, it would be best to save all the miners by blocking the right shaft, sentence 2, and 3. Sentence 4 is correct since there is a fifty-fifty chance of blocking the right shaft, given that we do not know the shaft in which the miners are. This sentence guarantees that we save nine of the ten miners according to the scenario (Von Fintel 2012). The paradox is: the four sentences jointly are inconsistent in classical logic. Moreover, they are inconsistent in the context of the Kratzerian baseline view (Cariani To appear). Here, we analyze this paradox in our setting. Suppose that the set of norms N = {(shA, blA), (shB, blB), (⊤, ¬blA ∧ ¬blB)} represents the sentences 2–4. We choose one of the output operations for deriving obligation, which satisfies the rule of SI. There are two ways for representing sentence 1. For the case f (w) = {shA ∨ shB}, as a set of factual informations, we have ¬blA ∧ ¬blB ∈ out(N O , {shA ∨ shB}). Example 5: In modal logic system KT, for the conditionals N = {(p, ✷q), (q, r), (s, t)} and the input set A = {p} we have outKT 3 (N, A) = U p(✷q, r). Moreover, we can add other rules such as AND and CT to the systems same as last chapter. 13 For this case, the abstract logic A = hL, Ci should include ∨, that is a binary operation symbol, either primitive or defined by a term and we have a ∨ b, b ∨ a ∈ C({a}). 243 6 Conclusion Acknowledgments I thank two anonymous reviewers for valuable comments. I would like to thank Majid Alizadeh, who during my visit to university of Tehran in summer of 2019 dedicated much time and attention to my work, and who gave me essential advice for this work. I thank Leon van der Torre, Dov Gabbay, Xavier Parent, and Seyed Mohammad Yarandi for comments that greatly improved the manuscript. In summary, we have characterized a class of proof systems over Boolean algebras for a set of explicitly given norms as follows: derivB i derivB R derivB L derivB 0 derivB I derivB II derivB 1 derivB 2 derivB 3 Rules {EQO} {EQI} {EQI, EQO} {SI, EQO} {WO, EQI} {SI, WO} {SI, WO, OR} {SI, WO, T} References Alchourrón, C., and Bulygin, E. 1971. Normative systems. Springer-Verlag; Wien New York. Benzmüller, C.; Farjami, A.; Meder, P.; and Parent, X. 2019. I/O logic in HOL. Journal of Applied Logics – IfCoLoG Journal of Logics and their Applications (Special Issue on Reasoning for Legal AI) 6(5):715–732. Benzmüller, C.; Farjami, A.; and Parent, X. 2018. A dyadic deontic logic in HOL. In Broersen, J.; Condoravdi, C.; Nair, S.; and Pigozzi, G., eds., Deontic Logic and Normative Systems — 14th International Conference, DEON 2018, Utrecht, The Netherlands, 3-6 July, 2018, volume 9706, 33– 50. College Publications. Benzmüller, C.; Farjami, A.; and Parent, X. 2019. Åqvist’s dyadic deontic logic E in HOL. Journal of Applied Logics – IfCoLoG Journal of Logics and their Applications (Special Issue on Reasoning for Legal AI) 6(5):733–755. Bochman, A. 2005. Explanatory nonmonotonic reasoning. World scientific. Boella, G., and van der Torre, L. 2008. Institutions with a hierarchy of authorities in distributed dynamic environments. Artificial Intelligence and Law 16(1):53–71. Cariani, F. To appear. Deontic logic and natural language. In Gabbay, D.; Horty, J.; Parent, X.; van der Meyden, R.; and van der Torre, L., eds., Handbook of Deontic Logic, volume 2. College Publications. Ciabattoni, A.; Gulisano, F.; and Lellmann, B. 2018. Resolving conflicting obligations in Mı̄mām . sā: a sequent-based approach. In Broersen, J.; Condoravdi, C.; Nair, S.; and Pigozzi, G., eds., Deontic Logic and Normative Systems — 14th International Conference, DEON 2018, Utrecht, The Netherlands, 3-6 July, 2018, 91–109. Ciuciura, J. 2013. Non-adjunctive discursive logic. Bulletin of the Section of Logic 42(3/4):169–181. Costa, H. A. 2005. Non-adjunctive inference and classical modalities. Journal of Philosophical Logic 34(5-6):581– 605. Danielsson, S. 1968. Preference and obligation, studies in theelogic of ethics. Ph.D. Dissertation, Filosofiska Färeningen. Føllesdal, D., and Hilpinen, R. 1970. Deontic logic: An introduction. In Deontic logic: Introductory and systematic readings. Springer. 1–35. Font, J. M., and Jansana, R. 2017. A general algebraic semantics for sentential logics. Cambridge University Press. Fuhrmann, A. 2017. Deontic modals: Why abandon the default approach. Erkenntnis 82(6):1351–1365. Each proof system is sound and complete for an I/O operation. For each of the introduced I/O operations, we can define a new I/O operation version that allows input reappear as outputs. Let N + = N ∪ I, where I is the set of all pairs + B + (a, a) for a ∈ B. We define outB i (N, A) = outi (N , A) , B the characterization is the same as outi . It is interesting to compare the introduced systems with Minimal Deontic Logic (Goble 2013), and its similar approaches, such as (Ciabattoni, Gulisano, and Lellmann 2018), that do not have deontic aggregation principles. Moreover, we have shown that how we can add two rules AND and CT to the proof systems and find representation theorems for them. The input/output logic is inspired by a view of logic as “secretarial assistant”, to prepare inputs before they go into the motor engine and unpacking outputs, rather than logic as an “inference motor”. The only input-output logics investigated so far in the literature are built on top of classical propositional logic and intuitionist propositional logic. The algebraic construction shows how we can build the input/output version of any abstract logic. Furthermore, we can build I/O framework over posts hA, ≤i where A is a set, and ≤ is a reflexive, antisymmetric and transitive binary relation. The monotony property of closure has no role in the proofs, and we can build I/O operations over non-monotonic relations. We pose combining I/O operations with consequence relations that do not satisfy inclusion or idempotence property (Makinson 2005) as a further research question. For example, combining the original I/O framework with a consequence relation that does not satisfy inclusion where the consequence relation is an input/output closure (A ∈ out(N, A) is not necessary) was explored by Sun and van der Torre (2014). Moreover, we could present an embedding of new I/O operations in HOL (Benzmüller, Farjami, and Parent 2019; Benzmüller et al. 2019; Benzmüller, Farjami, and Parent 2018). Finally, we have introduced a unification of possible worlds and norm-based semantics for reasoning about permission and obligation; it is worth to investigate exploring the philosophical and conceptual advantages of integrating normbased semantics into the classic semantics (Von Fintel 2012; Horty 2014). As an advantage, we discussed the miners paradox (Kolodny and MacFarlane 2010). 244 Gabbay, D.; Parent, X.; and van der Torre, L. 2019. A geometrical view of I/O logic. arXiv preprint arXiv:1911.12837. Goble, L. 2013. Prima facie norms, normative conflicts, and dilemmas. Handbook of Deontic Logic 1:499–544. Hansen, J. 2008. Imperatives and deontic logic–on the semantic foundations of deontic logic. Hansson, B. 1969. An analysis of some deontic logics. Nous 373–398. Horty, J. F. 2012. Reasons as defaults. Oxford University Press. Horty, J. 2014. Deontic modals: why abandon the classical semantics? Pacific Philosophical Quarterly 95(4):424–460. Jaśkowski, S. 1969. Propositional calculus for contradictory deductive systems (communicated at the meeting of march 19, 1948). Studia Logica: An International Journal for Symbolic Logic 24:143–160. Kolodny, N., and MacFarlane, J. 2010. Ifs and oughts. The Journal of philosophy 107(3):115–143. Kratzer, A.; Pires de Oliveira, R.; and Pessotto, A. L. 2014. Talking about modality: an interview with angelika kratzer. ReVEL, especial (8). Kratzer, A. 2012. Modals and conditionals: New and revised perspectives. Oxford University Press. Lewis, D. 1974. Semantic analyses for dyadic deontic logic. In Logical theory and semantic analysis. Springer. 1–14. Lewis, D. 2013. Counterfactuals. John Wiley & Sons. Lindahl, L., and Odelstad, J. 2013. The theory of joiningsystems. Handbook of Deontic Logic 1:545–634. Makinson, D., and van der Torre, L. 2000. Input/output logics. Journal of philosophical logic 29(4):383–408. Makinson, D., and van der Torre, L. 2001. Constraints for input/output logics. Journal of Philosophical Logic 30(2):155–185. Makinson, D., and van der Torre, L. 2003. Permission from an input/output perspective. Journal of philosophical logic 32(4):391–416. Makinson, D. 1999. On a fundamental problem of deontic logic. Norms, logics and information systems. New studies on deontic logic and computer science 29–54. Makinson, D. 2005. Bridges from classical to nonmonotonic logic. King’s College. Nute, D. 2012. Defeasible deontic logic, volume 263. Springer Science & Business Media. Parent, X., and van der Torre, L. 2013. Input/output logic. Handbook of Deontic Logic 1:499–544. Parent, X., and van der Torre, L. 2014. Sing and dance! In International Conference on Deontic Logic in Computer Science, 149–165. Springer. Parent, X., and van der Torre, L. 2017a. The pragmatic oddity in norm-based deontic logics. In Proceedings of the 16th edition of the International Conference on Articial Intelligence and Law, 169–178. Parent, X., and van der Torre, L. 2017b. Detachment in normative systems: Examples, inference patterns, properties. IfCoLog Journal of Logics and their Applications 4(9):2295– 3039. Parent, X., and van der Torre, L. W. 2018a. I/O logics with a consistency check. In Broersen, J.; Condoravdi, C.; Nair, S.; and Pigozzi, G., eds., Deontic Logic and Normative Systems — 14th International Conference, DEON 2018, Utrecht, The Netherlands, 3-6 July, 2018, 285–299. College Publications. Parent, X., and van der Torre, L. 2018b. Introduction to deontic logic and normative systems. College Publications. Parent, X.; Gabbay, D.; and van der Torre, L. 2014. Intuitionistic basis for input/output logic. In David Makinson on Classical Methods for Non-Classical Problems. Springer. 263–286. Stolpe, A. 2008a. Norms and norm-system dynamics. Department of Philosophy, University of Bergen, Norway. Stolpe, A. 2008b. Normative consequence: The problem of keeping it whilst giving it up. In International Conference on Deontic Logic in Computer Science, 174–188. Springer. Stolpe, A. 2010a. Relevance, derogation and permission. In International Conference on Deontic Logic in Computer Science, 98–115. Springer. Stolpe, A. 2010b. A theory of permission based on the notion of derogation. Journal of Applied Logic 8(1):97–113. Stolpe, A. 2015. A concept approach to input/output logic. Journal of Applied Logic 13(3):239–258. Straßer, C.; Beirlaen, M.; and van de Putte, F. 2016. Adaptive logic characterizations of input/output logic. Studia Logica 104(5):869–916. Straßer, C. 2013. Adaptive logics for defeasible reasoning: Applications in argumentation, normative Reasoning and default Reasoning. Springer Publishing Company, Incorporated. Sun, X., and van der Torre, L. 2014. Combining constitutive and regulative norms in input/output logic. In International Conference on Deontic Logic in Computer Science, 241–257. Springer. Sun, X. 2016. Logic and games of norms: a computational perspective. Ph.D. Dissertation, University of Luxembourg, Luxembourg. Sun, X. 2018. Proof theory, semantics and algebra for normative systems. Journal of logic and computation 28(8):1757–1779. Van Fraassen, B. C. 1973a. The logic of conditional obligation. In Exact Philosophy. Springer. 151–172. Van Fraassen, B. C. 1973b. Values and the heart’s command. The Journal of Philosophy 70(1):5–19. Von Fintel, K., and Heim, I. 2011. Intensional semantics. Unpublished Lecture Notes. Von Fintel, K. 2012. The best we can (expect to) get? challenges to the classic semantics for deontic modals. In Central meeting of the american philosophical association, february, volume 17. Von Wright, G. H. 1951. Deontic logic. Mind 60(237):1–15. 245 Kratzer Style Deontic Logics in Formal Argumentation Huimin Dong1 , Beishui Liao1 , Leendert van der Torre1,2 1 Department of Philosophy, Zhejiang University, China 2 Department of Computer Science, University of Luxembourg, Luxembourg huimin.dong@xixilogic.org, baiseliao@zju.edu.cn, leon.vandertorre@uni.lu Abstract logic (van Benthem, Grossi, and Liu 2014), and formal argumentation (Liao et al. 2019). Yet considering one instance of the problem of detachment, the warning sign (Prakken and Sergot 1996), the usual non-monotonic tools, for instance, abnormality or priority among obligations, do not seem to be explicitly expressed in the deontic modals: Kratzer introduced an ordering source to define obligations, while Poole, Makinson, and others showed how to build nonmonotonic logics using a similar approach of ordering default assumptions. In this paper, we present three ways to use the ordering source in order to capture various defeasible deontic logics. We prove representation theorems showing how these defeasible deontic logics can be explained in terms of formal argumentation. We illustrate the logics by various benchmark examples of contrary-to-duty obligation. • “There must be no dog.” • “If there is no dog, there must be no warning sign.” • “If there is a dog, there must be a warning sign.” 1 Motivation • “There is a dog.” As some linguists argue, the problem of contrary-to-duty concerns whether we can successfully recognize what is ideal from what is actually true (Arregui 2010), and this can be linked by different ways they are presumed. So the problem of detachment is when we can make a hypothesis that corresponds to less-than-ideal circumstances, and why. These will be answered by the hypothetical reasoning patterns they are in. We adopt Kratzer’s linguistic theory to build up various hypothetical reasoning mechanisms to analyze the contraryto-duty. To uniformly explore the logical nature of the linguistic expressions of contrary-to-duty, and then solve the paradox, we propose a family of Kratzer style deontic logics (Kratzer 1981; Horty 2014). We take two variants of standard deontic logic as the basic logics for hypothetical reasoning in modal language for obligation and ability. The interpretation offered by the Kratzer style inferences is twofold as hypothetical reasoning does. To derive a conclusion from the above-mentioned premises, we not only have to check whether it comes from hypotheses but also examine whether the rules inferring it are defeasible. “Working hypotheses” is common in normative and legal reasoning (Makinson 1994), for instance, the presumption of innocence in criminal law and that of capacity in civil code. This intuition is captured by the notion of ordering source proposed by Kratzer (1981). Comparing to the formalisms proposed by Poole, Makinson, and others (Poole 1988; Makinson 1994; Freund 1998), our Kratzer ordering source can redefine the priorities among presumptions in compositional semantics. Ordering source can be well deployed as a function to define The reasoning with contrary-to-duty obligation has been widely studied in philosophy, linguistics, law, computer science, and artificial intelligence (Chisholm 1963; Prakken and Sergot 1996; Pigozzi and van der Torre 2017). A contrary-to-duty obligation says what should be the case if a violation occurs. “If someone violates the parking regulation, she should pay the compensation.” As first discussed by Chisholm (1963), when the statements of a contrary-toduty obligation and its violation are taken together, it is inevitable to have either inconsistency or logical dependency in standard deontic logic (von Wright 1951). This is called the Chisholm Paradox. It is also called the problem of detachment in the follow-up research (Prakken and Sergot 1996; Parent and van der Torre 2017). Many formal tools in non-monotonic logics have been proposed to represent variants of contrary-to-duty and to represent the paradoxes in a consistent way (Straßer 2014; Governatori and Rotolo 2006). There are two main accounts to avoid inconsistency in the reasoning of contraryto-duty. The first account explicitly expresses the paraconsistent structures of normative exceptions and throw them away when facing conflicts. For instance, in adaptive logic (Goble 2014; Straßer 2014) each model has a set of abnormal propositions to remove the deontic conflicts according to certain inferential rules. In contrast, the paraconsistent representation of obligation can also be expressed in terms of a sequential operator in substructural logic (Governatori and Rotolo 2006). The second account replaces paraconsistency with priority to handle normative conflicts, including methods from default logic (Horty 1994), defeasible deontic logic (Governatori 2018), preference deontic 246 preference over presumptions (Horty 2014). As a case study, we connect our deontic logics to the formal argumentation theory ASPIC` . We chose ASPIC` to transform our defeasible deontic logics to a formal argumentation approach, because the defeasible knowledge and defeasible rules properly correspond to hypotheses, while, the preference can be well defined by ordering source. We prove this connection by several representation theorems. This paper is structured as follows. We first present two deontic logics motivated by an example of contrary-to-duty in Section 2. We then propose the modeling of ordering source in Section 3. In Section 4 we define the possible defeasible deontic inferences in Kratzer’s sense. We also present the logical properties satisfied in these Kratzer style deontic logics. Section 5 instantiates ASPIC` in accordance with ordering source. In Section 6 we prove the representation theorems connecting from defeasible deontic logics to ASPIC` . The related work is discussed in Section 7. We conclude in Section 8. 2 (PL) (MP) (NEC∆ ) (RE2 ) (Dual2 ) (DualO ) (U) (OiC) (D) (M2 ) (C∆ ) (R) (T) (B) (4) (5) All tautologies ϕ, ϕ Ñ ψ{ψ ϕ{∆ϕ ϕ Ø ψ{2ϕ Ø 2ψ 3ϕ Ø 2 ϕ Pϕ Ø O ϕ 2ϕ Ñ Oϕ Oϕ Ñ 3ϕ pOϕ ^ O ϕq 2pϕ ^ ψq Ñ p2ϕ ^ 2ψq p∆ϕ ^ ∆ψq Ñ ∆pϕ ^ ψq 2pϕ Ñ ψq Ñ pOϕ Ñ Oψq 2ϕ Ñ ϕ ϕÑ2 2 ϕ 2ϕ Ñ 22ϕ 2 ϕÑ2 2 ϕ Table 1: Logic D` of obligation and necessity, where ∆ P t2, Ou. We define the notion of Hilbert style derivation based on modal logic in the usual way, see e.g. the textbook of modal logic (Blackburn, De Rijke, and Venema 2002). Note that modal logic provides two related kinds of derivation according to the application of necessitation or uniform substitution, i.e. necessitation or uniform substitution can only be applied to theorems rather than an arbitrary set of formulas. Our connection from defeasible deontic logic to formal argumentation theory will use both notions. Deontic Logics We present two deontic logics based on standard deontic logic which are motivated to capture derivations of contraryto-duty. The modal language we use is simple. We only have two monadic operators: One is for obligation (O) and another is for necessity (2) for agent’s ability “seeing to it that.” We can introduce the indices for agents, but for simplification, we omit them here. Definition 1 (Deontic Language). Let p be any element of a given (countable) set P rop of atomic propositions. The deontic language L of modal formulas is defined as follows: ϕ :“ p | ϕ | pϕ ^ ϕq | Oϕ | 2ϕ The disjunction _, the material condition Ñ, weak permission P , and possibility 3 are defined as usual: ϕ _ ψ :“ p ϕ ^ ψq, ϕ Ñ ψ :“ pϕ ^ ψq, P ϕ :“ O ϕ, and 3ϕ :“ 2 ϕ. The so-called upper-bound logic is presented in Definition 2 as our strongest monotonic deontic logic, to which we refer using the symbol “D” to stand for “deontic.” The axiomatization in Table 1 is a variant of standard deontic logic. Notice that the axiom MO for obligation expansion can be derived by the rule of necessitation NEC2 and the axiom R for obligation replacing, while the uniformed substitution REO for obligation is derivable from the axioms M2 and R. As Chellas (1980) shows, instead of having axiom K, it is possible to replace it with axioms M and C together with the rule of uniform substitution RE. So the logic D` is a possible representation of obligation and ability to uniformly handle non-monotonic reasoning later. The upperbound logic D` includes axioms and rules for obligation and ability of agency, as well as their interactions. The axiom OiC is the principle of “ought implies can” and the axiom U for obligation enforcing. The axiom R illustrates this intuition: “When ϕ enforces ψ, if ϕ is obligatory then ψ is also obligatory.” The axiom T for agents’ abilities indicates that “The enforcement by agent makes it the case.” Definition 2 (Upper-bound logic D` ). The deontic logic D` is a system including all axioms and rules in Table 1. Definition 3 (Derivations without Premises). Let D be a deontic logic. A derivation for φ in D is a finite sequence φ1 , . . . , φn´1 , φn such that φ “ φn and for every φi p1 ď i ď nq in this sequence is either an instance of one of the axioms in D; or the result of the application of one of the rules in D to those formulas appearing before φi . We write $D φ if there is a derivation for φ in D. We say φ is a theorem of D or D proves φ. We write CnpDq as the set of all theorems of D. Definition 4 (Derivations from Premises). Let D be a deontic logic. Given a set Γ of formulas, a derivation for φ from Γ in D is a finite sequence φ1 , . . . , φn´1 , φn such that φ “ φn and for every φi p1 ď i ď nq in this sequence either φi P CnpDq Y Γ; or the result of the application of one of the rules (which is neither RE nor NEC) to those formulas appearing before φi . We write Γ $D φ if there is a derivation from Γ for φ in D.1 We say this that φ is derivable in D from Γ. We write CnD pΓq as the set of formulas derivable in D from Γ. A system D is consistent iff K R CnpDq; otherwise, inconsistent. A set Γ is D-consistent iff K R CnD pΓq; otherwise, inconsistent. A set Γ1 Ď Γ is maximally D-consistent subset of Γ, denoted as Γ1 P MCSD pΓ1 q iff there is no Γ2 Ą Γ1 such that Γ2 is D-consistent. Now we formalize the reasoning of the scenario Instructions of a Party Host in the logic D` . 1 Alternatively, it can be seen as a theorem $D the deduction theorem. 247 Ź Γ Ñ φ by Example 1 (Instructions of a Party Host). This scenario here is interpreted as a factual detachment paradox, or a contrary-to-duty paradox, proposed by Prakken and Sergot (Prakken and Sergot 1996). not meet and be forced to embrace”. The worse is, given the indefeasible axiom OiC, it infers the ability of being forced to embrace when they do not meet, 3pe ^ mq, which, however, is not consistent with the assumption (4). As Prakken and Sergot (1996) point out, the obligations O m and Oe are temporally independent, and the obligation aggregation should not be applied on them. This analysis then offers a reason why the application of the axiom CO is defeasible. Of course, we can have a step backward and instead consider the axiom T defeasible, because not every act can be achieved. (1) “John and Kevin should not meet.” O m (2) “If John and Kevin meet, they should be forced to embrace.” 2pm Ñ Oeq (3) “John and Kevin meet.” m (4) “John and Kevin cannot be forced to embrace if they do not meet.” 3pe ^ mq where m stands for “John and Kevin meet” and e for “John and Kevin are forced to embrace.” Intuitively, the four statements should be jointly satisfiable. But they are inconsistent in the derivations with them as premises in D` . Oe: 2, 3, T O e: 1, 4, R Om: 2, 3, 4, Dual2 , T, R Ope ^ mq: 2, 3, 4, CO Op e ^ mq: 1, 2, 3, 4, CO 4. If the premise (1) is defeasible and defeated by the conclusion Om, now we consider whether the application of axiom R is defeasible. Suppose not. In addition, we consider premises (1) and (2) are defeasible while neither (3) nor (4). Now we have conclusions Oe and O e, which are both from defeasible premises. Assume that the application of the axiom T is defeasible. This gives us a reason to say O e is more undefeated than Oe. Even if we do not consider the result Op e ^ mq given by the aggregation axiom, we already receive something far from being well-behaved: John and Kevin should meet, Om, as well as they should not be forced to embrace, O e. It does not sound like a good guideline. O m: 1 Ope ^ mq: 1, 2, 3, CO Op e ^ mq: 1, 4, CO Table 2: The derived formulas are in the left side of the colon, while the assumptions, axioms, and rules used to derive them are on the right side. To keep the consistency for this case of contrary-to-duty, there are two possible ways to have the hypothetical reasoning patterns (Makinson 1994) for defeasible derivations. We can hypothesize certain premises as the normal assumptions which can be challenged when their derived results are inconsistent with themselves. In this case, the derivations with defeasible premises are also defeasible. We can also hypothesize certain axioms or inferential rules being used in a defeasible manner in order to have their derivations provisional. Given the basic ideas here, the reasons to defeasibly derive the normative statements regarding e and m in Table 2 can be analyzed as follows: We then can consider either the derivations with deontic premises defeasible or those necessarily using one of the axioms CO , T and R defeasible. In particular, the three axioms have some defeasible readings in natural language. We can read the axiom CO as “If ϕ is obligatory and ψ is obligatory, normally, then ϕ and ψ is obligatory.” The axiom T is read as ‘The enforcement by agent, normally, makes it the case.” Similarly, the axiom R is read as “When ϕ enforces ψ, if ϕ is obligatory then, normally, ψ is also obligatory.” Taking different patterns of hypothetical reasoning, either by the premises or by the applications of axioms or rules, we have different ways to resolve the inconsistency in Example 1. In Section 4 the method here will be used to show how to handle the problem of detachment according to the ways of doing hypothesis. Now we define the so-called lower-bound logic D´ in Definition 5 as the weakest deontic logic used for nonmonotonic reasoning later. As discussed in Example 1, it is defined by removing some inference rules and axioms from the upper-bound logic. Roughly, the formulas derived by using these removed rules and axioms will become less prioritized. In terms of formal argumentation, they are the results derived by the defeasible principles of our defeasible deontic logic. For instance, it removes the axiom R, so it does not contain NECO nor the controversial principle MO of obligation expansion derived by R. Nor it has obligation aggregation represented by the axiom CO in Example 1. We also remove the axiom T to have a weaker sense of agent’s “seeing to it that”. 1. As the axiom D indicates, the derived result Om is not consistent with the given premise (1) O m, nor the derived results Oe and O e are consistent. 2. When the premise (1) O m is accepted, to remain consistent, we suppose the conclusion Om is defeasible and then can be defeated by the premise (1). (a) In this case, the defeasibility of Om is possible to be traced back to the premises, if at least one of them is defeasible. Let us presume the premise (2) is defeasible while the premises (3) and (4) are not, because the former is considered as a deontic statement while the latter are factual statements. (b) We can also consider one of the axioms, Dual2 , T, or R, which is used to apply for this result is defeasible. Here we only consider the axiom T or the axiom R to be defeasible, because Dual2 is much common in modal logics. Definition 5 (Lower-bound logic D´ ). The deontic logic D´ is a system including all axioms and rules in D` except R, CO , and T. 2 3. Suppose no premises are defeasible, as well as the derivations using axiom T necessarily for the conclusions, such as the one to derive Oe. In such a case, by using the aggregation axiom CO , we conclude “John and Kevin should 2 248 This deontic logic D´ then reduces to a minimal deontic logic Although the lower-bound logic does not accept obligation aggregation, it does admit that OK by the axiom OiC and pOϕ ^ O ϕq by the axiom D. Notice that all four aggregated obligations in Table 2 are not derived in D´ . 3 derivation is stronger than its contrary from the defeasible one? What if the defeasible derivation is firm? To keep the comparison between formulas consistent, our ordering source simply stipulates the priority over formulas derivable from the strongest derivations to the weakest ones: firm and strict, denoted as Dfs ; firm and defeasible, denoted as Dfd ; plausible and strict, denoted as Dps ; plausible and defeasible, denoted as Dpd . Observe that CnD` pΓf Y Γp q “ Dfs Y Dfd Y Dps Y Dpd . We follow Kratzer and define the ordering source as a function g, which maps a pair pΓf , Γp q of assumptions to a set of subsets of Dpd Y Dps Y Dfd Y Dfs all derivation results. Precisely, this intuition stipulates gpΓf , Γp q as follows, : Defining Ordering Source Our Kratzer style ordering source is defined from the perspective of hypotheses. As discussed in Exampe 1, the hypothetical reasoning of contrary-to-duty can differentiate derivations either by the premises or by the logics: • Firm Derivation Premises are firm and certain in the conversational background and this background information cannot be changed. If we take 3pe ^ mq as background information, then all derivations from this firm premises are also firm, so is the one deriving Ope ^ mq. tDpd , Dpd Y Dps , Dpd Y Dps Y Dfd , Dpd Y Dps Y Dfd Y Dfs u. Then gpΓf , Γp q is consists of elements from the set of formulas derivable in the weakest derivations to the set of those derivable in all and the strongest ones. So the Kratzer style ordering ďgpΓf ,Γp q induced by gpΓf , Γp q is defined as follows: ϕ ďgpΓf ,Γp q ψ iff @X P gpΓf , Γp q. pψ P X ñ ϕ P Xq. In other words, a formula ϕ is as strong as a formula ψ when, if ϕ is contained in an element of gpΓf , Γp q, then ψ is also contained in this set. The following example shows how this classification works as a guideline for interpreting the reasoning of contrary-to-duty. Simply speaking, our ordering source respects the derived results in the firm derivations first, and then consider the logics used in the derivations. Example 3 (Instructions of a Party Host, continued). Let Γf “ tm, 3pe ^ mqu be the set of firm premises and Γp “ t2pm Ñ Oeq, O mu be the set of plausible premises. Now we know the following results: • Dfs contains: m, 3pe ^ mq, Ope ^ mq, 2pe Ñ mq, Ope Ñ mq • Dfd “ H • Dps contains: 2pm Ñ Oeq, O m • Dpd contains: Om, Oe, O e, Ope ^ mq, Ope ^ mq, Op e ^ mq, Op e ^ mq, 3pe ^ mq Then we observe that the contraries of Om, Ope^mq, Ope^ mq, Op e ^ mq, 3pe ^ mq are in the higher priorities. As the stipulation of priority given by the ordering source, we now say that the Kratzer style deontic consequences would like to exclude them. Importantly, we can find the reasons in the conversation for each selection. We remove 3pe ^ mq because it comes from a derivation with a defeasible rule and a hypothetical assumption, while its contrary is a firm assumption without a doubt in the conversation background. Further, the obligation of being forced to embrace, Oe, and the obligation of not being forced to embrace, O e, are both in the same priority and inconsistent to each other. We shall decide when to chose Oe and when to chose O e as well as Op e^ mq to construct the Kratzer style consistent sets. • Plausible Derivation Premises are hypothetical and plausible as beliefs and so introduce uncertainty into the conversation. This plausible information can be changed if their contraries are firm. If we consider the premises O m, 2pm Ñ Oeq as plausible while m as firm, then the derivation for formula Ope ^ mq with all these premises is also plausible. • Strict Derivation The derivations in the lower-bound logic are considered to be strict. For instance, the derivation for Ope ^ mq from the premise 3pe ^ mq is strict. • Defeasible Derivation The derivations in the upperbounded logic but not in the lower-bounded logic are considered to be defeasible. So the derivations for Ope^ mq and 3pe ^ mq from O m, 2pm Ñ Oeq, m are defeasible. Therefore, firm and plausible derivations are mutually disjoined. And so are the strict and defeasible derivations. Definition 6 (Types of Formulas). Given two sets of formulas Γf and Γp , where Γf is a set of firm premises and Γd is a set of plausible premises. A formula ϕ is derivable in a firm derivation iff there is Γ Ď Γf such that Γ $ ϕ. A formula ϕ is derivable in a plausible derivation iff there is a Γ X Γp ‰ H such that Γ $ ϕ. A formula ϕ is derivable in a strict derivation iff there is a Γ Ď L such that Γ $D´ ϕ. A formula ϕ is devirable in a defeasible derivation iff there is a Γ Ď L such that Γ $D` ϕ but there is no Γ Ď L such that Γ $D´ ϕ. As what Kratzer (1981) required, the comparison brought by ordering source should follow the principle of consistency. By saying that formula derivable in a strict derivation is stronger than the one derivable in a defeasible one, this easily leads to inconsistency, shown as the following example. Example 2. It is possible that a strict derivation for Ope^ mq is plausible, if the premise 3pe ^ mq is considered to be plausible. Notice that the derivation for Ope ^ mq is defeasible. Can we say that Ope ^ mq from a strict in neighborhood models (Chellas 1980). 249 According to the Kratzer style full layers, we define two types of defeasible deontic inferences. They are widely invested in the literature of non-monotonic reasoning, the socalled skeptical defeasible inference and the so-called credulou defeasible inference (Horty 1994). The former one takes the intersection of all full layers into consideration to define consequences, while the latter take their union. As shown in Example 4, fixing the premises, it is possible to have more than one Kratzer style maximal consistent set of formulas. So these two constructions provide different Kratzer style deontic consequences. Definition 8 (Skeptical Defeasible Inferences). Given two sets of formulas Γ, Γ1 Ď L such that Γ X Γ1 “ H, we define the closure operator č LpΓ, Γ1 q. DΓ@1 pΓq “ Now we turn to the further discussion on Kratzer style deontic inferences. 4 Kratzer Style Deontic Inferences All Kratzer style deontic inferences in this paper are defined by maximal consistent sets of formulas according to the ordering source. As shown in Section 3, we propose to consider first a maximally consistent subset of the formulas in the strongest Dfs , and then a consistent subset of the weaker Dfd such that it is also consistent with Dfs and it is maximal; and we repeat this process of “keeping consistency with the previous layer as much as possible” for Dps and the weakest Dpd . We call this set a full layer and define it as follows. Definition 7 (Full Layers). Given two sets of formulas Γ, Γ1 Ď L such that ΓXΓ1 “ H, we denote S0 “ S2 “ D´ , and S1 “ S3 “ D` . We recursively define a layer LpΣi q where i P t0, 1, 2, 3u as follows: • LpΣ0 q “ CnD´ pΣ0 q; • LpΣi`1 q is defined as follows: (i) LpΣi`1 q Ď CnSi`1 pΣi`1 q; (ii) LpΣi`1 q is consistent with ϕ w.r.t. Si where ϕ P LpΣi q, and (iii) there is no ∆ Ą LpΣi`1 q such that ∆ Ď CnSi`1 pΣi`1 q and for all ϕ P LpΣi q we have ∆ Y tϕu be Si consistent. where Σ0 P MCSD´ pΓq, Σ1 P MCSD` pLpΣ0 qq, Σ2 P MCSD´ pLpΣ1 q Y Γ1 q, and Σ3 P MCSD` pLpΣ2 qq. We then define: 1 • a full layer FpΣŤ 0 q from the pair pΓ, Γ q starting at Σ0 P MCSD´ pΓq as iPt0,1,2,3u LpΣi q; @ The defeasible inference |„Γ1 corresponding to this closure @ operator is defined as follows: Γ |„Γ1 ϕ iff ϕ P DΓ@1 pΓq. We @ @ write |„Γ1 ϕ when H |„Γ1 ϕ. Example 5 (Instructions of a Party Host, continued). Given two sets Γ “ tm, 3pe ^ mqu and Γ1 “ t2pm Ñ Oeq, O mu. Then m, 3pe ^ mq, Ope ^ mq, 2pe Ñ mq, Ope Ñ mq, 2pm Ñ Oeq, O m P DΓ@1 pΓq. Meanwhile, we also have Oe _ O e P DΓ@1 pΓq. Definition 9 (Credulous Defeasible Inferences). Given two sets of formulas Γ, Γ1 Ď L such that Γ X Γ1 “ H, we define the closure operator ď LpΓ, Γ1 q. DΓD 1 pΓq “ D The defeasible inference |„Γ1 corresponding to this closure D operator is defined as follows: Γ |„Γ1 ϕ iff ϕ P DΓD 1 pΓq. We D D write |„Γ1 ϕ when H |„Γ1 ϕ. Example 6 (Instructions of a Party Host, continued). Given two sets Γ “ tm, 3pe ^ mqu and Γ1 “ t2pm Ñ Oeq, O mu. Now we have this result: DΓ@1 pΓq Y tOe, O e, Op e ^ mqu Ď DΓD 1 pΓq. We present an observation about the connection between these two Kratzer style deontic inferences and the classical consequences as follows. • a set LpΓ, Γ1 q of full layers as tFpΣ0 q | Σ0 P MCSD´ pΓqu. A layer is a consistent subset, such that all elements are consistent with the previous and stronger layer as much as possible. We can see that a layer LpΣ0 q is a consistent subset of Dfs , a layer LpΣ1 q is a consistent subset of Dfs Y Dfd , a layer LpΣ2 q is a consistent subset of Dfs Y Dfd Y Dps , and a layer LpΣ3 q is a consistent subset of Dfs Y Dfd Y Dps Y Dpd . Example 4 (Instructions of a Party Host, continued). Given Γ “ tm, 3pe ^ mqu and Γ1 “ t2pm Ñ Oeq, O mu. As the classification given by ordering source in Example 3, for each group we put inconsistent formulas in the same priority into different full layers. So we have two full layers according to Dpd : • One full layer includes: m, 3pe ^ mq, Ope ^ mq, 2pe Ñ mq, Ope Ñ mq, 2pm Ñ Oeq, O m, Oe; • Another full layer includes: m, 3pe ^ mq, Ope ^ mq, 2pe Ñ mq, Ope Ñ mq, 2pm Ñ Oeq, O m, O e, Op e ^ mq. We see that inconsistency arises in Dpd but not in the other three in Example 3. In other words, the inconsistency is brought up by the axioms and rules in the upper-bounded logic from the plausible premises. It therefore each full layer contain all elements in Dfs Y Dfd Y Dps . They only differ in Dpd . @ D Proposition 1. $D´ Ď |„Γ1 Ď |„Γ1 Ď $D` . Now we consider the third type of Kratzer style deontic inferences, which focuses on derivations rather than the resulting consequences in derivations. Definition 10 (All-Things-Considered Defeasible Inferences). Given two sets of formulas Γ, Γ1 Ď L such that Γ X Γ1 “ H, we define the closure operator č DpΓ, Γ1 qu, DΓ1 pΓq “ tϕ | p∆, ϕq P where the set DpΓ, Γ1 q of all derivations from the set of full layers is defined as: DpΓ, Γ1 q “ tp∆, ϕq | ∆ Ď Γ Y Γ1 and ϕ P F P LpΓ, Γ1 qu. The defeasible inference |„Γ1 corresponding to this closure operator is defined as follows: Γ |„Γ1 ϕ iff ϕ P DΓ1 pΓq. We write |„Γ1 ϕ when H |„Γ1 ϕ. 250 5 Example 7 (Instructions of a Party Host, continued). Remember that we have Oe _ O e P DΓ@1 pΓq in Example 6. This is not the case in the all-things-considered defeasible inference, because this result cannot be derived from the same premises. More precisely, Oe _ O e R DΓ1 pΓq. There are two main ways in the formal argumentation literature to make a monotonic logic like D` defeasible, using either a defeasible knowledge base (a kind of assumptions) or defeasible rules. ASPIC+ (Modgil and Prakken 2018) is one of the few approaches which combines both ways. This has been criticized by proponents of other approaches, suggesting that one of these ways can be reduced to the other. We do not take a stance in this discussion, we just observe that the availability of both defeasible knowledge and defeasible rules, as well as the possibility to define preferences over arguments, can be used to well capture deontic arguments by the Kratzer style deontic inferences. In the following, we define deontic arguments based on the lower-bound and upper bound logic. Following the spirit of ordering source in Section 3, ASPIC` style arguments (Modgil and Prakken 2018) are defined in terms of strict and defeasible rules, and knowledge. The knowledge base can be defeasible or not, and this does affect the definition of arguments. Our instantiation of ASPIC` is twofold in terms of Kratzer’s deontic modals (Kratzer 1981; Horty 2014). The knowledge base is divided into strict and defeasible knowledge, i.e. Ks and Kd , corresponding to the firm and plausible derivations, while the inference rules are categorized as strict and defeasible rules, i.e. Rs and Rd , corresponding to the strict and defeasible derivations. Preference in ASPIC` is defined by the ordering source as discussed in Section 3. In addition, in many approaches of formal argumentation, arguments are similar to derivations, but in our approach, they are not the same. Although each argument corresponds to a derivation defined as a top rule, the former explicitly consider each step of this derivation as a finite sequence. Definition 11 (Inference Rules and Arguments). Let K “ Ks YKd Ď L be a knowledge base such that Ks XKd “ H, and R “ Rs Y Rd be a set of rules such that • Rs “ tφ1 , . . . , φn ÞÑ φ | tφ1 , . . . , φn u $D´ φu is the set of strict rules, and • Rd “ tφ1 , . . . , φn Zñ φ | tφ1 , . . . , φn u $D` φ & tφ1 , . . . , φn u &D´ φu is the set of defeasible rules. Given each n P N, the set An is defined by induction as follows: Proposition 2. $D´ Ę |„Γ1 and |„Γ1 Ď $D` . Next, we check whether the Kratzer style deontic inferences satisfy some important properties regarding nonmonotonicity. @ D Proposition 3. Let , P t|„Γ1 , |„Γ1 , |„Γ1 u, we define: 1. Reflexivity: Γ , ϕ where ϕ P Γ 2. Cut: If Γ Y tψu , χ and Γ , ψ then Γ , χ 3. Cautious Monotony: If Γ , ψ and Γ , χ then Γ Y tψu , χ 4. Left Logical Equivalence: If CnD` pΓq “ CnD` p∆q and Γ , χ then ∆ , χ 5. Right Weakening: If $D` ϕ Ñ ψ and Γ , ϕ then Γ , ψ 6. OR: If Γ , ϕ and ∆ , ϕ then Γ Y ∆ , ϕ 7. AND: If Γ , ψ and Γ , χ then Γ , ψ ^ χ 8. Rational Monotony: If Γ , χ and Γ . ψ then Γ Y tψu , χ The results of satisfactions are shown in Table 3. Properties Reflexivity Cut Cautious Monotony Left Logical Equivalence Right Weakening OR AND Rational Monotony @ |„Γ1 X˚ X X X X No X X D |„Γ1 X X X X X No X No |„Γ1 X˚ X X X X No X X Table 3: The symbol X˚ indicates that this property is satisfied when the given knowledge base is consistent in D´ . Example 8 (Miss Manner). This counterexample of OR illustrates the so-called Miss Manners case discussed by Horty (2014) for deontic detachment. Let Γ “ tO f, Onu, ∆ “ tOpa Ñ f q, Onu and Γ1 “ tP au, where f stands for “eating with fingers”, a for “being served asparagus”, and n for “putting the napkin”. We then have the following D @ results for , P t|„Γ1 , |„Γ1 , |„Γ1 u: • Γ , P a and ∆ , P a but Γ Y ∆ , A0 An`1 “ K “ An Y tB1 , . . . , Bm Ź ψ | Bi P An for all i P t1, . . . , mu and ψ P Lu where for an element B “ B1 , . . . , Bm Ź ψ: • If B P K, then P rempBq “ tψu, ConcpBq “ ψ, SubpBq “ tψu, Rulesd pBq “ H, T opRulepBq “ undef ined where ψ P K. • If B “ B1 , . . . , Bm Ź ψ where Ź is ÞÑ then ConcpB1 q, . . . , ConcpBm q ÞÑ ψ P Rs with P rempBq “ P rempB1 q Y . . . Y P rempBm q, ConcpBq “ ψ, SubpBq “ SubpB1 q Y . . . Y SubpBm q Y tBu, Rulesd pBq “ Rulesd pB1 q Y . . . Y Rulesd pBm q, T opRulepBq “ ConcpB1 q, . . . , ConcpBm q ÞÑ ψ. P a. Example 9 (Instructions of a Party Host, continued). Now D we provide a counterexample of Rational Monotony for |„Γ1 . 1 Assume that Γ “ tmu, Γ “ t 3pe ^ mq, O mu, χ “ Oe _ Op e ^ mq, and ψ “ 2pm Ñ Oeq. We then have the following results: D Instantiating ASPIC` D • Γ |„Γ1 χ and Γ |Γ1 ψ; D • However, Γ Y tψu |Γ1 χ. 251 • A P A is acceptable w.r.t. E iff when B P A such that pB, Aq P D then DC P E such that pC, Bq P D. • E is an admissible set iff E is conflict-free and if A P E then A is acceptable w.r.t. E. • E is a complete extension iff E is admissible and if A P A is acceptable w.r.t. E then A P E. • E is a stable extension iff E is conflict-free and @B R E DA P E such that pA, Bq P D. • If B “ B1 , . . . , Bm Ź ψ where Ź is Zñ, then each condition is similar to the previous item, except that the rule is defeasible and Rulesd pBq “ Rulesd pB1 q Y . . . Y Rulesd pBm q Y tConcpB1 q, . . . , ConcpBm q Zñ ψu. Ť We define A “ nPN An as the set of arguments on the basis of K, and define ConcpEq “ tϕ Ď ConcpAq | A P Eu where E Ď A. We define the set of formulas regarding to a given argument as follows: F pDq “ Ť P rempDq Y tConcpDqu where D P A. Let F pEq “ tF pDq | D P E Ď Au. Example 11 (Instructions of a Party Host, continued). In Example 10, arguments are ordered as: D2 , D3 , D4 , D5 ď C, D ď D1 , A, B. The defeats include pB, D5 q, pD1 , D4 q, pD2 , D3 q, pD3 , D2 q, as illustrated in Figure 1. The following example illustrates the arguments in the running example. Example 10 (Instructions of a Party Host, continued). We take the knowledge base constructed by Ks “ tm, 3pe ^ mqu and Kd “ t2pm Ñ Oeq, O mu. The knowledge base has four arguments: the strict knowedge A “ m and B “ 3pe ^ mq as well as the defeasible knowledge C “ 2pm Ñ Oeq and D “ O m. Then the arguments leading to the conclusions Ope ^ mq, Oe, O e which are presented as follows: 1. The arguments which have top rules as strict rules: • D1 “ B ÞÑ Ope ^ mq The premise of argument D1 is a strict knowledge. 2. The arguments which have top rules as defeasible rules: • D2 “ A, C Zñ Oe • D3 “ B, D Zñ Oe • D4 “ A, C, D Zñ Ope ^ mq • D5 “ A, C, D Zñ 3pe ^ mq All four arguments have at least one piece of defeasible knowledge in their premises. B D5 D1 D4 D2 D3 Figure 1: The closed circle represents the arguement with strict rule as the top rule, and the dashed circles represent the arguments with defeasible rules as the top rules. The arrows represent defeat relations. 6 From Deontic Logic to ASPIC` One contribution of this paper is to instantiate ASPIC` in terms of Kratzer style consistency. We prove that the notion of full layer in Section 4 characterizes the notion of stable extension in Section 5. This is not coincident. The compositional idea proposed by Kratzer provides a process to capture the intuition behind stability – “all outsiders are defeated.” The ordering source as a function induces a preference. And then a full layer recursively defines a mechanism to remove all inconsistent and less prioritized results. This is exactly what a stable extension requires. This connection is shown by the following representation theorem for stable extension. We then define such a preference among arguments by the ordering source. Definition 12 (Arguments properties). Let A, B be arguments. Then A is strict if Rulesd pAq “ H; defeasible if Rulesd pAq ‰ H; firm iff P rempAq Ď Ks ; plausible iff P rempAq X Kd ‰ H. The preference ď is defined as: A ď B iff: • If A, B both are firm or plausible then A is defeasible; • otherwise, A is plausible and B is firm. The preference indeed divides arguments into four groups: firm and strict, firm and defeasible, plausible and strict, and plausible and defeasible. Proposition 4. Let K “ Ks Y Kd be a knowledge base and A be a set of arguments on the basis of K. Let F P LpKs , Kd q as Definition 7 defined. And we also define E as tD P A | F pDq Ď Fu. Then E is a stable extension. Definition 13 (Defeats). We define D as a set of pairs of arguments in which argument A defeats argument B is defined as: Proof. First, we prove that E is conflict-free. Otherwise, there are A, B P E such that pA, Bq P D. This implies that F pAq Y F pBq $ K where $ is either $D´ or $D` . Because each F P LpKs , Kd q, we suppose Ť there is some Σ0 P MCSD´ pKs q such that F “ FpΣ0 q “ 0ďiď3 LpΣi q is a full layer from the pair pKs , Kd q starting at Σ0 , such that each layer LpΣi q satisfies Definition 7. Because each LpΣi q (0 ď i ď 3) is constructed from a consistent set, it implies that A and B are elements from different LpΣi q where i P t0, 1, 2, 3u. We assume that F pAq Ď LpΣ1 q and • either ConcpAq “ φ for some B 1 P SubpBq and T opRulepB 1 q P Rd , ConcpB 1 q “ φ and A ă B 1 . • or ConcpAq “ φ for defeasible knowledge φ P P rempBq X Kd of B and A ă φ. Definition 14 (Dung Extensions). Let A be a set of arguments, D be a set of defeats, and E Ď A be a set of arguments. Then • E is conflict-free iff @A, B P E we have pA, Bq R D. 252 @ F pBq Ď LpΣ2 q. But then it contradicts the condition (ii). Similar in the other cases. So E is conflict-free. Now we prove that E is a stable extension. For each B R E we need to find out a A P E such that pA, Bq P D. Given B R E, we know that F pBq does not satisfy some condition of LpΣi q. There are four cases need to be considered. • Γ |„Γ1 ϕ iff every stable extension on the basis of K contains an argument A with ConcpAq “ ϕ. Proposition 6. Let K “ Ks Y Kd be a knowledge base and A be a set of arguments on the basis of K, such that Ks “ Γ and Kd “ Γ1 . We have: D • Γ |„Γ1 ϕ iff some stable extension on the basis of K contains an argument A with ConcpAq “ ϕ. 1. If F pAq Ď LpΣ0 q. Because B R E, we know that F pBq is not possible to be contained in LpΣ0 q. Then we have F pAq Y F pBq $D´ K. Because T opRulepAq P Rs and F pAq Ď LpΣ0 q, according to the priority defined by the ordering source, A ď B. So there is a A P E such that pA, Bq P D. 2. If F pAq Ď LpΣ1 q. We only consider the case of F pAq Y F pBq $D` K. Otherwise, like the previous step, there is an argument in LpΣ0 q such that its set of formulas is not consistent with F pBq. • If A ă B, then B is strict and firm. So F pBq is not D´ -consistent with the elements in LpΣ0 q, otherwise F pBq Ď LpΣ0 q. As the proof in the previous step, there is a A1 such that F pA1 q Ď LpΣ0 q and A1 ě B. Thus there is a A1 such that pA1 , Bq P D and A1 P E. • If A ě B, then we also have there is a A P E such that pA, Bq P D. 3. If F pAq Ď LpΣ2 q. We only consider the case of F pAq Y F pBq $D´ K. Otherwise, the proof is similar to the previous two steps. • If A ă B. Then either B is strict and firm or B is defeasible and firm. It further infers that either B is strict and firm and F pBq is not D´ -consistent with LpΣ0 q; or B is defeasible and firm and F pBq is not D` -consistent with LpΣ1 q; otherwise F pBq Ď LpΣ0 q or F pBq Ď LpΣ1 q. Either case implies that there is a A1 such that F pA1 q Ď LpΣi q, F pA1 q Y F pBq $Si K, and A1 ě B (i P t0, 1u). So there is a A1 P E such that pA1 , Bq P D. • If A ě B, then it is clear that pA, Bq P D. 4. If F pAq Ď LpΣ3 q. We only consider the case of F pAq Y F pBq $D` K. Otherwise, the proof is similar to the previous three steps. • If A ă B, then either B is strict and firm, defeasible and firm, or strict and plausible. As the strategy used previously, we always find a A1 such that F pA1 q Ď LpΣi q, F pA1 q Y F pBq $Si K, and A1 ě B (i P t0, 1, 2u). So, there is a A1 P E such that pA1 , Bq P D. • If A ě B, then it is clear that pA, Bq P D. Proposition 7. Let K “ Ks Y Kd be a knowledge base and A be a set of arguments on the basis of K, such that Ks “ Γ and Kd “ Γ1 . We have: • Γ |„Γ1 ϕ iff there is an argument A contained in every stable extension on the basis of K such that ConcpAq “ ϕ. 7 Related Work A well-known approach for hypothetical reasoning in the form of maximally consistent subsets has been studied in nonmonotonic reasoning (Poole 1988; Makinson 1994; Freund 1998; Makinson and van der Torre 2001). One of the early proposals is introduced by Poole (1988), in which an assumption-based consequence is constructed based on default logic following certain principles of consistency. Along the same line, Freund (1998) proposes the preferential inferences for hypothetical reasoning, in which preference is induced by the hypotheses. So a proposition is more preferable than the other iff it is more likely than the other proposition in the sense that it satisfies a subset of presumed premises than the other does. In contrast, the preference in Krazter’s account is induced by the ordering source. In our case, it takes two kinds of premises as well as two categories of inferential rules into consideration. In contrast, Makinson and van der Torre (2001) study a variety of constraints of input/output pairs to model the inferential relation between premises and conclusions. These constraints deciding which conclusions to be accepted, we assume, can be represented by the Kratzer ordering source. We leave this to further research. The study of logic-based instantiations of argumentation framework (da Costa Pereira et al. 2017; Beirlaen, Heyninck, and Straßer 2018) is an active area to connect logics to formal argumentation. It usually investigates how to apply the standard method of constructing maximal consistency to argumentation systems (Amgoud and Besnard 2013; Arieli, Borg, and Straßer 2018). This basic idea can be traced back to Benferhat et al. and Cayroll’s work (Benferhat, Dubois, and Prade 1995; Cayrol 1995). Benferhat et al. propose the concept of “level of paraconsistency” to characterize preference in argumentation theory. Cayroll (1995) links the construction of stable extensions to maximally consistent sets in classical logics. Recently this research area puts emphasis on modal logics (Beirlaen, Heyninck, and Straßer 2018; Liao et al. 2019). Beirlaen et al. (2018) define their argumentation systems of conditional obligation, in which preference is indexed by the modal language. In contrast, the rule-based argumentation systems developed by Liao et al. (2019) provide total orderings in order to prioritize norms in the semantics. In both accounts, preferences are given but not induced. Dong et al. (2019) instantiates ASPIC` on Now we can conclude that E is a stable extension. Given the above-mentioned representation theorem, we then have the following representation theorems for the Kratzer style deontic inferences in terms of argumentations. Proposition 5. Let K “ Ks Y Kd be a knowledge base and A be a set of arguments on the basis of K, such that Ks “ Γ and Kd “ Γ1 . We have: 253 agents’ abilities. A contrary-to-duty obligation can be understood in different intuitions, for instance, factual detachment (Straßer 2014) or deontic detachment (Prakken and Sergot 1996) about norm violations, or a compensational norm linking in a computational way (Governatori and Rotolo 2006). Several formal systems have been proposed to deal with various variants of contrary-to-duty (Prakken and Sergot 1996; Makinson and van der Torre 2001; Parent and van der Torre 2014; Beirlaen, Heyninck, and Straßer 2018). As discussed by Prakken and Sergot (1996) and recently by Pigozzi and van der Torre (2017), the challenge of representing norm violation different to norm exception is still waiting to be solved. Whether there are some other linguistic features to distinguish a violation from an exception, this question is left to future work. the basis of a modal deontic logic for obligation and strong permission. They define preferences either by the language types of premises or by the inference rules but, contrary to our work, have not considered both at the same time. Now we turn to the fruitful work on defeasible deontic logic (Nute 1997). The main idea is to define defeasibility, either by consistency governed under a set of formulas combining with a set of inference rules (Goble 2014; Straßer 2014; Governatori and Rotolo 2006), or by providing a priority to overtake less normal conclusions (Horty 1994; Governatori 2018). For instance, Goble (2014) provides an adaptive logic to handle different kinds of normative conflicts via the notion of abnormality. A formula is true from a set of formulas iff this formula is satisfied at every reliable and normal model. This inference relation highly depends on the sets of abnormalities and inferential rules on them. Straßer (2014) follows Goble’s work and investigates the dynamics in adaptive reasoning. While Governatori (2006) proposes that the multi-layered consistency for conditional obligations is captured by the sequential operators to compute norms and their violations. In contrast, Horty (1994) and Governatori (2018) define the defeasible consequences by the priorities over default rules. They both define priorities among default rules rather than over the arguments. We therefore can apply Kratzer’s method to define another kind of ordering source to model the preferences and their default constructions. 8 Acknowledgments The authors thank for the useful comments from the two anonymous reviewers from NMR 2020. Huimin Dong is supported by the China Postdoctoral Science Foundation funded project [No. 2018M632494] and the National Science Centre of Poland [No. UMO-2017/26/M/HS1/01092]. Beishui Liao and Huimin Dong are supported by the Convergence Research Project for Brain Research and Artificial Intelligence, Zhejiang University. Leendert van der Torre and Beishui Liao have received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 690974 for the project “MIREL: MIning and REasoning with Legal texts”. Conclusion This paper employs Kratzer’s compositional method, the ordering source, to define preferences associated with hypothetical reasoning. By doing so, we can explicitly interpret why a conclusion is drawn: by the linguistic types of premises or by the inferential rules defeasible or not. This is proven by the four representation theorems in Section 6 to connect defeasible deontic logics to ASPIC` . It clarifies the kernel to constructing preference and arguments by the Kratzer maximal consistency method. We believe that this Kratzer method can generally redefine the other nonmonotonic formal tools. We leave this in our further research. We observe that the ordering source provided in this paper highlights the principle of “Premises first.” Our ordering source always considers the derivations with firm premises first and then categorizes derivations according to what inferential rules they use. In other words, it regards background information more important than the logical rules. This is reflected in the notion of full layers. In future work, we can explore an alternative ordering source associated with the principle of “Rules first,” and examine the logical structure behind. Furthermore, given the full consideration of “Premises first” and “Rules first”, it is possible to answer the linguistic question in a general way: Whether a factual detachment or a deontic detachment is satisfied or valid (Arregui 2010), depending on what kind of the logical structure is used for the hypothetial reasoning. We can also investigate alternative formats of contraryto-duty obligation. We have studied an example of contraryto-duty regarding obligation violation interacting with the References Amgoud, L., and Besnard, P. 2013. Logical limits of abstract argumentation frameworks. Journal of Applied NonClassical Logics 23(3):229–267. Arieli, O.; Borg, A.; and Straßer, C. 2018. Reasoning with maximal consistency by argumentative approaches. Journal of Logic and Computation 28(7):1523–1563. Arregui, A. 2010. Detaching if-clauses from should. Natural Language Semantics 18(3):241–293. Beirlaen, M.; Heyninck, J.; and Straßer, C. 2018. Structured argumentation with prioritized conditional obligations and permissions. Journal of Logic and Computation 29(2):187– 214. Benferhat, S.; Dubois, D.; and Prade, H. 1995. A local approach to reasoning under inconsistency in stratified knowledge bases. In European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty, volume 946, 36–43. Springer. Blackburn, P.; De Rijke, M.; and Venema, Y. 2002. Modal Logic, volume 53. Cambridge University Press. Cayrol, C. 1995. On the relation between argumentation and non-monotonic coherence-based entailment. In International Joint Conference on Artificial Intelligence, volume 95, 1443–1448. Chellas, B. F. 1980. Modal logic: an introduction. Cambridge university press. 254 Chisholm, R. M. 1963. Contrary-to-duty imperatives and deontic logic. Analysis 24(2):33–36. da Costa Pereira, C.; Liao, B.; Malerba, A.; Rotolo, A.; Tettamanzi, A. G. B.; van der Torre, L.; and Villata, S. 2017. Handling norms in multi-agent systems by means of formal argumentation. IfCoLog Journal of Logics and Their Applications 4(9):3039–3073. Also in Handbook of Normative Multi-agent Systems. Dong, H.; Liao, B.; Markovich, R.; and van der Torre, L. 2019. From classical to non-monotonic deontic logic using ASPIC`. In International Workshop on Logic, Rationality and Interaction, 71–85. Springer. Freund, M. 1998. Preferential reasoning in the perspective of poole default logic. Artificial Intelligence 98(1-2):209– 235. Goble, L. 2014. Deontic logic (adapted) for normative conflicts. Logic Journal of the IGPL 22(2):206–235. Governatori, G., and Rotolo, A. 2006. Logic of violations: A gentzen system for reasoningwith contrary-to-duty obligations. The Australasian Journal of Logic 4. Governatori, G. 2018. Practical normative reasoning with defeasible deontic logic. In Reasoning Web International Summer School, 1–25. Springer. Horty, J. F. 1994. Moral dilemmas and nonmonotonic logic. Journal of philosophical logic 23(1):35–65. Horty, J. 2014. Deontic modals: why abandon the classical semantics? Pacific Philosophical Quarterly 95(4):424–460. Kratzer, A. 1981. The notional category of modality. Words, Worlds, and Contexts: New Approaches in Word Semantics 6:38. Liao, B.; Oren, N.; van der Torre, L.; and Villata, S. 2019. Prioritized norms in formal argumentation. Journal of Logic and Computation 29(2):215–240. Makinson, D., and van der Torre, L. 2001. Constraints for input/output logics. Journal of Philosophical Logic 30:155– 185. Makinson, D. 1994. General patterns in nonmonotonic reasoning. In Handbook of logic in artificial intelligence and logic programming (vol. 3). 35–110. Modgil, S., and Prakken, H. 2018. Abstract rule-based argumentation. In Baroni, P.; Gabbay, D.; Giacomin, M.; and van der Torre, L., eds., Handbook of formal argumentation. College Publication. 287–364. Nute, D., ed. 1997. Defeasible deontic logic. Parent, X., and van der Torre, L. 2014. Sing and dance! In International Conference on Deontic Logic in Computer Science, 149–165. Springer. Parent, X., and van der Torre, L. 2017. Detachment in normative systems: Examples, inference patterns, properties. IfCoLog Journal of Logics and Their Applications 4(9):2995–3038. Also in Handbook of Normative Multiagent Systems. Pigozzi, G., and van der Torre, L. 2017. Multiagent deontic logic and its challenges from a normative systems perspective. IfCoLog Journal of Logics and Their Applications 4(9):2929–2993. Also in Handbook of Normative Multiagent Systems. Poole, D. 1988. A logical framework for default reasoning. Artificial intelligence 36(1):27–47. Prakken, H., and Sergot, M. 1996. Contrary-to-duty obligations. Studia Logica 57(1):91–115. Straßer, C. 2014. A deontic logic framework allowing for factual detachment. In Adaptive Logics for Defeasible Reasoning. Springer. 297–333. van Benthem, J.; Grossi, D.; and Liu, F. 2014. Priority structures in deontic logic. Theoria 80(2):116–152. von Wright, G. H. 1951. Deontic logic. Mind 1–15. 255

Log In

Information Revision: The Joint Revision of Belief and Trust

Related papers

Related papers

Related topics