Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2013, Lecture Notes in Computer Science
…
16 pages
1 file
We propose a set of algebraic laws for reasoning with sequential imperative programs that use object references like in Java. The theory is based on previous work by adding laws to cope with object references. The incrementality of the algebraic method is fundamental; with a few exceptions, existing laws for copy semantics are entirely reused, as they are not affected by the proposed laws for reference semantics. As an evidence of relative completeness, we show that any program can be transformed, through the use of our laws, to a normal form which simulates it using an explicit heap with copy semantics.
Sigplan Notices, 2007
Java's Reference objects provide the programmer with limited control over the process of memory management. Although reference objects are often helpful, they introduce nondeterminism into program evaluation and lead to ambiguous program outcome. In this paper we present a calculus to formally reason about Java's Reference objects. We model multiple levels of reference objects in a single calculus and apply a different garbage collection policy to each one of them. Accordingly, weak references are given the semantics of eager collection and soft references are given the semantics of lazy collection. In addition, we constrain garbage collection with the scarcity of two resources: time and space. We demonstrate the viability of our calculus by modeling a Java program which addresses a commonly-encountered caching problem. Using our model, we reason about the program's evaluation and interaction with the garbage collector.
2006
Weak references provide the programmer with limited control over the process of memory management. By using them, a programmer can make decisions based on previous actions that are taken by the garbage collector. Although this is often helpful, the outcome of a program using weak references is less predictable due to the nondeterminism they introduce in program evaluation. It is therefore desirable to have a framework of formal tools to reason about weak references and programs that use them. We present several calculi that formalize various aspects of weak references, inspired by their implementation in Java. We provide a calculus to model multiple levels of non-strong references, where a different garbage collection policy is applied to each level. We consider different collection policies such as eager collection and lazy collection. Similar to the way they are implemented in Java, we give the semantics of eager collection to weak references and the semantics of lazy collection t...
Science of Computer Programming, 2004
We present algebraic laws for a language similar to a subset of sequential Java that includes inheritance, recursive classes, dynamic binding, access control, type tests and casts, assignment, but no sharing. These laws are proved sound with respect to a weakest precondition semantics. We also show that they are complete in the sense that they are sufficient to reduce an arbitrary program to a normal form substantially close to an imperative program; the remaining object-oriented constructs could be further eliminated if our language had recursive records. This suggests that our laws are expressive enough to formally derive behaviour preserving program transformations; we illustrate that through the derivation of provablycorrect refactorings.
Patterns, Programming and Everything, 2012
Verifying properties of object-oriented software requires a method for handling references in a simple and intuitive way, closely related to how O-O programmers reason about their programs. The method presented here, a Calculus of Object Programs, combines four components: compositional logic, a framework for describing program semantics and proving program properties; negative variables to address the specifics of O-O programming, in particular qualified calls; the alias calculus, which determines whether reference expressions can ever have the same value; and the calculus of object structures, a specification technique for the structures that arise during the execution of an object-oriented program. The article illustrates the Calculus by proving the standard algorithm for reversing a linked list.
Proceedings of the 2003 workshop on Mechanized reasoning about languages with variable binding - MERLIN '03, 2003
We illustrate the benefits of using Natural Deduction in combination with weak Higher-Order Abstract Syntax for formalizing an object-based calculus with objects, cloning, method-update, types with subtyping, and side-effects, in inductive type theories such as the Calculus of Inductive Constructions. This setting suggests a clean and compact formalization of the syntax and semantics of the calculus, with an efficient management of method closures. Using our formalization and the Theory of Contexts, we can prove formally the Subject Reduction Theorem in the proof assistant Coq, with a relatively small overhead.
2005
A weak reference is a reference to an object that is not followed by the pointer tracer when garbage collection is called. That is, a weak reference cannot prevent the object it references from being garbage collected. Weak references remain a troublesome programming feature largely because there is not an accepted, precise semantics that describes their behavior (in fact, we are not aware of any formalization of their semantics). The trouble is that weak references allow reachable objects to be garbage collected, therefore allowing garbage collection to influence the result of a program. Despite this difficulty, weak references continue to be used in practice for reasons related to efficient storage management, and are included in many popular programming languages (Standard ML, Haskell, OCaml, and Java). We give a formal semantics for a calculus called λweak that includes weak references and is derived from Morrisett, Felleisen, and Harper's λgc. λgc formalizes the notion of ...
Electronic Notes in Theoretical Computer Science, 2005
Assume/Guarantee (A/G) reasoning for heap-manipulating programs is challenging because the heap can be mutated in an arbitrary way by procedure calls. Moreover, specifying the potential side-effects of a procedure is non-trivial. We report on an on-going effort to reduce the burden of A/G reasoning for heap-manipulating programs by automatically generating post-conditions and estimating side-effects of non-recursive procedures. Our method is sound. It combines the use of theorem provers and abstract-interpretation algorithms.
ACM SIGPLAN Notices, 2005
The goal of this work is to develop compile-time algorithms for automatically verifying properties of imperative programs that manipulate dynamically allocated storage. The paper presents an analysis method that uses a characterization of a procedure's behavior in which parts of the heap not relevant to the procedure are ignored. The paper has two main parts: The first part introduces a non-standard concrete semantics, LSL, in which called procedures are only passed parts of the heap. In this semantics, objects are treated specially when they separate the "local heap" that can be mutated by a procedure from the rest of the heap, which-from the viewpoint of that procedure-is non-accessible and immutable. The second part concerns abstract interpretation of LSL and develops a new static-analysis algorithm using canonical abstraction. It also provides insight into Deutsch's mayalias algorithm.
2019
In light of the pervasive developments of new technologies, such as NBIC (Nanotechnology, biotechnology, information technology, and cognitive science), it is imperative to produce a coherent and deep reflexion on the human nature, on human intelligence and on the limit of both of them, in order to successfully respond to some technical argumentations that strive to depict humanity as a purely mechanical system. For this purpose, it is interesting to refer to the epistemology and metaphysics of Thomas Aquinas as a stable philosophical reference on Human Nature. Indeed, we find in the works of Aquinas some of the most productive elements that could form a base to our deeper understanding of, and possibly even solutions to some of the most perplexing questions raised in our times by the existence of AI.
The inherent difficulty of reasoning with pointers has been successfully addressed using different techniques for describing spatial separation of pointers, see for example [19,7,13,3]. However, there have been few initiatives using algebraic approaches [12,20], despite its well known advantages. Transformations are used in compilers, but these rely on context conditions presented algorithmically or by means of logic (as discussed in [4]); in many cases they apply only to intermediate representations. No comprehensive set of algebraic laws has been proposed to support transformations of source programs involving references.
In [10], Hoare and Staden highlight, among other advantages, the incrementality of the algebraic method. When a new programming language concept or design pattern is added, new axioms can be introduced, keeping intact at least some axioms and theorems of the existing theory of the language. Even when the new language features have an impact on the original ones, this tends to be controllable, affecting only a few laws. On the other hand, a pure algebraic presentation is based on postulating algebraic laws, which raises the questions of consistency and completeness of the proposed set of laws.
In this paper, we explore the incrementality of algebra to extend with object references a simple non-deterministic imperative language similar to that given in the seminal "Laws of Programming" by Hoare et al [9] (LoP for short). Our language deals with references in the same way as in Java, i.e., aliasing (sharing) may only occur for fields of objects since the only operations with references are field access to read and write. There is no arithmetic on references nor operations returning addresses of variables.
Based on LoP, we generalize some laws to deal with references, in particular those related to assignments to object fields. The main difficulties are due to aliasing. We tackle them using techniques inspired in works of Morris [17] and Bornat [7] on program logic. We also propose new laws to deal with object creation and manipulation of assertions. Dynamic allocation poses challenges in semantic modeling, and thus in semantic notions of completeness [21]. As in LoP, we address the completeness of the laws by showing that any program can be transformed, by using the laws, to a normal form. Soundness of the laws is tackled in an extended report [15] by proving the validity of the laws with respect to a naive denotational semantics (which suffices because we only consider first order programs).
Our work is in a wider context of defining algebraic theories for reasoning about object-oriented programs. Previous work [6,8] defines theories for objectorientation useful to prove transformations of programs such as refactorings. However, the language used in these works has copy semantics, lacking the concept of reference and, thus, restricting the refactorings that can be characterized. By "copy semantics" we mean that it is only simple variables that are mutable; objects are immutable records (i.e., functional maps with update by copy) so aliasing cannot occur.
Our laws are intended to support program transformation and verification. They can be used, for instance, in the design of correct compilers and optimizers. Together with laws of object-orientation established in previous works, which are also valid in the context of references, they can be applied to a wider collection of refactorings and patterns which depend on the concept of reference.
In the next Section we show the abstract syntax of the language and briefly explain its constructions. Section 3 discusses how aliasing complicates the algebraic reasoning with programs and describes a substitution mechanism that deals with it. The laws are given in Section 4 and a notion of relative completeness for the proposed set of laws is shown in Section 5. Section 6 presents final considerations, including other notions of completeness, and discusses related and future works. Proofs and further details appear in the long version of this paper [15].
The programming language we consider is sequential and imperative. It extends that in LoP by including object references as values and assignments to object fields. The language is statically typed, thus each variable and field has a statically declared type. Formalization of types is routine and omitted.
The abstract syntax is in Figure 1. In this grammar x , X , f , and K range over given sets representing names for variables, recursive commands, fields of objects, and classes, respectively. The non terminal symbols cd , T , c, le and e are used for class declarations, types, commands, left expressions and expressions, respectively. As a convention, a line above syntactic elements denotes a list of them. Thus, for example, we use e to abbreviate a list e
Figure 1
/f{f :m} to denote a field substitution on the left expression. The idea is that a field substitution applied on a left expression like x .f 1 .f 2 . . . f n always ignores the last field, keeping it in the result, i.e.,(le.g) /f {f :m} = le f {f :m} .g even when f and g are the same name field. The field substitution on the right side of the equation is that described above for expressions.
. . e n of expressions, for some n. But the identifier e has no relation with the identifier e which represents any single expression not necessarily in e. The empty list is written as "()". A class declaration defines a named record type, i.e., class K {f : T } declares a class with name K and fields f : T . There are no methods and all the fields in the class are public. In our laws of commands, we assume there is a fixed set of class declarations which may be mutually recursive.
Like in Java, variables or fields of primitive types, bool and int, store values of the corresponding types. On the other hand, any variable or field having some K as its declared type does not store an object instance of K but a reference to it. This is the main difference of our language with respect to [6,8] where a copy semantics is adopted and variables/fields hold full objects.
The expression null.f is allowed, though it always aborts, because it may arise from the application of laws.
The language has all the programming commands given in LoP, extended with commands for creation of objects, assertions and richer left expressions for assignments to fields. We omit specification constructs like angelical choice. We use the same syntax as in LoP, e.g., b e c and e * c are conditional and iteration commands corresponding to the if and while of Java, respectively. In both cases e is the guard condition. For recursion we use the notation µ X • c, which binds X and defines a command where recursive calls can happen within c. The while command can be defined in terms of the µ construct: e * c = µ X • c; X e skip.
The non-deterministic command b ∪ c denotes the demonic choice between b and c. The command ⊥, also known as abort, represents the behaviour of a failing program. It is the most non-deterministic program since its execution may result in any state or may even fail to terminate.
The simultaneous assignment le := e requires the same size for both lists, and, of course, the same type for each corresponding left and right expression. The assignment is executed by evaluating all the expressions in e and then assigning each resulting value to the corresponding left expression in le. The assignment command in our language differs from that of LoP in some important aspects. First, it is allowed to have empty lists at each side of ":=". Indeed, we can define skip as () := (), whose execution always ends successfully without performing any change. Second, left expressions are richer than those in LoP where only variables are allowed. Here, left expressions can also be fields of objects or even conditional left expressions.
Notably, it is allowed to have repeated left expressions on the left-hand side list of ":=". When this happens, the execution is non-deterministic. For example, the execution of x , x := 1, 2 assigns either 1 or 2 to x . This kind of nondeterminism can happen even with syntactically distinct left expressions when the left-hand side list includes aliased fields, so we may as well allow it for variables as well.
The command x ← new K creates in the heap a new instance of the class K and assigns its reference to x . The fields of the new instance are initialized by default with 0 for int, false for bool and null for objects.
Assertions are denoted by [e], where e is a boolean expression.
[e] is a command that behaves like skip when e evaluates to true and like ⊥ otherwise. Our assertions use only program expressions and can be defined in terms of more basic operators: [e] = skip e ⊥. However, assertions may refer to a distinguished variable, alloc, which in any state holds the set of currently allocated references. This device has been used in some program logics (e.g., [3]).
As usual we model input and output by the global variables of the main program. These global variables are assumed to have implicit type declarations.
We use a, b, c to stand for commands, d , e for expressions, f , g, k for field names, m for case lists (defined in Subsection 3.2), p, q, r stand for left expressions, x , y, z stand for variables, K stands for class names and T stands for type names, i.e. primitive types and classes.
A fundamental idea behind algebra of programs [9,16,1], as well as in Hoare Logic and many other formalisms, is that variable substitution can capture the effects that assignments produce on expressions. However, when the language has references, several obstacles emerge. The main difficulty resides in the possibility of aliasing, which happens when two or more expressions or left expressions point to a single object field.
Consider the following law, as originally presented in LoP, where aliasing is not possible. It combines two sequential assignments into a single one. The notation Just like in LoP, we assume that for every expression e there is an expression De for checking if e is defined. We assume that De is always defined. For example, D(x /y) is given by y = 0. Considering the left-hand side of the equation (*) above, if e is undefined it will behave as ⊥. This justifies the checking for definedness on the right-hand side.
Under the conditions given in LoP, where the left-hand side x is a list of variables without duplicates, the law is valid. But our language allows duplicates on left-hand sides. Obviously, in this case the law does not make sense since the substitution d
x e will not be well defined. Moreover, because of the possibility of aliasing, we get in trouble if we apply the law to assignments with fields as left expressions. Note that a variable always denotes the same memory cell both in the first and in the second assignment on the left-hand side of the law. This is not true with fields. For example, in the command x , x .f := a, b; x , x .f := x , x .f + 1, the left expression x .f in the second assignment may be referring to a different cell from that in the first since x may have changed to reference an object referenced by a.
It is well known that, with pointers or array, such a law is not valid if naive substitution is used for left expressions. For example, for the command p.f :
will give always (p.f + 1) + q.f , ignoring the possibility of aliasing between p.f and q.f .
In order to generalize law (*) above for our language, we prefix a guard asserting that the left expressions of the first assignment are pairwise disjoint and, furthermore, that the left expressions of both assignments refer to the same cells. Also, we need to use a substitution mechanism that deals with aliasing and simultaneous assignments. In the following subsection we give a precise definition of substitution. We give the generalization of law (*) in Section 4 -see law (33).
Like in Java, aliasing may only occur for fields of objects. More precisely, field accesses as p.f and p.g, where f and g are different field names, can never have the same lvalues, i.e, will never be aliased. On the other hand, p.f and q.f are aliased if and only if p == q, meaning that p and q hold the same value (object address). For simplicity we consider field names to be globally unique, i.e., distinct classes use distinct field names.
We define a state predicate alias[p, q] that is true if and only if p and q are aliased. For convenience, we consider that any left expression is aliased with itself. We are using square brackets to stress that the arguments of alias are not semantic values but syntactic elements. Note however that the aliasing checking is state dependent. By recursion on structure we define where x ≡ y and f ≡ g. We write ≡ to mean syntactically the same.
For substitution we borrow ideas from Bornat [7] who defines a field substitution operator for languages that treat references like in Java, as our language does. Like we do in this paper, he also assumes that distinct types of objects use distinct field names, which is useful to simplify the definition. We write e f {f :p →d} to denote the expression obtained by syntactically replacing occurrences of the field name f by the conditional field {f : p → d }. A conditional field access e 1 .{f : p → d } is interpreted as being d e 1 == p e 1 .f .
As an illustrative example, consider the effect caused by the assignment x .f := e on the expression x .f + y.g + z .f . This effect is given by the following field substitution
x .{f :
As expected, observe that, contrasting to the initial expression, in the resulting expression x .f was replaced by e and z .f by a conditional indicating that if x .f and z .f are aliased, z .f also will be replaced by e, but if there is no such aliasing z .f will be kept intact. Field substitution also works for expressions containing nested accesses to fields. For example, it is easy to see that
{f :y →e} = e (e x == y x .f ) == y (e x == y x .f ).f
Because our language has simultaneous assignments, we need to extend Bornat's operator to work with simultaneous substitutions of fields and variables. We start by defining a syntax sugar. A case expression is defined by case e of e 1 → d 1 , . . . , e n → d n else
Note that the order of the items in the case list is important, since it is chosen the first i th branch such that e == e i . We use m to represent case list like e 1 → d 1 , . . . , e n → d n . We then extend the notion of conditional field by allowing case lists and define We relax the notation to allow duplication of fields, but not variables, in xf . In this case we interpret that the corresponding conditional fields are concatenated into a single one. For example, e x ,f ,y,f ,g d1,{f :m1},d2,{f :m2},{g:m3} (with f ≡ g) can be written as e x ,f ,y,g d1,{f :m1,m2},d2,{g:m3} . Note that in the conditional field {f : m 1 , m 2 } the cases on m 1 have priority over those in m 2 , since the order in a case list is relevant. Our simultaneous substitution prioritizes the first listed field substitutions, which may appear arbitrary. However, in our laws all uses of simultaneous substitutions will be for disjoint assignments, i.e., guarded by assertions that ensures that the left expressions reference distinct locations.
Field substitution can also be applied to left expressions. However, we need a slightly different notion because left values and right values need to be treated differently. We use le
In this section we give an algebraic presentation for our imperative language. In general, our laws are either the same or generalizations of those in LoP. Furthermore, as one might expect, the only laws to be generalized are those related to assignment. Also, we establish a few orthogonal laws only related to references, for example one that allows to exchange a left expression by another aliased with it. Our laws can be proved sound in a simple denotational model like that described in LoP. The main difference is that our program states include the heap as well as valuation of the variables in scope. A command denotes a relation from initial states to outcomes, where an outcome is an ordinary state or the fictitious state ⊥ that represents both divergence and runtime error. For any initial state s, if the related outcomes include ⊥ then s must also relate to all states; otherwise, the image of s is finite and non-empty. In this semantics, refinement is simply inclusion of relations, and equations mean equality of denotations. The semantics is parameterized on an arbitrary allocator that, for any state, returns a finite non-empty set of unused references (so new is boundedly non-deterministic).
A shortcoming of this simple semantics is that it does not validate laws like x ← new K = x ← new K ; x ← new K for which equality of heaps is too p, q, r := e1, e2, e3 = q, p, r := e2, e1, e3
(1) strong. An appropriate notion of program equality considers heaps up to bijective renaming of references (as in [2]). In this paper, we do not need such laws because the normal form reduction preserves allocation order. Figure 2 lists some laws given in LoP that do not depend on references and therefore remain intact. For lack of space, we do not explain in detail these laws here. Readers not familiar with them are referred to [9].
Figure 2
The refinement relation, , is defined by b c = (b ∪ c) = b. The set of programs with is a semi-lattice, where ∪ is the meet and ⊥ the bottom, and all the command constructions are monotonic (indeed, continuous). Law (3) on recursion says that µ X .F (X ) is a fixed point and law (4) that it is the least fixed point.
We will be especially concerned with assertions that enable reasoning about aliasing. In Figure 3 we provide a set of laws that allow to decorate commands with assertions. Some of these laws add information brought from the conditions of conditional and while commands. Others spread assertions forward in the commands deducing them from some already known assertions at the beginning. Assertions are defined as syntax sugar, and all these laws are derived except for law (22) which is proved directly in the semantics using the definition of alias.
Figure 3
[e]; c = c e ⊥ (10) [d ∧ e] = [d ]; [e] (11) b e c = ([De]; b) e c (12) b e c = b e ([De]; c) (13) b e c = ([e]; b) e c (14) b e c = b e ([not e]; c) (15) [e]; (b d c) = (16) ([e]; b) d ([e]; c) [e]; (b ∪ c) = ([e]; b) ∪ ([e]; c) (17) e * c = e * ([De]; c) (18) e * c = (e * c); [De] (19) e * c = e * ([e]; c) (20) e * c = (e * c); [not e] (21) [alias[q, r ]]; p, q := e, d = [alias[q, r ]]; p, r := e, d (22) If [d ∧ e]; c = [d ∧ e]; c; [d ] then [d ]; e * c = [d ]; e * ([d ]; c) (23) [d ]; e * c = [d ]; (e * c); [d ] (24) Laws for assertions.
Laws (10)-(21) for assertions should be familiar. Law (22) enables to replace one alias by another. The hypothesis for laws (23) and (24) may be better understood if we interpret it through partial correctness assertions (triples) from ] must be equivalent to skip, which is the same as saying that e 2 must hold. That is exactly the same meaning intended for a triple {e 1 }c{e 2 } from Hoare logic (cf. [14]). Laws (23) and (24) state that any invariant is satisfied at the beginning of each iteration and at the end of the loop.
We give an additional law for assertions that expresses the axiom for assignment in Hoare logic. For this we need some definitions. A path is a left expression that does not use conditionals, i.e., a path is a variable x , or a sequence of field accesses using the dot notation like e.f .g, for example. Given an assignment on em[e] must be a precondition.
Many of the laws for assignment (Figure 4) are guarded by assertions that say the assigned locations are disjoint.
Figure 4
Laws for Assignment
Law (26) stipulates that attempting to evaluate the right-hand side outside its domain has a behaviour wholly arbitrary. We express that by prefixing the assertion [De] on the assignment. Because left expressions also may be undefined, we also have law (27). The definition of Dp is straightforward if, for this purpose, we consider left expressions as being expressions. In Section 4.3 we determine when a field access e.f is defined. Law (28) characterizes the behaviour of simultaneous assignments to repeated left expressions as being non-deterministic. In a simultaneous assignment, if the same left expression q receives simultaneously the values of expressions d 1 and d 2 , there will occur a non-deterministic choice between both values and one of them will be assigned to q.
Law (29) is a generalization of a similar one given in LoP, but it is conceived to deal with non-determinism. It establishes that a simple assignment of the value of a left expression back to itself has no effect, when the expression is defined. We can add (or eliminate) such a dummy assignment to a simultaneous assignment whenever no non-determinism is introduced (eliminated). Note that if p, q is composed only by non-repeated variables then the assertion will be trivially true, and thus law (29) becomes the same established in LoP.
Law (31), when used from left to right, eliminates conditionals in left expressions. If we have a conditional left expression, we can pass the responsibility for checking the condition to a conditional command. This law resembles assignment law (2) in Figure 2 in that it behaves the same but for expressions on the right-hand side of assignments.
Law (34) below can be used together with (31) for transforming any assignment in an equivalent one where left expressions have no conditional. This transformation can be useful to enable the use of law (25).
(p e q).f = p.k e q.f
Law (32) states that when we have a disjoint assignment to paths followed by a conditional command, the assignment can be distributed rightward through the conditional, but changing occurrences of the assigned paths in the condition to reflect the effects of the assignment. Note that the assertion ensures that the assignment is disjoint, and therefore the substitution applied on the condition d is well defined. Law (33) is our version for the law (*) already discussed in Subsection 3.1. This law permits merging two successive assignments to the same locations (variables or fields). The first conjunct in the assertion guarantees that the first assignment is disjoint, thus the substitutions q em[e] denote the left value of q and the value of d , respectively, after the execution of the assignment p := e. The second conjunct in the assertion states that the left expressions p and q are referring to same locations, and thus the assignments can be combined into a single one.
The new construction in our language is a command. It cannot be handled as an expression because it alters the state of the program and our approach assumes side-effect-free expressions. The laws for new are given in Figure 5. Recall that there is a distinguished variable alloc which does not occur in commands except in assertions; its value is always the set of allocated references. Any attempt to access a field of a non allocated reference will be undefined. Thus, we have as definition that De.f = De ∧ e ∈ alloc ∧ f ∈ fields(e), where fields(e) = fields(type(e)) and type(e) returns the class name of e. Recall that our laws are in an implicit context cd of class declarations and context Γ for types of variables; so the type of e is statically determined.
Figure 5
if x does not occur in p, e, then x ← new K ; p := e = p := e; x ← new K (37) if x does not occur in d , then x ← new K ; (b d c) = (x ← new K ; b) d (x ← new K ; c) (38) Laws for new.
Law (35) determines that any field of a new object will be initialized with the default value. The value of default(f ) is 0, false or null depending on the type of f declared in K . Law (36) establishes that x ← new K assigns to x a fresh reference of type K and adds it to alloc. Law (37) allows to exchange the order of a new followed by an assignment, if the assignment does not depends on the created new object. Finally, law (38) distributes a new command to the right, inside a conditional, if the condition does not depend on the created new object.
Our main result says that any command can be reduced to an equivalent one that simulates the original command by using a temporary variable representing explicitly the heap through copy semantics. The simulating program never accesses fields in the heap, neither for writing nor for reading. Instead, it just uses the explicit heap where any update is done by means of copy semantics. Reduction of a program to this form is used as a measure of the comprehensiveness of the proposed set of laws.
The explicit representation of the heap is given by a mapping from references to explicit objects and, in turn, every explicit object is represented by a mapping from field names to values. We assume there is a type, Object, of all object references, and a type, Value, of all primitive values (including null) and elements of Object. We also assume the existence of a type Heap whose values are mappings with the signature Object → (FieldName → Value). FieldName is another primitive type whose values are names of fields. We assume the expression language includes functional updates of finite maps; we use Z notation so h ⊕ {x → e} denotes the map like h but with x mapped to the value of e.
The reduction is made in two stages. In the first stage, an arbitrary command is transformed (using the laws) into an equivalent one whose assignments are all disjoint and with paths as left expressions. The proof uses the following ideas. In order to have just paths on left expressions, we can use systematically assignment laws (34) and (31) for eliminating conditionals. For example, it is easy to prove
x .f , (y e 1 y.g).f := e 2 , e 3 = x .f , y.f := e 2 , e 3 e 1 x .f , y.g.f := e 2 , e 3 ( †)
After the elimination of conditionals, we can use systematically the derived law stated below to transform all the assignments to disjoint ones. Roughly speaking, the second stage of the reduction is to transform the command in the intermediate form (obtained from the first stage, with disjoint assignments) to an equivalent one that first loads the implicit heap into an explicit heap h : Heap, then simulates the original command always using h instead of object fields and finally, when it finishes, restores back the contents of h to the implicit heap, i.e, updates all the object fields accordingly as they are represented in h. The domain of h will be the entire set of allocated references, i.e. alloc. To keep this domain we use variables mirroring alloc before and after the transformed command.
Following our example, suppose that e 2 is z + x .f . The assignment x .f := e 2 will be simulated using the explicit heap h by
where h is updated by copying a new mapping equal to the original except for the value of h(x ), which is updated accordingly. Note that h(x ) represents the object referenced by x , h(x )(f ) represents the value of the field f of this object.
For any c, we will define a command S [c][h, al ] that simulates c using the explicit heap h and the alloc mirror al . We will also define command load [h, al ] that loads the contents of the objects into the explicit heap h, and store[h, al ] that restores back h into the objects. In these terms we can state our main result. Because the variable h is free on the right-hand side, we need the load [h, al 1 ] after c on the left-hand side. That is according to the standard interpretation that free variables are the input and output of commands. al 0 and al 1 can be interpreted as variables that bring the value of alloc at the points of the assertions. An alternative would be to introduce h, al 0 and al 1 on the right-hand side as local variables; then the assertions could be removed from both sides. But, for brevity in this paper, we omit local variable blocks.
The
We established a comprehensive algebraic presentation for programs that deal with references in the way Java does. Taking as a basis the classical work in LoP [9], we explored the incrementality of algebra by generalizing laws of assignment and formulating others. Our normal form theorem gives a sense in which the laws are complete. Perhaps a more standard notion of completeness is whether every semantically true equation is provable. What we know is that our denotational model is not fully abstract with respect to contextual equivalence, owing to the missing laws about new mentioned in Section 4. For our first order language those laws can be validated by quotienting modulo renaming, e.g., by using FM-sets, though for higher order programs that would be unsatisfactory [21]. In [20] some laws are presented as an initial attempt to characterize references. However, they are not comprehensive; the law for combining assignments does not consider the possibility of aliasing on left expressions. Furthermore, the laws depend on provisos that are hard to verify statically. Other initiatives like abstract separation algebras [11] do not deal with a concrete notation for manipulation of fields. Transformations used in compilers are conditioned on alias analyses that are not expressed algebraically. Staton proves Hilbert-Post completeness for a equational theory of ML-style references. The language lacks field update and null (and includes higher order functions and an unusual operator ref n ). The laws capture commutativity of allocations and aspects of locality, but the step from this work to handling refactorings in object-oriented programs is big.
A fundamental notion for our laws is field substitution. Inspired by the works of Bornat [7] and Morris [17] on program correctness, we adapt Bornat's definition for our language by extending it to deal with simultaneous assignments.
The problem approached in this paper is related with the difficulties for defining algebraic theories like those in [9,16,1] for a language with references. Like in [7], we only tackle the problem of pointer aliasing. Other kinds of aliasing, like parameter aliasing or aliasing through higher order references, are out of the scope of our work. For higher order programs, more complicated machinery is needed, e.g., content quantification as proposed in [5].
Another difficulty not addressed in this paper is caused by aliasing in combination with recursive data structures. This usually requires dealing with assertions with inductive formulas, and reasoning with substitution and aliasing can explode into numerous conditionals [7]. Note, however, that assertions in our laws use no inductive predicates, only boolean expressions involving pointer equalities. We intend to provide mechanisms for local reasoning as those proposed in [19,7,13,3] in an extension of our theory when dealing with a more expressive object-oriented language. In particular, we would like to connect the present work with that of Silva et al [18] where denotational semantics is used to prove refactoring laws from which others are derived. The goal is algebraic proof of refactoring laws based ultimately on basic laws like our extension of LoP, just as the works [6,8] do for object-oriented progams with copy semantics. In future work we will explore other laws for new, seeking to avoid the use of the implicit variable alloc in assertions.
Te Reo: The Journal of the Linguistic Society of New Zealand 55, 2012
Atassi Foundation, 2022
Tekstylia w zbiorach sakralnych. Inwentaryzacja – konserwacja – przechowywanie, ed. H. Hryszko, A. Kwaśnik-Gliwińska, M. Stachurska. Warszawa 2013, s. 154-162, 370-373, 2013
AIP Conference Proceedings, 2018
Erlìhìvsʹkij žurnal
Korean journal of medical education, 2018
Acta Fisiátrica
JURNAL PENGABDIAN PAPUA
Clinical Cancer Research, 2009
IJMRAP, 2024
Ethiopian Journal of Health Sciences, 2017
Journal of Clinical Nursing, 2010
Zenodo (CERN European Organization for Nuclear Research), 2021