Algebraic Reasoning About Timeliness

Peter Van Roy

Algebraic Reasoning About Timeliness

Peter Van Roy

Electronic Proceedings in Theoretical Computer Science

visibility

…

description

20 pages

link

1 file

Designing distributed systems to have predictable performance under high load is difficult because of resource exhaustion, non-linearity, and stochastic behaviour. Timeliness, i.e., delivering results within defined time bounds, is a central aspect of predictable performance. In this paper, we focus on timeliness using the ∆Q Systems Development paradigm (∆QSD, developed by PNSol), which computes timeliness by modelling systems observationally using so-called outcome expressions. An outcome expression is a compositional definition of a system's observed behaviour in terms of its basic operations. Given the behaviour of the basic operations, ∆QSD efficiently computes the stochastic behaviour of the whole system including its timeliness. This paper formally proves useful algebraic properties of outcome expressions w.r.t. timeliness. We prove the different algebraic structures the set of outcome expressions form with the different ∆QSD operators and demonstrate why those operators do not form richer structures. We prove or disprove the set of all possible distributivity results on outcome expressions. On our way for disproving 8 of those distributivity results, we develop a technique called properisation, which gives rise to the first body of maths for improper random variables. Finally, we also prove 14 equivalences that have been used in the past in the practice of ∆QSD. An immediate benefit is rewrite rules that can be used for design exploration under established timeliness equivalence. This work is part of an ongoing project to disseminate and build tool support for ∆QSD. The ability to rewrite outcome expressions is essential for efficient tool support. Algebraic Reasoning About Timeliness This paper defines and proves algebraic properties of the ∆QSD operators w.r.t. timeliness, i.e., delivering outcomes within the acceptable time-frames. In this paper, our sole resource of concern is time, although ∆QSD includes other types of resources and their interaction. This theoretical work is part of an ongoing project to disseminate and build tool support for ∆QSD, to make it available to the wide community of system engineers. We base our work on the ∆QSD formalisation given by Haeri et al. [11], which defines outcome expressions and their semantics, and gives a real-world example of ∆QSD taken from the blockchain domain. Contributions This paper gives a firm mathematical foundation for ∆QSD, and uses this to establish important algebraic properties of the ∆QSD operators with respect to timeliness, i.e., when the relevant resource is time. This paper is based on a general model theory of resource analysis for systems specified using outcome expressions [12]. That model theory is the first of its kind and we specialise it using the timeliness analysis recipe that is commonly used in ∆QSD (Definition 3). • We show that the set of outcome expressions forms different algebraic structures with the different ∆QSD operators (Theorems 1-4). • We establish 3 distributivity results in Section 7 about the ∆QSD operators (Theorem 6). • We rule out the formation of certain richer algebraic structures by the set of outcome expressions and the current ∆QSD operators (Remarks 2, 3, and 4).

Algebraic Reasoning About Timeliness Seyed Hossein HAERI Peter W. THOMPSON Peter VAN ROY IOG, Belgium PNSol, UK Université catholique de Louvain, Belgium University of Bergen, Norway Peter.Thompson@pnsol.com pvr@info.ucl.ac.be hossein.haeri@iohk.io Magne HAVERAAEN Neil J. DAVIES Mikhail BARASH University of Bergen, Norway PNSol, UK University of Bergen, Norway Magne.Haveraaen@uib.no Neil.Davies@pnsol.com mikhail.barash@uib.no Kevin HAMMOND James CHAPMAN IOG, UK IOG, UK kevin.hammond@iohk.io james.chapman@iohk.io Designing distributed systems to have predictable performance under high load is difficult because of resource exhaustion, non-linearity, and stochastic behaviour. Timeliness, i.e., delivering results within defined time bounds, is a central aspect of predictable performance. In this paper, we focus on timeliness using the ∆Q Systems Development paradigm (∆QSD, developed by PNSol), which computes timeliness by modelling systems observationally using so-called outcome expressions. An outcome expression is a compositional definition of a system’s observed behaviour in terms of its basic operations. Given the behaviour of the basic operations, ∆QSD efficiently computes the stochastic behaviour of the whole system including its timeliness. This paper formally proves useful algebraic properties of outcome expressions w.r.t. timeliness. We prove the different algebraic structures the set of outcome expressions form with the different ∆QSD operators and demonstrate why those operators do not form richer structures. We prove or disprove the set of all possible distributivity results on outcome expressions. On our way for disproving 8 of those distributivity results, we develop a technique called properisation, which gives rise to the first body of maths for improper random variables. Finally, we also prove 14 equivalences that have been used in the past in the practice of ∆QSD. An immediate benefit is rewrite rules that can be used for design exploration under established timeliness equivalence. This work is part of an ongoing project to disseminate and build tool support for ∆QSD. The ability to rewrite outcome expressions is essential for efficient tool support. 1 Introduction Designing distributed systems to have predictable performance under high load is difficult. At high load, resources such as network, memory, storage, or CPU capacity will be exhausted, causing a dramatic effect on performance. Prediction is difficult because the behaviour of system components and their interactions are both nonlinear and stochastic. For over 20 years, a small group of people associated with the company PNSol has worked on diagnosing and designing systems to predict and correct performance problems [17]. PNSol has developed the ∆Q Systems Development paradigm (∆QSD) as part of this work. ∆QSD has been used in areas as diverse as telecommunications [20] [19] [6], WiFi [14], and distributed ledgers [5]. ∆QSD has been applied to many large industrial systems, with clients including BT, Vodafone, Boeing Space and Defence, and IOG (formerly IOHK). C. Aubert, C. Di Giusto, S. Fowler & L. Safina (Eds.): 16th Interaction and Concurrency Experience (ICE 2023) EPTCS 383, 2023, pp. 35–54, doi:10.4204/EPTCS.383.3 © Haeri et al. This work is licensed under the Creative Commons Attribution License. 36 Algebraic Reasoning About Timeliness This paper defines and proves algebraic properties of the ∆QSD operators w.r.t. timeliness, i.e., delivering outcomes within the acceptable time-frames. In this paper, our sole resource of concern is time, although ∆QSD includes other types of resources and their interaction. This theoretical work is part of an ongoing project to disseminate and build tool support for ∆QSD, to make it available to the wide community of system engineers. We base our work on the ∆QSD formalisation given by Haeri et al. [11], which defines outcome expressions and their semantics, and gives a real-world example of ∆QSD taken from the blockchain domain. Contributions This paper gives a firm mathematical foundation for ∆QSD, and uses this to establish important algebraic properties of the ∆QSD operators with respect to timeliness, i.e., when the relevant resource is time. This paper is based on a general model theory of resource analysis for systems specified using outcome expressions [12]. That model theory is the first of its kind and we specialise it using the timeliness analysis recipe that is commonly used in ∆QSD (Definition 3). • We show that the set of outcome expressions forms different algebraic structures with the different ∆QSD operators (Theorems 1–4). • We establish 3 distributivity results in Section 7 about the ∆QSD operators (Theorem 6). • We rule out the formation of certain richer algebraic structures by the set of outcome expressions and the current ∆QSD operators (Remarks 2, 3, and 4). • We develop two new techniques for analysing the validity of algebraic equivalences: a new technique that we call Properisation (Section 8) and another based on counterexamples (Section 7.2). We use those techniques to refute the remaining possible distributivity results in their full generality: 8 using properisation (Theorem 9) and 4 using counterexamples (Theorem 7). From a mathematical viewpoint, properisation is an important contribution of ours. As far as we know, properisation is the first body of maths developed for improper random variables [21]. • We provide guidelines for studying the necessary/sufficient conditions for the distributivity results we refute the generality of (Section 7.1). • We establish 14 equivalences that have been used in the past in the practice of ∆QSD (Section 6). Full proofs can be found in the accompanying technical report [12], which also shows how Fig. 2 can be further elaborated using code running in a Jupyter notebook. The primary practical results of this paper are to establish distributive properties of ∆QSD operators and other equivalences that are useful for rewriting outcome expressions. These enable common subexpressions to be moved, for example, to reduce representational complexity, with associated gains in tool performance. Rewriting can also be used to produce normal forms, and, in particular, to extract reliability/failure probabilities without fully evaluating the outcome expression. More generally, it can be used to establish equivalences between different designs with respect to their timeliness, even though their usage of other resources might differ, thereby allowing design exploration under equivalence. 2 Motivating Example: Cache Memory We give an example of a memory system consisting of a local cache with a remote main memory. This example serves two purposes: First, it shows how outcome diagrams can be used to model nontrivial Haeri et al. 37 read mread miss Cache return Main memory Network hit Timeout mreturn Figure 1: Block Diagram for a Cache with Networked Main Memory read ⇋ hit [95%] [5%] miss c-miss return c-hit ∃ mread net ⇋ [1 - 10-16] mreturn main net ECC fail t-out [10-16] Figure 2: Outcome Diagram for the Cache of Figure 1 systems. Second, it shows the usefulness of the algebraic transformations of this paper. We give the block diagram and the outcome diagram for this example. We show how to rewrite (a simplified version of) the outcome diagram to swiftly compute failure (and, hence, success) rate of this design, giving it a ‘back of an envelope’ feasibility test. As we revisit this example later on, we will see that all this is possible because of the algebraic results proved in this paper. Fig. 1 gives the block diagram of the memory system. A read message enters the cache; a cache hit – when the memory word is in the cache – results in an immediate return message; a cache miss – when the memory word is not in the cache – results in a main memory read. The main memory is across a network, so accessing it requires communication in both directions. Main memory access is guarded by a timeout in case of communication failure. The cache miss initialises the timeout timer; the mreturn message is passed through if it occurs before the timeout; otherwise, a timeout message is passed instead. Furthermore, there is a small probability that the remote main memory read fails. Outcome Diagram for the Cache with Networked Memory Fig. 2 shows the outcome diagram for the memory system. We can define an outcome as what the system obtains by performing one of its tasks. Outcomes are shown using orange circles in the outcome diagrams. When there is a left-to-right path from one outcome to another, the right one is causally dependent on the left one. Small square boxes show the starting and terminating sets of events of the corresponding outcomes. Large square boxes are operators. In Fig. 2 there are two probabilistic choices, “⇋”, and one first-to-finish synchronisation, “∃”. We assume that the cache hit rate is 95%. That is modelled using the leftmost probabilistic choice with two paths, one to each outcome (“cache hit” and “cache miss”), decorated with their corresponding probabilities. Timeout is modelled by a first-to-finish relationship between the main memory read and the timer. We assume that the main memory uses Error-Correction Codes (ECC) to catch bit errors, but nevertheless account for the possibility that a main memory access fails (e.g. because of hardware 38 Algebraic Reasoning About Timeliness failure) by giving it a failure rate of 10−16 . This assumption is modelled in Fig. 2 as a probabilistic choice between the “main” and “ECC fail” outcomes. Failure Rate Let us now compute the failure rate by doing algebraic transformations as defined [5%] in this paper. Without loss of generality, we can c-miss assume that the network has zero delay and the main timeout is infinite. One can then simplify Figure 2 to the outcome diagram at the top of Figure 3. In that diagram, the ECC failure is hidden in the failure rate assigned to main in the timeliness analysis of the diagram. However, as we will prove in c-hit ⇋ [95%] this paper, one can also properise main and explic[5%] itly demonstrate that failure rate as a probabilistic c-miss ⇋ [1 - 10 ] main choice, whilst retaining the level of timeliness. [10 ] The result of that properisation is shown in the ⊥ second diagram from the top, where “⊥” represents (unconditional) failure. According to the developments of this paper, one can rewrite the second diagram from the top to the third and then to the bottom one, again, whilst retaining the level of timeliness. What c-hit ⇋ [95%] is important about the bottom diagram of Fig. 3 [5%] is that it comprises of a probabilistic choice bemain ⇋ [1 - 10 ] c-miss tween failure and everything else in the diagram. [10 ] As will be proved later, for some p, and for q = ⊥ (1−0.05×10−16 ) = 0.999999999999999995, we have swiftly obtained the failure rate. Those numbers immediately tell the system engineer that, under the current assumptions about cache hit and main memory failure rates, every implementation will be infeasible if the overall success rate must ⊥ be greater than q. [p] The techniques used for this example generc-hit ⇋ [q] ⇋ alise in a straightforward fashion to any system modelled using an outcome diagram. c-miss main In the remainder of this paper, Examples 1–6 will come back to the developments of this section Figure 3: Steps for Swiftly Calculating the Failure by supplying syntax, semantics, timeliness analyRate sis, and authorising the rewrite steps taken here. [95%] c-hit Properisation ⇋ -16 = -16 -16 = -16 Closing Remarks on the Example Realistic cache memories are often more complex than this example, which gives rise to more complicated outcome diagrams in which “⊥” will appear at multiple depths. Thanks to the results we prove in this paper, techniques such as that of this section can be used to accumulate those ⊥s. Haeri et al. 39 While the probabilities in this example may seem small, they can combine with probabilities from other parts of the system, and it is important to be able to keep track of them. Dismissing them as ‘minimal’ risks missing potentially serious failures when many ‘small’ probabilities aggregate. 3 Background Figure 4: A Component’s Operation and its Cumulative Delay Function Percentage Outcome and Quality Attenuation Consider a component C which inputs message min and outputs message mout after a delay d. Doing this many times will usually give different delays. We define a cumulative delay function so that p percent of delays are less or equal to d. Figure 4 gives an illustration. The ∆QSD paradigm generalises this simple measurement. We measure delay not only for messages, but for all system behaviours that have a starting event and a terminat(a) Failure is modelled as a quality attenuation ing event. Given a starting event ein and a terminating event whose limit is less than 1. eout , what the system gains within the (ein , eout ) time frame ∆obs ≤ ∆req Intangible mass is called an instance of an outcome. We also generalise the 100 property that we measure: we measure not only delay, but 80 any property that makes the system less than perfect. The cu∆req ∆obs mulative distribution function of the property is then called 60 a quality attenuation and is denoted by a ∆Q. In what fol40 lows, we will consistently use the terms outcome and quality attenuation. 20 0 2 4 6 8 10 12 14 16 Failure It is straightforward to generalise the quality attenDelay uation to model both delay and failure. It suffices to allow the cumulative delay function’s limit to be less than 1. Figure (b) Timeliness: ∆Qobs (the observed quality 5a illustrates this possibility. There is an f percent probabil- attenuation ∆Q) is always to the left and above ity that the delay is infinite, which corresponds precisely to a ∆Qreq (the required ∆Q). failure. For the component, it means simply that there is an Figure 5: Failure and Timeliness 40 Algebraic Reasoning About Timeliness input message min with no corresponding output message mout . Mathematically, the delay is modelled by a random variable that is allowed to be improper: The probability that it is infinite can be greater than 0. This probability is called the intangible mass of the Improper Random Variable (IRV) [21]. The ability to model delay and failure as a single quantity is a key strength of ∆QSD. It makes it easy to explore trade-offs between delay and failure in the system design. This ability shows up clearly in the algebra presented in this paper. Timeliness We define timeliness as a relation (defined in [20]) between an observed ∆Qobs and a required ∆Qreq . We say that the system satisfies timeliness for a given outcome if ∆Qobs ≤ ∆Qreq . Figure 5b illustrates this condition. Outcome Expressions For a system consisting of multiple interconnected components, one can define a graph that combines all the components’ outcomes. This graph defines the causal relationships between the outcomes and is called an outcome diagram. Each outcome diagram has a corresponding outcome expression – a mathematical description of the diagram 1 . Given an outcome expression and the quality attenuations of all its components, it is possible to compute the quality attenuation of the complete system. The reverse process can also be fruitful: given an outcome expression and the required quality attenuation of the complete system, one can estimate the required quality attenuations of its components. This gives the system designer a powerful tool for both design and diagnosis. Outcome expressions can be manipulated according to algebraic rules, in particular those presented in this paper, which are useful to system designers using ∆QSD. As part of an ongoing project, we are building software tools to support ∆QSD, which can use the algebraic rules presented here for symbolic manipulation of outcome expressions. ∆QSD ∆QSD is a systems development paradigm that is able to compute many system properties early on in the design process, such as performance (latency and throughput), timeliness, resource consumption, risks, and feasibility. ∆QSD is used both for diagnosis and design: • System Diagnosis. ∆QSD can analyse an existing system, to pinpoint anomalous behaviours so their origin can be found and the system can be corrected. • System Design. ∆QSD can estimate performance trade-offs during the design process. At every step of the design process, performance of the complete system can be estimated by a computation on the partial design. This computation also determines whether or not the system is feasible, i.e., whether it can or cannot meet the requirements. While historically ∆QSD has primarily been used to diagnose and correct problems in large industrial systems, PNSol has recently used ∆QSD to design the Shelley block diffusion algorithm as used in the Cardano blockchain [11]. More information on ∆QSD can be found in a tutorial given at HiPEAC 2023 [22]. 1 In this paper, we take the equivalence between the outcome expressions and outcome diagrams for granted. That equivalence is not the focus of this paper. Haeri et al. 4 4.1 41 An Algebraic Perspective on Timeliness Syntax of Outcome Expressions Definition 1 (Haeri et al. [11]). Assume a set B of primitive outcomes. We use variables β ∈ B to represent individual primitive outcomes. We define the abstract syntax of outcome expressions as follows: O ∋ o ::= β primitive outcome | o •→ −• o′ sequential composition | (o ∥∀ o′ ) all-to-finish (a.k.a. last-to-finish) m | o ⇋′ o′ probabilistic choice | (o ∥∃ o′ ) any-to-finish (a.k.a. first-to-finish). m This defines outcome expressions as combinations of primitive outcomes β and four composition operators. In the case of probabilistic choice, m and m′ are numeric weights which give the probabilities of choosing the left or right alternative, respectively. For convenience, we also introduce another notation [p] ′ o⇋ o where the probability (1 − p) for the right alternative is implied. We distinguish two constant outcomes: ⊤ for “perfection” and ⊥ for “unconditional failure.” Note that the operator “∃" in the outcome diagrams is “∥∃ " in the outcome expressions. That is to signify that when two outcomes are connected by first-to-finish, they are performed concurrently; hence the “∥" sign. One need not emphasise that concurrency in the outcome diagrams because our left-to-right directional convention on causal dependency already implies concurrency when forking off an “∃” in the outcome diagrams. Similarly, for “∥∀ ” in the outcome expressions, the sign in the outcome diagrams is simply “∀”. Example 1. Getting back to our motivating example, we can now transcribe the outcome diagram of Fig. 2 into an outcome expression: −16 ] c-hit [95%] −• ((net •→ −• (main [1−10 ⇋ (c-miss •→ ⇋ We will use this outcome expression in further examples. 4.2 ⊥) •→ −• net) ∥∃ t-out)). (1) □ Timeliness Semantics for Outcome Expressions Let ∆Q(x) denote the probability that an outcome occurs in a time t ≤ x. In order to represent both delay and failure in a single quantity, a ∆Q is represented by an improper random variable (IRV), allowing the total probability not to reach 100% [21]. The intangible mass of such an IRV is ℑ(∆Q) = 1 − limx→∞ ∆Q(x). For a given ∆Q, the intangible mass ℑ(∆Q) encodes the probability of exceptions or failure occurring. Denote the set I of all IRVs that are differentiable and the values of which are always greater than or equal to zero. Statistically speaking, every ι ∈ I can be represented both using its Probability Density Function (PDF) or its Cumulative Distribution Function (CDF), where the former is the derivative of the latter. For convenience, we will freely switch between the two representations as the need rises. Fix a countable set of ∆Q variables ∆v . We define ∆ = ∆v ∪ I to denote both IRVs and ∆Q variables. When δ ∈ ∆ is in its CDF representation, we write δ ′ for its derivative, which is the PDF representation. We first define a mapping between primitive outcomes B and ∆Qs. Definition 2. We call a function ∆◦ [[.]] : B → ∆ a basic assignment when ∆◦ [[⊤]] = 1 and ∆◦ [[⊥]] = 0, where 1 and 0 are the functions always returning the constants 1 and 0, respectively. 42 Algebraic Reasoning About Timeliness Example 2. Every timeliness analysis of Fig. 2 à la ∆QSD needs a basic assignment that at least has mappings for the five individual outcomes in Equation (1) (namely, c-hit, c-miss, t-out, net, and main) so that ∆Qc-hit , ∆Qc-miss , ∆Qt-out , ∆Qnet , and ∆Qmain , are known initially. That is generally possible because: the first two are properties of the cache;the timeout is chosen by the designer; the network performance is known; and the main memory read time (and failure rate) is also known. □ We now define the semantics of an outcome expression as a mapping between the outcome expression and an IRV, for a given basic assignment. Definition 3 (Haeri et al. [11]). Given a basic assignment ∆◦ [[.]] : B → ∆, define ∆Q[[.]]∆◦ : O → I such that 1 when ∆◦ [[β ]] ∈ /I ∆Q[[β ]]∆◦ = ∆◦ [[β ]] otherwise ∆Q[[o •→ −• o′ ]]∆◦ = ∆Q[[o]]∆◦ ∗ ∆Q[[o′ ]]∆◦ m m′ m ′ ∆Q[[o ⇋′ o′ ]]∆◦ = m+m ′ ∆Q[[o]]∆◦ + m+m′ ∆Q[[o ]]∆◦ m ∀ ′ ′ ∆Q[[o ∥ o ]]∆◦ = ∆Q[[o]]∆◦ × ∆Q[[o ]]∆◦ ∆Q[[o ∥∃ o′ ]]∆◦ = ∆Q[[o]]∆◦ + ∆Q[[o′ ]]∆◦ − ∆Q[[o]]∆◦ × ∆Q[[o′ ]]∆◦ Here, the notation ∗ denotes the convolution of two ∆Qs. In the above formulae, the random variables are always represented using their CDFs except for sequential composition, where the representation is PDFs on both sides. Note that the PDF of ⊤ is the Dirac δ function. In what follows, we will drop ∆◦ whenever the basic assignment is fixed throughout a computation. One way to interpret Definition 3 is that ∆Q[[.]]∆◦ is a homomorphism from the term algebra of outcome expressions O to an algebra of probability distributions I. Remark 1. Note that, according to Definition 3, we get ∆Q[[o1 •→ −• o2 ]] = ∆Q[[o2 •→ −• o1 ]]. This may seem counter-intuitive because o1 •→ −• o2 ̸= o2 •→ −• o1 . ∆Q[[o1 •→ −• o2 ]] = ∆Q[[o2 •→ −• o1 ]] is, nonetheless, valid because, intuitively, o1 •→ −• o2 is just as timely as o2 •→ −• o1 . See the proof of Theorem 2 [12] for the □ mathematical justification of that intuition. 4.3 Motivating Example: Timeliness Analysis Example 3. Given the developments of Example 2, here is how to work out the quality attenuation −16 ] of Fig. 2 using Definition 3: Take mem = net •→ −• (main [1−10 −• net to be the outcome of the ⇋ ⊥) •→ networked main memory read. We start by computing ∆Qmem : ∆Qmem = ∆Qnet ∗ ((1 − 10−16 ) × ∆Qmain + 10−16 × ∆Q⊥ ) ∗ ∆Qnet (2) which, because ∆Q⊥ = 0, we can simplify to: ∆Qmem = ∆Qnet ∗ (1 − 10−16 ) × ∆Qmain ∗ ∆Qnet (3) The overall ∆Qobs is then given by: ∆Qobs = 0.95 × ∆Qc-hit + 0.05 × (∆Qc-miss ∗ (∆Qmem + ∆Qt-out − ∆Qmem × ∆Qt-out )). (4) This computation gives us the CDF for the execution time of a memory read. The numeric computation is easily performed by a software tool. For readers interested in seeing fully worked-out numerical examples, we recommend looking up the tutorial [22]. □ Haeri et al. 43 Recall that, in Section 3, we defined timeliness as ∆Qobs ≤ ∆Qreq (this relation is a partial order, defined in [11]). Definition 3 gives this more context. Using Definition 3, the systems engineer can work out the ∆Qobs of an outcome so they can compare the result against the required ∆Qreq . Example 4. Given the developments of Example 3, we can now get back to the plots in Fig. 5b. Taking the blue plot for ∆Qreq , the cache outcome diagrams developed in Section 2 are timely so long as 50% of the queries submitted to the cache can be handled within 5 units of time, 95% of them in 10 units, and 97% in 15 units. Furthermore, the cache is fine to drop 3% of the queries.2 Taking the black plot in Fig. 5b as that of Equation (4) after insertion of real numbers, our designed cache is timely enough because it always handles the queries within the acceptable time frame and drops less queries than the acceptable maximum. In other words, it has less delay and less failure rate. Visually, that amounts to the black plot always being to the left and above the blue plot. □ 4.4 Connecting Algebra to Timeliness In our accompanying technical report [12], we give a model theoretic formulation for studying the algebraic properties of resource consumption. This paper focuses on time as its sole resource of interest and uses that formulation for time exclusively without getting into the technical details of the formulation itself. An algebraic structure often consists of a carrier set, a few operations on the carrier set, and a finite set of identities that those operations need to satisfy. Given our focus on timeliness à la ∆QSD, the carrier set will always be O in this paper. The full set of operators on O is {•→ −•, ∥∀ , ∥∃ , ⇋}. However, most algebraic structures do not need all those operators. Different structures work with different number of operations; for example, a monoid works with only one operation; whilst a group works with two. Finally, the identities are of the form ol = or . We take ∆Q[[.]] (Definition 3) to be the model of time consumption for O. We write • ⊙⊙ time ⊨ ol = or when ∆Q[[ol ]] = ∆Q[[or ]]. That is when ol and or are as timely. • ⊙⊙ time ⊨ (O, P) : s for an algebraic structure s and a set of ∆QSD operators P when ⊙⊙ time ⊨ ol = or , for every equation ol = or – that is constructed using the operators in P, and – that is required for the formation of s. With time being our solo resource of interest in this paper, we will drop the initial “⊙⊙ time ⊨” from the above formulation hereafter. 5 Algebraic Structures This section establishes several important properties on O: • probabilistic choice forms a magma (Theorem 1); • sequential composition forms a commutative monoid with ⊤ and ⊥ as the identity and absorbing elements (Theorem 2); • all-to-finish forms a commutative monoid with ⊤ and ⊥ as the identity and absorbing elements (Theorem 3); 2 This is an instance of the “better-than” part of the Quantitative Timeliness Agreement (QTA) [20]. 44 Algebraic Reasoning About Timeliness • any-to-finish forms a commutative monoid with ⊥ and ⊤ as the identity and absorbing elements (Theorem 4); and • neither all-to-finish nor any-to-finish nor their combination form the familiar richer algebraic structures (Remarks 2, 3, and 4). Theorem 1. (O, ⇋) forms a magma when observing time. A magma is the weakest algebraic structure. That is because ⇋ is not even associative. Despite this, expressions containing two consecutive occurrences of ⇋ can still be re-associated. However, in this case the coefficients will change. Lemmas 2 and 3 give the exact formulae. Theorem 2. ⊙⊙ time ⊨ (O, •→ −•) : f orms a commutative monoid with ⊤ and ⊥ as the identity and absorbing elements, respectively. Theorem 3. ⊙⊙ time ⊨ (O, ∥∀ ) : f orms a commutative monoid with ⊤ and ⊥ as the identity and absorbing elements, respectively. Remark 2. It is important to notice that, when observing time, (O, ∥∀ ) does not form a group. That is because, in general, an outcome has no inverse element - intuitively, one can never undo an outcome! In order to prove that claim formally, suppose otherwise. That is, suppose that there exist a pair of outcomes o1 and o2 such that o1 ∥∀ o2 = ⊤. Then, ∆Q[[o1 ∥∀ o2 ]] = ∆Q[[⊤]] which implies δ1 × δ2 = 1 ⇒ δ2 = δ11 . However, given that δ1 ≤ 1, we get δ2 ≥ 1. The latter inequality can only be satisfied when □ o1 = ⊤. Restricting the application of ∆QSD to perfection is not practical. Theorem 4. ⊙⊙ time ⊨ (O, ∥∃ ) : f orms a commutative monoid with ⊥ and ⊤ as the identity and absorbing elements, respectively. Remark 3. Similar to the case for ∥∀ , it is important to note that, when observing time, (O, ∥∃ ) does not form a group. Again, it is the lack of an inverse element that is causing the trouble. Following our previous result, suppose that there exist a pair of outcomes o1 and o2 such that o1 ∥∃ o2 = ⊥. Then, 1 ∆Q[[o1 ∥∃ o2 ]] = ∆Q[[⊥]] which implies δ1 + δ2 − δ1 × δ2 = 0 ⇒ δ2 = δ1δ−1 . However, because δ1 ≤ 1, we get δ2 ≤ 0. But, only ⊥ can satisfy the latter inequality. There is no reason to develop a system in which □ all the outcomes will fail unconditionally! Having established that both (O, ∥∀ ) and (O, ∥∃ ) form commutative monoids for time, a natural question is whether (O, ∥∀ , ∥∃ ) or (O, ∥∃ , ∥∀ ) form semi-rings. This is not the case, since they do not distribute over one another. Lemma 1 helps Remark 4 show how the desirable distributivities fail. Lemma 1. ⊙⊙ time ⊨ o1 ∥∃ o2 = ⊤ implies o1 = ⊤ and o2 = ⊤. Remark 4. Neither (O, ∥∀ , ∥∃ ) nor (O, ∥∃ , ∥∀ ) form a semi-ring when observing time: for this to be the case, ∥∀ and ∥∃ would need to distribute over one another. The first distributivity requirement is: ? o1 ∥∃ (o2 ∥∀ o3 ) = (o1 ∥∃ o2 ) ∥∀ (o1 ∥∃ o3 ) (5) Equating ∆Q[[.]]s of the two sides, one eventually makes it to the requirement that either δ1 = 0 or ∆Q[[(o1 ∥∃ o3 ) ∥∃ o2 ]] = ⊤. In other words, it follows by Lemma 1 that Equation (5) can only hold under the trivial conditions when either o1 = ⊥ or o1 = o2 = o3 = ⊤. The second distributivity requirement is ? o1 ∥∀ (o2 ∥∃ o3 ) = (o1 ∥∀ o2 ) ∥∃ (o1 ∥∀ o3 ) (6) Again, equating ∆Q[[.]]s of the two sides, one eventually comes to observe that Equation (6) only holds □ when δ1 = 1 ∧ δ2 ̸= 0 ∧ δ3 ̸= 0, i.e., when o1 = ⊤ ∧ o2 ̸= ⊥ ∧ o3 ̸= ⊥. Haeri et al. ⊥⇋⊥ = ⊥ ⊥ •→ −• o = ⊥ ⊤∥∀ o = o ⊥∥∃ o = o 45 −• o2 = (o1 •→ −• o2 ) ⇋ ⊥ (o1 ⇋ ⊥) •→ o1 •→ −• (o2 ⇋ ⊥) = (o1 •→ −• o2 ) ⇋ ⊥ −• o2 = (o1 •→ −• oh2 ) ⇋ o2i (o1 ⇋ ⊤) •→ [p] [q] o1 ⇋ (o2 ⇋ ⊤) = o2 [q(1−p)] ⇋ (o1 p 1−q(1−p) ⇋ o •→ −• ⊥ = ⊥ ⊤⇋⊤ = ⊤ ⊤ •→ −• o = o o •→ −• ⊤ = o −• o2 ) ⇋ o1 o1 •→ −• (o2 ⇋ ⊤) = (o1 •→ [p] [q] ⊥⇋ (⊥ ⇋ o) = ⊥ [p+(1−p)q] ⇋ o ⊤) Figure 6: Equivalences Containing ⊤ and ⊥ 6 Equivalences Containing Constant Outcomes ∆QSD is already in use by its practitioners, who, amongst other usages, simplify outcome expressions according to their timeliness analysis. In particular, Figure 6 distils a list of equivalences that are used in such simplifications. Those equivalences all contain constant outcomes (⊤ or ⊥). Equivalences of Figure 6 provide the basis for rewrite rules that are useful for construction of normal forms, such as expressing a given system as a convolution of probabilistic choices or a probabilistic choice of convolutions. Such rewriting allows for: extraction of common sub-expressions permitting aggregation of failure rates (distinguishing between conditional and non-conditional failure); identifying minimal delays; and highlighting branching probabilities to identify issues of relative criticality. This is useful for quickly assessing whether a particular outcome decomposition is feasible without having to compute the complete ∆Q. See Section 2, for example. In addition, the equivalences of Figure 6 are very handy in the proofs of properties such as those established in this paper. Two examples, amongst many, are the proofs of Theorem 7 and Lemma 5. Before we delve into Figure 6, we prove a result about re-associating probabilistic choice. Given an expression with two consecutive probabilistic choices, one of which wrapped inside a pair of parentheses, the ∆QSD practitioner might be interested in wrapping the other two inside a pair of parentheses – reassociating the probabilistic choices, in effect. Lemmata 2 and 3 give the conditions on the coefficients of those probabilistic choices. ′ ′ [p] [q] ] [q ] ′ Lemma 2. o1 ⇋ (o2 ⇋ o3 ) = (o1 [p ⇋ o2 ) ⇋ o3 iff p = ′ ′ p 1−(1−p)(1−q) [p] [q] ] [q ] ′ ′ Lemma 3. (o1 ⇋ o2 ) ⇋ o3 = o1 [p ⇋ (o2 ⇋ o3 ) iff p = pq and q = and q′ = 1 − (1 − p)(1 − q). q(1−p) 1−pq . Theorem 5. The equivalences in Fig. 6 are correct. m1 Proof. We will only present the proof of ⊥ ⇋ ⊥ = ⊥ here. The rest of the equivalences are proved m2 similarly: m1 m2 m1 0+ 0 = 0 = ∆Q[[⊥]]. ∆Q[[⊥ ⇋ ⊥]] = m2 m1 + m2 m1 + m2 ■ Remark 5. The very last equivalence in Fig. 6 was incorrectly formulated (though never published) prior to this paper. Thanks to the formalisation developed in [11], that mistake was corrected, and the □ equivalences have been given a sound footing. 46 6.1 Algebraic Reasoning About Timeliness Motivating Example: Correctness of the Three Bottom Rewrites Example 5. We are now in position to confirm the steps taken in Fig. 3. Note first that, after dismissing the back-and-forth network connections and the timeout, Equation (1) simplifies to −16 ] c-hit [95%] −• (main [1−10 ⇋ (c-miss •→ ⇋ ⊥)) (7) which, according to Theorem 5, is equivalent to −16 ] c-hit [95%] −• main) [1−10 ⇋ ((c-miss •→ ⇋ ⊥) (8) which, again, can be rewritten using Lemma 2 as [.] [q] (c-hit ⇋ (c-miss •→ −• main)) ⇋ ⊥ (9) for q = (1 − 0.05 × 10−16 ) = 0.999999999999999995. Equations (7), (8), and (9) are the outcome expressions for the bottom three outcome diagrams of Fig. 3, respectively. □ 7 Distributivity In this section, we consider the distributivity results between the ∆QSD operators. Recall that out of the four P operators, three are commutative (i.e., •→ −•, ∥∀ , and ∥∃ ) and one is not (i.e., ⇋). Hence, it is only possible forright and left-distributivity to differ when ⇋ is the outermost operator. That gives rise to 2 × 31 + 31 31 = 15 possible ways for distributing P operators over each other. Theorem 6 establishes 3 of those 15. In Section 7.1, we show how the routine technique for examining the equivalence of expressions (i.e., equating the ∆Q[[.]] of the two sides) is not that helpful for the study of the remaining 12 distributivity results. That leads to Sections 7.2 and 8, which disprove the generality of 4 and 8 distributivity results using counterexamples (Theorem 7) and properisation (Theorem 9), respectively. We use the following syntactic convention: when, in an equivalence, two ⇋s are used without weights, each on precisely one side of the equivalence, we will assume that the weights of those ⇋s are the same. We therefore do not bother to repeat those weights. For example, in the theorem below, m2 m2 (o1 •→ −• o3 ), but we omit these. there exist weights m2 and m3 such that o2 ⇋ o and (o1 •→ −• o2 ) ⇋ m m 3 3 3 Theorem 6. Let o1 , o2 , o3 ∈ O and p ∈ {•→ −•, ∥∀ , ∥∃ }. Then, • ⊙⊙ time ⊨ o1 p (o2 ⇋ o3 ) = (o1 p o2 ) ⇋ (o1 p o3 ), and • ⊙⊙ time ⊨ (o1 ⇋ o2 ) p o3 = (o1 p o3 ) ⇋ (o2 p o3 ). 7.1 Potential Distributivity As we are going to see in Sections 7.2 and 8, the remaining 12 potential distributivity results do not hold in general. Nevertheless, this section uses the routine technique for studying the equivalence of expressions: Equating the ∆Q[[.]] of the two sides. That is important because: • firstly, it shows why the routine technique does not help, thereby motivating the next sections; • secondly, it presents some of the necessary conditions for those distributivity results to hold. Although pretty immature, such conditions help the ∆QSD practitioner to verify, under special circumstances, whether their given IRVs can satisfy the provided conditions. Haeri et al. 47 We do not know of better necessary conditions for the remaining 12 results (if indeed they are soluble at all). In this section, we demonstrate the necessary conditions of one distributivity result out the 12. We begin by Proposition 1, which is a simple yet handy result. R Proposition 1. Suppose that o1 = o2 •→ −• o3 . Then, ⊙⊙ time ⊨ δ1 (t) = (δ2′ ∗ δ3′ )(t) dt. When observing time, for m ? m m m m −• (o2 ⇋′ o3 ) (o1 •→ −• o2 ) ⇋′ o3 = (o1 ⇋′ o3 ) •→ m (10) to hold, according to Proposition 1, m m m′ ′ ′ (δ ∗ δ )(t) dt + δ3 1 2 m + m′ m + m′ ZZ m m′ ′ ′ = δ (τ)δ (t − τ) dτ dt + δ3 1 2 m + m′ m + m′ ∆Q[[(o1 •→ −• o2 ) ⇋′ o3 ]] = m Z (11) and Z m m m m′ m′ m ′ ′ ′ ′ −• (o2 ⇋′ o3 )]] = ∆Q[[(o1 ⇋′ o3 ) •→ δ + δ δ + δ ∗ (t) dt m m m + m′ 1 m + m′ 3 m + m′ 2 m + m′ 3 ZZ m′ m m′ m ′ ′ ′ ′ δ (t) + δ (t) × δ (t − τ) + δ (t − τ) dτ dt. (12) = m + m′ 1 m + m′ 3 m + m′ 2 m + m′ 3 For Equation (10) to hold, the right-hand-sides of Equations (11) and (12) need to be equal. That is, ZZ m′ m ′ ′ δ (τ)δ (t − τ) dτ dt + δ3 = 1 2 m + m′ m + m′ ZZ ′ m m m m′ ′ ′ ′ ′ δ (t) + δ (t) × δ (t − τ) + δ (t − τ) dτ dt (13) m + m′ 1 m + m′ 3 m + m′ 2 m + m′ 3 This is a differential equation for which we do not know a general solution. Given particular values for δ1 , δ2 , and δ3 , however, the ∆QSD practitioner might be able to solve it. 7.2 Counterexamples As will be worked out in Remark 6, properisation does not quite work for outcome expressions containing ∥∃ because ⊥ is not compositional under ∥∃ . In this section, we present a less advanced yet effective technique for refuting distributivity results: counterexamples. A single counterexample suffices to refute an equivalence. That is how Theorem 7 refutes 4 distributivity results out of the questionable 12 (in their full generality). Theorem 7. For every o1 , o2 , o3 ∈ O, o1 ⇋ (o2 ∥∃ o3 ) ̸= (o1 ⇋ o2 )∥∃ (o1 ⇋ o3 ) o1 ∥∃ (o2 •→ −• o3 ) ̸= (o1 ∥∃ o2 ) •→ −• (o1 ∥∃ o3 ) (o1 ∥∃ o2 ) ⇋ o3 ̸= (o1 ⇋ o3 )∥∃ (o2 ⇋ o3 ) o1 •→ −• (o2 ∥∃ o3 ) ̸= (o1 •→ −• o2 )∥∃ (o1 •→ −• o3 ). Proof. We only prove the last item here. The other inequalities can be proved similarly using the same technique. Take o2 = o3 = ⊤ and let ∆Q[[o1 ]] = δ1 . By Theorem 5, o1 •→ −• (o2 ∥∃ o3 ) = o1 •→ −• (⊤∥∃ ⊤) = o1 •→ −• ⊤ = o1 . Therefore, ∆Q[[o1 •→ −• (o2 ∥∃ o3 )]] = δ1 . (14) On the other hand, by Theorem 5, (o1 •→ −• o2 )∥∃ (o1 •→ −• o3 ) = (o1 •→ −• ⊤)∥∃ (o1 •→ −• ⊤) = o1 ∥∃ o1 . Thus, ∆Q[[(o1 •→ −• o2 )∥∃ (o1 •→ −• o3 )]] = δ1 + δ1 − δ1 δ1 . (15) Equations (14) and (15) together imply δ1 = 2δ1 − δ12 ⇒ δ1 = 0 ∨ δ1 = 1 ⇒ o1 = ⊥ ∨ o1 = ⊤. The result follows because, for any other o1 and o2 = o3 = ⊤, the two sides will not be equal. ■ 48 8 Algebraic Reasoning About Timeliness Properisation when dom(ι) = dom(ι ′ ) and ι ′ (x) = 1 1−i ι(x) ) = 0, for all ι ∈ I. →| Proposition 2. ℑ(ι →| Definition 4. For an ι ∈ I such that ℑ(ι) = i, write ι ′ = ι for every x ∈ dom(ι). Call ι ′ the properisation of ι. →| →| →| →| This section sets the stage using Theorem 8 for a technique that we call properisation and use for disproving equivalences (in their full generality). Properisation is based on the following important observation: if two outcomes do not fail similarly, they are not equivalent. Properisation is an algebraic technique for swiftly extracting the failure behaviour of outcomes via rewriting but without assessing the rest of their timeliness behaviour. Once the failure parts of the timeliness behaviours are at hand for the two sides, one can check whether they are equal, and if they are not, deduce that the outcomes in question are therefore unequal. Our intuition for the choice of name “properisation” for this technique follows: recall that ∆Qs are CDFs (or PDFs) of improper random variables. Properisation is a technique based on making the ∆Q of an outcome o proper (by scaling it) and restoring its amount of improperness – i.e., o’s intangible mass, denoted by ℑ(∆Q(o)) – as a probabilistic choice (of the right weights) between o and ⊥. That is also the intention behind the symbol we use for properisation: “ .” As one can see in Figure 5a, the CDF of an improper random variable needs not to make it to the “ceiling” (i.e., 1). The symbol “ ” that we use is intended to resemble the act of ‘sticking the CDF to the ceiling’ (represented by the horizontal bar at the top of “ ”)! Now, the formal definitions of properisation. Intuitively, for IRVs, “. ” produces a scaled random variable with no intangible mass. β →| →| →| Definition 5. Fix two basic assignments ∆, ∆′ and a base variable β such that ∆(β ) = ι. Write ∆′ = ∆ when ∆′ (β ′ ) = ∆(β ′ ) for β ′ ̸= β ∆′ (β ′ ) = ι otherwise. β is the result of properisation of β in ∆. →| We say ∆ β produces a new basic assignment that is the as same ∆ everywhere except β , where Intuitively, ∆ the assigned IRV is propoerised. Notation 1. Write o[o′ /β ] for the familiar λ -Calculus notation for substitutions: o in which every instance of β is replaced by o′ . →| →| →| →| →| →| →| →| →| Definition 6. Fix a basic assignment ∆ and a base variable β such that ∆(β ) = ι where ℑ(ι) = i. Write β β ′ (o, ∆) = (o′ , ∆′ ) when o′ = o[(β [1−i] . We say that o′ is the result properisation ⇋ ⊥)/β ] and ∆ = ∆ of β in o according to ∆. β ,β β β2 β ,β β β2 As a shorthand, we write (o, ∆) 1 2 for (o, ∆) 1 and ∆ 1 2 for ∆ 1 . As one can see from Definition 6, the act of properisation of a base variable β in an outcome o is β according to a given basic assignment ∆. That is, the move from the right-hand-side of (o, ∆) = ′ ′ (o , ∆ ) to its left-hand-side is performed by taking two steps in unison: 1. scaling according to the intangible mass of ∆(β ) so that β is no longer improper in the resulting new basic assignment ∆′ ; and, 2. replacing every occurrence of β in the outcome o with the probabilistic choice that is weighted according to the intangible mass of ∆(β ), resulting in the new outcome o′ . Haeri et al. 49 The idea is that the intangible mass that ∆′ takes away o′ returns, leaving timeliness intact. Lemma 4 utilises that idea. β ,β ,...,β →| Lemma 4. Suppose that (o, ∆) 1 2 n = (o′ , ∆′ ) for some β1 , β2 , . . . , βn ∈ B, o, o′ ∈ O and basic assignments ∆ and ∆′ . Then, ∆Q[[o]]∆ = ∆Q[[o′ ]]∆′ . Theorem 8 utilises Lemma 4 for examining equivalence of pairs of outcome expressions with no properisation relationship. →| →| Theorem 8. Suppose ∆ and ∆′ are two basic assignments. Suppose also that o1 , o′1 , o2 , o′2 ∈ O such β ,β ,...,β β ,β ,...,β that (o′1 , ∆′ ) = (o1 , ∆) 1 2 n and (o′2 , ∆′ ) = (o2 , ∆) 1 2 n , for some β1 , β2 , . . . , βn ∈ B. Then, ∆Q[[o1 ]]∆ = ∆Q[[o2 ]]∆ iff ∆Q[[o′1 ]]∆′ = ∆Q[[o′2 ]]∆′ . 8.1 Motivating Example: Correctness of the Properisation Step Example 6. Recall from Section 2 that we took the failure rate of our ECC to be 10−16 . One way to model that failure rate is to assume a basic assignment ∆ such that ℑ(∆(main)) = 10−16 . Note also that the outcome expression for the top outcome diagram of Fig. 3 is c-hit [95%] −• main). ⇋ (c-miss •→ Furthermore, recall from Example 5 that the outcome expression for the second diagram of Fig. 3 from the top is −16 ] −• (main [1−10 c-hit [95%] ⇋ (c-miss •→ ⇋ ⊥)). main →| Now, suppose another basic assignment ∆′ = ∆ . Observe first that the latter outcome expression above is the properisation of main in the former according to ∆. Finally, thanks to Lemma 4, we know that one can rewrite the former outcome expression to the latter provided that one also replaces ∆ with ∆′ . Hence, timeliness remains intact over taking the properisation step of Fig. 3. □ 8.2 Disproving the Remaining Distributivity Results →| →| →| →| Armed with Theorem 8, we can now outline the properisation technique: Suppose two outcome expressions o and o′ the equivalence of which is to be studied. One begins β ,...,β β ,...,β by studying the equivalence of o 1 n and o′ 1 n for some β1 , . . . , βn ∈ B. Now, suppose that β ,...,β β ,...,β [p] – after the application of algebraic laws – one gets to rewrite o 1 n to (. . . ) ⇋ ⊥ and o′ 1 n to ′] ′ ′ (. . . ) [p ⇋ ⊥. One concludes that o ̸= o if one can show that p ̸= p . We start the application of our properisation technique by obtaining some useful results. Lemma 5 paves the way for the applications of the above technique. They instruct one on how to accumulate failure at the rightmost corner when the operator between two pairs of parentheses is •→ −•, ⇋, and ∥∀ , ∃ respectively. Unfortunately, ∥ has no such property, as will be shown by Remark 6. Lemma 5. For every o1 , o2 , o3 ∈ O, 1] 2] 1 p2 ] ⊥ (o1 [p −• (o2 [p −• o2 ) [p⇋ ⇋ ⊥) •→ ⇋ ⊥) = (o1 •→ pp [p] [p2 ] [q] [r] 1 1] (o1 [p ⇋ ⊥) ⇋ (o2 ⇋ ⊥) = (o1 ⇋ o2 ) ⇋ ⊥ where q = p2 −pp2 +pp1 and r = p2 − pp2 + pp1 [p2 ] [p1 p2 ] ∀ ∀ 1] (o1 [p ⇋ ⊥)∥ (o2 ⇋ ⊥) = (o1 ∥ o2 ) ⇋ ⊥. 50 Algebraic Reasoning About Timeliness Proof. We only prove the first equivalence here. The proof is similar for the other two equivalences. By Theorems 6 and 5, [p1 ] [p2 ] 1] 2] 2] 1] 1 p2 ] ⊥. (o1 [p −• (o2 [p −• o2 ) [p −• o2 ) [p −• o2 ) [p⇋ ⇋ ⊥) •→ ⇋ ⊥) = ((o1 ⇋ ⊥) •→ ⇋ ⊥ = ((o1 •→ ⇋ ⊥) ⇋ ⊥ = (o1 •→ ■ Remark 6. Interestingly enough, there is no p such that the following holds in its full generality: ? ∃ ∃ [p2 ] [p] 1] (o1 [p ⇋ ⊥)∥ (o2 ⇋ ⊥) = (o1 ∥ o2 ) ⇋ ⊥. →| Suppose there were such a p. One gets to observe after some calculations that equating the ∆Q[[.]] of the [p2 ] ∃ ∃ 1] two sides implies p = p1 = p2 = 1 or p = p1 = p2 = 0. When (o1 [p ⇋ ⊥)∥ (o2 ⇋ ⊥) is o1 ∥ o2 , in which □ o1 and o2 are being properised, that is either when o1 = o2 = ⊤ or o1 = o2 = ⊥. [.] [p] Hereafter, we will write o1 ⇋ o2 to mean o1 ⇋ o2 for some unimportant p. The desirable inequalities in Theorem 9 are all of the form ol ̸= or , with the outcome variables in ol and or being o1 , o2 , and o3 . In order to show ol ̸= or , we proceed by properisation of o1 , o2 , and o3 in ol and or . To that end, we fix a basic assignment ∆, such that ∆Q[[ok ]]∆ = δk and ℑ(δk ) = ik for k ∈ {1, 2, 3}. o ,o ,o Then, we take pk = 1 − ik for k ∈ {1, 2, 3}, (o′k , ∆′ ) = (ok , ∆) 1 2 3 for k ∈ {l, r}. We show that ∆Q[[o′l ]]∆′ ̸= ∆Q[[o′r ]]∆′ to conclude that ∆Q[[ol ]]∆ ̸= ∆Q[[or ]]∆ by Theorem 8 and the result follows. Theorem 9. For every o1 , o2 , o3 ∈ O, (o1 •→ −• o2 ) ⇋ o3 ̸= (o1 ⇋ o3 ) •→ −• (o2 ⇋ o3 ) (o1 ∥∀ o2 ) ⇋ o3 ̸= (o1 ⇋ o3 )∥∀ (o2 ⇋ o3 ) (o1 ∥∀ o2 ) •→ −• o3 ̸= (o1 •→ −• o3 )∥∀ (o2 •→ −• o3 ) o1 ⇋ (o2 •→ −• o3 ) ̸= (o1 ⇋ o2 ) •→ −• (o1 ⇋ o3 ) o1 ⇋ (o2 ∥∀ o3 ) ̸= (o1 ⇋ o2 )∥∀ (o1 ⇋ o3 ) o1 •→ −• (o2 ∥∀ o3 ) ̸= (o1 •→ −• o2 )∥∀ (o1 •→ −• o3 ). Proof. We only prove [p] [p] [p] o3 ̸= (o1 ⇋ o3 ) •→ −• (o2 ⇋ o3 ) (o1 •→ −• o2 ) ⇋ (16) →| →| for a given p here. The rest can be proved similarly using Lemma 5. Fix a basic assignment ∆, such that ℑ(∆Q[[ok ]]∆ ) = ik for k ∈ {1, 2, 3}. Take pk = 1 − ik for k ∈ o ,o ,o [p] [p] {1, 2, 3}. Pick o′l and o′r such that (o′l , ∆′ ) = ((o1 •→ −• o2 ) ⇋ o3 , ∆) 1 2 3 and (o′r , ∆′ ) = ((o1 ⇋ o3 ) •→ −• o1 ,o2 ,o3 [p] ′ ′ ′ (o2 ⇋ o3 ), ∆) , for some basic assignment ∆ . Our target inequality now becomes ol ̸= or , where [p] ′ [p3 ] 1] 2] • o′l is ((o′1 [p −• (o′2 [p ⇋ ⊥) •→ ⇋ ⊥)) ⇋ (o3 ⇋ ⊥), and [p] [p] ′ [p3 ] ′ [p3 ] 1] 2] • o′r is ((o′1 [p −• ((o′2 [p ⇋ ⊥) ⇋ (o3 ⇋ ⊥)) •→ ⇋ ⊥) ⇋ (o3 ⇋ ⊥)). [.] ′ [q] One can rewrite o′l using Lemma 5 as ((o′1 •→ −• o′2 ) ⇋ o3 ) ⇋ ⊥, where q = p3 − pp3 + pp1 p2 . Likewise, o′r [.] ′ [.] ′ [r1 r2 ] o3 )•→ −•(o′2 ⇋ o3 )) ⇋ ⊥, where r1 = p3 − pp3 + pp1 and r2 = p3 − pp3 + pp2 . can be rewritten as ((o′1 ⇋ ′ ′ Should ol ̸= or not hold, one gets q = r1 r2 . That is p3 − pp3 + pp1 p2 = (p3 − pp3 + pp1 )(p3 − pp3 + pp2 ). But, that is not an equation that holds in general. Inequality (16) follows by Theorem 8. ■ 9 Related Work ∆QSD has been used in practice by a small group of practitioners for a couple of decades now [20, 19, 6, 14, 5]. The first formalisation of ∆QSD was, however, done quite recently by Haeri et al. [11]. We use that formalisation as a foundation. Haeri et al. 51 Teigen et al [14] use ∆Q to develop a novel model of WiFi performance that produces complete latency distributions. The model is validated by comparison with previous modeling work and real-world measurements. It would be very interesting to apply ∆QSD to an outcome description of the protocol to see if this can replicate the same results. Elsewhere, Gajda [10] attempts to model latency distributions but allows operations that do not preserve total probability, hence, leading to incorrect conclusions about failure probabilities. Business Process Modelling and Notation (BPMN) [18] is a diagram scheme which is closely related to Outcome Diagrams (although with some details that are not considered relevant to ∆QSD). BPMN supports all ∆QSD operators except probabilistic choice. The closest operator is their “xor” gateway, which is essentially [0.5] ⇋ . It is less expressive to the extent that it makes it impossible to consider systems such as the example in Section 2. Of the attempts for formalising BPMN, those of Wong and Gibbons [23, 24] are the most related to our work. Wong and Gibbons use the CSP process algebra for that purpose and further develop it to enable the specification of timing constraints on concurrent systems. Their developments allow mechanical verification of behavioural properties of BPMN diagrams using the FDR2 [15] refinement checker. Whilst Wong and Gibbons prove many interesting properties of their BPMN instances, they do not consider algebraic equivalences or algebraic structures for BPMN as we do in this work for ∆QSD. A less related BPMN formalisation work is that of El Hichami et al. [8], which provides a denotational semantics based on the Max+ algebra as an execution model for BPMN. They list a handful of algebraic equivalences in Max+ only axiomatically. Nevertheless, El Hichami et al. make no attempt to study the equivalence of BPMN diagrams based on their Max+ semantics. When it comes to timeliness analysis, an important advantage of outcome diagrams over BPMNs is Definition 3, which formally defines the timeliness analysis of outcome diagrams. Definition 3 is fundamental to the applicability of the model theory we employ in this paper (Section 4.4). We are not aware of any formally defined recipe for timeliness analysis of BPMNs. The two closest attempts that we could find are the following two: Friedenstab et al. [9] borrow constructs from Business Activity Monitoring [4] to augment BPMN with a graphical notation for describing certain timeliness matters. Likewise, Morales [16] informally describes how to transform BPMN diagrams to timed automata networks, suggesting qualitative analysis of timeliness. Performance Evaluation Process Algebra (PEPA) [13] is an algebraic language for performance modelling of systems. PEPA is successful and well-published with a rich family of formalisations with various interesting theoretical properties. However, PEPA suffers from several shortcomings that make it difficult to apply to real-world software systems. For example, PEPA does not model open or partiallyspecified systems; every detail of the system needs to be determined in advance. Since PEPA does not allow goals and objectives to be specified, it offers no assistance when comparing the predicted performance with the requirements. PEPA also suffers from state explosion, rapidly making it impractical, although more recent PEPA technology employs continuous approximations of the states, which contain some of the state explosion. This is similar to the use of IRVs in ∆QSD but rather ad hoc compared with the systematic use of ∆Qs in ∆QSD. Less conservative alternatives to PEPA like SCEL [7] allow open systems but suffer from even more state explosion. CARMA [2] addresses a lot of the problems with PEPA, using a fluid approximation to manage the state explosion. PerformERL [3] is an Erlang toolset, which focuses on monitoring the relationship between load repeatability and internal resource allocation. The authors advertise their toolset as an assistant for making early stage performance decisions, but it is unclear how it does this. Uunlike ∆QSD, monitoring (like testing) requires implementation of the system specification up to a certain level. The closer the implementation is to the full specification, the more reliable the monitoring will become, but the analysis is then no longer early-stage. Less accurate monitoring, on the other hand, is not reliable for decision 52 Algebraic Reasoning About Timeliness making. The closest PerformERL gets to the work described in this paper is its lightweight theoretical work out of the monitoring overhead it imposes to the system under development. Finally, Failure Modes Effects Analysis [1] (FMEA) considers how failures propagate through a system but, unlike ∆QSD, does not model delays. We are not aware of any formalisation of FMEA that can serve algebraic developments like those on failure in this paper. 10 Conclusion and Future Work This paper lays down model-theoretic foundations for timeliness analysis à la ∆QSD. It establishes time as a resource that is consumed by outcomes. In doing so, it enables timeliness analysis via the study of quality attenuation, simultaneously capturing both delay and failure. With our focus being exclusively on timeliness, we discuss the algebraic structures that the ∆QSD operators form with outcome expressions (Theorems 1–4). We refute the formation of richer algebraic structures by the ∆QSD operators and outcome expressions (Remarks 2, 3, and 4). We consider the 15 distributivity results about the ∆QSD operators. We prove 3 (Theorem 6) and disprove 8 (Theorem 9) using the newly formalised technique developed in this paper called properisation (Theorem 8) and 4 using counterexamples (Theorem 7). We also provide guidelines for studying the existence of potential distributivity (Section 7.1). Finally, we establish 14 important equivalences that have already been used in the practice of ∆QSD over the past few decades (Lemmas 2–3 and Theorem 5). Our immediate future work is to study the algebraic properties of other resources à la ∆QSD, with the eventual goal of providing an algebraic categorisation of resources. A sound theoretical foundation is essential for the construction of robust tool support, which is, in turn, a prerequisite for wider application of the ∆QSD paradigm. Currently, there is a numerically-based tool prototype. However, to deal effectively with large complex systems, this needs to be made more symbolic. The aim is for the expressions to be simplified before calculation, and to be able to represent performance unknowns. Algebraic structures are essential for correctly manipulating and simplifying expressions. This work informs both ongoing practical work and tool development. Conversely, consideration of specific aspects of system design and operation will inform the most productive directions for the theoretical developments. To conclude, this paper has introduced a number of important algebraic properties for ∆QSD outcome expressions. These properties have a highly practical application in the analysis of timeliness and resource consumption. For the first time, we have shown distributivity of the ∆QSD operators over probabilistic choice, and placed a set of ‘folklore’ equivalences (Theorem 5) that are in common usage for ∆QSD on a sound footing. These equivalences are essential for rapid recognition of infeasibility and for sound manipulation of outcome expressions to reduce computational complexity. Acknowledgements This research is funded by IOG, Singapore as a part of an ongoing project for incorporating performance as a first-class factor of the software development life cycle. When the routine proof technique did not work for distributivity, Andre Knispel (of IOG) suggested that we could utilise easier properties to obtain the disproofs using contrapositive reasoning. We would like to thank him for that suggestion. Haeri et al. 53 References [1] (1980): MIL-STD-1629A – Procedures for Performing a Failure Mode Effect and Criticality Analysis. Technical Report, United States Department of Defense. [2] L. Bortolussi, R. De Nicola, V. Galpin, S. Gilmore, J. Hillston, D. Latella, M. Loreti & M. Massink (2015): CARMA: Collective Adaptive Resource-sharing Markovian Agents. In N. Bertrand & M. Tribastone, editors: Proc. 13th W. Quant. Aspects of Prog. Lang. and Sys., EPTCS 194, pp. 16–31, doi:10.4204/EPTCS.194.2. [3] W. Cazzola, F. Cesarini & L. Tansini (2022): PerformERL: A Performance Testing Framework for Erlang. Distributed Comp. 35(5), pp. 439–454, doi:10.1007/s00446-022-00429-7. [4] C. Costello & O. Molloy (2008): Towards a Semantic Framework for Business Activity Monitoring and Management. In: AAAI Spring Symposium: AI meets business rules and process management, pp. 17–27. [5] D. Coutts, N. Davies, M. Szamotulski & P. Thompson (2020): Introduction to the Design of the Data Diffusion and Networking for Cardano Shelley. Technical Report, IOHK. Available at https://hydra.iohk. io/build/20405228/download/1/network-design.pdf. [6] N. Davies, P. Thompson, G. Young, J. Newton, B. Teigen & M. Olden (2021): Measuring Network Impact on Application Outcomes Using Quality Attenuation. In: Measuring Network Quality for End-Users, Internet Architecture Board, pp. 43–52. Available at https://www.iab.org/wp-content/IAB-uploads/2021/ 09/PNSol-et-al-Submission-to-Measuring-Network-Quality-for-End-Users-1.pdf. [7] R. De Nicola, D. Latella, A. L. Lafuente, M. Loreti, A. Margheri, M. Massink, A. Morichetta, R. Pugliese, F. Tiezzi & A. Vandin (2015): The SCEL Language: Design, Implementation, Verification, pp. 3–71. Springer, doi:10.1007/978-3-319-16310-9_1. [8] O. El Hichami, M. Naoum, M. Al Achhab, I. Berrada & B. E. El Mohajir (2015): An Algebraic Method for Analysing Control Flow of BPMN Models. iJES 3(3), pp. 20––26, doi:10.3991/ijes.v3i3.4862. Available at https://online-journals.org/index.php/i-jes/article/view/4862. [9] J.-P. Friedenstab, C. Janiesch, M. Matzner & O. Muller (2012): Extending BPMN for Business Activity Monitoring. In: 45th HICSS, pp. 4158–4167, doi:10.1109/HICSS.2012.276. [10] M. J. Gajda (2020): Curious Properties of Latency Distributions. CoRR abs/2011.05219, doi:10.1007/9783-031-10461-9_10. Available at https://arxiv.org/abs/2011.05219. [11] S. H. Haeri, P. Thompson, N. Davies, P. Van Roy, K. Hammond & J. Chapman (2022): Mind Your Outcomes: The ∆QSD Paradigm for Quality-Centric Systems Development and Its Application to a Blockchain Case Study. Computers 11(3), p. 45, doi:10.3390/computers11030045. Available at https://www.mdpi.com/ 2073-431X/11/3/45. [12] S. H. Haeri, P. W. Thompson, P. Van Roy, M. Haveraaen, N. J. Davies, M. Barash & J. Chapman (2023): On the Algebraic Properties of Timeliness. Technical Report, IOG. Available at http://www.pnsol.com/ public/Algebraic-Timeliness-TR.pdf. [13] J. Hillston (1996): A Compositional Approach to Performance Modelling. Cambridge University Press, doi:10.1017/CBO9780511569951. [14] B. Ivar Teigen, N. Davies, K. Olav Ellefsen, T. Skeie & J. Torresen (2022): Quantifying the Quality Attenuation of WiFi. In S. Oteafy, E. Bulut & F. Tschorsch, editors: IEEE 47th LCN, IEEE, pp. 189–197, doi:10.1109/LCN53696.2022.9843690. [15] Formal Systems (Europe) Ltd (2012): Failures-Divergence Refinement: FDR2 User Manual. Available at https://www.cs.ox.ac.uk/projects/concurrency-tools/download/fdr2manual-2.94.pdf. [16] L. E. M. Morales (2014): Specifying BPMN Diagrams with Timed Automata: Proposal of Some Mapping Rules. In: 9th CISTI, pp. 1–6, doi:10.1109/CISTI.2014.6876897. [17] Predictable Network Solutions Ltd (PNSol) (2022): Available at http://www.pnsol.com. 54 Algebraic Reasoning About Timeliness [18] K. J. Sherry (2012): Business Process Modelling with BPMN: Modelling and Designing Business Processes Course Book using The Business Process Model and Notation Specification Version 2.0. CreateSpace Independent Publishing Platform. [19] P. Thompson (2022): TR-452.2 Quality Attenuation Measurements using Active Test Protocols. Technical Report, The Broadband Forum. [20] P. Thompson & R. Hernadaz (2020): Quality Attenuation Measurement Architecture and Requirements. Technical Report TR-452.1, Broadband Forum. Available at https://www.broadband-forum.org/ download/TR-452.1.pdf. [21] K. S. Trivedi (2002): Probability and Statistics with Reliability, Queuing, and Computer Science Applications, 2 edition. Wiley, New York, NY, USA. [22] P. Van Roy, N. Davies, P. Thompson & S. H. Haeri (2023): ∆QSD: Designing Systems with Predictable Latency at High Load. Tutorial, HiPEAC 2023 (Conf. High Perf. Emb. Arch. & Compil.). Available at shorturl.at/dmKSW. [23] P. Y. H. Wong & J. Gibbons (2011): Formalisations and Applications of BPMN. SCP 76(8), pp. 633–650, doi:10.1016/j.scico.2009.09.010. Available at https://www.sciencedirect.com/science/article/ pii/S0167642309001282. [24] P. Y. H. Wong & J. Gibbons (2011): Property Specifications for Workflow Modelling. SCP 76(10), pp. 942–967, doi:10.1016/j.scico.2010.09.007. Available at https://www.sciencedirect.com/science/ article/pii/S0167642310001735.

Log In

Algebraic Reasoning About Timeliness

Sign up to get access to over 50M papers

Related papers

Related papers

Related topics