Academia.eduAcademia.edu

A Distributed Graph Algorithm: Knot Detection

1982, ACM Transactions on Programming Languages and Systems

A v e r t e x vi i n a d i r e c t e d g r a p h is i n a k n o t i f f o r e v e r y v e r t e x vj r e a c h a b l e f r

A Distributed Graph Algorithm" Knot Detection J. MISRA and K. M. CHANDY University of Texas at Austin A knot in a directed graph is a useful concept in deadlock detection. A distributed algorithm for identifying a knot in a graph by using a network of processes is presented. The algorithm is based on the work of Dijkstra and Scholten. Categories and Subject Descriptors: C.2.4 [Computer-Communication Systems]: Distributed Systems--distributed applications, network operating systems; D.1.3 [Programming Techniques]: Concurrent Programming; F.2.2 [Analysis of Algorithms and Problem Complexity]: Nonnumerical Algorithms and Problems--sequencing and scheduling; G.2.2 [Discrete Mathematics]: Graph Theory--graph algorithms, network problems General Terms: Algorithms Additional Key Words and Phrases: Distributed algorithms, message communication, knot 1. INTRODUCTION A v e r t e x vi in a d i r e c t e d g r a p h is in a k n o t if for e v e r y v e r t e x vj r e a c h a b l e f r o m vi, vi is r e a c h a b l e f r o m vj. C h a n g [2] s h o w s t h a t k n o t is a u s e f u l c o n c e p t in d e a d l o c k d e t e c t i o n . D i j k s t r a [3] h a s p r o p o s e d a d i s t r i b u t e d a l g o r i t h m for d e t e c t i n g w h e t h e r a g i v e n p r o c e s s in a n e t w o r k o f p r o c e s s e s is in a k n o t . H i s a l g o r i t h m is b a s e d o n his p r e v i o u s w o r k w i t h S c h o l t e n [4] o n t e r m i n a t i o n d e t e c t i o n o f d i f f u s i n g c o m p u t a t i o n s . W e p r o p o s e a n a l g o r i t h m for k n o t d e t e c t i o n w h i c h is a l s o b a s e d o n [4] b u t is c o n c e p t u a l l y s i m p l e r . W e also d i s c u s s t h e e x t e n s i o n s o f o u r a l g o r i t h m t o a m o r e g e n e r a l class o f p r o b l e m s . 2. MODEL OF A NETWORK OF COMMUNICATING PROCESSES A p r o c e s s is a s e q u e n t i a l p r o g r a m w h i c h c a n c o m m u n i c a t e w i t h o t h e r p r o c e s s e s by sending/receiving messages. Two processes P and Q are said to be neighbors if t h e y c a n c o m m u n i c a t e d i r e c t l y w i t h o n e a n o t h e r w i t h o u t h a v i n g m e s s a g e s go through intermediate processes. We assume that communication channels are b i d i r e c t i o n a l : if P c a n s e n d m e s s a g e s t o Q, t h e n Q c a n s e n d m e s s a g e s t o P . A p r o c e s s k n o w s its n e i g h b o r s b u t is o t h e r w i s e i g n o r a n t o f t h e g e n e r a l c o m m u n i cation structure of the network. Supported in part by the Air Force under grant AFOSR 81-0205. Authors' address: Department of Computer Sciences, University of Texas, Austin, TX 78712. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. © 1982 ACM 0164-0925/82/1000-0678 $00.75 ACM Transactionson ProgrammingLanguagesand Systems,Vol. 4, No. 4, October 1982, Pages 678-686. Knot Detection • 679 We assume a very simple protocol for message communication; this protocol is equivalent to the one used by Dijkstra and Scholten [4]. Every process has an input buffer of unbounded length. If process P sends a message to a neighbor process Q, then the message gets appended at the end of the input buffer of Q after a finite, arbitrary delay. We assume that (1) messages are not lost or altered during transmission, (2) messages sent from P to Q arrive at Q's input buffer in the order sent, and (3) two messages arriving simultaneously at an input buffer are ordered arbitrarily and appended to the buffer. A process receives a message by removing it from its input buffer. The assumption of unbounded length buffers is for ease of exposition. We show in Section 5.1 that the input buffer length of process Q can be bounded by the number of neighbors of Q. 3. A DISTRIBUTED ALGORITHM FOR KNOT DETECTION Consider a network of processes corresponding to a given directed graph G: there is a one-to-one correspondence between processes in the network and vertices in the graph; a process pi in the network represents vertex vi in G, for all i; and pi, pj are neighbors if edge (vi, vj) or (vj, vi) exists in G. Process pl initiates a computation to determine if vl is in a knot. 3.1 Local Variables of Processes Every process pi maintains the following variables. succeeding(i): preceding(i): subordinate(i): cs(i): This Boolean variable is set true when pi determines that vi is reachable from vl. Initially this variable is false for all pi, i # 1, and is true for pl. Eventually, succeeding(i) will be true if and only if vi is reachable from Vl. Same as above, except that preceding(i) represents whether vl is reachable from vi. This is integer valued and will be set to 1 if and only if succeeding(i) a n d notpreceding(i); else it will be set to 0. v~ is in a knot if and only if subordinate(i) is eventually zero for every process i. This is an integer-valued variable which keeps the partial sum of some subordinate variables. A goal of the program is to establish the following at termination: cs(1) = ~ subordinate(i). i Therefore v~ is in a knot if and only if cs(1) -- 0 at termination. We discuss in Section 3.2 the different types of messages sent among processes. In short, a process pi may send a message to Pi, and pj sends an acknowledgment (ack) to pi for every message that pj receives from pi. We introduce the following variables related to message and ack transmission. num(i): This is the number of unacknowledged messages, that is, the number of messages sent by this process pi for which acks have not been received so far. ACM Transactions on P r o g r a m m i n g Languages a n d Systems, Vol. 4, No. 4, October 1982. 680 J. Misra and K. M. Chandy father(i): T h i s is a process f r o m which pi, i ~ 1, received a message w h e n its num(i) was last zero. father(i) is undefined initially. Our goal is to m a i n t a i n a rooted tree structure at all times over processes whose n u m > 0; father will denote the p a r e n t in this tree structure, a n d p l the root. 3.2 Messages Sent Among Processes T h e r e are two types of messages sent b e t w e e n neighbors in this algorithm. (i) Structure message, or message, has two c o m p o n e n t s , (type, p-), where type = suc or pre and p is the identity of the sender process. Process pi sends (suc, pi) to pj if there is a p a t h f r o m Vl to vj in which vi is the prefinal vertex. Process pi sends (pre, pi) to pj if t h e r e is a p a t h f r o m vj to vl in which vi follows vj in the path. (ii) A c k n o w l e d g m e n t message, or ack, is of the f o r m (ack, c), where c is a n integer. Acks are used to u p d a t e cs and num. T h e entire c o m p u t a t i o n t e r m i n a t e s w h e n process p l receives acks for all messages t h a t it sent, t h a t is, w h e n num(1) is d e c r e m e n t e d to zero. Acks for all messages are sent b a c k as soon as the messages are received, except for messages received f r o m father; an ack to a father is sent only when num next b e c o m e s zero. Convention. I t is convenient for p u r p o s e s of p r o o f to define an atomic action within which invariant assertions m a y be t e m p o r a r i l y violated and outside of which the invariants m u s t hold. We write (A1; A2; . . . ; A , ) to show t h a t executions of s t a t e m e n t s A1, A2 . . . . . An m u s t be considered as an atomic action. We use PASCAL-like notation with the added c o m m a n d s send and receive to write our programs. 3.3 Knot-Detection Algorithm Convention. We write succeeding, preceding, etc., for succeeding(i), preceding(i), etc., w h e n the context is clear. Overview of the Algorithm. As stated earlier, one goal of the algorithm is to maintain a rooted directed tree structure over the set of processes p~ whose num(i) > 0. T h e root of the tree will be pl, a n d father(i) will be the p a r e n t in the tree for p~, i ~ 1. In order to m a i n t a i n the tree structure, we m u s t ensure t h a t (1) a process p~, i ~ 1, acquires a father only if it does not h a v e one currently: this is guaranteed, since a process acquires a father only w h e n its num(i) b e c o m e s nonzero; and (2) a process pi can be r e m o v e d f r o m the tree (i.e., set its num(i) = 0) only if it is a leaf node: this will be g u a r a n t e e d b y every process sending its last ack to its father. C o m p u t a t i o n t e r m i n a t e s w h e n the tree is e m p t y . We will also m a i n t a i n the invariant (1) given in L e m m a 4.2, which states t h a t the s u m of cs over all processes plus the c's in the acks in transit equals the s u m of subordinates over all processes. T h e algorithm will ensure t h a t if num(i) = 0 and i ~ 1, t h e n cs(i) = 0. Therefore, w h e n the tree is e m p t y , cs(i) = 0 for all i, i 1, and hence cs(1) = ~ subordinate(i). i ACM Transactions on Programming Languages and Systems, Vol. 4, No. 4, October 1982. Knot Detection 681 P r o c e s s p l is in a k n o t if a n d o n l y if cs(1) -- O. Algorithm for Pl Initialization begin f a t h e r is undefined; subordinate := O; cs := O; n u m :-- O; (succeeding :ffi true; n u m := n u m + number of successors of vl; send (suc, pi) to all successors} ; (preceding :ffi true; n u m := n u m + number of predecessors of Vl; send (pre, p~) to all predecessors} end Upon receiving a structure message (type, p) send (ack, O) t o p (M1) Upon receiving an a c k n o w l e d g m e n t (ack, c) begin cs := cs + c; n u m := nurn - 1; i f nurn = 0 t h e n terminate computation {vl is in a knot if cs = 0} end (M2) Algorithm for pi, i # 1 Initialization begin f a t h e r is undefined; subordinate := 0; cs := 0; n u m := 0; succeeding :-- false; p r e c e d i n g := false end Upon receiving a message (type, p ) begin {update f a t h e r or send an ack immediately} if num= 0 then f a t h e r := p else b e g i n (send (ack, cs) t o p ; cs := 0} end; (L1) {update succeeding and p r e c e d i n g if necessary} i f type = suc a n d n o t succeeding {For the first time pi has determined that vi is reachable from Vl} then b e g i n (succeeding :-- true; n u m := n u m + number of successors of vi; send (suc, pi) to all successors) end; i f type = p r e a n d n o t p r e c e d i n g {For the first time p~ has determined that vl is reachable from vi } then b e g i n (preceding := true; n u m := n u m + number of predecessors of v~; send (pre, pi) to all predecessors) end; {update subordinate if necessary. Also update cs to maintain the invariant in Lemma 4.2) ACM Transactions on Programming Languages and Systems, Vol. 4, No. 4, October 1982. 682 J. Misra and K. M. Chandy if succeeding and n o t preceding then begin ( cs := cs - subordinate + 1; subordinate := 1) end else begin (cs := cs - subordinate + 0; subordinate := O) end; {send ack to father if n u m = 0} if n u m = 0 t h e n begin {send {ack, cs) to father; cs := O) end end (L2) (L3) (L4) Upon receiving an acknowledgment ( ack, c) begin cs := cs + c; num := num - 1; if num = 0 then begin (send (ack, cs) to father; cs := O) end (L5) (L6) end 4. PROOF OF CORRECTNESS LEMMA 4.1. A t a n y p o i n t in t h e c o m p u t a t i o n , t h e s e t o f p r o c e s s e s w i t h n u m > 0 f o r m a r o o t e d tree w i t h p l a s t h e r o o t a n d t h e p a r e n t r e l a t i o n s p e c i f i e d by the local variable "father." PROOF. T h e l e m m a holds vacuously initially, n u m ( i ) and f a t h e r ( i ) m a y be changed only u p o n receipt of a m e s s a g e or an a c k by process i. I f a process with n u m > 0 receives a m e s s a g e , t h e n it does not alter its f a t h e r , t h u s preserving the tree property. Similarly, if a process h a s n u m > 0 after processing an a c k , it does not alter the tree structure. If a process P / c h a n g e s n u m ( j ) f r o m zero, t h e n it m u s t h a v e received a m e s s a g e f r o m some other process pi on the tree and m u s t have set f a t h e r ( j ) = i, thus preserving the tree property. We now show t h a t only a leaf node can d e c r e m e n t its n u m to zero. If pi is on the tree and is not a leaf, t h e n t h e r e is a process p/ with n u m ( j ) > 0 a n d f a t h e r ( j ) = i; t h e n pj will not r e t u r n an a c k to pi while p j r e m a i n s on the tree, and hence n u m ( i ) > 0 while pj r e m a i n s on the tree. T h e r e f o r e only a leaf node can d e c r e m e n t its n u m to 0, which preserves the tree property. T h i s c o m p l e t e s the proof. [] L e t T, at any point in computation, denote the set of a c k messages which are in Transit, t h a t is, which h a v e b e e n sent b u t h a v e not y e t b e e n received. LEMMA 4.2. T h e f o l l o w i n g is a n i n v a r i a n t : cs(i) + i ~ (ack, c ) E T c = ~ subordinate(i). (1) i PROOF. T h e l e m m a holds initially, since all the t e r m s in the equation are zero. For pi, i ~ 1, the t e r m s in the equations are modified only at p r o g r a m points L 1 L6, and for pl, these t e r m s can be modified only at M1 or M2. T h e r e a d e r m a y easily convince himself t h a t the equation is left invariant b y the execution of the s t a t e m e n t s at these p r o g r a m points. [] THEOREM 4.3. A s s u m e t h a t p r o c e s s p l t e r m i n a t e s c o m p u t a t i o n (in s t e p M2). cs(1) = 0 i f a n d o n l y i f vl is i n a k n o t . ACM Transactions on P r o g r a m m i n g Languages a n d Systems, Vol. 4, No. 4, October 1982. Knot Detection 683 PROOF. We first show that when pl terminates computation (i) cs(i) = 0 for i # 1, (ii) subordinate(i) is correctly set, and (iii) the set T is empty. T h e t h e o r e m follows directly from the invariant proved in L e m m a 4.2. (i) When pl terminates computation in step M2, num(1) = O. T h e n the tree is empty, since pl was the root of the tree. T h e r e f o r e n u m ( i ) = 0 for all i. If n u m ( i ) = 0, then cs(i) = 0 for all i, i # 1, because every change to n u m ( i ) is followed by the code to set cs(i) to 0 i f n u m ( i ) is 0 (steps IA and L6). (ii) If vi is reachable from Vl, it follows by induction on p a t h length to vi t h a t pi will eventually receive a message which will result in succeeding(i) set true; succeeding(i) remains true thereafter. Similarly for preceding(i). T h e r e f o r e subordinate(i) will eventually be set to its correct value. W h e n assignment is made to succeeding(i) or preceding(i), pi has not r e t u r n e d an ack to its father, and hence the computation could not be over. T h e r e f o r e these variables are assigned their correct values before the termination of computation. (iii) Since the tree is empty, every process must have received acks corresponding to all messages sent. T h e r e f o r e there can be no ack in transit, t h a t is, set T is empty. [] LEMMA 4.4. pl will terminate computation in finite time. PROOF. A processpi sends at most two messages (type, pi) to any other process pj because (1) a message is sent only when succeeding or preceding is set to true, and (2) succeeding and preceding are never reset to false. Because the graph is finite, the total n u m b e r of messages sent is bounded. Hence the total n u m b e r of acks sent is also bounded. Observe t h a t every process must send or receive either a message or an ack every time it starts to execute. T h e r e f o r e a process can switch from idle to executing only a finite n u m b e r of times. T h e r e are no loops in the program; therefore every executing process will become idle in finite time. Hence every process in the network will cease to execute in finite time, and no more messages or acks will be sent or received from t h e n on. We now show t h a t the tree must be e m p t y at this point. If not, let p~ be a leaf node of the tree; n u m ( i ) > 0, since pi is on the tree. T h e r e is no pj on the tree for which f a t h e r ( j ) = pi, and hence pi must have received all its outstanding acks; therefore n u m ( i ) = 0, a contradiction! [] 5. NOTES ON THE KNOT-DETECTION ALGORITHM 5.1 Bounding the Buffer Size We assumed earlier for purposes of exposition t h a t buffers are of u n b o u n d e d length. In the knot-detection algorithm a process sends at most two messages to any neighbor process, and therefore no process sends more t h a n two acks to any other process. Hence the buffer length for any process need not exceed four times the n u m b e r of neighbors of the process. 5.2 Efficiency This algorithm is superior to the brute-force algorithm in which process p l (1) computes successor*, the set of vertices reachable from vl; (2) computes predeACM Transactions on Programming Languages and Systems, Vol. 4, No. 4, October 1982. 684 J. Misra and K. M. Chandy cessor*, the set of vertices that can reach vl; and (3) then declares that vl is in a knot if and only if successor* C__predecessor*. T he computation of successor* (predecessor*) can be done by using an algorithm similar to the one proposed here--every ack carries with it a set of successors (predecessors). Therefore a successor at distance d from v~ will have its identity transmitted through d processes to reach vl. The total message length will be at least O ( N 2) for an Nvertex graph, as opposed to O ( E ) for our algorithm, where E is the number of edges. 6. EXTENSIONS We show in this section that the ideas in the knot-detection algorithm can be extended to solve a very general class of problems. Consider a distributed computation which is initiated by process pl sending messages to some of its neighbors. Any other process can send messages only after receiving a message. Th e computation terminates when no process has any more messages to send and all messages that have been sent have been received. Dijkstra and Scholten [4] were the first to identify this class of computations, which they call diffusing computations. T he y proposed an algorithm, using the growing and shrinking tree, to detect termination of diffusing computations. Our contribution is to show how the same idea may be exploited to compute a networkwide function of locally computed results. Let local-result(i) denote some computed result at process pi, at termination of the entire computation. It is required to compute global-result at the termination of computation, where global-result = f(local-result(i), for all i), (2) f being any arbitrary computable function. The knot-detection algorithm computed the global-result cs(1), with localresult(i) =- subordinate(i), and cs(1) = ~ subordinate(i), (3) i that is, f ~ ~. We propose two schemes for computing networkwide functions. Note that our algorithm can be used to develop distributed algorithms according to the following methodology. In order to compute some global-result, invent a function f and • local-result(i) satisfying (2) and then design a distributed algorithm to compute local-result(i) at processpi, for all i. T h e n superimpose our algorithm to compute the global-result. A variation of this idea appears in [1], where a number of other problems amenable to this approach are listed. One difficulty with a straightforward implementation is that a process cannot know when network computation has terminated. Process pi knows t hat network computation can terminate only when num(i) --- 0; however, pi cannot assert the converse, that is, that network computation may not have terminated even if num(i) = 0. Hence pi must send back its current value of local-result(i) to its father every time that it decrements num(i) to zero. This causes a problem: pi may send back a local-result to its father and subsequently get another message ACM Transactions on Programming Languages and Systems, Vol. 4, No. 4, October 1982. Knot Detection 685 which causes it to compute a new local-result. Therefore pi must cancel the old local-result value. We propose two mechanisms for canceling out-of-date local results: bags and time stamps. To simplify exposition in our discussion of cancellation schemes, we will assume that there is no delay between sending and receiving a message, that is, that there is never any message in transit. The reader can easily convince himself that the arguments also apply when the transmission delay is not zero. 6.1 Bags Each process p/maintains two bags, all(i) and canceled(i). Each bag element is of the form (j, local-result(j)). If (j, x) is an element in canceled(i), then process pj has definitely canceled an out-of-date local-result x. If (j, x) is an element of all(i), then at some time pj posted a local-result x. The elements in all(i) are not necessarily current. Every local-result that P1 has posted appears in the union of bags all(i), for every i. Similarly, all local-results that pj has canceled appear in the union of canceled(i), for every i. Therefore pj's current local-result is in the difference of these two bag unions. In other words, the goal is to maintain the following invariant. Let r ( j ) denote the current local-result of process j, and let U denote the union operation over bags. Then U (j, r(j)) = U all(i) - U canceled(i). j i i Initially, all(i) holds the initial local-result ofpi, and canceled(i) is empty. To post a current local-result x and cancel the previous local-result y, process pi adds (i, x) to all(i) and (i, y) to canceled(i). Two bags a bag and c bag are returned with every ack in the form ( ack, a bag, c bag). When pj sends an ack, it takes the elements out of bag all(j) and puts them into a bag, and similarly puts elements from canceled(j) into c bag, and then sends abag and cbag along with the ack. Ifpi receives (ack, abag, cbag), it adds the contents of a bag to all(i) and c bag to canceled(i). At termination, all(i) and canceled(i) will be empty for i ~ 1, canceled(l) will contain tuples corresponding to all canceled local-results, and all(l) will contain tuples corresponding to all local-results, current and canceled. By removing the canceled results (i.e., elements of canceled(l)) from all(l), pl can determine the current local-results for all processes. The knot-detection algorithm of Section 3 uses the bag idea; the information in the two bags has been condensed into a single integer cs. Adding an element (j, x) to all(i) is implemented by incrementing cs(i) by x. Adding an element (j, y) to canceled(i) is achieved by decrementing cs (i) by y. Efficiency. The sizes of the bags returned with acks can be reduced by having each process pi remove all elements common to all(i) and canceled(i) from both all(i) and canceled(i). 6.2 Time Stamps Each process Pi maintains a set S (i) of triples of the form (j, n (j), local-result(j)), where n ( j ) is a time stamp local to process pj. When a process pi wishes to post ACM Transactions on Programming Languages and Systems, Vol. 4, No. 4, October 1982. 686 J. Misra and K. M. Chandy a new local-result x (and cancel an out-of-date result), it increments n(i) and adds (i, n(i), x) to S. Whenpi sends an ack, it sends (ack, S(i)) and then sets S(i) to empty. Upon receiving an ack, (ack, B), pi sets S(i) t o / t h e union of S(i) and B. Upon termination, S(i) will be empty for all i ~ 1, and S(1) will contain all tuples (i, n(i), S(i)) that have been sent. pl can identify the current local-results because they will be associated with the latest time stamps. Efficiency. The sizes of the sets returned with acks can be reduced by having each process pi discard all elements in S(i) that it can identify as being out of date. ACKNOWLEDGMENTS We gratefully acknowledge the suggestions of E. W. Dijkstra and C. S. Scholten, on whose work this paper is based. We are also grateful to two anonymous referees for their valuable comments. REFERENCES 1. CHANDY, K.M., AND MISRA, J. Distributed computation on graphs: Shortest path algorithms. Commun. ACM. 25, 11 (Nov. 1982). 2. CHANG, E. Decentralized deadlock detection in distributed systems. Tech. Rep., Univ. of Victoria, Victoria, B.C., Canada. 3. DIJKSTRA, E.W. In reaction to Ernest Chang's Deadlock Detection. EWD702, Plataanstraat 5, 5671 AL Nuenen, The Netherlands, Feb. 21, 1979. 4. DIJKSTRA, E.W., AND SCHOLTEN, C.S. Termination detection for diffusing computation. Inf. Process Left. 11, 1 (Aug..1980), 1-4. Received September 1981; revised May 1982; accepted May 1982 ACM Transactions on Programming Languages and Systems, Vol. 4, No. 4, October 1982.