Lecture-04. DS Models New
Lecture-04. DS Models New
Models of Distributed
Computation, Causality &
Logical Time
Rajiv Misra
Dept. of Computer Science & Engineering
Indian Institute of Technology Patna
rajivm@iitp.ac.in
Preface
• The send and the receive events signify the flow of information
between processes and establish causal dependency from the sender
process to the receiver process.
• A relation →msg that captures the causal dependency due to
message exchange, is defined as follows. For every message m that is
exchanged between two processes, we have
The ordered pair <A, R> is called a poset (partially ordered set) when R is
a partial order.
• For example, in Figure 4.1 event e26 has the knowledge of all other events
shown in the figure.
Space-time diagram e11→ e33 e33 → e26
• For example, in Figure 4.1, e33 e24 and e24 e15, however, e33 e15
• For any two events ei and ej in a distributed execution,
ei → ej or ej → ei , or ei ej .
Space-time diagram e13 e33 e24 e31
• For example, in Figure 4.1, events in the set {e13,e24 ,e33 } are
logically concurrent, but they occurred at different instants in
physical time. However, note that if processor speed and
message delays had been different, the execution of these
events could have very well coincided in physical time.
R1: Before executing an event (send, receive, or internal), process pi executes the
following:
Ci := Ci + d (d > 0)
• In general, every time R1 is executed, d can have a different value; however, typically
d is kept at 1.
Scalar Time
R2: Each message piggybacks the clock value of its sender at sending time.
When a process pi receives a message with timestamp Cmsg , it executes the following
actions:
1 2 3 8 9
p1
9
2
1 4 5 7 11
p2
3 10
4
1 b
p3
5 6 7
Total Ordering
1 2 3 8 9
p1
9
2
1 4 5 7 11
p2
3 10
4
1 b
p3
5 6 7
• Process identifiers are linearly ordered and tie among events with identical scalar
timestamp is broken on the basis of their process identifiers.
• The lower the process identifier in the ranking, the higher the priority.
• The timestamp of an event is denoted by a tuple (t, i ) where t is its time of
occurrence and i is the identity of the process where it occurred.
• The total order relation ≺ on two events x and y with timestamps (h,i) and (k,j),
respectively, is defined as follows:
x ≺ y ⇔ (h < k or (h = k and i < j ))
Properties…
Event counting
•If the increment value d is always 1, the scalar time has the following interesting
property: if event e has a timestamp h, then h-1 represents the minimum logical
duration, counted in units of events, required before producing the event e;
• We call it the height of the event e.
• In other words, h-1 events have been produced sequentially before the event e
regardless of the processes that produced these events.
•For example, in Figure 4.2, five events precede event b on the longest causal
path ending at b.
Five events precede event b on the longest causal path ending at b
1 2 3 8 9
p1
9
2
1 4 5 7 11
p2
3 10
4
1 b
p3
5 6 7
•The system of scalar clocks is not strongly consistent; that is, for two events
ei and ej, C(ei ) < C(ej ) ⇒ ei eij
•For example, in Figure 4.2, the third event of process P1 has smaller scalar
timestamp than the third event of process P2 .However, the former did not
happen before the latter.
•The reason that scalar clocks are not strongly consistent is that the logical local
clock and logical global clock of a process are squashed into one, resulting in the
loss causal dependency information among events at different processes.
•For example, in Figure 4.2, when process P2 receives the first message from
process P1, it updates its clock to 3, forgetting that the timestamp of the latest
event at P1 on which it depends is 2.
smaller scalar timestamp
Clock Updation
1 2 3 8 9
p1
9
2
1 4 5 7 11
p2
3 10
4
1 b
p3
5 6 7
•Execute R1
•Deliver the message m
Vector Time
•The timestamp of an event is the value of the vector clock of its process
when the event is executed.
•Figure 4.3 shows an example of vector clocks progress with the increment
value d=1.
2 5
3 5
0 4
0 2 2 2
0 3 3 3
1 2 3 4
p3
vh = vk ⇔ ∀x : vh[x ] = vk [x ]
vh ≤ vk ⇔ ∀x : vh[x ] ≤ vk [x ]
vh < vk ⇔ vh ≤ vk and ∃x : vh[x ] < vk [x ]
vh || vk ⇔ ¬(vh < vk ) ∧ ¬(vk < vh)
• If the process at which an event occurred is known, the test to compare two
timestamps can be simplified as follows: If events x and y respectively occurred at
processes pi and pj and are assigned timestamps vh and vk, respectively, then
x → y⇔ vh[i ] ≤ vk [i ]
x || y ⇔ vh[i ] > vk [i ] ∧ vh[j ] < vk [j ]
Properties of Vector Time
Isomorphism
Event Counting
• If d=1 (in rule R1), then the i th component of vector clock at process pi , vt i [i ],
denotes the number of events that have occurred at pi until that instant.
• So, if an event e has timestamp vh, vh[j ] denotes the number of events executed by
process pj that causally precede e. Clearly, ∑ vh[j ] − 1 represents the total number
of events that causally precede e in the distributed computation.
Conclusion
• In a distributed system, a set of processes communicate by exchanging messages
over a communication network. A distributed computation is spread over
geographically distributed processes. The processes do not share a common global
memory or a physical global clock, to which processes have instantaneous access.
• In this lecture we have presented the idea of logical time that was proposed by
Lamport in 1978 in an attempt to order events in distributed systems.
• We have discussed two systems of logical clocks, namely, scalar and vector clocks
to capture causality between events of a distributed computation.
• In upcoming lecture, we will discuss about the size of vector clock, matrix clocks,
virtual time and physical clock synchronization.