DC - Unit 2 Finalmost

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 69

UNIT II – LOGICAL TIME AND GLOBAL STATE

UNIT II LOGICAL TIME AND GLOBAL STATE

Logical Time: Physical Clock Synchronization: NTP – A Framework for a


System of Logical Clocks – Scalar Time – Vector Time; Message Ordering
and Group Communication: Message Ordering Paradigms -Asynchronous
Execution with Synchronous Communication – Synchronous Program
Order on Asynchronous System – Group Communication – Causal Order –
Total Order; Global State and Snapshot Recording Algorithms: Introduction
– System Model and Definitions – Snapshot Algorithms for FIFO Channels.
Physical clock
 Physical Clock :
It measures the
• Time taken between the events occurring
• Time taken for process scheduling
• Time taken to maintain synchronization within the system

 Logical clocks are preferred over Physical clock in distributed system


Physical clock
 Why not Physical Clock in distributed system ?
1. Clock drift

Different systems will have their own local clock which will drift apart over
time due to difference in hardware, temperature and other factors.

2. Network latency and delay

Time taken by the messages to travel between nodes will vary due to which
synchronizing the events across the entire distributed system becomes a
challenge.
Physical clock
3. Faults and Failures
- Physical clock accuracy is disrupted by hardware failures, software errors
and network issues.
- When a node failure occurs maintaining a consistent global time becomes
more challenging

4. Scalability
In large scale distributed systems with numerous interconnected nodes
maintaining a single global clock across the entire system become more
complex and impractical.
Logical clock
Logical clock
• It refers to a mechanism used to establish an ordering of events that occurs across the
different nodes or processes in a distributed system.
• It satisfies causal relationship between events
• Useful for
a. coordinating activities
b. Maintaining consistency
c. Enabling the implementation of various distributed algorithms and protocols
d. Ensuring proper ordering of events
• Every event is assigned a timestamp and the causality relationship between events can
be inferred from their timestamps
Logical clock

Logical clock
• The timestamp assigned to events obey the fundamental monotonicity
property.
ie if an event a causally affects event b then the timestamp of a is smaller than
the timestamp of b

Methods of representing Logical time


• In the first method, Lamport’s scalar clocks, the time is represented by non-
negative integers;
• In the second method, the time is represented by a vector of non-negative
integers;
• In the third method, the time is represented as a matrix of non-negative
integers
Benefits of causal precedence

 Distributed algorithms design: The knowledge of the causal precedence


relation among events
• helps ensure liveness and fairness in mutual exclusion algorithms
• helps maintain consistency in replicated databases
• helps design correct deadlock detection algorithms to avoid phantom and
undetected deadlocks.
 Tracking of dependent events: In distributed debugging, the knowledge of the
causal dependency among events
• helps construct a consistent state for resuming reexecution
• helps in failure recovery
Benefits of causal precedence

• helps build a checkpoint


• Helps in replicated databases
• aids in the detection of file inconsistencies in case of a network partitioning.

 Knowledge about the progress: The knowledge of the causal dependency


among events
• helps measure the progress of processes in the distributed computation.
• useful in discarding obsolete information, garbage collection, and termination
detection.
Benefits of causal precedence

 Concurrency measure:
• The knowledge of how many events are causally dependent is useful in
measuring the amount of concurrency in a computation.
• All events that are not causally related can be executed concurrently.
• Thus, an analysis of the causality in a computation gives an idea of the
concurrency in the program
Physical clock synchronization : NTP

 Need of Knowing Time:


• The time of the day at which an event happened on a specific machine in the
network.
• The time interval between two events that happened on different machines in
the network.
• The relative ordering of events that happened on different machines in the
network.
Physical clock synchronization : NTP

 Need for synchronization


• In database systems, the order in which processes perform updates on a
database is important to ensure a consistent, correct view of the database.To
ensure the right ordering of events, a common notion of time between co-
operating processes becomes imperative.
• Improves the performance of distributed algorithms by replacing
communication with local computation.
• Distributed protocols use timeouts, and their performance depends on how
well physically dispersed processors are time-synchronized. Design of such
applications is simplified when clocks are synchronized.
Physical clock synchronization : NTP

 Clock synchronization
• It is the process of ensuring that physically distributed processors have a
common notion of time.
• It has a significant effect on many problems like secure systems, fault
diagnosis and recovery, scheduled operations, database systems, and real-
world clock values.
• Clocks are synchronized to an accurate real-time standard like UTC (Universal
Coordinated Time).
Physical clock synchronization : NTP
 Definitions
Ca and Cb are any two clocks.
• Time: The time of a clock in a machine p is given by the function Cp(t), where
Cp(t) = t for a perfect clock.

• Frequency : Frequency is the rate at which a clock progresses. The frequency at


time t of clock Ca is C ′ a (t).

• Offset: Clock offset is the difference between the time reported by a clock and
the real time.
 The offset of the clock Ca is given by Ca(t) − t.
 The offset of clock Ca relative to Cb at time t ≥ 0 is given by Ca(t) − Cb(t).
Physical clock synchronization : NTP

• Skew: The skew of a clock is the difference in the frequencies of the clock and
the perfect clock.
 The skew of a clock Ca relative to clock Cb at time t is (C ′ a (t) − C ′ b (t)).
 If the skew is bounded by ρ, then as per Equation 1 − ρ ≤ dC/dt ≤ 1 + ρ clock
values are allowed to diverge at a rate in the range of 1 − ρ to 1 + ρ.

• Drift (rate): The drift of clock Ca is the second derivative of the clock value with
respect to time, namely, C ′′ a (t).
 The drift of clock Ca relative to clock Cb at time t is C ′′ a (t) − C ′′ b (t).
Physical clock synchronization : NTP

 The behavior of fast, slow, and perfect clocks with respect to UTC.
Physical clock synchronization : NTP

 Offset delay estimation method using NTP


NTP : Network Time Protocol
• widely used for clock synchronization on the internet
• Calculates Offset by performing various trials and choosing the trial with
minimum delay.
• The design of NTP involves a hierarchical tree of time servers
-The primary server at the root synchronizes with the UTC.
-The next level contains secondary servers, which act as a backup to the
primary server
- At the lowest level is the synchronization subnet which has the clients.
Physical clock synchronization : NTP
 Offset delay estimation using NTP

• The clock offset θ of A relative to B and B relative to A is given by


Clock offset Ɵ = (a + b)/2 (Network delay difference : a = T1 – T3 and b = T2 – T4)
• The Roundtrip delay δ of B relative to A at time T4 are approximately given by
the following.
Roundtrip delay δ = a − b ( a = T2 – T1 and b = T4 – T3 )
Physical clock synchronization : NTP
• Each NTP message includes the latest three timestamps T1, T2 and T3, while T4
is determined upon arrival
• Both peers A and B can independently calculate delay and offset using a single
bidirectional message stream as shown below
Physical clock synchronization : NTP
A Framework for System of Logical clocks

 Definition
• A system of logical clocks consists of a time domain T and a logical clock C

• The logical clock C is a function that maps an event e in a distributed system to


an element in the time domain T, denoted as C(e) and called the timestamp of
e, and is defined as follows:
C : H → T such that the following property is satisfied:
(a) Consistency : for two events ei and ej , ei → ej =: C(ei) < C(ej )
(b) strongly consistent : for two events ei and ej , ei → ej ⇔ C(ei) < C(ej )
A Framework for System of Logical clocks
 Implementing Logical clocks
• Every process needs data structure (storing logical time) and protocol
(updation maintaining consistency)
(a) Data structure
• Each process pi maintains data structures that allow it the following two
capabilities:
• A local logical clock, denoted by lci , that helps process pi measure its own
progress.
• A logical global clock, denoted by gci , that is a representation of process pi’s
local view of the logical global time. It allows this process to assign consistent
A Framework for System of Logical clocks

(b) Protocol
The protocol consists of the following two rules:
• R1: This rule governs how the local logical clock is updated by a process when
it executes an event (send, receive, or internal).

• R2: This rule governs how a process updates its global logical clock to update
its view of the global time and global progress.

It dictates what information about the logical time is piggybacked in a


message and how this information is used by the receiving process to
update its view of the global time.
Scalar Time

• Proposed by Lamport
• The logical local clock of a process pi and its local view of the global time are
squashed into one integer variable Ci
• Rules R1 and R2 are used to update the clocks
R1: Before executing an event (send, receive, or internal), process pi executes
the following: Ci := Ci + d (d > 0)
Ci  Logical local time of a process
d  add on value which is mostly equal to 1
• Every time when rule 1 is executed local time of each event will have a
different value
Scalar Time
R2 : Applicable only at receiver end
Each message piggybacks the clock value of its sender at sending time
When a process pi receives a message with timestamp Cmsg, it executes the
following actions:
– Ci := max(Ci,Cmsg)
– Execute R1.
– Deliver the message.
Scalar Time
Properties of Scalar time
1. Consistency
scalar clocks satisfy the monotonicity and hence the consistency property:
for two events ei and ej , ei → ej =: C(ei) < C(ej ).
2. Total Ordering
case 1: all timestamps are unique
Events are ordered according to the timestamp
case 2: Timestamps are not unique and there is repetition
e11 = 4 and e41 = 4 ( Lowest process id event will be executed)
3. Event counting
h – 1 events has occurred
eg : e43 = 7 then 6 events would have occured
Scalar Time

4. No strong consistency
ei → ej ⇔ C(ei) < C(ej )
Vector Time

• Proposed by Fidge, Mattern and Schmuck

• Representation : array of non negative integers

• Size of array = number of processes (‘n’)

• The timestamp associated with an event is the value of the vector clock of its
process when the event is executed.

• Each process pi maintains a vector vti[1..n]


- vti[i] is the local logical clock of pi and describes the logical time
progress at process pi.
- vti[j] represents process pi’s latest knowledge of process pj local
time.
Vector Time
R1:
Before executing an event, process pi updates its local logical time as
follows:
vti[i] := vti[i] + d (d > 0)
• R2:
Each message m is piggybacked with the vector clock vt of the sender
process at sending time. On the receipt of such a message (m,vt), process pi
executes the following sequence of actions:
Update its global logical time as follows:
• 1 ≤ k ≤ n : vti[k] := max(vti[k], vt[k])
• – Execute R1.
• – Deliver the message m.
Vector Time

The following relations are defined to compare two vector timestamps, vh and
vk:
• vh = vk ⇔ ∀x : vh[x] = vk[x]
• vh ≤ vk ⇔ ∀x : vh[x] ≤ vk[x]
• vh < vk ⇔ vh ≤ vk and ∃x : vh[x] < vk[x]
• vh || vk ⇔ ¬ (vh < vk) ∧ ¬ (vk < vh)
Vector Time

Example of vector clocks progress with the increment value d=1. Initially, a
vector clock is [0, 0, 0, ...., 0].
Vector Time
Properties of Vector time
1. Isomorphism
If two events x and y have timestamps vh and vk, respectively, then
x → y ⇔ vh < vk
x || y ⇔ vh || vk.
2. Strongly Consistent
x → y ⇔ vh < vk
3. Event counting
• If an event e has timestamp vh, vh[j] denotes the number of events executed by
process pj that causally precede e.
• The total number of events that causally precede e in the distributed
computation.
Σvh[j] − 1
Vector Time
Size of Vector clock

Case 1: Progress of process can be tracked only if array size is ‘n’

Case 2: Whenever there is a causality between pair of events, array size will be
equal to (E,≺) and need not be ‘n’.

Applications of vector clock


used in
• distributed debugging
• implementations of causal ordering communication and causal distributed
shared memory
• establishment of global breakpoints
• in determining the consistency of checkpoints in optimistic recovery.
Vector Time

Efficient Implementations of Vector Clocks

Need:
• If the number of processes in a distributed computation is large, then vector
clocks will require piggybacking of huge amount of information in messages.

• The message overhead grows linearly with number of processors in the system
and when there are thousands of processors in the system, the message size
becomes huge even if there are only a few events occurring in few processors.

Method:
Singhal – KshemKalyani’s Differential technique
Global state
Why finding Global state is difficult?
• No shared memory between systems
• No common global clock
• Difficult to synchronize different local clocks
Global state computation
• Process state – Local memory and history of Activity
• Channel state – Message sent / received
Need of global state
• Deadlock detection
• Termination
• Failure recovery
• Debugging
Global state
Effect of state computation at different instant

Banking example:

• S1 - sites of a distributed system which maintains bank accounts A and S2 be


bank accounts A
• S2 - sites of a distributed system which maintains bank accounts A and S2 be
bank accounts B
• C12 - communication channels from site S1 to
site S2
• C21 - communication channels from site S2 to
site S1
Global state
• time t0: Initially,
Account A = $600, Account B = $200, C12 = $0, C21 = $0

• time t1: Site S1 initiates a transfer of $50 from Account A to Account B. Account
A is decremented by $50 to $550 and a request for $50 credit to Account B is
sent on Channel C12 to site S2.
Account A = $550, Account B = $200, C12 = $50, C21 = $0.

• time t2: Site S2 initiates a transfer of $80 from Account B to Account A. Account
B is decremented by $80 to $120 and a request for $80 credit to Account A is
sent on Channel C21 to site S1.
Account A = $550, Account B = $120, C12 = $50, C21 = $80.
Global state
• time t3: Site S1 receives the message for a $80 credit to Account A and
updates Account A.
Account A = $630, Account B = $120, C12 = $50, C21 = $0.
• time t4: Site S2 receives the message for a $50 credit to Account B and
updates Account B.
Account A = $630, Account B = $170, C12 = $0, C21 = $0.
System model and definitions
• The system consists of a collection of n processes, p1, p2, ..., pn, that are
connected by channels.
• There is no globally shared memory and physical global clock and processes
communicate solely by passing messages.

• Cij denote the channel from process pi to process pj and its state is denoted by
Scij

• The actions performed by a process are modeled as three types of events,


namely, internal events, message send events, and message receive events.

• For a message mij that is sent by process pi to process pj , let send(mij) and
rec(mij) denote its send and receive events, respectively
System model and definitions
• At any instant, the state of process pi, denoted by LSi, is a result of the sequence
of all the events executed by pi till that instant

• For an event e and a process state LSi, e ∈ LSi iff e belongs to the sequence of
events that have taken process pi to state LSi.

• For an event e and a process state LSi, e ∈ LSi iff e does not belong to the
sequence of events that have taken process pi to state LSi.

• For a channel Cij , the following set of messages can be defined based on the
local states of the processes pi and pj
System model and definitions
• Different snapshot algorithms have assumed different models of
communication.

• In FIFO model, each channel acts as a firstin first-out message queue and thus,
message ordering is preserved by a channel.

• In non-FIFO model, a channel acts like a set in which the sender process adds
messages and the receiver process removes messages from it in a random
order.
• A system that supports causal delivery of messages satisfies the following
property:
“ For any two messages mij and mkj , if send(mij) → send(mkj),
then rec(mij) → rec(mkj) ".
System model and definitions

Consistent Global state

• The global state of a distributed system is a collection of the local states of the
processes and the channels. Notationally, global state GS is defined as,
System model and definitions
Interpretation in terms of cuts

• A cut is a line joining an arbitrary point on each process line that slices the
space-time diagram into a PAST and a FUTURE

• Consistent cut : A consistent global state corresponds to a cut in which every


message received in the PAST of the cut has been sent in the PAST of that cut.
Such a cut is known as a consistent cut.

• For example, consider the space-time diagram for the computation illustrated in
the below Figure
System model and definitions
system model and definitions
• Cut C1 is inconsistent because message m1 is flowing from the FUTURE to the
PAST.
• Cut C2 is consistent and message m4 must be captured in the state of channel
C21.
Issues in recording a global state
The following two issues need to be addressed:

I1: How to distinguish between the messages to be recorded in the snapshot


(either in a channel state or a process state) from those not to be recorded.

From C1 - Any message that is sent by a process before recording its


snapshot, must be recorded in the global snapshot
System model and definitions

From C2 - Any message that is sent by a process after recording its snapshot,
must not be recorded in the global snapshot

I2: How to determine the instant when a process takes its snapshot.

From C2 - A process pj must record its snapshot before processing a message mij
that was sent by process pi after recording its snapshot.
Snapshot Recording Algorithm for FIFO channels
Chandy-Lamport Algorithm
• uses a control message, called a marker whose role is to separate messages in
the channel in FIFO system.

• A marker separates the messages in the channel into those to be included in


the snapshot (i.e., channel state or process state) from those not to be
recorded in the snapshot.

• After a site has recorded its snapshot, it sends a marker along all of its outgoing
channels before sending out any more messages.

• A process must record its snapshot no later than when it receives a marker on
any of its incoming channels.
Snapshot Recording Algorithm for FIFO channels
• The algorithm can be initiated by any process by executing the “Marker
Sending Rule“ by which it records its local state and sends a marker on each
outgoing channel.

• A process executes the “Marker Receiving Rule” on receiving a marker.

Case 1: If the process has not recorded its local state , it records the state of the
channel on which the marker is received as empty and executes the
“Marker Sending Rule“ to record its local state.

Case 2: If the process has recorded its local state , it records the state of the
channel as corresponding message between last snapshot and current
signal.
Snapshot Recording Algorithm for FIFO channels

• The algorithm terminates after each process receives a marker on all of its
incoming channels.

• All the local snapshots will get disseminated to all other processes and all the
processes can determine the global state.
Snapshot Recording Algorithm
Snapshot Recording Algorithm for FIFO channels
• Correctness
• To prove the correctness of the algorithm, a recorded snapshot must satisfy
conditions C1 and C2.

• Due to FIFO property of channels, it follows that no message sent after the
marker on that channel is recorded in the channel state. Thus, condition C2 is
satisfied.

• When a process pj receives message mij that precedes the marker on channel
Cij , it acts as follows:
If process pj has not taken its snapshot yet, then it includes mij in its
recorded snapshot.
Otherwise, it records mij in the state of the channel Cij . Thus, condition C1
Snapshot Recording Algorithm for FIFO channels
• Complexity

The recording part of a single instance of the algorithm requires O(e) messages
and O(d) time, where e is the number of edges in the network and d is the
diameter of the network.
Snapshot Recording Algorithm for FIFO channels
Properties of Recorded Global state
• Consider two possible executions of the snapshot algorithm
1. Markers shown using dashed-and-dotted arrows
• Let site S1 initiate the algorithm just after t1.

• Site S1 records its local state (Account A = $550) and sends a marker to site S2.
The marker is received by site S2 after t4.

• When site S2 receives the marker, it records its local state (Account B = $170),
the state of channel C12 as $0, and sends a marker along channel C21.
Snapshot Recording Algorithm for FIFO channels
• When site S1 receives this marker, it records the state of Channel C21 as $80.
The $800 amount in the system is conserved in the recorded global state,
A = $550,B = $170,C12 = $0,C21 = $80
2. Markers shown using dotted arrows
• Let site S1 initiate the algorithm just after t0 and before sending the $50 for S2.
• Site S1 records its local state (Account A = $600) and sends a marker to site S2.
• The marker is received by site S2 between t2 and t3.
• When site S2 receives the marker, it records its local state (Account B = $120),
the state of channel C12 as $0, and sends a marker along channel C21.
• When site S1 receives this marker, it records the state of Channel C21 as $80.
The $800 amount in the system is conserved in the recorded global state,
A = $600,B = $120,C12 = $0,C21 = $80
Snapshot Recording Algorithm for FIFO channels
Snapshot Recording Algorithm for FIFO channels
• The recorded global state may not correspond to any of the global states that
occurred during the computation

• This happens because a process can change its state asynchronously before the
markers it sent are received by other sites and the other sites record their states.

• The system could have passed through the recorded global states in some
equivalent executions

• The recorded global state is a valid state in an equivalent execution and if a


stable property (i.e., a property that persists such as termination or deadlock)
holds in the system before the snapshot algorithm begins, it holds in the
recorded global snapshot.

• Therefore a recorded global state is useful in detecting stable properties.


Message ordering paradigms
 Four types of ordering
1. Non FIFO ( sequence number is attached with msg for ordering)
2. FIFO
3. Causal ordering
4. Synchronous order

5. Non FIFO
• Asynchronous execution : An asynchronous execution (or A-execution) is an
execution (E,≺) for which the causality relation is a partial order.

• physical link typically delivers the messages sent on it in FIFO order due to the
physical properties of the medium
Message ordering paradigms

• On any logical link between two nodes in the system, messages may be
delivered in any order, not necessarily First-In First-Out. Such executions are
also known as non-FIFO executions.
Message ordering paradigms
2. FIFO
• Asynchronous execution : A FIFO execution is an A-execution in which:
for all (s, r) and (s′, r′) ∈ T , (s ∼ s′ and r ∼ r′ and s ≺ s′) =: r ≺ r′
Message ordering paradigms

• How logical link (Non FIFO is converted to FIFO)


The sender assigns and appends a (sequence_num, connection_id ) tuple to each
message. The receiver uses a buffer to order the incoming messages as per the
sender’s sequence numbers, and accepts only the “next” message in sequence.
Message ordering paradigms
3. Causal order
• CO execution is an A-execution in which, for all (s, r) and (s′, r′) ∈ T , (r ∼ r′ and
s ≺ s′) =:r ≺ r′
Message ordering paradigms
Delivery event : Specific message being processed only after all other messages
arrives.The time at which the specific message is processed is delivery event.

To enforce CO, message m3 should be kept pending in the local buffer after it
arrives at P1, until m1 arrives and m1 is delivered
Message ordering paradigms
OTHER DEFINITIONS
(a) causal order (CO) for implementations
If send(m1) ≺ send(m2) then for each common destination d of messages m1
and m2, deliverd(m1) ≺ deliverd(m2) must be satisfied.

(b) Message order (MO)


• A MO execution is an A-execution in which,
for all (s, r) and (s′, r′) ∈ T , s ≺ s′ =: ¬ (r′ ≺ r)

(c) Empty Interval Execution (IE)


• An execution (E,≺) is an Empty-Interval (EI) execution
if for each pair of events (s, r) ∈ T , the open interval set {x ∈ E | s ≺ x ≺
r} in the partial order is empty.
Message ordering paradigms
4. Synchronous Execution
• When all the communication between pairs of processes uses synchronous
send and receive primitives, the resulting order is the synchronous order.
• In a timing diagram, the “instantaneous” message communication can be
shown by bidirectional vertical message lines.
Message ordering paradigms
4. Synchronous Execution
• The “instantaneous communication” property of synchronous executions
requires a modified definition of the causality relation because for each (s, r) ∈
T , the send event is not causally ordered before the receive event.

• The synchronous causality relation ≪ on E is the smallest transitive relation


that satisfies the following.
S1. If x occurs before y at the same process, then x ≪ y
S2. If (s, r) ∈ T , then for all x ∈ E, [(x ≪ s ⇐:x ≪ r) and (s ≪ x ⇐:r ≪ x)]
S3. If x ≪ y and y ≪ z, then x ≪ z
Message ordering paradigms
4. Synchronous Execution
• Timestamping a synchronous execution
An execution (E,≺) is synchronous if and only if there exists a mapping from E to
T (scalar timestamps) such that
for any message M, T(s(M)) = T(r(M))
for each process Pi, if ei ≺ ei′ then T(ei) < T(ei′)
By assuming that a send event and its corresponding receive event are viewed
atomically, i.e., s(M) ≺ r(M) and r(M) ≺ s(M), it follows that for any events ei and
ej that are not the send event and the receive event of the same message, ei ≺ ej
=: T(ei) < T(ej).
Asynchronous execution with synchronous communication
(a) Intro
• When all the communication between pairs of processes is by using
synchronous send and receive primitives, the resulting order is synchronous
order
• If a program is written for an asynchronous system and is implemented into
synchronous system there are chances for deadlock to occur.
Asynchronous execution with synchronous communication
A-synchronous execution ending up in deadlock when run in synchronous
system.
Asynchronous execution with synchronous communication
(b) Executions Realizable with Synchronous Communication (RSC)
In an A-execution, the messages can be made to appear instantaneous if there
exists a linear extension of the execution, such that each send event is
immediately followed by its corresponding receive event in this linear extension.
Such an A-execution can be realized under synchronous communication
and is called a Realizable with Synchronous Communication (RSC) execution.

You might also like