chap 3. Synchronization
chap 3. Synchronization
Kanchan K Doke
Computer Engg. Department , BVCOE
2
Outline
• Clock synchronization
▫ Physical clock
▫ Logical clock
• Election algorithm
▫ Ring algorithm
▫ Bully algorithm
• Mutual exclusion
▫ Centralized
▫ Distributed
• Deadlock
▫ Centralized
▫ Distributed
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
3
Introduction
A DS consists of a collection of distinct processes that are
separated spatially and run concurrently
Sharing of resources may be co-operative or competitive.
The rules for enforcing correct interaction are implemented in the
form of synchronization.
The number of resources in a computing system is restricted , one
process must influence the action of other concurrently executing
processes as it competes for resources.
The synchronization mechanisms suitable for distributed system
are
1. Clock synchronization
2. Event ordering
3. Mutual exclusion
4. Election algorithm
1. Physical clock
2. Logical clock
Global Time
• Global Time is utilized to provide timestamps
for processes and data.
▫ Physical clock: concerned with “People” time
▫ Logical clock: concerned with relative time and
maintain logical consistency
Clock synchronization
• Why synchronization:-
▫ An application may have processes that concurrently
run on multiple nodes of the system
▫ It enables to measure the duration of distributed
activities that start on one node and terminate on
another
• Issues :-
Practically no two clocks can be perfectly synchronized
Clock synchronization requires each node to read other nodes’ clock
Time must never run backward
Marks 10
Physical Clocks
• There are two aspects:
Obtaining an accurate value for physical time
Marks 10
Marks: 5/10
Marks: 5/10
Berkeley Algorithm
The Berkeley0+(-10)+25=15/3=5
Algorithm
Time server
10+5=
25-5=
a) The time server asks all the other machines for their clock values using
a polling mechanism and providing “current time”
b) The machines answer indicating how they differ from time sent.
c) The time server tells everyone how to adjust their clock based on a
calculation of the “average time value”.
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
14
Event Ordering
For most applications it is not necessary to keep the clocks
synchronized.
Marks 10
Event Ordering…….(cont.)
Happened-Before relation
The happened-before relation ()on a set of events
satisfies
1. If a and b are events in the same process and a
occurs before b, then ab.
2. If a is the event of sending message to b, and b is the
event of receiving from a then ab
3. If ab and bc, the ac (transitive law)
4. if a and b are said to be concurrent, then ab is not
possible.
Marks 5
Logical Ordering
Conditions
C1: If a and b are two events within the same process Pi
and a occurs before b, then Ci(a)<Ci(b)
For example
E F
Process 3
Process 2 C D
Process A B
1
A<B<C<D<F> E
Marks 10
Lamport’s Algorithm
• To implement Lamport’s logical clocks, each
process Pi maintains a local counter Ci that is
updated as follows:
Step 1. Before executing an event, Pi executes
Ci ← Ci +1
Step 2. When process Pi sends a message m to Pj , it sets m’s
timestamp ts(m) = Ci after having executed the previous step.
Step 3. Upon the receipt of a message m, process Pj adjusts its own
local counter as
Cj ← max{Cj ,ts(m)} +1
after which it then executes the first step and delivers the
message to the application.
A(1) B(2)
Process 1
25
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
Totally-Ordered Multicasting
Updating a replicated database and leaving it in an inconsistent state.
E(1.3) F(5.3)
Process 3
C(3.2) D(4.2)
Process 2
A(1.1) B(2.1)
Process 1
28
Marks 10
Election Algorithms
• Many distributed systems need one process to act as a
coordinator to initiate actions or to resolve conflicts.
• Assumptions:
▫ We will assume that each process has a unique process number id
so we can attempt to locate the highest numbered process to act
as the coordinator.
▫ Every process knows the id numbers of all other processes.
▫ A process doesn’t not know which of the other processes is up or
down.
• Goal: When an election starts, it ends with all processes
agreeing on who the coordinator is.
Bully Algorithm
• Messages used: Election, Coordinator, OK
1 1 1
7 7 7 2
2 2
Election
6 6 6 3
Election
3 3
5 4 5 4 5 4
Ok
a b c
1 1
7 2 7 2
Coordinator
6 Ok
3 6 3
5 4 5 4
d e
• Figure : The bully election algorithm.
(d) Process 6 tells 5 to stop.
(e) Process 6 wins and tells everyone.
Ring Algorithm
• The ring algorithm uses the same ring arrangement as in the token
ring mutual exclusion algorithm, but does not employ a token this
time.
• If the neighbor is down, the process skips over it and sends the
message to the next process in the ring. This process is repeated
until a running process is located.
Ring Algorithm
• At each step, the process adds its own process ID to the list in the message
and sends the message to its living neighbor.
• Eventually, the election message comes back to the process that started it.
• The process realizes this because it gets an election message with its own
process ID at the head of the list.
▫ The list of processes in the message represents the list of all live processes in the
system.
• The process then picks either the highest or lowest process ID in the list
and sends out a message to the group informing them of the new
coordinator.
• The method for picking the leader has to be consistent so that even if
multiple election messages circulated, each process that started an election
will come to the same decision on who the coordinator is.
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
40
Ring Algorithm
[2,3,4,5,6,0,1]
Initiator
[5,6,0,1,2,3,4] Initiator
Ring Algorithm
[2,3,4,5,6,0,1]
[Coordinator,6]
[Coordinator,6]
[5,6,0,1,2,3,4]
Figure : Election algorithm using a ring.
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
42
• Distributed systems:-
▫ Both the shared resources and the users may be
distributed, and shared memory does not exist
▫ Approaches based on message passing must be used.
Performance Measure
• Synchronization delay(sd),
▫ i.e., Time between the leaving of CS by a site and the
entry of CS by the next one: should be minimized.
Performance Measure
• Response time:
▫ Time interval between request messages transmissions and exit of
CS.
• E the average CS execution time.
• System throughput,
▫ i.e., rate at which system executes requests for CS: should be
maximized.
▫ system throughput = 1 / (sd + E).
4 Marks
Classification of Mutual Exclusion Algorithms
1. Centralized approach
• There is single coordinator which handles all the requests to access
the shared data.
• Every process takes permission to the coordinator
2. Distributed approach
1. Non-token based:
• A site/process can enter a critical section when an assertion (condition) becomes
true.
• Algorithm should ensure that the assertion will be true in only one site/process.
2. Token based:
• A unique token (a known, PRIVILEGE message) is shared among cooperating
sites/processes.
• Possessor of the token has access to critical section.
• Need to take care of conditions such as
•loss of token,
•crash of token holder,
•possibility of multiple tokens, etc.
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
51
8/10 Marks
Centralized approach
• One of the processes in the system is chosen to coordinate the entry
to the critical section
• A process that wants to enter its critical section sends a request
message to the coordinator
• The coordinator decides which process can enter the critical section
next, and its sends that process a reply message
• When the process receives a reply message from the coordinator, it
enters its critical section
• After exiting its critical section, the process sends a release message
to the coordinator and proceeds with its execution
• This scheme requires three messages per critical-section entry:
▫ request
▫ reply
▫ release
P2
3. Request
7. Release
6. Reply
9. Release
5. Release
P1 2. Reply Pc 8. Reply P3
1. Request
4. Request
P3 P2
• Advantages:
▫ Easy to implement
▫ Guaranteed Mutual Exclusion
▫ Request granted in order they are received
▫ No process ever wait forever (no starvation)
• Drawbacks.
▫ A single point of failure, the control site.
▫ the communication links near the control site are
likely to be congested and become a bottleneck.
10 Marks
Distributed approach :-
Non-token Based Algorithms
• Notations:
▫ Si: Site i
▫ Ri: Request set, containing the ids of all Si’s from
which permission must be received before
accessing CS.
▫ Compose message containing:
Identifier (machine ID, process ID)
Name of resource
Timestamp (totally-ordered Lamport)
▫ Send request to all processes in group
1. Lamport’s Algorithm
• Lamport’s Algorithm
▫ Ri = {S1, S2, …, Sn}, i.e., all sites.
▫ Request queue: maintained at each Si. Ordered by
time stamps.
▫ Assumption: message delivered in FIFO.
Ri:-[S1,S2,……Sn]
Si
Request queue:-((tsj, j), (tsi, i))
Lamport’s Algorithm
• Requesting CS :
▫ Site Si :Send REQUEST(tsi, i) message to all sites in its Ri.
Place REQUEST in request_queuei.
▫ Site Sj: On receiving the message; Sj sends time-stamped REPLY
message to si. Si’s request placed in request_queuej.
• Executing CS:
▫ Si has received a message with time stamp larger than (tsi,i) from
all other sites.
▫ Si’s request is the top most one in request_queuei.
• Releasing CS:
▫ Exiting CS: send a time stamped RELEASE message to all sites in
its request set.
▫ Receiving RELEASE message: Sj removes Si’s request from its
queue.
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
57
S2
(1,2)
S3
S1 (1,2) (2,1)
Step 2:
S2 enters CS
S2
(1,2) (2,1)
S3 (1,2) (2,1)
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
58
S2 leaves CS
S2
(1,2) (2,1)
S3
(1,2) (2,1)
Lamport’s Algorithm
• Performance.
▫ 3(N-1) messages per CS invocation.
1. (N - 1) REQUEST
2. (N - 1) REPLY
3. (N - 1) RELEASE messages.
▫ Synchronization delay: T
10 Marks
2. Ricart-Agrawala Algorithm
• Distributed algorithm using reliable multicast
and logical clocks
• Process wants to enter critical section:
▫ Wait until everyone gives permission
▫ Enter critical section / use resource
• RD[i]:- Request Deferred vector
• Use two messages:
• REQUEST
• REPLY
• Notes:
▫ When a site receives a message, it updates its clock
using the timestamp in the message.
▫ When a site takes up a request for the CS for
processing, it updates its local clock and assigns a
timestamp to the request.
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
63
Ricart-Agrawala: Example
Step 1:
S1 (2,1)
S2
(1,2)
S3
Step 2:
S1 (1,2) (2,1)
S2 enters CS
S2
(1,2) (2,1)
S3
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
64
Ricart-Agrawala: Example…
Step 3:
S1 CS
S1 enters CS
S2 CS
(2,1)
S3 S2 leaves CS
p1 p2
p3 p4
v3 v4
Request Subsets
• Example k = 2; (N = 3).
▫ R1 = {1, 2}; R3 = {1, 3}; R2 = {2, 3}
• Example k = 3; N = 7.
▫ R1 = {1, 2, 3}; R4 = {1, 4, 5}; R6 = {1, 6, 7};
▫ R2 = {2, 4, 6}; R5 = {2, 5, 7}; R7 = {3, 4, 7};
▫ R3 = {3, 5, 6}
p3 p4
Protocol
Each process pi is associated with a voting set Vi
(of processes)
Guarantees safety
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
69
• Releasing CS
▫ Send RELEASE(i) to all sites in Vi
▫ Any Sj after receiving RELEASE message, send REPLY
message to the next request in queue.
▫ If queue empty, update status indicating receipt of RELEASE.
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
70
10 Marks
Token-based Algorithms
• Unique token circulates among the participating sites.
• A site can enter CS if it has the token.
• Token-based approaches use sequence numbers instead
of time stamps.
▫ Request for a token contains a sequence number.
▫ Sequence number of sites advance independently.
▫ Sequence no. is incremented when site request for a token
• Correctness issue is minor since only one token is
present -> only one site can enter CS.
• Deadlock and starvation issues to be addressed.
• Design issues:
▫ Distinguishing outdated REQUEST messages.
Format: REQUEST(j,n) -> jth site making nth request.
Each site has RNi[1..N] -> RNi [j] is the largest sequence
number of received from j
(i.e. RNi [j]=max(RNi [j],n)
Test RNi[j]>n for outdated message.
• Executing the CS
▫ Site Si executes the CS when it has received the token
Suzuki-Kasami: Example
Step 1: S1 has token, S3 is in queue
Site Seq. Vector RN Token Vect. LN Token Queue
S1 10, 15, 9 10, 15, 8 3
S2 10, 16, 9
S3 10, 15, 9
Step 2: S3 gets token, S2 in queue
Site Seq. Vector RN Token Vect. LN Token Queue
S1 10, 16, 9
S2 10, 16, 9
S3 10, 16, 9 10, 15, 9 2
Step 3: S2 gets token, queue empty
Site Seq. Vector RN Token Vect. LN Token Queue
S1 10, 16, 9
S2 10, 16, 9 10, 16, 9 <empty>
S3 10, 16, 9
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
79
10 Marks
4. Raymond’s Algorithm
• Sites are arranged in a logical directed tree. Root: token
holder. Edges: directed towards root.
• Every site has a variable holder that points to an
immediate neighbor node, on the directed path towards
root. (Root’s holder point to itself).
Request queue
S1
Request queue S2 S3
S4 S5 S6 S7
Raymond’s Algorithm
• Requesting CS
▫ If Si does not hold token and request CS, sends REQUEST
upwards provided its request_q is empty. It then adds its
request to request_q.
▫ Non-empty request_q -> REQUEST message for top entry in q
(if not done before).
▫ Site on path to root receiving REQUEST -> propagate it up, if its
request_q is empty. Add request to request_q.
▫ Root on receiving REQUEST -> send token to the site that
forwarded the message. Set holder to that forwarding site.
▫ Any Si receiving token -> delete top entry from request_q, send
token to that site, set holder to point to it. If request_q is non-
empty now, send REQUEST message to the holder site.
• Executing CS:
▫ getting token with the site at the top of request_q. Delete top of
request_q, enter CS.
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
81
Raymond’s Algorithm …
• Releasing CS
▫ If request_q is non-empty, delete top entry from q, send token
to that site, set holder to that site.
▫ If request_q is non-empty now, send REQUEST message to the
holder site.
• Performance
▫ Average messages: O(log N) as average distance between 2
nodes in the tree is O(log N).
▫ Synchronization delay: (T log N) / 2, as average distance
between 2 sites to successively execute CS is (log N) / 2.
▫ Greedy approach: Intermediate site getting the token may enter
CS instead of forwarding it down. Affects fairness, may cause
starvation.
S2 Token S3
request
S4 S5 S6 S7
Step 2:
S1
S2 Token S3
S4 S5 S6 S7
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
83
S2 S3
S4 S5 S6 S7
Token
holder
Comparison
Non-Token Resp. Time(ll) Sync. Delay Messages(ll) Messages(hl)
Suzuki-Kasami 2T+E T N N
Raymond T(log N)+E Tlog(N)/2 log(N) 4
Introduction
What is a deadlock?
• Formal definition:
“A set of processes is deadlocked if each process in
the set is waiting for an event that only another
process in the set can cause.”
• Usually, the event is release of a currently held
resource
• In deadlock, none of the processes can
▫ Run
▫ Release resources
▫ Be awakened
• Communication Deadlock
A Process waits for certain messages before it can
proceed.
Two processes exchange a long message with each other,
but their socket buffer is smaller than the message.
Socket Socket
buffer buffer
Kanchan K. Doke, Computer Engg. Dept. ,BVCOE
91
AND, OR Models
• AND Model
▫ A process/transaction can simultaneously request for multiple
resources.
▫ Remains blocked until it is granted all of the requested
resources.
• OR Model
▫ A process/transaction can simultaneously request for multiple
resources.
▫ Remains blocked till any one of the requested resource is
granted.
Deadlock modeling
b
• Deadlocks can be modeled using the
directed graphs. a c
• Some terminologies:
▫ directed graph: A directed graph is a pair (N,E),where
N-nonempty set of nodes,
f d
E-set of directed edges.
▫ Path:(a,b,c,….f) e
▫ Cycle: it is a path whose first and last
nodes are the same.
▫ reachable set: a path exists from a to b.
▫ knot: a knot contains one/more cycles.
Pi Pi
Pi
Rj Rj
P1 P2 P3
P2 Requests R3
R2 R4
P1 P2 P3 P1 P3
R4
R2 R2
P4
Wait-for graph
• Used to represent the waiting relationships between current
transactions.
• In a wait-for graph the nodes represent transactions and
the edges represent wait for relationships between
transactions.
• P1 -> P2 implies P1 is waiting for a resource from P2.
• Transaction-wait-for Graphs (TWF): WFG in databases.
P2
P1
P3
P5
P4
▫ Drawback:
Need special care while arranging the
site in a hierarchy.
Holds Wants
A S S C A S C
Wants
Holds
R R
Holds T T
B B
• Advantage-
▫ Simple and easy
• Disadvantages-
▫ Single point of failure
▫ Network congestion issues
▫ False deadlock detection (Phantom Deadlocks)
Phantom Deadlocks
P0 R P2
System A System B
S P1 T
▫ Terms used:
Pj is dependent on Pk, if a sequence of Pj, Pi1,.., Pim, Pk exists.
Pj is locally dependent on Pk, if above condition + Pj,Pk on same
site.
Each process maintains an array dependenti:
dependenti (j)= true ….. if Pj is dependent on Pi
(initially set to false for all i & j).
Chandy-Misra-Haas’s Algorithm
Sending the probe:
if Pi is locally dependent on itself then deadlock.
else for all Pj and Pk such that
(a) Pi is locally dependent upon Pj, and
(b) Pj is waiting on Pk, and
(c ) Pj and Pk are on different sites, send probe(i,j,k) to the home
site of Pk.
Chandy-Misra-Haas’s Algorithm
Receiving the probe:
…….
else for all Pm and Pn such that
(a’) Pk is locally dependent upon Pm, and
(b’) Pm is waiting on Pn, and
(c’) Pm and Pn are on different sites, send probe(i,m,n)
to the home site of Pn.
end.
Performance:
• For a deadlock that spans m processes over n sites, m(n-1)/2
messages are needed.
• Size of the message 3 words.
• Delay in deadlock detection O(n).
Chandy-Misra-Haas’s Algorithm
( 1,7,1 )
Site 3
P1 P2 P6 P7
( 1,2,3 )
( 1,5,6 )
Site 1
( 1,2,4 ) Site 2
P3
P1 initiates Deadlock
P4 P5
Detection