Course No.
CSSE - 601
Distributed System
Course No. CSSE - 601
Distributed
Deadlock Detection
Objective
• Introduction • Deadlock Avoidance
• Resource vs. Communication • Deadlock Detection and
Deadlocks Recovery
• A Graph-Theoretic Model • Control Organizations
• Resource Allocation Graph • Centralized Deadlock-
• Wait-For Graph Detection
• Deadlock handling strategies in • Distributed Deadlock
DS Detection Algorithms
• Deadlock Prevention • Algorithms Unleashed
Perspectives
Introduction
• In distributed systems, a process can request and release resources
(local or remote) in any order, which may not be known a priority
and a process can request some resources while holding others
• If the sequence of the allocation of resources to processes is not
controlled in such environments, deadlocks can occur.
• The problem of deadlocks has been generally studied in distributed
systems under the following model:
• The systems have only reusable resources.
• Processes are allowed only exclusive access to resources.
• There is only one copy of each resource.
Introduction
• A process can be in two states:
• running or blocked
• In the running state (also called the active state), a process has all the needed
resources and is either executing or is ready for execution.
• In the blocked state, a process is waiting to acquire some resources.
• Deadlock is a situation in which a set of processes is blocked waiting for other
process in the set to release the resource
• Following conditions should hold simultaneously for deadlock to occur:
1. Mutual Exclusion 2. No preemption
3. Hold and wait 4. Circular Wait
DDD: Resource vs.
Communication Deadlocks
• Two types of deadlocks have been discussed:
• Resource deadlock
• Processes can simultaneously wait for several resources and cannot proceed until they
have acquired all those resources
• A set of processes is resource-deadlock if each process in the set requests resources held
by another process in the set and it must receive all of the requested resources before it
can become unblocked
• Communication deadlock
• Processes wait to communicate with other processes among a set of processes
• A waiting process can unblock on receiving a communication from any one of these
processes
• A set of processes is communication deadlocked if each process in the set is waiting to
communicate with another process in the set and no process in the set ever initiates any
further communication until it receives the communication for which it is waiting
DDD: A Graph-Theoretic Model
• The state of process-resource interaction in distributed systems can be
modeled by a bi-partite directed graph called a resource allocation graph.
• The nodes of this graph are processes and resources of a system, and the
edges of the graph depict assignments or pending requests.
• A pending request is represented by a request edge directed from the node
of a requesting process to the node of the requested resource.
• A resource assignment is represented by an assignment edge directed from
the node of an assigned resource to the node of the assigned process.
• A system is deadlocked if its resource allocation graph contains a directed
cycle or a knot.
DDD: Resource Allocation Graph
P1
R1 R4
P4 P2 Assignment Edge
Request Edge
R2 P3 R3
DDD: Wait-For Graph
• Wait-For Graphs:
• In distributed systems, the system state can be modeled or
represented by a directed graph, called a wait-for graph
(WFG)
• In a WFG, nodes are processes and there is a directed edge
from node P1 to node P2 if P1 is blocked and is waiting for
P2 to release some resource
• A system is deadlocked if and only if there is a directed cycle
or not (depending upon the underlying model) in the WFG
DDD: Wait-For Graph
P1 P2
P4 P3
DDD: Deadlock handling
strategies in DS
• There are three strategies to handle deadlock
• Deadlock Prevention
• Deadlock Avoidance
• Deadlock Detection and Recovery
• Deadlock handling is complicated to implement in DS
because no one site has accurate knowledge of the
current state of the system and because every inter-site
communication involves a finite and unpredictable delay
DDD: Deadlock Prevention
• Deadlock can be prevented in a DS through following approaches:
• Deadlock can be prevented in a DS through linear ordering of the resources, simple to implement with
less overhead.
• Can be achieved either by having a process acquired all the needed resources simultaneously before it
begins execution or by preempting a process that holds the needed resources.
• Can use time-stamping and priority with resource preemption.
• To control the preemption each process is assigned a unique priority.
• These values are used to decide whether a process will wait for Pj if Pi has a priority higher than Pj,
otherwise Pi is rolled back.
• It prevents deadlock because for any edge Pi-> Pj, in the wait-for-graph, Pi has a higher priority then Pj,
then a cycle can’t exist
• Low priority process will always be rolled back
• It can be avoided by the use of timestamps, two schemes can be used as following:
• The Wait-Die Scheme.
• The Wound-Wait Scheme.
DDD: Deadlock Avoidance
• A resource is granted to a process if the resulting global state is safe
( a global state includes all the processes and resources of the DS)
• Deadlock is practically impossible to implement because
• Every site has to maintain information on the global state of the system,
which translates into huge storage requirements an extensive
communication costs
• The process of checking for a safe global state must be mutually exclusive,
because if several sites concurrently perform checks for a safe state they
may all find the state safe but the net global state may not be safe
• Due to the large number of processes and resources it will be
computationally expensive to check for a safe state
DDD: Deadlock Detection and
Recovery
• Requires an examination of the status of process-
resource interaction for the presence of cyclical wait
• Two conditions exist in the DS:
• Once a cycle is formed in the WFG, it persists until its is
detected and broken, and
• Cycle detection can proceed concurrently with the normal
activities of a system
• We’ll study the techniques to detect deadlock in a DS
instead of trying to prevent or avoid it
DDD: Deadlock Detection and
Recovery
• Deadlock detection and resolution entails addressing two basic issues:
• First, detection of existing deadlocks and
• Second resolution of detected deadlocks.
• The detection of deadlocks involves two Issues:
• Maintenance of the WFG and
• Search of the WFG for the presence of cycles (or knots)
• In distributed systems, a cycle may involve several sites, so the search for
cycles greatly depends upon how the WFG of the system is represented
across the system
• Depending upon the manner in which WFG information is maintained and
the search for cycles is carried out, there are centralized, distributed, and
hierarchical algorithms for deadlock detection in distributed systems.
DDD: Deadlock Detection and
Recovery
• Deadlock resolution involves breaking existing wait-for
dependencies in the system WFG to resolve the deadlock
• It involves rolling back one or more processes that are deadlocked
and assigning their resources to blocked processes in the
deadlock so that they can resume execution
DDD: Control Organizations
• Centralized Control
• Provided with a control site responsible of constructing WFG
• However, a single point of failure, congested links, continuous message generation for deadlock
detection, are the demerits.
• Distributed Control
• Detection of a global deadlock is shared equally
• Deadlock detection is initiated only when a waiting process is suspected to be a part of deadlock
cycle.
• However, such systems are difficult to design, several sites may initiate detection for the same
deadlock, proof of correctness is difficult, deadlock resolution is cumbersome.
• Hierarchical Control
• Site detects deadlocks involving only its descendent sites
• Exploits access patterns local to a cluster of sites to efficiently detect deadlocks.
• However, the control is defeated if most deadlocks span several clusters.
Centralized Deadlock-Detection
• The Completely Centralized Algorithm:
• A designated site called the Control Site is provided which maintains the
WFG of the entire system
• It checks the WFG for the existence of deadlock cycles whenever a request
edge is added to the WFG.
• Sites request or release through Request and Release message for all
resources, whether local or remote.
• However, it is highly inefficient due to concentration of all messages.
• It imposes larger delays, large communication overhead, and the congestion
of communication links.
• Moreover, the reliability is poor due to single point of failure.
Centralized Deadlock-Detection
• The Ho-Ramamoorthy Algorithms:
• Two proposals to resolve the problems in Centralized algorithm
• The Two-Phase Algorithm
• Every site maintains a status table containing status of all the processes initiated at that site.
• A designated site requests the status table from all sites, periodically.
• Two received reports are matched and a WFG is generated based on the differences.
• The drawback is it may report a false deadlock.
• The One-Phase Algorithm
• It request one status report from each site, however, each site maintains two status table, i.e. resource &
process.
• Resource table keeps track of transactions, whereas the process table keeps track of resources locked.
• Periodically, a designated site requests both the tables from every site, constructs a WFG using the
information provided by both tables.
• If no cycle is found, then the system is not deadlocked.
Distributed Deadlock Detection
Algorithms
• All sites collectively cooperate to detect a cycle in the state graph that is
likely to be distributed over several sites of the system.
• The algorithm can be initiated whenever a process is forced to wait.
• The algorithm can be initiated either by the local site of the process or
by the site where the process waits.
• These algorithm can be divided into four classes,
• Path-pushing algorithm
• Edge-chasing algorithm
• Diffusion computation algorithm
• Global state detection algorithm
Distributed Deadlock Detection
Algorithms
Edge-Chasing Algorithm
• If Pi is locally dependent on itself then • then
declare deadlock • begin
• Else for all Pj and Pk such that Pi is locally • dependentk(i)=true;
dependent upon Pj, and Pj is waiting on Pk,
• if k=I then declare that Pi is deadlocked
and Pj and Pk are on different sites,
• else for all Pm and Pn such that Pk is locally
• Send probe(i,j,k) to the home site of Pk
dependent upon Pm, and Pm is waiting on
• On receipt of probe(i,j,k), the site takes the Pn and Pm and Pn are on different sites,
following actions:
• send probe(i,m,n) to the home sites of Pn
• If Pk is blocked, and dependentk(i) is false,
and Pk has not replied to all requests of Pj • end
Distributed Deadlock Detection
Algorithms
• Example:
P2
P3
P1
Site-1 P 4
P9
P8 P6
P5
P10
P7
Site-3 Site-2
Distributed Deadlock Detection
Algorithms
• Example:
P2
P3
P1
Site-1 P 4
P9
P8 P6
P5
P10
P7
Site-3 Site-2
Distributed Deadlock Detection
Algorithms
• Example:
,2 ) P2
(1,1
e
rob
P
P3
P1
Site-1 P 4
P9
P8 P6
P5
P10
P7
Site-3 Site-2
Distributed Deadlock Detection
Algorithms
• Example: Pr
2 ) ob
, P2 e
(1,1 (1
,2,
e 3
rob )
P
P3
P1
Site-1 P 4
P9
P8 P6
P5
P10
P7
Site-3 Site-2
Distributed Deadlock Detection
Algorithms
• Example: Pr
2 ) ob
, P2 e
(1,1 (1
,2
e ,3)
rob
P
P3
P1 Prob
e (1
,3,4 )
Site-1 P 4
P9
P8 P6
P5
P10
P7
Site-3 Site-2
Distributed Deadlock Detection
Algorithms
• Example: Pr
2 ) ob
, P2 e
(1,1 (1
,2
e ,3)
rob
P
P3
P1 Prob
e (1
,3,4 )
Site-1 Pr
ob
P 4 e
P9 (1
,4 ,5)
P8 P6
P5
P10
P7
Site-3 Site-2
Distributed Deadlock Detection
Algorithms
• Example: Pr
2 ) ob
, P2 e
(1,1 (1
,2
e ,3)
rob
P
P3
P1 Prob
e (1
,3,4 )
Site-1 Pr
ob
P 4 e
P9 (1
,4 ,5)
P8 P6
Probe (1,5,6) P5
P10
(1 ,5,7)
e
P7 Prob
Site-3 Site-2
Distributed Deadlock Detection
Algorithms
• Example: Pr
2 ) ob
, P2 e
(1,1 (1
,2
e ,3)
rob
P
P3
P1 Prob
e (1
,3,4 )
Site-1 Pr
ob
P 4 e
P9 (1
,4 ,5)
P8 P6
Probe (1,6,8) Probe (1,5,6) P5
P10
(1 ,5,7)
e
Probe (1,7,10) P7 Prob
Site-3 Site-2
Distributed Deadlock Detection
Algorithms
• Example: Pr
2 ) ob
, P2 e
(1,1 (1
,2
e ,3)
rob
P
P3
P1 Prob
e (1
,3,4 )
Site-1 Pr
ob
P 4 e
P9 (1
,4 ,5)
Prob P6
e (1 ,8,9)
P8
Probe (1,6,8) Probe (1,5,6) P5
P10
(1 ,5,7)
e
Probe (1,7,10) P7 Prob
Site-3 Site-2
Distributed Deadlock Detection
Algorithms
• Example: Pr
2 ) ob
, P2 e
(1,1 (1
,2
e ,3)
rob
P
P3
P1 Prob
1 ,9 ,1) e (1
,3,4
ob e( )
Pr
Site-1 Pr
ob
P 4 e
P9 (1
,4 ,5)
Prob P6
e (1 ,8,9)
P8
Probe (1,6,8) Probe (1,5,6) P5
P10
(1 ,5,7)
e
Probe (1,7,10) P7 Prob
Site-3 Site-2
Hierarchical Deadlock Detection
Algorithm
• In hierarchical deadlock detection algorithms, sites are arranged in a hierarchical
fashion, and a site detects deadlocks involving only its descendant sites
• Hierarchical algorithms exploit access patterns local to a cluster of sites to
efficiently detect deadlocks
• However, hierarchical deadlock detection algorithms require special care while
arranging the sites in a hierarchy
• For efficiency, most deadlocks should be localized to as few clusters as possible -
the objective of hierarchical control is defeated if most deadlocks span several
clusters
• These algorithms can be divided into two classes,
• The Menasce-Muntz Algorithm
• The Ho-Ramamoorthy Algorithm
Hierarchical Deadlock Detection
Algorithm
• The Ho-Ramamoorth Algorithm:
• Sites are grouped into several disjoint clusters.
• Periodically, a site is chosen as a central control site, which dynamically
chooses a control site for each cluster.
• The central control site requests from every control site their intercluster
transaction status information and wait-for relation.
• A control site collects status tables from all the site in its cluster and applies
the one-phase deadlock detection algorithm.
• It then sends status information and wait-for relation to the central control
site.
• Finally, the central site constructs a system WFG and searches it for cycles.
Hierarchical Deadlock Detection
Algorithm Control Site
Central Site
Control Site Control Site
Algorithms Unleashed
Perspectives
• Theory of Correctness
• A formal proof of the correctness is nontrivial
• TWF graph and deadlock cycles can form in innumerable ways and it is difficult to imagine
• Deadlock is very sensitive to the timing of requests
• Message delays are unpredictable
• Performance
• The number of messages exchanges may not be the true indicator as it varies in algorithms
• The persistence of deadlocks results in wasteful utilization of resources, therefore, the average
time a deadlock persist can be an important measure
• Other measures include, storage overhead, processing overhead, resource holding time, etc.
• Deadlock Resolution
• A deadlock is resolved by aborting at least one process and granting the released resources to
other processes.
Thanks
Email: mnaseem105@gmail.com