Distributed vs Parallel Primitives in System
Algorithms
Distributed algorithms for core system primitives (mutual exclusion, deadlock detection, consensus) must
handle the absence of shared memory and clocks, using only message-passing. This imposes fundamental
trade-offs: advantages include strong consistency and fault tolerance across nodes; challenges include
unpredictable network delays, partial knowledge, and communication overhead 1 2 . In parallel (shared-
memory) systems, primitives rely on hardware synchronization (locks, atomic operations) with easier global
control but limited to one machine. We discuss each primitive pointwise, citing theoretical foundations and
real-world uses.
Mutual Exclusion
Distributed mutual exclusion ensures only one node enters its critical section at a time across a network
3 .
Distributed Algorithms (Mutual Exclusion)
• Advantages: Provides global exclusive access with no single point of failure (e.g. token-based
schemes ensure safety by a unique token 1 ). Fairness can be achieved via logical timestamps or
rotating tokens. Algorithms can adapt to dynamic networks.
• Challenges: No shared memory or clock means processes use message passing and logical clocks
to coordinate 1 . This causes high message complexity (e.g. Lamport’s or Ricart–Agrawala need O(N)
messages per entry) and unpredictable network delays. Ensuring liveness (no deadlock/starvation)
is hard when nodes fail or the token is lost 2 . Designing an algorithm that works with incomplete
system state (no global view) is complex 1 .
• Applications: Used in distributed databases and file systems to serialize access. For example,
updates to a distributed directory or lock manager must be atomic 3 . Cloud services (e.g. Google’s
Chubby lock service) use leader-based mutual exclusion for coordination 4 .
Parallel Algorithms (Mutual Exclusion)
• Advantages: Shared memory allows fast locks and semaphores. Hardware support (atomic
instructions) and OS scheduling simplify exclusion with minimal message overhead. Performance is
high when contention is low.
• Challenges: Limited to one machine’s cores; locks can cause contention and priority inversion.
Deadlocks can occur if locks are misused (livelocks or priority inversions).
• Applications: Multicore programming uses mutexes/spinlocks to protect shared data (e.g. shared
counters). Classic parallel algorithms (e.g. barrier synchronizations) rely on mutual exclusion
primitives.
Figure: Paxos consensus roles (Proposers, Acceptors, Learners) in a distributed system. Consensus algorithms like
Paxos/RAFT build on these interactions 5 4 .
1
Deadlock Detection
Deadlock occurs when processes cyclically wait for each other’s resources 2 .
Distributed Algorithms (Deadlock Detection)
• Advantages: Can detect and resolve deadlocks that span multiple machines. Edge-chasing
algorithms (e.g. Chandy–Misra–Haas) explicitly find global cycles by circulating probe messages 6 .
Hierarchical or hybrid schemes can localize detection to sub-networks.
• Challenges: The lack of a global view means no single node knows all waits 2 . Detecting
deadlocks requires significant communication (processes exchange state, probes) 7 . This incurs
high overhead and latency, especially as system size grows (scalability issues 8 ). Algorithms must
avoid reporting “false” deadlocks when information is stale or partial. Complexity can be high (in
AND-model, detection can use O(mn) messages for m processes and n nodes 6 ). Implementing
these under failure/failure-detection constraints is also difficult.
• Applications: Distributed databases and transaction managers use deadlock detection to abort or
roll back cycles in two-phase locking. Distributed operating systems detect global resource waits.
Large-scale workflow systems or grid schedulers employ deadlock detection to maintain liveness.
Parallel Algorithms (Deadlock Detection)
• Advantages: In a single shared-memory system, the OS or runtime has a complete system view.
Deadlocks among threads can be detected via a global wait-for-graph or avoided via lock ordering.
Communication overhead is minimal (in-process).
• Challenges: Still require tracking resource ownership; complex algorithms (like Banker’s algorithm)
are rarely used at runtime. Prevention (e.g. lock hierarchies) is common.
• Applications: Operating systems detect thread deadlocks (e.g. via wait-for graphs for single-
instance resources). Multithreaded programs use lock-ordering or runtime checks to avoid deadlock.
Agreement (Consensus)
Consensus algorithms guarantee that all non-faulty nodes agree on a single value despite failures 5 .
Distributed Algorithms (Consensus)
• Advantages: Provides strong consistency for critical decisions. Essential for state machine replication:
all replicas process the same commands in order. Fault tolerance is built-in (algorithms tolerate some
crashes and still agree on values) 5 . Ensures data consistency in replicated databases (even under
node failures).
• Challenges: The FLP impossibility theorem shows no deterministic consensus in a fully
asynchronous network with one crash failure 9 . Practical algorithms (Paxos, Raft) circumvent this
by leader election, timeouts, or randomization at the cost of potential delays or unavailability during
network issues. Multiple communication rounds (prepare/accept) add latency and message cost.
Network partitions force trade-offs (sacrificing either consistency or availability per the CAP
theorem). Designing for both safety (agreement) and liveness (eventual decision) under real-world
constraints is non-trivial 9 4 .
2
• Applications: Distributed databases and coordination services. For example, Google’s Chubby lock
service and Spanner database use Paxos for consistent replication 4 . NoSQL stores (Cassandra,
DynamoDB) use Paxos or Paxos-inspired protocols for lightweight transactions 10 . Blockchains rely
on consensus (Proof-of-Work or -Stake) to agree on ledger state. Cluster managers (e.g. Kubernetes)
use consensus to elect leaders and maintain state.
Parallel Algorithms (Consensus)
• Advantages: Shared memory makes consensus trivial: processes can agree by writing to a common
variable or using a barrier. No multi-round messaging is needed.
• Challenges: Essentially not an issue – consensus reduces to atomic memory operations (for
example, all threads set a flag or use a barrier to synchronize).
• Applications: Parallel computing uses barrier synchronization (all threads reach a point) or atomic
operations (e.g. compare-and-swap on a shared flag) to achieve consensus or coordinated decisions
among threads.
Summary Table: Distributed vs Parallel Algorithms
Algorithm Example
Primitive Advantages Challenges
Type Applications
Enforces single-
No shared memory/clock ⇒
writer at any Distributed
heavy messaging 1 ;
time; fault- databases (global
Mutual complex recovery (lost
Distributed tolerant token or locks); distributed
Exclusion token, failed coordinator);
quorum schemes filesystems; leader
need logical clocks,
1 ; fairness via election services
asynchrony management
timestamps
Fast hardware
Multithreaded
locks/atomics;
Limited to one node; lock programs; kernel/
OS-enforced
Parallel contention; potential OS mutexes;
exclusivity; low
deadlocks if misused spinlocks for shared
overhead within
data
one machine
Can identify
Distributed
global cycles via Lack of global state view
transaction
probe-based ; high communication
2
managers; grid
Deadlock algorithms; overhead and latency 7 ;
Distributed computing resource
Detection supports cyclic complexity grows with
schedulers;
wait resolution system size 8 ; false
distributed OS
across machines positives
resource managers
6
3
Algorithm Example
Primitive Advantages Challenges
Type Applications
Global OS
visibility (shared
memory); Tracking multiple resources OS thread deadlock
simpler graph- still required; usually detection; parallel
Parallel
based detection avoided by design or OS libraries ensuring
for single- checks lock ordering
instance
resources
Ensures Distributed
agreement even FLP impossibility⇒leader/ databases (e.g.
under failures timeouts needed 9 ; Spanner, Cassandra)
Consensus Distributed 5 ; multiple phases incur 4 10 ;
foundational for latency; network partitions coordination
replication complicate progress services
(safety) (Zookeeper/Chubby)
Multithreaded
Trivial via shared
Essentially no special consensus
memory (atomic
Parallel challenge beyond basic (barriers); GPU warp
write or barrier);
synchronization vote
very low latency
synchronization
Alignment with Course Outcomes
• CO2 – Algorithm Selection & Application: This survey highlights when to use distributed vs. parallel
approaches. For example, distributed mutual exclusion algorithms are chosen for multi-node
systems with no shared memory 1 , whereas parallel mutual exclusion (mutexes, spinlocks)
suffice on a single multicore machine. Similarly, consensus algorithms like Paxos/RAFT are applied
in networked databases 4 , while in shared-memory parallel code a simple atomic barrier does the
job. Understanding these contexts enables correct algorithm choice.
• CO3 – Synchronization Mechanisms: The discussion emphasizes proper synchronization:
distributed primitives rely on message-based locks, tokens and logical clocks 1 3 , whereas parallel
primitives use hardware locks and barriers. For example, Lamport’s and Ricart–Agrawala’s algorithms
use message-passing with FIFO channels and timestamps 1 3 . Understanding these
mechanisms is crucial for implementing safe coordination.
• CO4 – Design under Constraints: We analyze key constraints (asynchrony, failures, message costs)
affecting design. For instance, distributed mutual exclusion must tolerate unpredictable delays and
node failures 1 , and consensus algorithms must respect the FLP limit 9 . Performance
implications (message complexity, latency) are highlighted (e.g. Chandy-Haas uses O(m·n) messages
6 ). These insights help in designing distributed systems that meet correctness, performance and
reliability requirements.
Summary: Distributed algorithms for mutual exclusion, deadlock detection, and consensus provide critical
synchronization across networked nodes, but face challenges of no shared state, unpredictable delays, and
4
failure handling. Parallel algorithms (in shared memory) are simpler but limited in scope. The table above
and the discussion outline advantages, trade-offs, and real-world uses of each approach, equipping
students to choose and apply the right synchronization primitive under given design constraints.
Sources: Authoritative texts and articles were used to extract algorithm properties and use-cases 1 2
5 6 4 , ensuring up-to-date and accurate coverage.
1 Chapter 9: Distributed Mutual Exclusion Algorithms
https://www.cs.uic.edu/~ajayk/Chapter9.pdf
2 7 8 Deadlock detection in Distributed systems | GeeksforGeeks
https://www.geeksforgeeks.org/deadlock-detection-in-distributed-systems/
3 Mutual exclusion in distributed system | GeeksforGeeks
https://www.geeksforgeeks.org/mutual-exclusion-in-distributed-system/
4 5 9 10 Paxos (computer science) - Wikipedia
https://en.wikipedia.org/wiki/Paxos_(computer_science)
6 Chandy–Misra–Haas algorithm resource model - Wikipedia
https://en.wikipedia.org/wiki/Chandy%E2%80%93Misra%E2%80%93Haas_algorithm_resource_model