Distributed Shared
Memory
Distributed shared memory
DSM paradigm provides process with shared address
space.
Primitives for shared memory:
1. Read(address)
2. Write(address, data)
Shared memory paradigm gives the system a illusion of
physically shared memory.
DSM refers to shared memory paradigm applied to
loosely coupled distributed memory systems.
Cont….
Shared memory exists only virtually.
Similar concept to virtual memory.
DSM also known as DSVM.
DSM provides a virtual address space shared among
processes on loosely coupled processors.
DSM is basically an abstraction that integrates the local
memory of different machine into a single logical entity
shared by cooperating processes.
Distributed shared memory
DSM Architecture
Each node of the system consist of one or more CPUs
and memory unit.
Nodes are connected by high speed communication
network.
Simple message passing system for nodes to exchange
information.
Main memory of individual nodes is used to cache pieces
of shared memory space.
Reduces network latency
Cont….
Memory mapping manager routine maps local memory
to shared virtual memory.
Shared memory space is partitioned into blocks.
Data caching is used in DSM system to reduce network
latency.
The basic unit of caching is a memory block.
Cont….
If data is not available in local memory network block
fault is generated.
On Network block fault:
The missing block is migrated from the remote node to the
client process’s node and OS maps it into the application’s
address space.
Data blocks keep migrating from one node to another on
demand but no communication is visible to the user processes.
Design and implementation issues
1. Granularity
2. Structure of Shared memory
3. Memory coherence and access synchronization
4. Data location and access
5. Replacement strategy
6. Thrashing
7. Heterogeneity
Granularity
Refers to the block size of DSM.
The unit of sharing & the unit of data transfer across the
network when a network block fault occurs.
Possible unit are a few word, a page or few pages.
Memory Coherence & Access Synchronization
Replicated and shared data items may
simultaneously be available in the main memories of
a number of nodes.
Memory Coherence Problem
Deals with the consistency of a piece of shared data lying
in the main memories of two or more nodes.
Replacement strategy
Ifthe local memory of a node is full, a cache miss at that
node implies not only a fetch of accessed data block from
a remote node but also a replacement.
Data block must be replaced by the new data block.
Thrashing
Data block migrate between nodes on demand.
Therefore if two nodes compete for write access to a
single data item, the corresponding data block may be
transferred back.
Granularity
Most visible parameter in the design of DSM system is
block size.
Sending small packet of data is more expensive than
sending large packet.
Cont…
Factors influencing block size selection:
1. Paging overhead
2. Directory size
3. Thrashing
4. False sharing
Paging overhead
A process is likely to access a large region of its
shared address space in a small amount of time.
The paging overhead is less for large block size as
compared to the paging overhead for small block size.
Directory size
The larger the block size, the smaller the
directory.
Result: reduced directory management overhead
for larger block size.
Thrashing
The problem of thrashing may occur:
when data item in the same data block are being
updated by multiple nodes at the same time.
with any block size, more likely with larger block
size.
False sharing
Occurs when two different processes access
two unrelated variable that reside in the
same data block.
The larger is the block size, higher is the probability of
false sharing.
False sharing of a block may lead to a thrashing problem
Possible Solutions
Using page size as block size
Relative advantage and disadvantages of small
and large block size makes it difficult for DSM
designer to decide on a proper block size.
On Intel Core2 Duo 64-bit system, the page size
is 4kB, which is normal for almost every desktop
PC architectures.
Linux kernel 2.6.x versions support large pages
(4MB).
Advantages of using page size as
block size
Use of existing page fault schemes to trigger a DSM page
fault.
Allows the access right control.
Page size do not impose undue communication overhead
at the time of network page fault.
Page size is a suitable data entity unit with respect to
memory contention.
Structure of Shared-Memory Space
Structure defines the abstract view of the shared
memory space for application programmers.
The structure and granularity of a DSM system are
closely related.
Consistency Models
Consistency requirement vary from application to
application.
Refers to the degree of consistency that has to be
maintained for the shared memory data.
If a system supports a stronger consistency model,
weaker consistency model is automatically supported
but the converse is not true.
Types of Consistency Models
1. Strict Consistency model
2. Sequential Consistency model
3. Casual consistency model
4. Pipelined Random Access Memory consistency
model (PRAM)
5. Processor Consistency model
6. Weak consistency model
7. Release consistency model
Strict Consistency Model
The strongest form with most stringent consistency
requirement.
Value returned by a read operation on a memory
address is always the same as the value written by
the most recent write operation to that address.
All writes instantaneously become visible to all
processes.
Implementation requires the existence of an absolute
global time to synchronize clocks of all nodes.
Practically impossible.
Sequential Consistency Model
Proposed by Lamport in 1979.
All processes see the same order of all memory access
operations on the shared memory.
Exact order of access operations are interleaved & does
not matter.
Example :
Operations performed in order
1. Read(r1)
2. write(w1)
3. Read(r2)
Cont…
Only acceptable ordering
for a strictly consistency memory
(r1, w1, r2)
For sequential consistency model
Any of the orderings (r1,w1,r2), (r1,r2,w1), (w1,r1,r2),
(w1,r2,r1), (r2,r1,w1), (r2,w1,r1), is correct
if all processes see the same ordering.
Cont…
The consistency requirement of the sequential
consistency model is weaker than that of the strict
consistency model.
A sequentially consistent memory provide one-copy /
single-copy semantics.
Acceptable by most applications.
Casual Consistency Model (PCR)
Proposed by hutto and ahemad in 1990.
All processes see only those memory reference
operations in the correct order that are potentially
casually related (PCR).
Otherwise seen by different processes in different order.
Required to construct and maintain dependency graphs
for memory access operations.
Pipelined Random Access Memory
(PRAM) Consistency model
Proposed by Lipton and Sandberg in 1988.
A weaker consistency semantics.
Ensures that all write operations performed by a single
process are seen by all other processes in the order in
which they were performed as if all write operations
performed by a single process are in a pipeline.
PRAM
P1 – W11 & W12
P2 – W21 & W22
P3 can see these as ((W11,W12), (W21,W22))
P4 can see these as ((W21,W22), (W11,W12))
Advantages:
Simple and easy to implement.
Good performance
Limitation:
All processes may not agree on the same order of memory
reference operations.
Processor Consistency model
Proposed by Goodman in 1989.
Very similar to PRAM model with additional restriction of
memory coherence.
Memory coherence:
For any memory location, all processes agree on the same order
of all write operation to that location.
Processor consistency ensures that all write operations
performed on the same location are seen by all
processes in the same order.
Weak Consistency Model
Many applications may demand that:
It is not necessary to show the change in memory
done by every write operation to other processes.
Idea of weak consistency:
Better performance can be achieved if consistency
is enforced on a group of memory reference
operations rather than on individual memory
reference operations.
Uses a special variable called a
synchronization variable to synchronize
memory.
Cont…
Requirements:
1. All accesses to synchronization variables must obey
sequential consistency semantics.
2. All previous write operations must be completed
everywhere before an access to a synchronization
variable is allowed.
3. All previous accesses to synchronization variables must
be completed before access to a non-synchronization
variable is allowed.
4. Better performance at the cost of putting extra burden on the
programmers.
Release consistency model
Enhancement of weak consistency model.
Use of two synchronization variables:
Acquire
Release
Cont…
Acquire
Used to tell the system that a process is
entering Critical section.
Results in propagating changes made by other
nodes to process's node.
Release
Used to tell the system that a process has just
exited critical section.
Results in propagating changes made by the
process to other nodes.
Cont…
A variation of release consistency is lazy
release consistency proposed by Keleher in
1992.
Modifications are not sent to other nodes at the
time of release but only on demand.
Better performance.
Implementing Seq. Consistency Model
Protocol for implementing depends on
whether the DSM system allows:
replication & / or
Migration
Of shared memory data blocks.
Strategies:
1. Nonreplicated, Nonmigrating blocks (NRNMB)
2. Nonreplicated, Migrating blocks (NRMB)
3. Replicated, migrating blocks (RMB)
4. Replicated, Nonmigrating blocks (RNMB)
NRNMBS
Simplest strategy.
Each block of the shared memory has a single
copy whose location is always fixed.
Cont…
Enforcing sequential consistency is trivial.
Method is simple and easy to implement.
Drawbacks:
Serializing data access creates a bottleneck.
Parallelism is not possible.
Cont…
Data locating in the NRNMB strategy:
There is a single copy of each block in the entire
system.
The location of a block never changes.
Requires a simple mapping function to map a block
of a node.
NRMBS
Each block of the shared memory has a single
copy in the entire system.
Migration is allowed.
Owner node of a block changes as soon as the
block is migrating to a new node.
Only the processes executing on one node
can read or write a given data item at any
time.
Cont…
Cont…
Advantage :
No communications cost are incurred when a
process accesses data currently held locally.
Advantage of data access locality.
Drawbacks:
Prone to thrashing problem.
No parallelism.
Data locating in the NRMB strategy
There is a single copy of each block, the location
of a block keeps changing dynamically.
Cont…
Following method used:
1. Broadcasting
2. Centralized server algorithm
3. Fixed distributed server algorithm
4. Dynamic distributed server algorithm
Broadcasting
Each node maintains an owned block table
that contains an entry for each block for
which the node is the current owner.
Cont…
Cont….
On a fault,
Fault handler of the faulting node broadcasts a
read/write request on the network.
Disadvantage:
Not scalable.
Centralized server algorithm
Cont…
A centralized server maintains a block table
that contains the location information for all
block in the shared memory space.
Drawbacks:
A centralized server serializes location
queries, reducing parallelism.
The failure of the centralized server will cause
the DSM system to stop functioning.
Fixed distributed server algorithm
Cont..
A direct extension of the centralized server
scheme.
There is a block manager on several nodes.
Each block manager is given a predetermined
subset of data blocks to manage.
On fault, the mapping functions is used by
the currently accessed block.
Dynamic distributed server algorithm
Cont…
Does not use any block manager and
attempts to keep track of the ownership
information of all block in each node.
Each node has a block table.
Contains the ownership information for all blocks.
Use of probable owner.
When fault occurs, the faulting node extracts
from its block table the node information.
RMB - Replicated, migrating blocks
Used to increase parallelism.
Read operations are carried out in parallel at multiple
nodes by accessing the local copy of the data.
Replication tends to increase the cost of write
operation.
Problem to maintain consistency.
Extra expense if the read/write ratio is large.
Cont…
Protocols to ensure sequential consistency:
1. Write-invalidate
2. Write-update
Write-invalidate
All copies of a piece of data except one are invalidated
before a write can be performed on it.
After invalidation of a block, only the node that performs
the write operation on the block holds the modified
version of the block.
Cont…
Write-update
A write operation is carried out by updating all copies of
the data on which the write is performed.
On write fault, the fault handler copies the accessed
block from one of the block’s current node to its own
node.
The write operation completes only after all the copies of
the block have been successfully updated.
Cont…
Cont…
Assign sequence number to the modification and
multicasts that to all the nodes where a replica of the
data block to be modified is located.
The write operations are processed at each node in
sequence number order.
If the verification fails, node request the sequencer
for a retransmission of the missing modification.
Cont…
Cont…
Write-update approach is very expensive.
Inthe write-invalidate approach, updates are only
propagated when data is read and several updates
can take place before communication is necessary.
Use of status tag associated with each block.
Indicates the block is valid/shared/ read-only / writable.
Data Locating in the RMB strategy
Following algorithms may be used:
1. Broadcasting
2. Centralized-server algorithm
3. Fixed distributed-server algorithm
4. Dynamic distributed-server algorithm
Replicated, Non-Migrating Block
A shared memory block may be replicated at multiple
node of the system but the location of each replica is
fixed.
All replicas of a block are kept consistent by updating
them all in case of a write access.
Sequential consistency is ensured by using a global
sequencer to sequence the write operation of all nodes.
Data locating in the RNMB strategy
Following characteristics:
1. The replica location of a block never change.
2. All replicas of a data block are kept consistent.
3. Only a read request can be directly sent to one of the
node having a replica and all write requests have to
be sent to the sequencer.