Distributed Shared Memory
Distributed Shared Memory
Distributed Shared Memory
Memory Basics
• Physical memory Alternatively referred to as the physical storage or the real
storage, physical memory is a term used to describe the total amount of memory
installed in the computer. For example, if the computer has two 64MB memory
modules installed, it has a total of 128MB of physical memory.
• Virtual Memory is a part of the hard disk which is used as a memory. It has a set of
memory addresses and stores the instructions or the data.
• Shared Memory Global physical memory equally accessible to all processors
• Programming ease and portability
• Increased contention and limit scalability
• The private part is divided up into regions so that each machine has a piece for its
stacks and other unshared data and code. The shared part is common to all
machines The machines in a ring-based multiprocessor can be much more loosely
coupled.
• All the machines in Memnet are connected together in a modified token passing
ring. The ring consists of 20 parallel wires, which together allow 16 data bits and 4
control bits to be sent every 100 nsec, for a data rate of 160 Mbps.
Switched Multiprocessors in
Shared Memory
• If one inter cluster bus proves to be inadequate, add a second inter cluster bus, or
arrange the clusters in a tree or grid.
• If still more bandwidth is needed, collect a bus, tree, or grid of clusters together
into a super-cluster, and break the system into multiple super clusters.
• The super clusters can be connected by a bus, tree, or grid shows a system with
three levels of buses.
Multiprocessor Architecture
• No global memory
• The local memories form a global, virtual memory
• Connection is relatively slow and loose (LAN, MAN …)
• Scalability
• Transparency of
• information sharing
• communication
• Cluster based architecture
• Why DSM?
• Direct information sharing programming
paradigm (transparency)
• Multilevel memory access (locality)
• Wealth of existing programs (portability)
• Large physical memory
• Scalable multiprocessor system
DSM
• Distributed shared memory (DSM) tries to combine the advantages of two
classes of systems: shared memory systems, having a single global physical
memory, equally accessible to all processors, and distributed memory
systems, that consist of multiple processing nodes communicating by
means of message passing.
• Distributed Memory
• Scales well
• Difficult to program
• Message passing or RPC based
• Solution: DSM (Distributed Shared Memory)
• Logically shared memory
• Physically distributed local memories
• Page based
• Shared pages
• Demand paging between nodes
• Variable based
• Shared Variables
• Object based
• Shared encapsulation units
• Easier to optimize
Page Based Distributed
Shared Memory
• Basic Design- Distributed Shared Memory emulate the cache of a multiprocessor
using the MMU( Memory Management Unit) and operating system software.
• In a DSM system, the address space is divided up into chunks, with the chunks being
• spread over all the processors in the system.
• When a processor references an address that is not local, a trap occurs, and the DSM
software fetches the chunk containing the address and restarts the faulting
instruction, which now completes successfully.
• Replication- To replicate chunks that are read only, for example, program text, read
only constants, or other read-only data structures.
• (a) Chunks of address space distributed among four machines.
• (b) Situation after CPU 1 references chunk 10.
• (c) Situation if chunk 10 is read only and replication is used.
Granularity
• The problem of false sharing is that although the variables are unrelated, since
they appear by accident on the same page, when a process uses one of them, it
also gets the other.
• The larger the effective page size, the more often false sharing will occur, and
conversely, the smaller the effective page size, the less often it will occur.
• Clever compilers that understand the problem and place variables in the address
space accordingly can help reduce false sharing and improve performance
• Finding the owner- how to find the owner of the page.
First solution is by doing a broadcast, asking for the owner of the specified page to
respond.
Second solution is that, one process is designated as the page manager. It is the
job of the manager to keep track of who owns each page.
• When a process, P, wants to read a page it does not have or wants to write a page
it does not own, it sends a message to the page manager telling which operation it
wants to perform and on which page.
• The manager then sends back a message telling who the owner is. P now contacts
the owner to get the page and/or the ownership, as required.
• Two approaches to find ownership location using four message protocol & using
three message protocol
• Finding the Copies-The detail of how all the copies are found when they must be
invalidated. Two possibilities present.
one is to broadcast a message giving the page number and ask all processors
holding the page to invalidate it. This approach works only if broadcast messages
are totally reliable and can never be lost.
Second possibility is to have the owner or page manager maintain a list or copyset
telling which processors hold which pages.
• When a page must be invalidated, the old owner, new owner, or page manager
sends a message to each processor holding the page and waits for an
acknowledgement. When each message has been acknowledged, the invalidation
is complete.
• Page Replacement-In a DSM system, it can happen that a page is needed but there is no
free page frame in memory to hold it. When this situation occurs, a page must be evicted
from memory to make room for the needed page.
• Two sub problems immediately arise: which page to evict and where to put it.
• If one process, A, is inside the critical region and another process, B, (on a different machine)
wants to enter it, B will sit in a tight loop testing the variable, waiting for it to go to zero.
• Use synchronization manager (or managers) that accept messages asking to enter and leave
critical regions, lock and unlock variables, and so on, sending back replies when the work is
done.
• When a region cannot be entered or a variable cannot be locked, no reply is sent back
immediately, causing the sender to block. When the region becomes available or the variable
can be locked, a message is sent back.
Shared variable Distributed
Shared Memory
• Munin- Munin is a DSM system that is fundamentally based on software objects.
The goal of the Munin project is to take existing multiprocessor programs, make
minor changes to them, and have them run efficiently on multicomputer systems
using a form of DSM.
• Synchronization for mutual exclusion is handled in a special way. Lock variables may
be declared, and library procedures are provided for locking and unlocking them.
Barriers, condition variables, and other synchronization variables are also supported.
• If it does and the lock is free, the request is granted. If the lock is not local, it is
located using the synchronization directory, which keeps track of the probable
owner. If the lock is free, it is granted. If it is not free, the requester is added to the
tail of the queue.
Comparison of IPC paradigms
Advantages & Disadvantages of DSM
Advantages of DSM:-
• Simpler abstraction - programmer does not have to worry about data movement
• Easier portability - sequential programs can run directly on DSM systems gives
possibly better performance
• Locality of data - data moved in large blocks which helps programs with good
locality of reference having on-demand data movement
• Flexible communication - no need for sender and receiver to exist, can join and
leave DSM system without affecting the others
• By having control to DSM manager software, programmers cannot use their own
message-passing solutions.
• Synchronous Messages
• Synchronous messaging involves a client that waits for the server to respond to a
message. Messages are able to flow in both directions, to and from. Essentially it means
that synchronous messaging is a two way communication. i.e. Sender sends a message to
receiver and receiver receives this message and gives reply to the sender. Sender will not
send another message until get reply from receiver.
• Asynchronous Messages
• Asynchronous messaging involves a client that does not wait for a message from the
server. An event is used to trigger a message from a server. So even if the client is
down , the messaging will complete successfully. Asynchronous Messaging means
that, it is a one way communication and the flow of communication is one way only.
Design Issues In DSM
5 key design issues when accessing data in
the DSM address space
• DSM Implementation
• Thrashing
• Replication/Migration
- Advantages
• Takes advantage of the locality of reference
• DSM can be integrated with VM ( virtual memory) at each node
- Make DSM page multiple of VM page size
- A locally held shared memory can be mapped into the VM page address space
- If page not local, fault-handler migrates page and removes it from address space at remote
node
- To locate a remote data object:
• Use a location server
• Maintain hints at each node
• Broadcast query
- Issues
• Only one node can access a data object at a time
• Thrashing can occur: to minimize it, set minimum time data object resides at a node
Algorithms for implementing DSM
3. The Read-Replication Algorithm
– Replicates data objects to multiple nodes
– DSM keeps track of location of data objects
– Multiple nodes can have read access or one node write access (multiple
readers-one writer protocol)
– After a write, all copies are invalidated or updated
– DSM has to keep track of locations of all copies of data objects.
Examples of implementations:
• IVY: owner node of data object knows all nodes that have copies
• PLUS: distributed linked-list tracks all nodes that have copies
– Advantage
• The read-replication can lead to substantial performance improvements if
the ratio of reads to writes is large
Algorithms for implementing DSM
4. The Full–Replication Algorithm
- Extension of read-replication algorithm: multiple nodes can read
and multiple nodes can write (multiple-readers, multiple-writers
protocol)
- Issue: consistency of data for multiple writers
- Solution: use of gap-free sequencer
• All writes sent to sequencer
• Sequencer assigns sequence number and sends write request
to all sites that have copies
• Each node performs writes according to sequence numbers
• A gap in sequence numbers indicates a missing write request:
node asks for retransmission of missing write requests
DSM Memory coherence
• DSM are based on
- Replicated shared data objects
- Concurrent access of data objects at many nodes
or
• 2. When blocks with read only permissions are repeatedly invalidated soon
after they are replicated.
Techniques to reduce thrashing
• 1.Providing application – controlled locks : Locking data to
prevent other nodes from accessing that for a short period of
time can reduce thrashing. An application controlled lock can
be associated with each data block to implement this method.
Replicates
Fixed Distributed manager algorithm
Dynamic distributed manager algorithm
Distributed File System(DFS)
• Distributed file systems support the sharing of information in the form of files and hardware
resources.
• File Properties
• Byte sequence vs. data structure- A file can be structured as a sequence of records. The operating system
either maintains the file as a B-tree or other suitable data structure, or uses hash tables to locate records
quickly.
• Attributes (owner, creation/modified date, size, permissions)- A files can have attributes, which are pieces
of information about the file typical attributes are the owner, size, creation date, and access permissions.
• Immutable vs. mutable files-Once a file has been created, it cannot be changed. Such a file
is said to be immutable.
• Having files be immutable makes it much easier to support file caching and replication
because it eliminates all the problems associated with having to update all copies of a file
whenever it changes.
• Protection via Capabilities vs. Protection via Access Control Lists- Protection in distributed
systems uses : capabilities and access control lists.
• With capabilities, each user has a kind of ticket, called a capability for each object to which it
has access.
• The capability specifies which kinds of accesses are permitted (e.g., reading is allowed but
writing is not).
• All access control list schemes associate with each file a list of users who may access the file
and how.
• The UNIX scheme, with bits for controlling reading, writing, and executing each file separately
for the owner, the owner's group, and everyone else is a simplified access control list.
• The File Service Model
• (a) Upload/download model- In the upload/download model, the file service
provides only two major operations: read file and write file. The read file operation
transfers an entire file from one of the file servers to the requesting client. The
write file operation transfers an entire file the other way, from client to server
• Advantage
Simple
• Problems
Wasteful: what if client needs small piece?
Problematic: what if client doesn’t have enough space?
Consistency: what if others need to modify the same file?
• (b) Remote access model- In this model, the file service provides a large number
of operations for opening and closing files, reading and writing parts of files,
moving around within files, examining and changing file attributes, and so on.
• Advantages:
Client gets only what’s needed
Server can manage coherent view of file system
• Problem:
Possible server and network congestion
Same data may be requested repeatedly
File Service Model
Directory Server Interface in DFS Design
DFS Implementation
• File usage
• System Structure
• Caching
• Replication
• Sun's Network File System
File usage
• Stateless Server
Server maintains no information on client access
Each request must identify file and offsets
Server can crash and recover (No state to lose)
Client can crash and recover (No open/close needed)
No server space used for state of the client
No limits on number of open files
No problems if a client crashes
Best known examples?
• The UNIX NFS file system is stateless.
Bill Joy: “Once they replaced my file server during the evening
while my machine was idle. The next day I resumed work
right where I had left off, and didn’t even notice the change!”
• Caching Location- It refers to the place where the cached data is stored.
Whether it is server’s main memory or client disk or client’s main memory.
• Delayed-write
To split the workload over multiple servers. As the system grows in size, having all
the files on one server can become a performance bottleneck. By having files
replicated on two or more servers, the least heavily loaded one can be used.
• Three ways of replication
Lazy replication- In this approach only one copy of each file is created, on
some server. Later, the server itself makes replicas on other servers
automatically, without the programmer's knowledge.
• NFS Protocols-A protocol is a set of requests sent by clients to servers, along with
the corresponding replies sent by the servers back to the clients.
• The locating facility, which is an integral part of the naming facility, maps an object's name
to the object‘s location in a distributed system.
• The naming and locating facilities jointly form a naming system that provides the users with
an abstraction of an object that hides the details of how and where an object is actually
located in the network.
• It provides a further level of abstraction when dealing with object replicas. Given an object
name, it returns a set of the locations of the object's replicas.
• The naming system plays a very important role in achieving the goal of
location transparency,
facilitating transparent migration and replication of objects
object sharing
DESIRABLE FEATURES
OF A GOOD NAMING SYSTEM
• 1. Location transparency. An object's name should be independent of the physical
connectivity or topology of the system, or the current location of the object.
• 2. Location independency. The name of an object need not be changed when the object's
location changes. Furthermore, a user should be able to access an object by its same name
irrespective of the node from where he or she accesses it (user migration).
• 3. Scalability. Distributed systems vary in size ranging from one with a few nodes to one
with many nodes , a change in the system scale should not require any change in the
naming or locating mechanisms.
• 4. Uniform naming convention. A good naming system should use the same naming
convention for all types of objects in the system.
• 5. Multiple user-defined names for the same object. a naming system must provide the
flexibility to assign multiple user-defined names to the same object. In this case, it should
be possible for a user to change or delete his or her name for the object without affecting
those of other users.
• 6. Group naming. A naming system should allow many different objects to be
identified by the same name. Such a facility is useful to support broadcast facility
or to group objects for conferencing or other applications.