CSCE 613: Interlude: Distributed Shared Memory
CSCE 613: Interlude: Distributed Shared Memory
CSCE 613: Interlude: Distributed Shared Memory
• Consistency Models
• Reading (old!):
– Coulouris: Distributed Systems, Addison Wesley, Chapter 17
– Tanenbaum: Distributed Operating Systems, Prentice Hall, 1995,
Chapter 6
– Tanenbaum, van Steen: Distributed Systems, Prentice Hall, 2002,
Chapter 6.2
– M. Stumm and S. Zhou: Algorithms Implementing Distributed
Shared Memory, IEEE Computer, vol 23, pp 54-64, May 1990
• Vanilla implementation:
– references to local pages done in hardware.
– references to remote page cause HW page fault; trap to OS;
load the page from remote; restart faulting instruction.
• Optimizations:
– share only selected portions of memory.
– replicate shared variables on multiple machines.
1
CSCE 613 : Operating Systems Memory Models
Shared Memory
• DSM in context of shared memory for multiprocessors.
managed by language
managed by MMU managed by OS runtime system
2
CSCE 613 : Operating Systems Memory Models
/* lock(mutex) */
< implementation of lock would come here>
/* counter++ */
load r1, counter
add r1, r1, 1
store r1, counter
/* unlock(mutex) */
store zero, mutex
3
CSCE 613 : Operating Systems Memory Models
Consistency Models
• Strict consistency
• Sequential consistency
• Causal consistency
• PRAM (pipeline RAM) consistency
• Weak consistency
• Release consistency
Strict Consistency
• Most stringent consistency model:
Any read to a memory location x returns the value stored
by the most recent write operation to x.
• strict consistency observed in simple uni-processor systems.
• has come to be expected by uni-processor programmers
– very unlikely to be supported by any multiprocessor
• Two scenarios:
4
CSCE 613 : Operating Systems Memory Models
• In this example, at least one CPU must load the other's new value.
Sequential Consistency
• Strict consistency impossible to implement.
• Programmers can manage with weaker models.
• Sequential consistency [Lamport 79]
The result of any execution is the same as if the operations of
all processors were executed in some sequential order, and the
operations of each individual processor appear in this sequence
in the order specified by its program.
• Memory accesses of different CPUs are “sequentialised”; Any
valid interleaving is acceptable, but all processes must see the
same sequence of memory references.
• Scenarios:
P1: W(x)1 P1: W(x)1
P2: W(x)0 P2: W(x)0
P3: R(x)0 R(x)1 P3: R(x)0 R(x)1
P4: R(x)0 R(x)1 P4: R(x)1 R(x)0
5
CSCE 613 : Operating Systems Memory Models
6
CSCE 613 : Operating Systems Memory Models
7
CSCE 613 : Operating Systems Memory Models
/* lock(mutex) */
< implementation of lock would come here>
/* counter++ */
load r1, counter
add r1, r1, 1
store r1, counter
/* MEMORY BARRIER */
STBAR
/* unlock(mutex) */
store zero, mutex
8
CSCE 613 : Operating Systems Memory Models
Causal Consistency
• Weaken sequential consistency by making distinction between
events that are potentially causally related and events that are
not.
• Distributed forum scenario: causality relations may be violated
by propagation delays.
• Causal consistency:
Writes that are potentially causally related must be seen by
all processes in the same order. Concurrent writes may be
seen in a different order on different machines.
• Scenario
P1: W(x)1 W(x)3
P2: R(x)1 W(x)2
P3: R(x)1 R(x)3 R(x)2
P4: R(x)1 R(x)2 R(x)3
P1: W(x)1
P2: R(x)1 W(x)2
P3: R(x)2 R(x)1
P4: R(x)1 R(x)2
P1: W(x)1
P2: W(x)2
P3: R(x)2 R(x)1
P4: R(x)1 R(x)2
9
CSCE 613 : Operating Systems Memory Models
Weak Consistency
• PRAM consistency still unnecessarily restrictive for many
applications: requires that writes originating in single process
be seen everywhere in order.
• Example:
– reading and writing of shared variables in tight loop inside a
critical section.
– Other processes are not supposed to touch variables, but
writes are propagated to all memories anyway.
10
CSCE 613 : Operating Systems Memory Models
Release Consistency
• Problem with weak consistency:
– When synchronization variable is accessed, we don’t know if
process is finished writing shared variables or about to start
reading them.
– Need to propagate all local writes to other machines and gather all
writes from other machines.
• Operations:
– acquire critical region: c.s. is about to be entered.
• make sure that local copies of variables are made consistent
with remote ones.
– release critical region: c.s. has just been exited.
• propagate shared variables to other machines.
– Operations may apply to a subset of shared variables
• Scenario:
P1: Acq(L) W(x)1 W(x)2) Rel(L)
P2: Acq(L) R(x)2 Rel(L)
P3: R(x)1
11
CSCE 613 : Operating Systems Memory Models
12
CSCE 613 : Operating Systems Memory Models
Page-Based DSM
• NUMA
– processor can directly reference local and remote memory
locations
– no software intervention
• Workstations on network
– can only reference local memory
• Goal of DSM
– add software to allow NOWs to run multiprocessor code
– simplicity of programming
– “dusty deck” problem
Basic Design
• Emulate cache of multiprocessor using the MMU and system
software
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 2 5 1 3 6 4 7 11 13 15
9 8 10 12 14
13
CSCE 613 : Operating Systems Memory Models
Design Issues
• Replication
– replicate read-only portions
– replicate read and write portions
• Granularity
– restriction: memory portions multiples of pages
– pros of large portions:
• amortize protocol overhead
• locality of reference
Processor 1 Processor 2
– cons of large portions
• false sharing!
A A
B B
code using A code using B
• Write-Update:
– Writes made locally are multicast to all copies of the data item.
– Multiple writers can share same data item.
– Consistency depends on multicast protocol.
• E.g. Sequential consistency achieved with totally ordered
multicast.
– Reads are cheap
• Write-Invalidate:
– Distinguish between read-only (multiple copies possible) and writable
(only one copy possible).
– When process attempts to write to remote or replicated item, it
first sends a multicast message to invalidate copies; If necessary,
get copy of item.
– Ensures sequential consistency.
14
CSCE 613 : Operating Systems Memory Models
P W P R R
P R P R
P R R P W
P W P R R
P R P R
P R R P W
15
CSCE 613 : Operating Systems Memory Models
B C D E B C D E
owner
A A
owner A issues write A issues read
B C D E
owner
A
alternatively …
16
CSCE 613 : Operating Systems Memory Models
3 4 2 3 4 1 2 3 2 3 4 1
1
2 2
3 5
4 4
4
17
CSCE 613 : Operating Systems Memory Models
Shared-Variable DSM
• Is it necessary to share entire address space?
• Share individual variables.
• more variety in possible in update algorithms for replicated
variables
• opportunity to eliminate false sharing
• Synchronization:
– lock variables
– barriers
– condition variables
• Release consistency
• Multiple update protocols
• Directories for data location
18
CSCE 613 : Operating Systems Memory Models
lock(L);
a=1; /* changes to
b=2; * shared
c=3; * variables*/
unlock(L);
a,b,c a,b,c
19
CSCE 613 : Operating Systems Memory Models
Multiple Protocols
• Annotations for shared-variable declarations:
– read-only
• do not change after initialization; no consistency problems
• protected by MMU
– migratory
• not replicated; migrate from machine to machine as critical
regions are entered
• associated with a lock
– write-shared
• safe for multiple programs to write to it
• use “diff” protocol to resolve multiple writes to same
variable
– conventional
• treated as in conventional page-based DSM: only one copy
of writeable page; moved between processors.
– others…
word 4
6 6 6 6 8 6 8
6->8
20
CSCE 613 : Operating Systems Memory Models
........
P2:
a[1]= a[3]= a[n]=
barrier
manager
barrier barrier
send changes
a[0]= a[2]= a[n-1]=
P1:
........
P2:
a[1]= a[3]= a[n]=
barrier
manager
barrier barrier
21