0% found this document useful (0 votes)

141 views

Cache Coherence and Synchronization - Tutorialspoint

The document discusses cache coherence and synchronization in multiprocessor systems. It describes the cache coherence problem that can occur when multiple processors have inconsistent copies of the same data in their caches. It then covers cache coherence protocols like snoopy bus protocols and directory-based protocols that maintain consistency between caches and main memory. It also discusses hardware synchronization mechanisms and sources of inconsistency like shared writable data, process migration, and I/O activity. Finally, it defines uniform memory access (UMA), non-uniform memory access (NUMA), and cache only memory architecture (COMA) models.

Uploaded by

Tejender Kumar Sachwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

141 views

Cache Coherence and Synchronization - Tutorialspoint

Uploaded by

Tejender Kumar Sachwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

08/12/2019 Cache Coherence and Synchronization - Tutorialspoint

Cache Coherence and Synchronization

In this chapter, we will discuss the cache coherence protocols to cope with the multicache
inconsistency problems.

The Cache Coherence Problem

In a multiprocessor system, data inconsistency may occur among adjacent levels or within the same
level of the memory hierarchy. For example, the cache and the main memory may have inconsistent
copies of the same object.

As multiple processors operate in parallel, and independently multiple caches may possess different
copies of the same memory block, this creates cache coherence problem. Cache coherence
schemes help to avoid this problem by maintaining a uniform state for each cached block of data.

Let X be an element of shared data which has been referenced by two processors, P1 and P2. In the
beginning, three copies of X are consistent. If the processor P1 writes a new data X1 into the cache,
by using write-through policy, the same copy will be written immediately into the shared memory. In
this case, inconsistency occurs between cache memory and the main memory. When a write-back
policy is used, the main memory will be updated when the modified data in the cache is replaced or
invalidated.

In general, there are three sources of inconsistency problem −

Sharing of writable data
Process migration
I/O activity

Snoopy Bus Protocols

Snoopy protocols achieve data consistency between the cache memory and the shared memory
through a bus-based memory system. Write-invalidate and write-update policies are used for
maintaining cache consistency.
https://www.tutorialspoint.com/parallel_computer_architecture/parallel_computer_architecture_cache_coherence_synchronizatio… 1/7
08/12/2019 Cache Coherence and Synchronization - Tutorialspoint

In this case, we have three processors P1, P2, and P3 having a consistent copy of data element ‘X’ in
their local cache memory and in the shared memory (Figure-a). Processor P1 writes X1 in its cache
memory using write-invalidate protocol. So, all other copies are invalidated via the bus. It is denoted
by ‘I’ (Figure-b). Invalidated blocks are also known as dirty, i.e. they should not be used. The write-

https://www.tutorialspoint.com/parallel_computer_architecture/parallel_computer_architecture_cache_coherence_synchronizatio… 2/7
08/12/2019 Cache Coherence and Synchronization - Tutorialspoint

update protocol updates all the cache copies via the bus. By using write back cache, the memory
copy is also updated (Figure-c).

Cache Events and Actions

Following events and actions occur on the execution of memory-access and invalidation commands −
Read-miss − When a processor wants to read a block and it is not in the cache, a read-miss
occurs. This initiates a bus-read operation. If no dirty copy exists, then the main memory that
has a consistent copy, supplies a copy to the requesting cache memory. If a dirty copy exists
in a remote cache memory, that cache will restrain the main memory and send a copy to the
requesting cache memory. In both the cases, the cache copy will enter the valid state after a
read miss.

Write-hit − If the copy is in dirty or reserved state, write is done locally and the new state is
dirty. If the new state is valid, write-invalidate command is broadcasted to all the caches,
invalidating their copies. When the shared memory is written through, the resulting state is
reserved after this first write.

Write-miss − If a processor fails to write in the local cache memory, the copy must come
either from the main memory or from a remote cache memory with a dirty block. This is done
by sending a read-invalidate command, which will invalidate all cache copies. Then the local
copy is updated with dirty state.
Read-hit − Read-hit is always performed in local cache memory without causing a transition
of state or using the snoopy bus for invalidation.
Block replacement − When a copy is dirty, it is to be written back to the main memory by
block replacement method. However, when the copy is either in valid or reserved or invalid
https://www.tutorialspoint.com/parallel_computer_architecture/parallel_computer_architecture_cache_coherence_synchronizatio… 3/7
08/12/2019 Cache Coherence and Synchronization - Tutorialspoint

state, no replacement will take place.

Directory-Based Protocols

By using a multistage network for building a large multiprocessor with hundreds of processors, the
snoopy cache protocols need to be modified to suit the network capabilities. Broadcasting being very
expensive to perform in a multistage network, the consistency commands is sent only to those caches
that keep a copy of the block. This is the reason for development of directory-based protocols for
network-connected multiprocessors.

In a directory-based protocols system, data to be shared are placed in a common directory that
maintains the coherence among the caches. Here, the directory acts as a filter where the processors
ask permission to load an entry from the primary memory to its cache memory. If an entry is changed
the directory either updates it or invalidates the other caches with that entry.

Hardware Synchronization Mechanisms

Synchronization is a special form of communication where instead of data control, information is

exchanged between communicating processes residing in the same or different processors.
Multiprocessor systems use hardware mechanisms to implement low-level synchronization operations.
Most multiprocessors have hardware mechanisms to impose atomic operations such as memory read,
write or read-modify-write operations to implement some synchronization primitives. Other than atomic
memory operations, some inter-processor interrupts are also used for synchronization purposes.

Cache Coherency in Shared Memory Machines

Maintaining cache coherency is a problem in multiprocessor system when the processors contain local
cache memory. Data inconsistency between different caches easily occurs in this system.
The major concern areas are −
Sharing of writable data
Process migration
I/O activity

Sharing of writable data

When two processors (P1 and P2) have same data element (X) in their local caches and one process
(P1) writes to the data element (X), as the caches are write-through local cache of P1, the main
memory is also updated. Now when P2 tries to read data element (X), it does not find X because the
data element in the cache of P2 has become outdated.

https://www.tutorialspoint.com/parallel_computer_architecture/parallel_computer_architecture_cache_coherence_synchronizatio… 4/7
08/12/2019 Cache Coherence and Synchronization - Tutorialspoint

Process migration

In the first stage, cache of P1 has data element X, whereas P2 does not have anything. A process on
P2 first writes on X and then migrates to P1. Now, the process starts reading data element X, but as
the processor P1 has outdated data the process cannot read it. So, a process on P1 writes to the data
element X and then migrates to P2. After migration, a process on P2 starts reading the data element X
but it finds an outdated version of X in the main memory.

I/O activity

As illustrated in the figure, an I/O device is added to the bus in a two-processor multiprocessor
architecture. In the beginning, both the caches contain the data element X. When the I/O device
receives a new element X, it stores the new element directly in the main memory. Now, when either P1
or P2 (assume P1) tries to read element X it gets an outdated copy. So, P1 writes to element X. Now, if
I/O device tries to transmit X it gets an outdated copy.

https://www.tutorialspoint.com/parallel_computer_architecture/parallel_computer_architecture_cache_coherence_synchronizatio… 5/7
08/12/2019 Cache Coherence and Synchronization - Tutorialspoint

Uniform Memory Access (UMA)

Uniform Memory Access (UMA) architecture means the shared memory is the same for all processors
in the system. Popular classes of UMA machines, which are commonly used for (file-) servers, are the
so-called Symmetric Multiprocessors (SMPs). In an SMP, all system resources like memory, disks,
other I/O devices, etc. are accessible by the processors in a uniform manner.

Non-Uniform Memory Access (NUMA)

In NUMA architecture, there are multiple SMP clusters having an internal indirect/shared network,
which are connected in scalable message-passing network. So, NUMA architecture is logically shared
physically distributed memory architecture.

In a NUMA machine, the cache-controller of a processor determines whether a memory reference is

local to the SMP’s memory or it is remote. To reduce the number of remote memory accesses, NUMA
architectures usually apply caching processors that can cache the remote data. But when caches are
involved, cache coherency needs to be maintained. So these systems are also known as CC-NUMA
(Cache Coherent NUMA).

Cache Only Memory Architecture (COMA)

COMA machines are similar to NUMA machines, with the only difference that the main memories of
COMA machines act as direct-mapped or set-associative caches. The data blocks are hashed to a
location in the DRAM cache according to their addresses. Data that is fetched remotely is actually
stored in the local main memory. Moreover, data blocks do not have a fixed home location, they can
freely move throughout the system.
COMA architectures mostly have a hierarchical message-passing network. A switch in such a tree
contains a directory with data elements as its sub-tree. Since data has no home location, it must be
explicitly searched for. This means that a remote access requires a traversal along the switches in the
tree to search their directories for the required data. So, if a switch in the network receives multiple
requests from its subtree for the same data, it combines them into a single request which is sent to the
parent of the switch. When the requested data returns, the switch sends multiple copies of it down its
subtree.

https://www.tutorialspoint.com/parallel_computer_architecture/parallel_computer_architecture_cache_coherence_synchronizatio… 6/7
08/12/2019 Cache Coherence and Synchronization - Tutorialspoint

COMA versus CC-NUMA

Following are the differences between COMA and CC-NUMA.

COMA tends to be more flexible than CC-NUMA because COMA transparently supports the
migration and replication of data without the need of the OS.

COMA machines are expensive and complex to build because they need non-standard
memory management hardware and the coherency protocol is harder to implement.
Remote accesses in COMA are often slower than those in CC-NUMA since the tree network
needs to be traversed to find the data.

https://www.tutorialspoint.com/parallel_computer_architecture/parallel_computer_architecture_cache_coherence_synchronizatio… 7/7

CSCI 8150 Advanced Computer Architecture
100% (2)
CSCI 8150 Advanced Computer Architecture
46 pages
MN Cache Coherence
No ratings yet
MN Cache Coherence
11 pages
CSA Mod 3-Part 2 Notes (Cache Coherence)
No ratings yet
CSA Mod 3-Part 2 Notes (Cache Coherence)
19 pages
Cache Coherence: CEG 4131 Computer Architecture III Slides Developed by Dr. Hesham El-Rewini
No ratings yet
Cache Coherence: CEG 4131 Computer Architecture III Slides Developed by Dr. Hesham El-Rewini
63 pages
Cache Coherence
No ratings yet
Cache Coherence
63 pages
Ch-7 Cache Coherence and Synchronization
No ratings yet
Ch-7 Cache Coherence and Synchronization
20 pages
L39 - Centralized Shared Memory Architectures
No ratings yet
L39 - Centralized Shared Memory Architectures
31 pages
Shared Memory Architecture
No ratings yet
Shared Memory Architecture
39 pages
Cache Coherence - MESI MOESI
No ratings yet
Cache Coherence - MESI MOESI
57 pages
CA-unit 5-Material-For Reference
No ratings yet
CA-unit 5-Material-For Reference
16 pages
Cache Coherence
No ratings yet
Cache Coherence
53 pages
Parallel 2
No ratings yet
Parallel 2
14 pages
4-Module #4-Shared-Memory-Students-Version-Final-October-24-2024
No ratings yet
4-Module #4-Shared-Memory-Students-Version-Final-October-24-2024
25 pages
Coherence
No ratings yet
Coherence
16 pages
Multiprocessor Cache Coherence
No ratings yet
Multiprocessor Cache Coherence
13 pages
Sequential Consistency and Cache Coherence Protocols: Computer Science and Artificial Intelligence Lab M.I.T
No ratings yet
Sequential Consistency and Cache Coherence Protocols: Computer Science and Artificial Intelligence Lab M.I.T
29 pages
Implement Write-Invalidate Protocol To Demonstrate The Memory Coherency
No ratings yet
Implement Write-Invalidate Protocol To Demonstrate The Memory Coherency
10 pages
Unit 4 - Advanced Computer Architecture - Www.rgpvnotes.in
No ratings yet
Unit 4 - Advanced Computer Architecture - Www.rgpvnotes.in
60 pages
Snooping Cache and Directory Based Multiprocessors
No ratings yet
Snooping Cache and Directory Based Multiprocessors
59 pages
Cache Coherency
No ratings yet
Cache Coherency
33 pages
18bce2429 Da 2 Cao
No ratings yet
18bce2429 Da 2 Cao
13 pages
Term Paper: Cahe Coherence Schemes
No ratings yet
Term Paper: Cahe Coherence Schemes
12 pages
Cache Coherence: Part I: CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012)
No ratings yet
Cache Coherence: Part I: CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012)
31 pages
ACA Lecture 29 Cache-Coherence 2
No ratings yet
ACA Lecture 29 Cache-Coherence 2
42 pages
CS 523 Advanced Computer Architecture: Introduction To Cache Coherence Protocols
No ratings yet
CS 523 Advanced Computer Architecture: Introduction To Cache Coherence Protocols
24 pages
Cache Coherence (Part 1)
No ratings yet
Cache Coherence (Part 1)
13 pages
Computer Architecture: Multiprocessors Shared Memory Architectures Prof. Jerry Breecher CSCI 240 Fall 2003
No ratings yet
Computer Architecture: Multiprocessors Shared Memory Architectures Prof. Jerry Breecher CSCI 240 Fall 2003
24 pages
Cache Coherence
No ratings yet
Cache Coherence
39 pages
Cache Coherence: Write-Invalidate Snooping Protocol For Write-Back
No ratings yet
Cache Coherence: Write-Invalidate Snooping Protocol For Write-Back
21 pages
Parallel Cache Coherence
No ratings yet
Parallel Cache Coherence
14 pages
Multiprocessing: Flynn's Classification (1966)
No ratings yet
Multiprocessing: Flynn's Classification (1966)
8 pages
Memory Hierarchy: Haresh Dagale Dept of ESE
No ratings yet
Memory Hierarchy: Haresh Dagale Dept of ESE
32 pages
Module 4
No ratings yet
Module 4
40 pages
Cache Coherence Snoopy Bus Protocol
No ratings yet
Cache Coherence Snoopy Bus Protocol
15 pages
Distributed Shared Memory
No ratings yet
Distributed Shared Memory
23 pages
Bus-Based Multiprocessor: A.K.A or Snoopy-Bus Architecture
No ratings yet
Bus-Based Multiprocessor: A.K.A or Snoopy-Bus Architecture
54 pages
Cache Coherence_20250120_142158_0000
No ratings yet
Cache Coherence_20250120_142158_0000
34 pages
Multiprocessors and Thread
No ratings yet
Multiprocessors and Thread
4 pages
Cache Coherence Protocols: Evaluation Using A Multiprocessor Simulation Model
No ratings yet
Cache Coherence Protocols: Evaluation Using A Multiprocessor Simulation Model
26 pages
Shared-Memory Architectures: Adapted From A Lecture by Ian Watson, University of Machester
No ratings yet
Shared-Memory Architectures: Adapted From A Lecture by Ian Watson, University of Machester
33 pages
Distributed Shared Memory: Introduction & Thisis
No ratings yet
Distributed Shared Memory: Introduction & Thisis
22 pages
1.symmetric and Distributed Shared Memory Architectures
79% (19)
1.symmetric and Distributed Shared Memory Architectures
29 pages
IJARCCE-46_cachemesiwithverilog
No ratings yet
IJARCCE-46_cachemesiwithverilog
5 pages
Cache Coherence: CSE 661 - Parallel and Vector Architectures
No ratings yet
Cache Coherence: CSE 661 - Parallel and Vector Architectures
37 pages
Shared Memory Multiprocessors: Logical Design and Software Interactions
No ratings yet
Shared Memory Multiprocessors: Logical Design and Software Interactions
107 pages
Shared Memory Architecture Concepts and Performance Issues: Outline
No ratings yet
Shared Memory Architecture Concepts and Performance Issues: Outline
7 pages
MODULE 4 hpc
No ratings yet
MODULE 4 hpc
41 pages
Cache Coherence Presence Flag Technique Censier 1978
No ratings yet
Cache Coherence Presence Flag Technique Censier 1978
7 pages
2.4.6 Cache Coherence in Multiprocessor Systems
No ratings yet
2.4.6 Cache Coherence in Multiprocessor Systems
3 pages
L7 Multicore 1
No ratings yet
L7 Multicore 1
50 pages
Cs 6461 Computer Architecture Lecture 11
No ratings yet
Cs 6461 Computer Architecture Lecture 11
51 pages
Cache Coherence: Caches Memory Coherence Caches Multiprocessing
No ratings yet
Cache Coherence: Caches Memory Coherence Caches Multiprocessing
4 pages
Lecture 18: Coherence Protocols
No ratings yet
Lecture 18: Coherence Protocols
18 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
Chapter 4 TLP
No ratings yet
Chapter 4 TLP
46 pages
Cache AN3544
No ratings yet
Cache AN3544
12 pages
Untitled
No ratings yet
Untitled
27 pages
Operating Systems Interview Questions You'll Most Likely Be Asked
From Everand
Operating Systems Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
PlayStation 2 Architecture: Architecture of Consoles: A Practical Analysis, #12
From Everand
PlayStation 2 Architecture: Architecture of Consoles: A Practical Analysis, #12
Rodrigo Copetti
No ratings yet
Bibliography & Abbreviations
No ratings yet
Bibliography & Abbreviations
5 pages
Mutable Value Semantics
No ratings yet
Mutable Value Semantics
30 pages
Paper 2 2023
No ratings yet
Paper 2 2023
24 pages
Experiment 3: Interfacing LED's With PIC Microcontroller: Step1: Open Proteus and Get The Devices
No ratings yet
Experiment 3: Interfacing LED's With PIC Microcontroller: Step1: Open Proteus and Get The Devices
6 pages
USM Thesis Template
No ratings yet
USM Thesis Template
46 pages
sl44 - Kannad - Elt - SB
No ratings yet
sl44 - Kannad - Elt - SB
4 pages
Detect AdBlock With Javascript A To Z Exams
No ratings yet
Detect AdBlock With Javascript A To Z Exams
1 page
T.E. Sem. V at Vidyalankar: Proposed Time Table For Regular Batches (Online - Live & Interactive)
No ratings yet
T.E. Sem. V at Vidyalankar: Proposed Time Table For Regular Batches (Online - Live & Interactive)
2 pages
Shri Mata Vaishno Devi University, Katra
No ratings yet
Shri Mata Vaishno Devi University, Katra
12 pages
Seo Quiz-Caa-301 PDF
No ratings yet
Seo Quiz-Caa-301 PDF
30 pages
Pt. Tunggang Jagad: Item # Description Qty Unit Price Discount Price
No ratings yet
Pt. Tunggang Jagad: Item # Description Qty Unit Price Discount Price
7 pages
Dropsnake and Lb-Adsa User Manual: Aur Elien Stalder
No ratings yet
Dropsnake and Lb-Adsa User Manual: Aur Elien Stalder
8 pages
Functional Forms of Regression Models: Damodar Gujarati
No ratings yet
Functional Forms of Regression Models: Damodar Gujarati
10 pages
Mini Project Report v3
No ratings yet
Mini Project Report v3
24 pages
Friction Factor Formulas For Cheresources
No ratings yet
Friction Factor Formulas For Cheresources
4 pages
Divisibility
No ratings yet
Divisibility
5 pages
PL2303 Windows Driver User Manual v1.12.0
No ratings yet
PL2303 Windows Driver User Manual v1.12.0
17 pages
General Ledger Manual
100% (3)
General Ledger Manual
329 pages
Rational Numbers and Long Division
No ratings yet
Rational Numbers and Long Division
13 pages
E-Ticket: Departure Flight
No ratings yet
E-Ticket: Departure Flight
2 pages
Somesh Project
No ratings yet
Somesh Project
67 pages
Motion Tracking Using Kalman Filter Matlab Code
100% (2)
Motion Tracking Using Kalman Filter Matlab Code
2 pages
Sap HR Abap Interview Faq
100% (2)
Sap HR Abap Interview Faq
6 pages
Rakesh R Patil: Certified SAP MM Functional Consultant
No ratings yet
Rakesh R Patil: Certified SAP MM Functional Consultant
2 pages
Product Description: eWBB2.1 DBS3900 LTE TDD
No ratings yet
Product Description: eWBB2.1 DBS3900 LTE TDD
28 pages
IDOC Process in SAP:: Events
0% (1)
IDOC Process in SAP:: Events
2 pages
Quadratic Equations Project
No ratings yet
Quadratic Equations Project
2 pages
Social Networking Software: A SWOT Analysis
No ratings yet
Social Networking Software: A SWOT Analysis
10 pages
Excel Amount To Words
No ratings yet
Excel Amount To Words
3 pages
Advanced SQL
No ratings yet
Advanced SQL
46 pages