MCP ppt
MCP ppt
MCP ppt
Multicore Programming
K. Nagalakshmi, ASP/IT
Department of Information Technology
E.G.S. Pillay Engineering Technology
UNIT I
INTRODUCTION TO MULTICORE
PROCESSORS
Course objective: Understand the recent trends in the field of computer
architecture and identify performance related parameters
Course outcome: Understand the limitations of ILP and the need for
multicore architectures
UNIT
INTRODUCTION TO MULTICORE
PROCESSORS
• Scalable design principles
• Principles of processor design
• Instruction Level Parallelism
• Thread level parallelism.
• Parallel computer models
• Symmetric and distributed shared memory architectures
• Multi-core Architectures
• Software and hardware multithreading
• SMT and CMP architectures
• Design issues
• Case studies
• Intel Multicore architecture
• SUN CMP architecture
Multi-core Architectures
• An architecture where a single physical processor
incorporates the core logic of more than one
processor.
• A single IC is used to package or hold these
processors - die
• Better performance and reduced energy
consumption
• This technology is most commonly used in multicore
processors, where two or more processor chips or
cores run concurrently as a single system.
• Multicore-based processors are used in mobile
devices, desktops, workstations and servers.
Types of Multicore processor
Flynn’s Taxonomy
Parallel Computer Models
Symmetric and distributed shared
memory architectures
• Symmetric Shared Memory Architecture consists of several
processors with a single physical memory shared by all processors
through a shared bus.
• Distributed Shared Memory Architecture Processors have their own
local memory.
Symmetric Multi Processor(SMP)architectures
or
Symmetric shared memory architectures
• Shared memory parallel computers vary widely, but generally have in
common the ability for all processors to access all memory as global address
space.
• Multiple processors can operate independently but share the same memory
resources.
• Changes in a memory location effected by one processor are visible to all
other processors.
• Shared memory machines can be divided into two main classes based upon
memory access times: UMA and NUMA.
• Uniform Memory Access (UMA):
• Non-Uniform Memory Access (NUMA)
Uniform Memory Access (UMA) SMP architectures
• Most commonly represented today - Identical processors
• Equal access and access times to memory
• Sometimes called CC-UMA - Cache Coherent UMA. Cache coherent means if one
processor updates a location in shared memory, all the other processors know
about the update. Cache coherency is accomplished at the hardware level.
• Source PE writes data to GM & destination retrieves
it
• Easy to build, conventional OSes of SISD can be easily
be ported
• Limitation : reliability & expandability.
• A memory component or any processor failure
affects the whole system.
• Increase of processors leads to memory contention.
• Ex. : Silicon graphics supercomputers....
Uniform Memory Access (UMA) SMP architectures
Non-Uniform Memory Access (NUMA) SMP
architectures
• Often made by physically linking two or more SMPs
• One SMP can directly access memory of another SMP
• Not all processors have equal access time to all memories
• Memory access across link is slower
• If cache coherency is maintained, then may also be called CC-NUMA -
Cache Coherent NUMA
Shared Memory: Pros and Cons
Advantages
• Global address space provides a user-friendly programming
• Data sharing between tasks is both fast and uniform due to the proximity of
memory to CPUs
Disadvantages
• Lack of scalability between memory and CPUs. Adding more CPUs can geometrically
increases traffic on the shared memory-CPU path, and for cache coherent systems,
geometrically increase traffic associated with cache/memory management.
• Programmer responsibility for synchronization constructs that insure "correct"
access of global memory.
• Increasingly difficult and expensive to design and produce shared memory
machines with ever increasing numbers of processors.
Distributed Memory Architecture
• Processors have their own local memory. Memory addresses
in one processor do not map to another processor, so there
is no concept of global address space across all processors.
• DSM require a communication network to connect inter-
processor memory.
• Because each processor has its own local memory, it
operates independently.
• Changes it makes to its local memory have no effect on the
memory of other processors. Hence, the concept of cache
coherency does not apply.
• When a processor needs access to data in another processor,
it is usually the task of the programmer to explicitly define
how and when data is communicated. Synchronization
between tasks is likewise the programmer's responsibility.
• The network "fabric" used for data transfer varies widely,
though it can be as simple as Ethernet.
Distributed Memory Architecture
Distributed Memory: Pro and Con
Advantages
• Memory is scalable with number of processors. Increase the number of processors and the
size of memory increases proportionately.
• Each processor can rapidly access its own memory without interference and without the
overhead incurred with trying to maintain cache coherency.
• Cost effectiveness: can use commodity, off-the-shelf processors and networking.
Disadvantages
• The programmer is responsible for many of the details associated with data communication
between processors.
• It may be difficult to map existing data structures, based on global memory, to this memory
organization.
• Non-uniform memory access (NUMA) times
Hybrid Distributed-Shared Memory
• The largest and fastest computers in the world today employ both shared and
distributed memory architectures.
• The shared memory component is usually a cache coherent SMP machine.
Processors on a given SMP can address that machine's memory as global.
• The distributed memory component is the networking of multiple SMPs. SMPs
know only about their own memory - not the memory on another SMP. Therefore,
network communications are required to move data from one SMP to another.
• Current trends seem to indicate that this type of memory architecture will
continue to prevail and increase at the high end of computing for the foreseeable
future.
• Advantages and Disadvantages: whatever is common to both shared and
distributed memory architectures.
Hybrid Distributed-Shared Memory