0% found this document useful (0 votes)

174 views

Intro - HPC Cluster Computing v2 PDF

This document introduces high performance computing (HPC) and cluster computing. It discusses how HPC is needed to solve problems that require large amounts of computational resources. It then provides an overview of parallel computing and how cluster computing uses multiple interconnected computers to work in parallel. Examples of parallel processing and I/O are also presented.

Uploaded by

kucaisoft

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

174 views

Intro - HPC Cluster Computing v2 PDF

Uploaded by

kucaisoft

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 73

Pengenalan

Hi h P
High Performance
f C
Computing:
ti
Cluster Computing

Heru Suhartanto,
Fakultas Ilmu Komputer UI,
heru@cs.ui.ac.id

http:// bit.ly/hs-wshop-hpc 1
Outline
KKenapa perlu
l high
hi h performance
f
computing (HPC) : Masalah
membutuhkan sumber daya
komputasi tinggi
 Pengenalan HPC - komputasi parallel
 Cluster Computing

http:// bit.ly/hs-wshop-hpc 2
Resource Hungry
g y Applications
pp
• Solving grand challenge applications using computer modeling,
simulation and analysis : Lihat contoh Deteksi Tumor dan
Perancangan obat insilico, di slide terpisah

Aerospace
Internet &
Life Sciences
co e ce
Ecommerce

http:// bit.ly/hs-wshop-hpc 3
CAD/CAM Digital Biology Military Applications
Contoh proses Parallel

http:// bit.ly/hs-wshop-hpc 4
Contoh proses Parallel

http:// bit.ly/hs-wshop-hpc 5
Contoh proses Parallel

http:// bit.ly/hs-wshop-hpc 6
Parallel I/O

http:// bit.ly/hs-wshop-hpc 7
Parallel I/O

http:// bit.ly/hs-wshop-hpc 8
The Area/research topics
 At Applications/problems
A li ti / bl tto b
be
solved,
 At Cluster
Cl t computingti problems
bl

http:// bit.ly/hs-wshop-hpc 9
http:// bit.ly/hs-wshop-hpc 10
http:// bit.ly/hs-wshop-hpc 11
http:// bit.ly/hs-wshop-hpc 12
http:// bit.ly/hs-wshop-hpc 13
Windows of Opportunities
 Parallel Processing
 Use multiple processors to build MPP/DSM-like systems for
parallel computing
 Network RAM
 Use memory associated with each workstation as aggregate
DRAM cache
 Software RAID
 Redundant array of inexpensive disks
 Use the arrays of workstation disks to provide cheap,
cheap highly
available, & scalable file storage
 Possible to provide parallel I/O support to applications
 Use arrays of workstation disks to provide cheap, highly
available and scalable file storage
available,
 Multipath Communication
 Use multiple networks for parallel data transfer between nodes

http:// bit.ly/hs-wshop-hpc 14
How to Run Applications Faster ?

There are 3 ways

y to improve
p
performance:
Work Harder
Work Smarter
Get Help
Computer Analogy
Usingg faster hardware
Optimized algorithms and techniques used
to solve computational tasks
Multiple computers to solve a particular 15
http:// bit.ly/hs-wshop-hpc
task
Era of Computing

 Rapid
p technical advances
 the recent advances in VLSI technology
 software technology
 OS,
OS PL,
PL development
d l methodologies,
h d l i & tools
l
 grand challenge applications have become
the main driving
g force
 Parallel computing
 one of the best ways to overcome the speed
bottleneck of a single processor
 good price/performance ratio of a small
cluster-based
cluster based parallel computer
http:// bit.ly/hs-wshop-hpc 16
In Summary
 Need more computing power
 Improve the operating speed of processors &
other components
 constrained by the speed of light,
thermodynamic laws, & the high financial costs
for processor fabrication

 Connect multiple processors together &

coordinate their computational efforts
 parallel computers
 allow the sharing of a computational task among
multiple processors
http:// bit.ly/hs-wshop-hpc 17
Technology Trends...

 Performance of PC/Workstations components

has almost reached performance of those
used in supercomputers…
 Microprocessors (50% to 100% per year)
 Networks (Gigabit SANs);
 Operating Systems (Linux,...);
 Programming environment (MPI(MPI,…);
);
 Applications (.edu, .com, .org, .net, .shop, .bank);
 The rate of performance improvements
off commodity
di systems is
i much h rapid
id
compared to specialized systems.

http:// bit.ly/hs-wshop-hpc 18
Technology Trends

http:// bit.ly/hs-wshop-hpc 19
Trend
 [Traditional Usage] Workstations with
Unix for science & industry vs PC-
based machines for administrative
work & work processing
 [[Trend]] A rapid
p convergence
g in
processor performance and kernel-
level functionality of Unix workstations
and PC-based
PC based machines

http:// bit.ly/hs-wshop-hpc 20
Rise and Fall of Computer
Architectures
 Vector Computers (VC) - proprietary system:
 provided the breakthrough needed for the emergence
of computational science, buy they were only a
partial answer.
 Massively
M i l P Parallel
ll l P
Processors (MPP) -proprietary i t
systems:
 high cost and a low performance/price ratio.
 Symmetric Multiprocessors (SMP):
 suffers from scalability
 Distributed Systems:
 difficult to use and hard to extract parallel
performance.
 Clusters - gaining popularity:
 High Performance Computing - Commodity
S
Supercomputingti
 High Availabilityhttp://
Computing - Mission Critical
bit.ly/hs-wshop-hpc 21
Applications
The Dead Supercomputer Society
h
http://www.paralogos.com/DeadSuper/
// l /D dS /
 ACRI  Dana/Ardent/Stellar
 Alliant  Elxsi
 American  ETA Systems
Supercomputer  Evans & Sutherland
 Ametek Computer
p Division
 Applied Dynamics Floating Point Systems
  Meiko
 Astronautics  Galaxy YH-1
 BBN Convex C4600  Myrias
 Goodyear Aerospace
 CDC  Thinking
MPP
 Convex Machines
 Gould NPL
 Cray Computer  Saxpy
 Guiltech
 Cray Research  Scientific
(SGI?Tera)  I t lS
Intel Scientific
i tifi
Computers Computer
 Culler-Harris
 Intl. Parallel Machines Systems (SCS)
 Culler Scientific
 Cydrome  KSR  Soviet
 MasPar Supercomputer
http:// bit.ly/hs-wshop-hpc s 22

 Suprenum
Computer Food Chain: Causing the
demise of specialize systems

•Demise of mainframes, supercomputers, & MPPs

http:// bit.ly/hs-wshop-hpc 23
Towards Clusters

http:// bit.ly/hs-wshop-hpc 24
The promise of supercomputing to the average PC User ?
Towards Commodity Parallel
Computing
 linking together two or more computers to jointly
solve computational problems
 since the early 1990s, an increasing trend to move
away y from expensive
p and specialized
p proprietary
p p y
parallel supercomputers towards clusters of
workstations
 Hard to find money to buy expensive systems
 th
the rapid
id improvement
i t in
i the
th availability
il bilit off
commodity high performance components for
workstations and networks
 Low
Low-cost
cost commodity supercomputing
 from specialized traditional supercomputing platforms
to cheaper, general purpose systems consisting of
g or
loosely coupled components built up from single
multiprocessor PCs or workstations
http:// bit.ly/hs-wshop-hpc 25
History: Clustering of
Computers for Collective
Computing

PDA
Clusters

1960 1980s 1990

http:// bit.ly/hs-wshop-hpc 1995+ 2000+
26
Why PC/WS Clustering Now ?
 Individual PCs/workstations are becoming increasing
powerful
 Commodity networks bandwidth is increasing and
l ten is
latency i de
decreasing
e ing
 PC/Workstation clusters are easier to integrate into
existing networks
 Typical low user utilization of PCs/WSs
 Development tools for PCs/WS are more mature
 PC/WS clusters are a cheap and readily available
 Clusters can be easily grown

http:// bit.ly/hs-wshop-hpc 27
What is Cluster ?
 A cluster is a type of parallel or distributed processing
system, which consists of a collection of interconnected
stand-alone computers cooperatively working together as a
single,
g integrated
g computing
p g resource.
 A node
 a single or multiprocessor system with memory, I/O
facilities, & OS
 generally 2 or more computers (nodes) connected
together
 in a single cabinet
cabinet, or physically separated & connected
via a LAN
 appear as a single system to users and applications
 provide a costcost-effective
effective way to gain features and
benefits
http:// bit.ly/hs-wshop-hpc 28
Cluster Architecture
Parallel Applications
Parallel Applications
P
Parallel
ll l Applications
A li ti
Sequential Applications
Sequential Applications
Sequential Applications Parallel Programming Environment

Cluster Middleware
(Single System Image and Availability Infrastructure)

PC/Workstation PC/Workstation PC/Workstation PC/Workstation

Communications Communications Communications Communications

Software Software Software Software

Network Interface Network Interface Network Interface Network Interface

Hardware Hardware Hardware Hardware

Cluster Interconnection Network/Switch

http:// bit.ly/hs-wshop-hpc 29
Cluster Components

http:// bit.ly/hs-wshop-hpc 30
Prominent Components of
Cluster Computers (I)
Multiple High Performance
Computers
 PCs
 Workstations
 SMPs (CLUMPS)
 Distributed HPC Systems leading to
Grid Computing

http:// bit.ly/hs-wshop-hpc 31
System CPUs
 Processors
 Intel x86 Processors
 Pentium Pro and Pentium Xeon
 AMD x86x86, Cyrix x86,
x86 etc.
etc
 Digital Alpha
 Alpha 21364 processor integrates processing,
memory controller,
ll network
k interface
f into a
single chip
 IBM PowerPC
 Sun SPARC
 SGI MIPS
 HP PA
http:// bit.ly/hs-wshop-hpc 32
System Disk
 Disk and I/O
 Overall improvement in disk access time
has been less than 10% p
per yyear
 Amdahl’s law
 Speed-up
p p obtained by y from faster
processors is limited by the slowest system
component
 Parallel
ll l I/O
/
 Carry out I/O operations in parallel,
supported by parallel file system based on
hardware or software
http:// RAID
bit.ly/hs-wshop-hpc 33
Commodity Components for
Clusters (II): Operating Systems
 Operating Systems
 2 fundamental services for users
 make the computer hardware easier to use
 create a virtual machine that differs markedly from the
reall machine
h
 share hardware resources among users
 Processor - multitasking
 Th
The new concept in
i OS services
i
 support multiple threads of control in a process
itself
 parallelism within a process
 multithreading
 POSIX thread interface is a standard programming environment
 Trend
 Modularity – http://
MS Windows, IBM OS/2
bit.ly/hs-wshop-hpc 34
 Microkernel – provide only essential OS services
 high le el abst action of OS po tabilit
Prominent Components of
Cluster Computers
 State of the art Operating Systems
 Linux (MOSIX, Beowulf, and many more)
 Microsoft NT(Illinois HPVM
HPVM, Cornell Velocity)
 SUN Solaris (Berkeley NOW, C-DAC PARAM)
 IBM AIX ((IBM SP2))
 HP UX (Illinois - PANDA)
 Mach (Microkernel based OS) (CMU)
 Cluster Operating Systems (Solaris MC, SCO
Unixware, MOSIX (academic project)
 OS gluing layers (Berkeley Glunix)
http:// bit.ly/hs-wshop-hpc 35
Commodity Components for
Clusters
 Operating Systems
 Linux
 Unix-like OS
 Runs on cheap x86 platform,
platform yet offers the power
and flexibility of Unix
 Readily available on the Internet and can be
downloaded without cost
 Easy to fix bugs and improve system performance
 Users can develop or fine-tune hardware drivers
which can easily be made available to other users
 Features such as preemptive multitasking,
p g virtual memory,
demand-page y, multiuser,,
multiprocessor support
http:// bit.ly/hs-wshop-hpc 36
Commodity Components for
Clusters
 Operating Systems
 Solaris
S l i
 UNIX-based multithreading and multiuser OS
 support Intel x86 & SPARC-based platforms
 Real
Real-time
time scheduling feature critical for multimedia applications
 Support two kinds of threads
 Light Weight Processes (LWPs)
 User level thread
 Support both
b h BSD and
d severall non-BSD file
f l system
 CacheFS
 AutoClient
 TmpFS: uses main memory to contain a file system
 Proc file system
 Volume file system
 Support distributed computing & is able to store & retrieve
distributed information
 OpenWindows allows application to be run on remote systems
http:// bit.ly/hs-wshop-hpc 37
Commodity Components for
Clusters
 Operating Systems
 Microsoft Windows NT (New Technology)
 Preemptive, multitasking, multiuser, 32-bits OS
 Object
Object-based
based security model and special file system
(NTFS) that allows permissions to be set on a file and
directory basis
 Support
pp multiple
p CPUs and p provide multitasking g using
g
symmetrical multiprocessing
 Support different CPUs and multiprocessor machines
with threads
 Have the network protocols & services integrated with
the base OS
 several built-in networking protocols (IPX/SPX, TCP/IP,
N tBEUI) & API
NetBEUI), APIs (NetBIOS,
(N tBIOS DCE RPC
RPC, Wi
Window
d S
Sockets
k t
(Winsock))
http:// bit.ly/hs-wshop-hpc 38
Prominent Components of
Cluster Computers (III)
 High Performance Networks/Switches
 Ethernet
Eth t (10Mb
(10Mbps),
)
 Fast Ethernet (100Mbps),
 Gigabit Ethernet (1Gbps)
 SCI (Scalable Coherent Interface- MPI- 12µsec
latency)
 ATM (Asynchronous Transfer Mode)
 Myrinet (1.2Gbps)
 QsNet (Quadrics Supercomputing World, 5µsec
latency for MPI messages)
 Digital Memory Channel
 FDDI (fiber distributed data interface)
 InfiniBand
http:// bit.ly/hs-wshop-hpc 39
Prominent Components of
Cluster Computers (IV)
 Fast Communication Protocols
and Services (User Level
Communication):
 Active Messages (Berkeley)
 Fast Messages (Illinois)
 U-net (Cornell)
 XTP (Virginia)
 Virtual Interface Architecture (VIA)

http:// bit.ly/hs-wshop-hpc 40
Cluster Interconnects: Comparison
(created in 2000)
Myrinet QSnet Giganet ServerNet2 SCI Gigabit
Ethernet

Bandwidth 140 – 33MHz 208 ~105 165 ~80 30 - 50

(MBytes/s) 215 – 66 Mhz

MPI 16.5 – 33Nhz 5 ~20 - 40 20.2 6 100 - 200

Latency (µs) 11 – 66 Mhz

List price/port $1.5K $6.5K ~$1.5K $1.5K ~$1.5K ~$1.5K

Hardware Now Now Now Q2‘00 Now Now

Availability

Linux Support Now Now Now Q2‘00 Now Now

Maximum 1000’s 1000’s 1000’s 1000’s 64K 1000’s

#nodes
Protocol Firmware on Firmware Firmware on Implemented in Firmware Implemented
Implementation adapter on adapter adapter hardware on adapter in hardware

VIA support Soon None NT/Linux Done in hardware Software NT/Linux

TCP/IP, VIA

MPI support 3rd party Quadrics/ 3rd Party Compaq/3rd party 3rd Party MPICH – TCP/IP
Compaq
http:// bit.ly/hs-wshop-hpc 41
Commodity Components for
Clusters
 Cluster Interconnects
 Communicate over high-speed networks using a
standard networking protocol such as TCP/IP or a low-
level p
protocol such as AM
 Standard Ethernet
 10 Mbps
 cheap, easy way to provide file and printer sharing
 bandwidth & latency are not balanced with the
computational power
 Ethernet, Fast Ethernet, and Gigabit Ethernet
 Fast Ethernet – 100 Mbps
 Gigabit Ethernet
 preserve Ethernet
Ethernet’s s simplicity
 deliver a very high bandwidth to aggregate multiple Fast
Ethernet segments
http:// bit.ly/hs-wshop-hpc 42
Commodity Components for
Clusters
 Cluster Interconnects
 Myrinet
 1.28 Gbps full duplex interconnection network
 Use low latency cut-through routing switches, which is able
t offer
to ff fault
f lt tolerance
t l by
b automatic
t ti mapping
i off the
th
network configuration
 Support both Linux & NT
 Advantages
 Very low latency (5s, one-way point-to-point)
 Very high throughput
 Programmable onon-board
board processor for greater flexibility
 Disadvantages
 Expensive: $1500 per host
 Complicated scaling: switches with more than 16 ports are
unavailable
http:// bit.ly/hs-wshop-hpc 43
Prominent Components of
Cluster Computers (V)
 Cluster Middleware
 Single System Image (SSI)
 System Availability (SA) Infrastructure
 Hardware
 DEC Memory Channel, DSM (Alewife, DASH), SMP
Techniques
 Operating
O ti S
System
t K
Kernel/Gluing
l/Gl i L
Layers
 Solaris MC, Unixware, GLUnix, MOSIX
 Applications and Subsystems
 Applications (system management and electronic forms)
 Runtime systems (software DSM, PFS etc.)
 Resource management and scheduling (RMS) software
 SGE (Sun Grid Engine), LSF, PBS, Libra: Economy Cluster Scheduler,
NQS, etc. http:// bit.ly/hs-wshop-hpc 44
Advanced Network Services/
Communication SW
 Communication infrastructure support protocol for
 Bulk data transport
Bulk-data
 Streaming data
 Group communications
 Communication service provide cluster with important QoS
parameters
 Latency
 Bandwidth
 Reliability
 F l
Fault-tolerance
l
 Jitter control
 Network service are designed as hierarchical stack of protocols
y low-level communication API,, provide
with relatively p means to
implement wide range of communication methodologies
 RPC
 DSM
 Stream-based
Stream based and message passing interface (e.g., MPI, PVM)

http:// bit.ly/hs-wshop-hpc 45
Prominent Components of
Cluster Computers (VI)
 Parallel Programming Environments and Tools
 Threads (PCs, SMPs, NOW..)
 POSIX Threads
 Java Threads
 MPI (Message Passing Interface)
 Linux, NT, on many Supercomputers
 PVM (Parallel Virtual Machine)
 Parametric Programming
g g
 Software DSMs (Shmem)
 Compilers
 C/C++/Java
 P
Parallel
ll l programming
i with
ith C++ (MIT P
Press b
book)
k)
 RAD (rapid application development tools)
 GUI based tools for PP modeling
 Debuggers
 Performance Analysis Tools
http:// bit.ly/hs-wshop-hpc 46
 Visualization Tools
Prominent Components of
Cluster Computers (VII)
 Applications
 Sequential
 Parallel
P ll l / Di
Distributed
t ib t d (Cl
(Cluster-aware
t
app.)
 Grand
G d Ch
Challenging
ll i applications
li ti
 Weather Forecasting
 Quantum Chemistry
 Molecular Biology Modeling
 Engineering Analysis (CAD/CAM)
 ……………….
 PDBs, webhttp://
servers,data-mining
bit.ly/hs-wshop-hpc 47
Key Operational Benefits of
Clustering
 High Performance
 Expandability and Scalability
 High Throughput
 High Availability

http:// bit.ly/hs-wshop-hpc 48
Clusters Classification (I)

A li ti
Application T
Targett
 High Performance (HP)
Clusters
Grand Challenging Applications
 High Availability (HA) Clusters
Mission Critical applications

http:// bit.ly/hs-wshop-hpc 49
Clusters Classification (II)

N d O
Node Ownership
hi
 Dedicated Clusters
 Non-dedicated clusters
ti
Adaptive
Ad parallel
ll l computing
ti
Communal multiprocessing

http:// bit.ly/hs-wshop-hpc 50
Clusters Classification (III)

N d H
Node Hardware
d
 Clusters of PCs (CoPs)
Piles of PCs (PoPs)
 Clusters
Cl ste s of Wo
Workstations
kstations
(COWs)
 Clusters of SMPs (CLUMPs)

http:// bit.ly/hs-wshop-hpc 51
Clusters Classification (IV)
 Node Operating System
 Linux Clusters (e.g., Beowulf)
 Solaris Clusters (e.g.,
(e g Berkeley
NOW)
 NT Clusters (e.g., HPVM)
 AIX Clusters (e.g., IBM SP2)
 SCO/Compaq Clusters (Unixware)
 Digital VMS Clusters
 HP
HP-UX
UX clusters
 Microsoft Wolfpack clusters
http:// bit.ly/hs-wshop-hpc 52
Clusters Classification (V)

N d C
Node Configuration
fi ti
 Homogeneous Clusters
 All nodes will have similar
architectures and run the same OSs
 Heterogeneous Clusters
 All nodes will have different
architectures and run different OSs

http:// bit.ly/hs-wshop-hpc 53
Clusters Classification (VI)
 Levels of Clustering
 Group Clusters (#nodes: 2-99)
 Nodes are connected by SAN like Myrinet
 Departmental Clusters (#nodes: 10s to 100s)
 Organizational Clusters (#nodes: many 100s)
 National Metacomputers (WAN/Internet-based)
 International Metacomputers (Internet-based,
#nodes: 1000s to many millions)
 Grid Computing
p g
 Web-based Computing
 Peer-to-Peer Computing

http:// bit.ly/hs-wshop-hpc 54
Cluster Programming

http:// bit.ly/hs-wshop-hpc 55
Levels of Parallelism
Code-Granularity
Code Item
PVM/MPI Task ii-ll Task i Task i+1 Large grain
(task level)
Program

func1 ( ) func2 ( ) func3 ( )

{ { {
Medium grain
Threads ....
....
....
....
....
.... (control level)
} } } Function (thread)

Fine grain
Compilers
a ( 0 ) =.. a ( 1 )=.. a ( 2 )=..
b ( 0 ) =.. b ( 1 )=.. b ( 2 )=.. (data level)
Loop (Compiler)

Very fine
f grain
CPU + x Load
(multiple issue)
http:// bit.ly/hs-wshop-hpc 56
With hardware
Cluster Programming
Environments
 Shared Memory Based
 DSM
 Threads/OpenMP (enabled for clusters)
 Java threads ((IBM cJVM))
 Message Passing Based
 PVM (PVM)
 MPI (MPI)
 Parametric Computations
 Nimrod-G and Gridbus Data Grid Broker
 Automatic Parallelising Compilers
 Parallel Libraries & Computational Kernels
(
(e.g., N
NetSolve)
tS l )
http:// bit.ly/hs-wshop-hpc 57
Programming Environments and
Tools (I)
 Threads (PCs,
(PCs SMPs,
SMPs NOW..)
NOW )
 In multiprocessor systems
 Used to simultaneously y utilize all the available
processors
 In uniprocessor systems
 Used to utilize the system resources effectively
 Multithreaded applications offer quicker response to
user input and run faster
 Potentially portable, as there exists an IEEE
standard for POSIX threads interface (pthreads)
 Extensively used in developing both application and
system software
http:// bit.ly/hs-wshop-hpc 58
Programming Environments and
Tools (II)
 Message Passing Systems (MPI and PVM)
 Allow efficient parallel programs to be written for
distributed memory systems
 2 most popular high-level message-passing systems –
PVM & MPI
 PVM
 both an environment & a message-passing library
 MPI
 a message passing specification, designed to be
standard for distributed memory y parallel
p
computing using explicit message passing
 attempt to establish a practical, portable, efficient,
& flexible standard for message passing
 generally, application developers prefer MPI, as it
is fast becoming
http:// the de facto standard for message
bit.ly/hs-wshop-hpc 59
passing
Programming Environments and
Tools (III)
 Distributed Shared Memory (DSM) Systems
 Message-passing
 the most efficient, widely used, programming paradigm on distributed
memory system
 complex & difficult to program
 Sh
Shared
d memory systems
t
 offer a simple and general programming model
 but suffer from scalability
 DSM on distributed memory
y system
y
 alternative cost-effective solution
 Software DSM
 Usually built as a separate layer on top of the comm interface
 Take full advantage of the application characteristics: virtual pages, objects,
& language types are units of sharing
 TreadMarks, Linda
 Hardware DSM
 Better performance, no burden on user & SW layers, fine granularity of
sharing, extensions of the cache coherence scheme, & increased HW
complexity
 DASH, Merlin http:// bit.ly/hs-wshop-hpc 60
Programming Environments and
Tools (IV)
 Parallel Debuggers and Profilers
 Debuggers
 Very limited
 HPDF (High Performance Debugging Forum) as Parallel Tools
Consortium project in 1996
 Developed a HPD version specification, which defines the
functionality, semantics, and syntax for a commercial-line
parallel debugger
 TotalView
 A commercial product from Dolphin Interconnect Solutions
 The only widely available GUI-based parallel debugger
that supports multiple HPC platforms
 Only used in homogeneous environments, where each process of
the parallel application being debugged must be running under
the same version of the OS

http:// bit.ly/hs-wshop-hpc 61
Functionality of Parallel
Debugger
 Managing multiple processes and multiple
th
threads
d within
ithi a process
 Displaying each process in its own window
 Displaying source code, stack trace, and stack
frame for one or more processes
 Diving into objects, subroutines, and functions
 Setting both source-level
source level and machine-level machine level
breakpoints
 Sharing breakpoints between groups of
processes
 Defining watch and evaluation points
 Displaying arrays and its slices
 Manipulating code variable and constants
http:// bit.ly/hs-wshop-hpc 62
Programming Environments and
Tools (V)
 Performance Analysis Tools
 Help a p
programmer
og amme to understand
nde stand the performance
pe fo mance
characteristics of an application
 Analyze & locate parts of an application that exhibit
poor performance and create program bottlenecks
 Major components
 A means of inserting instrumentation calls to the
performance monitoring routines into the user
user’s
s applications
 A run-time performance library that consists of a set of
monitoring routines
 A set of tools for processing and displaying the performance
d t
data
 Issue with performance monitoring tools
 Intrusiveness of the tracing calls and their impact on the
application performance
 Instrumentation affects the performance characteristics of
the parallel application and thus provides a false view of 63
http:// bit.ly/hs-wshop-hpc its
performance behavior
Performance Analysis
and Visualization Tools
Tool Supports URL
AIMS Instrumentation, monitoring http://science.nas.nasa.gov/Software/AIMS
library, analysis

MPE Logging library and snapshot http://www.mcs.anl.gov/mpi/mpich

performance
f visualization
i li ti

Pablo Monitoring library and analysis http://www-pablo.cs.uiuc.edu/Projects/Pablo/

Paradyn
a ady Dynamic instrumentation http://www.cs.wisc.edu/paradyn
running analysis

SvPablo Integrated instrumentor, http://www-pablo.cs.uiuc.edu/Projects/Pablo/

monitoring library and analysis

V
Vampir
i Monitoring
M it i library
lib performance
f htt //
http://www.pallas.de/pages/vampir.htm
ll d / / i ht
visualization

Dimenma Performance prediction for http://www.pallas.com/pages/dimemas.htm

message passing programs
s
Paraver Program visualization and http://www.cepba.upc.es/paraver
http:// bit.ly/hs-wshop-hpc 64
analysis
Programming Environments and
Tools (VI)
 Cluster Administration Tools
 Berkeley NOW
 Gather & store data in a relational DB
 Use Java applet to allow users to monitor a system
 SMILE (Scalable Multicomputer Implementation using
Low-cost Equipment)
 Called K-CAP
 Consist of compute nodes, a management node, & a
client that can control and monitor the cluster
 K-CAP uses a Java applet to connect to the management
node through a predefined URL address in the cluster
 PARMON
 A comprehensive environment for monitoring large
clusters
 Use client-server techniques to provide transparent
access to all
a nodes
odes to be monitored
o to ed
 parmon-server & parmon-client
http:// bit.ly/hs-wshop-hpc 65
Cluster Applications

http:// bit.ly/hs-wshop-hpc 66
Cluster Applications
 Numerous Scientific & engineering
g g Apps.
pp
 Business Applications:
 E-commerce Applications (Amazon,
eBay);
y)
 Database Applications (Oracle on
clusters).
 Internet Applications:
 ASPs (Application Service Providers);
 Computing Portals;
 E-commerce and E-business.
 Mission Critical Applications:
 command control systems, banks,
nuclear reactor control, star-wars, and
h dli
handling lif
life threatening
h i situations.
i i
http:// bit.ly/hs-wshop-hpc 67
Some Cluster Systems:
Comparison
Project Platform Communications OS Other
Beowulf PCs Multiple Ethernet with Linux and MPI/PVM.
TCP/IP Grendel Sockets and HPF

Berkeley
B k l Solaris-based
S l i b d Myrinet
M i andd Active
A i Solaris
S l i + AM, PVM
AM PVM, MPI
MPI,
Now PCs and Messages GLUnix + xFS HPF, Split-C
workstations
HPVM PCs Myrinet with Fast NT or Linux Java-fronted,
Java fronted
Messages connection and FM, Sockets,
global Global Arrays,
resource SHEMEM and
manager + MPI
LSF
Solaris MC Solaris-based Solaris-supported Solaris + C++ and
PC and
PCs d Gl b li i
Globalization CORBA
workstations layer
http:// bit.ly/hs-wshop-hpc 68
Cluster of SMPs (CLUMPS)
 Clusters of multiprocessors (CLUMPS)
 To be the supercomputers of the future
 Multiple
u t p e SMPs
S s with
t several
se e a network
et o
interfaces can be connected using high
performance networks
 2 advantages
 Benefit from the high performance, easy-to-
use-and program SMP systems with a small
number
b off CPUs
C
 Clusters can be set up with moderate effort,
resulting in easier administration and better
support for data locality inside a node
http:// bit.ly/hs-wshop-hpc 69
Many types of Clusters
 High Performance Clusters
 Linux Cluster; 1000 nodes; parallel programs; MPI
 Load-leveling Clusters
 Move p processes around to borrow cycles
y (eg.
( g Mosix))
 Web-Service Clusters
 LVS/Piranah; load-level tcp connections; replicate
data
 Storage Clusters
l
 GFS; parallel filesystems; same view of data from
each node
 Database Clusters
 Oracle Parallel Server;
 High Availability Clusters
 ServiceGuard,
ServiceGuard Lifekeeper
Lifekeeper, Failsafe,
Failsafe heartbeat,
heartbeat
failover clusters
http:// bit.ly/hs-wshop-hpc 70
Cluster Design Issues

• Enhanced
E h dPPerformance
f (performance
( f @ low
l cost)
t)
• Enhanced Availability (failure management)
• Single System Image (look-and-feel of one system)
• Size Scalability (physical & application)
• Fast Communication (networks & protocols)
• Load Balancing (CPU, Net, Memory, Disk)
• Security and Encryption (clusters of clusters)
• Distributed Environment (Social issues)
• Manageability (admin
(admin. And control)
• Programmability (simple API if required)
• Applicability (cluster-aware and non-aware app.)

http:// bit.ly/hs-wshop-hpc 71
Summary: Cluster Advantage
 Price/performance ratio is low when
compared with a dedicated parallel
supercomputer.
 Incremental growth that often matches
with the demand patterns.
 The provision of a multipurpose system
 Scientific, commercial, Internet applications
 Have become mainstream enterprise
computing
p g systems:
y
 In 2003 List of Top 500 Supercomputers, over
50% of them are based on clusters and many of
them are deployed in industries.

http:// bit.ly/hs-wshop-hpc 72
References
 Hi
HighhPPerformance
f Cl
Cluster
t C Computing:
ti
Architectures and Systems, Book
Editor: Rajkumar Buyya,
Buyya Slides: Hai
Jin and Raj Buyya
 http://www.buyya.com/cluster
http://www buyya com/cluster
 Bahan kuliah Topik Dalam Komputasi
Paralel Fasilkom UI
Paralel,

http:// bit.ly/hs-wshop-hpc 73

SQ Emulator OF68-MANUEL PDF
100% (3)
SQ Emulator OF68-MANUEL PDF
187 pages
Trackpad Pro Ver. 5.0 Class 6
From Everand
Trackpad Pro Ver. 5.0 Class 6
Nidhi Arora
No ratings yet
Msi b250 Gaming m3 Ms-7a62 Rev1.0
No ratings yet
Msi b250 Gaming m3 Ms-7a62 Rev1.0
67 pages
GPU Datasheet
No ratings yet
GPU Datasheet
3 pages
Lecture Week - 1 Introduction 1 - SP-24
No ratings yet
Lecture Week - 1 Introduction 1 - SP-24
51 pages
High Performance Computing (HPC)
No ratings yet
High Performance Computing (HPC)
11 pages
RISC vs CISC
From Everand
RISC vs CISC
Isaac Berners-Lee
No ratings yet
High Performance Computing: What Is It Used For and Why?
No ratings yet
High Performance Computing: What Is It Used For and Why?
19 pages
High Performance Computing (HPC)
No ratings yet
High Performance Computing (HPC)
8 pages
Evolution of Cloud Computing
No ratings yet
Evolution of Cloud Computing
20 pages
HPC Datasheet sc23 h200 Datasheet 3002446
No ratings yet
HPC Datasheet sc23 h200 Datasheet 3002446
3 pages
Nvidia DGX A100 Datasheet
No ratings yet
Nvidia DGX A100 Datasheet
2 pages
Microsoft in High Performance Computing: An Introduction: Aditya Krishnan Technical Product Manager Microsoft Corp
No ratings yet
Microsoft in High Performance Computing: An Introduction: Aditya Krishnan Technical Product Manager Microsoft Corp
21 pages
High Performance Computing Update 0908 - InCOSE
No ratings yet
High Performance Computing Update 0908 - InCOSE
19 pages
HPC - Unit Test-I (9 July 2020) : Mark Only One Oval
No ratings yet
HPC - Unit Test-I (9 July 2020) : Mark Only One Oval
5 pages
High Performance Computing Lecture 2 Parallel Programming With MPI Pub
No ratings yet
High Performance Computing Lecture 2 Parallel Programming With MPI Pub
50 pages
High Performance Computing Lecture 1 HPC Public
No ratings yet
High Performance Computing Lecture 1 HPC Public
50 pages
Module 2 Class 1
No ratings yet
Module 2 Class 1
9 pages
Nvidia Cuda Arc
No ratings yet
Nvidia Cuda Arc
16 pages
Tesla V100 Performance Guide
No ratings yet
Tesla V100 Performance Guide
23 pages
FOSDEM14 HPC Devroom 12 Sniper
No ratings yet
FOSDEM14 HPC Devroom 12 Sniper
33 pages
Nvidia - Rapids
No ratings yet
Nvidia - Rapids
33 pages
Nvidia Opencl Best Practices Guide: Optimization
No ratings yet
Nvidia Opencl Best Practices Guide: Optimization
49 pages
Dgx1 v100 System Architecture Whitepaper
No ratings yet
Dgx1 v100 System Architecture Whitepaper
43 pages
CUDA Installation Guide Windows
100% (1)
CUDA Installation Guide Windows
17 pages
NGC Registry Launch Technical Overview
No ratings yet
NGC Registry Launch Technical Overview
11 pages
NVIDIA CUDA C Programming Guide 3.1
No ratings yet
NVIDIA CUDA C Programming Guide 3.1
173 pages
The Right Tools For Professionals: Nvidia Workstation Gpus
No ratings yet
The Right Tools For Professionals: Nvidia Workstation Gpus
4 pages
Using FFmpeg With NVIDIA GPU Hardware Acceleration
No ratings yet
Using FFmpeg With NVIDIA GPU Hardware Acceleration
22 pages
CC 1 Unit Notes
No ratings yet
CC 1 Unit Notes
8 pages
Introduction To High Performance Scientific Computing
No ratings yet
Introduction To High Performance Scientific Computing
464 pages
362.00 Nvidia Control Panel Quick Start Guide
No ratings yet
362.00 Nvidia Control Panel Quick Start Guide
33 pages
Gpu-Applications-Catalog 2021
No ratings yet
Gpu-Applications-Catalog 2021
76 pages
NVIDIA OpenCL JumpStart Guide
No ratings yet
NVIDIA OpenCL JumpStart Guide
15 pages
High Performance Network-on-Chip Through MPLS
No ratings yet
High Performance Network-on-Chip Through MPLS
4 pages
dgx2 User Guide
No ratings yet
dgx2 User Guide
125 pages
CUDA Compute Unified Device Architecture
No ratings yet
CUDA Compute Unified Device Architecture
26 pages
344.48 Nvidia Control Panel Quick Start Guide PDF
No ratings yet
344.48 Nvidia Control Panel Quick Start Guide PDF
33 pages
Laravel 5.1 Beauty - Creating Beautiful Web Apps With Laravel 5.1 PDF
No ratings yet
Laravel 5.1 Beauty - Creating Beautiful Web Apps With Laravel 5.1 PDF
247 pages
2021-02-04 DAIM Company Presentation
No ratings yet
2021-02-04 DAIM Company Presentation
17 pages
Data-Level Parallelism in Vector, SIMD, And: GPU Architectures
No ratings yet
Data-Level Parallelism in Vector, SIMD, And: GPU Architectures
29 pages
Nvidia DGX Station Print Infographic 738375 Web
No ratings yet
Nvidia DGX Station Print Infographic 738375 Web
1 page
Edge AI: Reshaping The Future of Edge Computing With Artificial Intelligence
No ratings yet
Edge AI: Reshaping The Future of Edge Computing With Artificial Intelligence
29 pages
Image Rotation Using CUDA
No ratings yet
Image Rotation Using CUDA
18 pages
Cap 100M PDF
No ratings yet
Cap 100M PDF
35 pages
Triton X-100-1
No ratings yet
Triton X-100-1
9 pages
Cuda C
No ratings yet
Cuda C
70 pages
Nvidia XID - Errors
No ratings yet
Nvidia XID - Errors
12 pages
ELEC6036-MOTIVATE Note-0 High Perf Cloud Mobile Computing 2021-22
No ratings yet
ELEC6036-MOTIVATE Note-0 High Perf Cloud Mobile Computing 2021-22
17 pages
TB 04631 001 - v01
No ratings yet
TB 04631 001 - v01
25 pages
High Performance Computing Cloud Computing
100% (1)
High Performance Computing Cloud Computing
49 pages
AWS HPC in Oil and Gas Ebook
No ratings yet
AWS HPC in Oil and Gas Ebook
9 pages
Insert Project Title: Business Requirements Specification
No ratings yet
Insert Project Title: Business Requirements Specification
19 pages
High Performance Computing: Course Introduction
No ratings yet
High Performance Computing: Course Introduction
32 pages
t4 Datasheet
No ratings yet
t4 Datasheet
2 pages
Achieve All Your Long Term Goals
No ratings yet
Achieve All Your Long Term Goals
19 pages
NVIDIA
No ratings yet
NVIDIA
38 pages
An Analysis of Generative Artificial Intelligence: Strengths, Weaknesses, Opportunities and Threats
From Everand
An Analysis of Generative Artificial Intelligence: Strengths, Weaknesses, Opportunities and Threats
Dennis Byer
No ratings yet
Co-Evolution of Metamodels and Model Transformations: An operator-based, stepwise approach for the impact resolution of metamodel evolution on model transformations.
From Everand
Co-Evolution of Metamodels and Model Transformations: An operator-based, stepwise approach for the impact resolution of metamodel evolution on model transformations.
Steffen Kruse
No ratings yet
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
From Everand
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
Fouad Sabry
No ratings yet
Cluster and Grid Computing
No ratings yet
Cluster and Grid Computing
37 pages
High Performance Cluster Computing:: Architectures and Systems
No ratings yet
High Performance Cluster Computing:: Architectures and Systems
70 pages
Vsphere 8.0 Configuration - Maximums
No ratings yet
Vsphere 8.0 Configuration - Maximums
15 pages
Tektalkcomputeaisolutons 1700043747452
No ratings yet
Tektalkcomputeaisolutons 1700043747452
23 pages
Ttkransomwareprotection 202309141694655795096
No ratings yet
Ttkransomwareprotection 202309141694655795096
62 pages
Lelang JB - Betta Feed
No ratings yet
Lelang JB - Betta Feed
6 pages
Certificate - Introduction To NetApp Portfolio PDF
No ratings yet
Certificate - Introduction To NetApp Portfolio PDF
1 page
Broodstock Nutrition Research On Marine Finfish in
No ratings yet
Broodstock Nutrition Research On Marine Finfish in
28 pages
Module 1 / Unit 3 / Using An Os: Comptia It Fundamentals+ Study Guide (Exam Fc0-U61)
No ratings yet
Module 1 / Unit 3 / Using An Os: Comptia It Fundamentals+ Study Guide (Exam Fc0-U61)
25 pages
Subnetting Examples1
No ratings yet
Subnetting Examples1
3 pages
Experion HS Application Development Guide
No ratings yet
Experion HS Application Development Guide
405 pages
Power Xpert InControl Software Manual - MN040013EN
No ratings yet
Power Xpert InControl Software Manual - MN040013EN
48 pages
Introduction To Embedded Systems
No ratings yet
Introduction To Embedded Systems
14 pages
70 413 CertifyChat Vce 20 01 2016 v1
No ratings yet
70 413 CertifyChat Vce 20 01 2016 v1
181 pages
Cs Mid Term
No ratings yet
Cs Mid Term
2 pages
Martial Arts
No ratings yet
Martial Arts
3 pages
(WWW - Entrance-Exam - Net) - JNTU, B.tech, CSE, Digital Logic Design Sample Paper 1 PDF
No ratings yet
(WWW - Entrance-Exam - Net) - JNTU, B.tech, CSE, Digital Logic Design Sample Paper 1 PDF
2 pages
Operating Systems: Internals and Design Principles: Threads
No ratings yet
Operating Systems: Internals and Design Principles: Threads
34 pages
2018.10.12 RX-RDP 1.4.3 Release Notes
No ratings yet
2018.10.12 RX-RDP 1.4.3 Release Notes
17 pages
Gigabyte GA-H81M-H Performance Results - UserBenchmark - 1-LABLENMUS-07
No ratings yet
Gigabyte GA-H81M-H Performance Results - UserBenchmark - 1-LABLENMUS-07
4 pages
Last UIException
No ratings yet
Last UIException
1 page
History of Operating Systems
No ratings yet
History of Operating Systems
2 pages
Eventlog
No ratings yet
Eventlog
3 pages
Cisco Secure Endpoint - Intro and Access - Deployment - Stage 3 FY22
No ratings yet
Cisco Secure Endpoint - Intro and Access - Deployment - Stage 3 FY22
39 pages
Bscit Sem 6 Iot Material Unit 4
No ratings yet
Bscit Sem 6 Iot Material Unit 4
11 pages
PythonPackagesandData Accesss
No ratings yet
PythonPackagesandData Accesss
4 pages
In Final
No ratings yet
In Final
24 pages
AZ-900 _ Microsoft Azure Fundamentals Exam Questions
No ratings yet
AZ-900 _ Microsoft Azure Fundamentals Exam Questions
12 pages
CPU Scheduling Alternating Sequence of CPU and I/O Bursts
No ratings yet
CPU Scheduling Alternating Sequence of CPU and I/O Bursts
8 pages
SMC-router-MN SMCWBR14 GM
No ratings yet
SMC-router-MN SMCWBR14 GM
69 pages
Great Persada Komputer Palu: Asus X441Ma Asus A409Ma
No ratings yet
Great Persada Komputer Palu: Asus X441Ma Asus A409Ma
1 page
How To Measure Storage Performance and IOPS On Windows - Windows OS Hub
No ratings yet
How To Measure Storage Performance and IOPS On Windows - Windows OS Hub
8 pages
Cache Memory
No ratings yet
Cache Memory
4 pages
Preliminary Computer Science I: COMP0001
No ratings yet
Preliminary Computer Science I: COMP0001
85 pages
3com Switch 7750 Command Reference
No ratings yet
3com Switch 7750 Command Reference
1,140 pages
Who Owns The Internet
No ratings yet
Who Owns The Internet
3 pages