Cluster Computing
Cluster Computing
Cluster Computing
WHITE PAPER
Abstract
In computers, clustering is the use of multiple computers, typically PCs or UNIX
workstations, multiple storage devices, and redundant interconnections, to form what appears to
users as a single highly available system. Cluster computing can be used for load balancing as
well as for high availability. One of the main ideas of cluster computing is that, to the outside
world, the cluster appears to be a single system.
A common use of cluster computing is to load balance traffic on high-traffic Web sites. A
Web page request is sent to a "manager" server, which then determines which of several identical
or very similar Web servers to forward the request to for handling. Having a Web farm (as such a
configuration is sometimes called) allows traffic to be handled more quickly.
Table of Contents
Introduction:..............................................................................................................................................4
1. Definition:..............................................................................................................................................4
2. Cluster Categorization:.........................................................................................................................4
2.1. High-availability (HA) clusters:.....................................................................................................4
2.2. Load-balancing clusters:................................................................................................................4
2.3. High-performance computing (HPC) clusters..............................................................................4
2.4. High Throughput Clusters.............................................................................................................5
2.5. Grid computing...............................................................................................................................5
3. Architecture:..........................................................................................................................................5
4. Applications:..........................................................................................................................................6
5. Comparison:...........................................................................................................................................6
6. Advantages:............................................................................................................................................7
7. Conclusion and Future Scope:..............................................................................................................7
References:.................................................................................................................................................7
Introduction:
Today, a wide range of applications are hungry for higher computing power, and even though
single processor PCs and workstations now can provide extremely fast processing, the even faster
execution that multiple processors can achieve by working concurrently is still needed. Now,
finally, costs are falling as well. Networked clusters of commodity PCs and workstations using
off-the-shelf processors and communication platforms such as Fast Ethernet, Gigabit Ethernet are
becoming increasingly cost effective and popular. This concept, known as Cluster computing
combines computing concepts and technologies of Internet, Supercomputing Applications,
Distributed and Parallel Processing.
1. Definition:
A cluster is a collection of connected, independent computers that work together to solve a
problem. All of the cluster can work together on a single problem at the same time. Portions of
the cluster can be working on different problems at the same time.
The constituent computer nodes are commercial-off-the-shelf (COTS), are capable of full
independent operation as is, and are of a type ordinarily employed individually for standalone
mainstream workloads and applications. The nodes may incorporate a single microprocessor or
multiple microprocessors in a symmetric multiprocessor (SMP) configuration. The
interconnection network employs COTS local area network (LAN) or systems area network
(SAN) technology that may be a hierarchy of or multiple separate network structures. A cluster
network is dedicated to the integration of the cluster compute nodes and is separate from the
cluster’s external (worldly) environment.
2. Cluster Categorization:
3. Architecture:
The Network Interface Hardware is responsible for transmitting and receiving packets of data
between nodes.
The Communication software is responsible to offer an efficient and reliable means of data
communication between nodes and potentially outside the cluster.
System-level middleware is responsible for offering the illusion of a unified system image
(single system image) from a collection of independent but interconnected computers.
Various operating systems, including Linux, Solaris, and Windows, can be used for managing
node resources.
It is responsible for making sure the computers work together as one entity.
System level Middleware offers Single System Image (SSI) and high availability infrastructure
for processes, memory, storage, I/O, and networking.
4. Applications:
Clusters have evolved to support applications ranging from supercomputing and mission-critical
software, through web server and e-commerce, to high performance database applications.
Numerous Scientific & Engineering Apps
Business Applications
5. Comparison:
The terms "grid computing" and "cluster computing" have been used almost interchangeably to
describe networked computers that run distributed applications and share resources.
However, cluster and grid computing represent different approaches to solving performance
problems; although their technologies and infrastructure differ, their features and benefits
complement each other.
Cluster Computing:
1. The computers (or "nodes") on a cluster are networked in a tightly-coupled fashion--they are
all on the same subnet of the same domain, often networked with very high bandwidth
connections.
2. The nodes are homogeneous; they all use the same hardware, run the same software, and are
generally configured identically. Each node in a cluster is a dedicated resource--generally only the
cluster applications run on a cluster node.
3. One advantage available to clusters is the Message Passing Interface (MPI) which is a
programming interface that allows the distributed application instances to communicate with
each other and share information.
4. Dedicated hardware, high-speed interconnects, and MPI provide clusters with the ability to
work efficiently on “fine-grained” parallel problems, including problems with short tasks,
some of which may depend on the results of previous tasks.
Grid Computing:
1. In contrast, the nodes on a grid can be loosely-coupled; they may exist across domains or
subnets.
2. The nodes can be heterogeneous; they can include diverse hardware and software
configurations.
3. Grids typically do not require high-performance interconnects; rather, they usually are
configured to work with existing network connections.
4. As a result, grids are better suited to relatively “coarse-grained” parallel problems,
including problems composed primarily of independent tasks.
6. Advantages:
1. Performance
No matter what measure of performance one is seeking, its is straightforward to claim
that one can get even more of it by using a bunch of machines.
2. Availability
High availability & resilience to failure
3. Price/performance
Workstation clusters are a cheap and readily available alternative to specialized High
Performance Computing (HPC) platforms.
Organizations are reluctant to buy large supercomputers, due to the large expense and
short useful life span.
4. Incremental growth
Use of clusters of workstations as a distributed compute resource is very cost effective
due to incremental growth of system. No need to make a large initial investment which
motivates the use of Clusters.
5. Scalability
Offer great scalability as potentially there is no limitation to the number of machines that
can be stacked side by side.
6. Rapid response to technology improvements
References:
www.springer.com/
http://www.rzg.mpg.de/computing/
http://www.bestpricecomputers.co.uk
http://www.buyya.com/cluster/
http://en.wikipedia.org/