Unit 1 CC (R20)
Unit 1 CC (R20)
Unit 1 CC (R20)
UNIT 1
Introduction to Cloud Computing, Meaning of Cloud and History,
Evolution of Cloud Computing, Cloud essential Characteristics, Cloud
Computing Architecture: Cloud Service Models/Types (i.e., Public, Private,
Hybrid, and Community), Cloud deployment models (i.e., IaaS, PaaS,
SaaS, and PaaS), System models for Distributed and Cloud Computing,
Service Oriented Architecture, Performance, Security and Energy
Efficiency
1960’s
One of the renowned names in Computer Science, John McCarthy,
enabled enterprises to use expensive mainframe and introduced the
whole concept of time-sharing. This turned out to be a huge contribution
to the pioneering of Cloud computing concept and establishment of
Internet.
1969
With the vision to interconnect the global
space, J.C.R. Licklider introduced the concepts of “Galactic
Network” and “Intergalactic Computer Network” and
also developed Advanced Research Projects Agency Network- ARPANET.
CLOUD COMPUTING LECTURE NOTES [B20CS4101]
1970
By this era, it was possible to run multiple Operating Systems in isolated
environment.
1997
Prof. Ramnath Chellappa introduced the concept of “Cloud Computing” in
Dallas.
1999
Salesforce.com started the whole concept of enterprise
applications through the medium of simple websites. Along with that, the
services firm also covered the way to help experts deliver
applications via the Internet.
2003
The Virtual Machine Monitor (VMM), that allows running of multiple virtual
guest operating systems on single device, paved way ahead for other
huge inventions.
2006
Amazon also started expanding in cloud services. From EC2 to
Simple Storage Service S3, they introduced pay-as-you-go model, which
has become a standard practice even today.
2013
With IaaS, (Infrastructure-as-a-Service), the Worldwide Public Cloud
Services Market was totalled at £78bn, which turned out to be the fastest
growing market services of that year.
The Infrastructure as a Service (IAAS) means the hiring & utilizing of the
Physical Infrastructure of IT (network, storage, and servers) from a
third-party provider. The IT resources are hosted on external servers, and
users can access them via an internet connection.
The Benefits
o Time and cost savings: No installation and maintenance of IT
hardware in-house,
o Better flexibility: On-demand hardware resources that can be
tailored to your needs,
CLOUD COMPUTING LECTURE NOTES [B20CS4101]
For Who?
The Benefits
o Focus on development: Mastering the installation and development
of software applications.
o Time saving and flexibility: no need to manage the implementation
of the platform, instant production.
o Data security: You control the distribution, protection, and backup
of your business data.
For Who?
The Benefits
o You are entirely free from the infrastructure management and
aligning software environment: no installation or software
maintenance.
o You benefit from automatic updates with the guarantee that all
users have the same software version.
o It enables easy and quicker testing of new software solutions.
For Who?
CLOUD COMPUTING LECTURE NOTES [B20CS4101]
The cloud service models describe to what extent your resources are
managed by yourself or by your cloud service providers.
⮚ Public cloud
⮚ Private cloud
⮚ Hybrid cloud
⮚ Community cloud
● Public Cloud
Public clouds are managed by third parties which provide cloud services
over the internet to the public, these services are available as
pay-as-you-go billing models.
Advantages
● High Scalability
● Cost Reduction
● Reliability and flexibility
● Disaster Recovery
Disadvantages
● Private cloud
schemes that manage the usage of the cloud and proportionally billing of
the different departments or sections of an enterprise. Private cloud
providers are HP Data Centers, Ubuntu, Elastic-Private cloud, Microsoft,
etc.
Advantages
Examples
Red Hat OpenStack, CISCO, DELL, Rackspace, IBM Bluemix Private Cloud,
Microsoft Azure Stack, and VMware Private Cloud
● Hybrid cloud:
Advantages
CLOUD COMPUTING LECTURE NOTES [B20CS4101]
Disadvantages
Examples
Netflix, Hulu, Uber and Airbnb using hybrid platforms like AWS Outposts,
Azure Stack, Azure Arc, Microsoft Azure VMware Solution, Google Anthos,
Nutanix Cloud Infrastructure, Nutanix Cloud Clusters, VMware Cloud
Foundation, and VMware Cloud on AWS.
● Community Cloud
Advantages:
● Cost
● Flexible and Scalable
● Security
● Sharing infrastructure
Disadvantages:
Example:
Distributed and cloud computing systems are built over a large number of
autonomous computer nodes. These node machines are interconnected by
SANs, LANs, or WANs in a hierarchical manner.
CLOUD COMPUTING LECTURE NOTES [B20CS4101]
❖ Computing Clusters
CLOUD COMPUTING LECTURE NOTES [B20CS4101]
❖ Peer-to-Peer Networks
o A distributed system architecture
CLOUD COMPUTING LECTURE NOTES [B20CS4101]
o Each computer in the network can act as a client or server for other
network computers.
o No centralized control
o Typically many nodes, but unreliable and heterogeneous
o Nodes are symmetric in function
o Take advantage of distributed, shared resources (bandwidth, CPU,
storage) on peer-nodes
o Fault-tolerant, self-organizing
o Operate in dynamic environment, frequent join and leave is the norm.
❖ Computational/Data Grids
Grid technology demands new distributed computing models,
software/middleware support, network protocols, and hardware
infrastructures. National grid projects are followed by industrial grid
platform development by IBM, Microsoft, Sun, HP, Dell, Cisco, EMC,
CLOUD COMPUTING LECTURE NOTES [B20CS4101]
Platform Computing, and others. New grid service providers (GSPs) and
new grid applications have emerged rapidly, similar to the growth of
Internet and web services in the past two decades.
❖ Cloud Platforms
Refer to introduction of cloud computing.
In the case of fault tolerance, the features in the Web Services Reliable
Messaging (WSRM) framework mimic the OSI layer capability (as in TCP
fault tolerance) modified to match the different abstractions (such as
messages versus packets, virtualized addressing) at the entity levels.
Security is a critical capability that either uses or reimplements the
capabilities seen in concepts such as Internet Protocol Security (IPsec)
and secure sockets in the OSI layers.
In the earlier years, CORBA and Java approaches were used in distributed
systems rather than today’s SOAP, XML, or REST.
endpoints are the URIs that define the scope and method of the
XML API operations.
▪ REST- REpresentational State Transfer is an architectural style that
defines a set of constraints to be used for creating web services.
REST API is a way of accessing web services in a simple and flexible
way without having any processing.
❖ Security
Clusters, grids, P2P networks, and clouds demand security and copyright
protection if they are to be accepted in today’s digital society.
Figure 1.25 summarizes various attack types and their potential damage
to users. As the figure shows, information leaks lead to a loss of
confidentiality. Loss of data integrity may be caused by user alteration,
Trojan horses, and service spoofing attacks. A denial of service (DoS)
results in a loss of system operation and Internet connections. Lack of
authentication or authorization leads to attackers’ illegitimate use of
computing resources. Open resources such as data centers, P2P
networks, and grid and cloud infrastructures could become the next
targets. Users need to protect clusters, grids, clouds, and P2P systems.
Otherwise, users should not use or trust them for outsourced work.
Malicious intrusions to these systems may destroy valuable hosts, as well
as network and storage resources. Internet anomalies found in routers,
gateways, and distributed hosts may hinder the acceptance of these
public-resource computing services.
Security Responsibilities
Application Layer
Middleware Layer
The middleware layer acts as a bridge between the application layer and
the resource layer. This layer provides resource broker, communication
service, task analyzer, task scheduler, security access, reliability control,
and information service capabilities. It is also responsible for applying
energy-efficient techniques, particularly in task scheduling. Until recently,
scheduling was aimed at minimizing makespan, that is, the execution
time of a set of tasks. Distributed computing systems necessitate a new
cost function covering both makespan and energy consumption.
Resource Layer
Network Layer
In Figure 1.23, scalable performance is estimated against the multiplicity of OS images in distributed
systems deployed up to 2010. Scalable performance implies that the system can achieve higher
speed by adding more processors or servers, enlarging the physical node’s memory size, extending
the disk capacity, or adding more I/O channels. The OS image is counted by the number of
independent OS images observed in a cluster, grid, P2P network, or the cloud. An SMP (symmetric
multiprocessor) server has a single system image, which could be a single node in a large cluster.
NUMA (nonuniform memory access) machines are often made out of SMP nodes with distributed,
shared memory. A NUMA machine can run with multiple operating systems, and can scale to a few
thousand processors communicating with the MPI library. For example, a NUMA machine may have
2,048 processors running 32 SMP operating systems, resulting in 32 OS images in the
2,048-processor NUMA system. The cluster nodes can be either SMP servers or high-end machines
that are loosely coupled together. The cloud could be a virtualized cluster. As of 2010, the largest
cloud was able to scale up to a few thousand VMs.
Keeping in mind that many cluster nodes are SMP or multicore servers, the total number of pro-
cessors or cores in a cluster system is one or two orders of magnitude greater than the number of
OS images running in the cluster. The grid node could be a server cluster, or a mainframe, or a
supercomputer, or an MPP. Therefore, the number of OS images in a large grid structure could be
hundreds or thousands fewer than the total number of processors in the grid. A P2P network can
easily scale to millions of independent peer nodes, essentially desktop machines.
Amdahl’s Law
Consider the execution of a given program on a uniprocessor workstation with a total execution time
of T minutes. Now, let’s say the program has been parallelized or partitioned for parallel execution on
CLOUD COMPUTING LECTURE NOTES [B20CS4101]
a cluster of many processing nodes. Assume that a fraction α of the code must be executed
sequentially, called the sequential bottleneck. Therefore, (1 − α) of the code can be compiled for
parallel execution by n processors. The total execution time of the program is calculated by α T + (1 −
α)T/n, where the first term is the sequential execution time on a single processor and the second
term is the parallel execution time on n processing nodes.
All system or communication overhead is ignored here. The I/O time or exception handling time is
also not included in the following speedup analysis. Amdahl’s Law states that the speedup factor of
using the n-processor system over the use of a single processor is expressed by:
The maximum speedup of n is achieved only if the sequential bottleneck α is reduced to zero or the
code is fully parallelizable with α = 0.
The sequential bottleneck is the portion of the code that cannot be parallelized. For example, the
maximum speedup achieved is 4, if α = 0.25 or 1 − α = 0.75, even if one uses hundreds of processors.
Amdahl’s law teaches us that we should make the sequential bottleneck as small as possible.
Increasing the cluster size alone may not result in a good speedup in this case.
In Amdahl’s law, we have assumed the same amount of workload for both sequential and parallel
execution of the program with a fixed problem size or data set. This was called fixed-workload
speedup. To execute a fixed workload on n processors, parallel processing may lead to a system
efficiency defined as follows:
Very often the system efficiency is rather low, especially when the cluster size is very large. To
execute the aforementioned program on a cluster with n = 256 nodes, extremely low efficiency E =
1/[0.25 × 256 + 0.75] = 1.5% is observed. This is because only a few processors (say, 4) are kept busy,
while the majority of the nodes are left idling.
Gustafson’s Law
To achieve higher efficiency when using a large cluster, we must consider scaling the problem size to
match the cluster capability. This leads to the following speedup law proposed by John Gustafson
(1988), referred as scaled-workload speedup. Let W be the workload in a given program. When using
an n-processor system, the user scales the workload to W′ = αW + (1 − α)nW. Note that only the
parallelizable portion of the workload is scaled n times in the second term. This scaled workload W′ is
essentially the sequential execution time on a single processor. The parallel execution time of a
scaled workload W′ on n processors is defined by a scaled-workload speedup as follows:
This speedup is known as Gustafson’s law. By fixing the parallel execution time at level W, the
following efficiency expression is obtained:
CLOUD COMPUTING LECTURE NOTES [B20CS4101]
For the preceding program with a scaled workload, we can improve the efficiency of using a
256-node cluster to E′ = 0.25/256 + 0.75 = 0.751. One should apply Amdahl’s law and Gustafson’s law
under different workload conditions. For a fixed workload, users should apply Amdahl’s law. To solve
scaled problems, users should apply Gustafson’s law.
In addition to performance, system availability and application flexibility are two other important
design goals in a distributed computing system.
System Availability
HA (high availability) is desired in all clusters, grids, P2P networks, and cloud systems. A system is
highly available if it has a long mean time to failure (MTTF) and a short mean time to repair (MTTR).
System availability is formally defined as follows:
System availability is attributed to many factors. All hardware, software, and network components
may fail. Any failure that will pull down the operation of the entire system is called a single point of
failure. The rule of thumb is to design a dependable computing system with no single point of failure.
Adding hardware redundancy, increasing component reliability, and designing for testability will help
to enhance system availability and dependability.
In Figure 1.24, the effects on system availability are estimated by scaling the system size in terms of
the number of processor cores in the system.
In general, as a distributed system increases in size, availability decreases due to a higher chance of
failure and a difficulty in isolating the failures. Both SMP and MPP are very vulnerable with
centralized resources under one OS. NUMA machines have improved in availability due to the use of
multiple OSes. Most clusters are designed to have HA with failover capability. Meanwhile, private
clouds are created out of virtualized data centers; hence, a cloud has an estimated availability similar
to that of the hosting cluster. A grid is visualized as a hierarchical cluster of clusters. Grids have higher
availability due to the isolation of faults. Therefore, clusters, clouds, and grids have decreasing
availability as the system increases in size. A P2P file-sharing network has the highest aggregation of
client machines. However, it operates independently with low availability, and even many peer nodes
depart or fail simultaneously.
CLOUD COMPUTING LECTURE NOTES [B20CS4101]