0% found this document useful (0 votes)
17 views

Cloud Computing Notes

The document provides an introduction to cloud computing including its history, definition, types of deployment models including public, private, hybrid and community clouds. It also discusses types of service models including SaaS, PaaS and IaaS. The document explores features, advantages, disadvantages and applications of cloud computing.

Uploaded by

aswinsee796
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Cloud Computing Notes

The document provides an introduction to cloud computing including its history, definition, types of deployment models including public, private, hybrid and community clouds. It also discusses types of service models including SaaS, PaaS and IaaS. The document explores features, advantages, disadvantages and applications of cloud computing.

Uploaded by

aswinsee796
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 130

CLOUD COMPUTING

Prof.M.VENGATESHWARAN M.E., (Ph.D)


Assistant Professor in CSE
Department of Computer Science and Engineering
Sri Krishna College of Engineering & Technology (Autonomous)
Coimbatore
E-mail:mkvengatesh@gmail.com

1 1 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


PREFACE
This material explores the field of Cloud Computing in line with the knowledge of database
field. This material has been designed to help the students to understand the theoretical
concepts and prepare for the interviews easily.
With clear, straight forward text and appropriate illustration, this material brings an active style
of learning to study of a Cloud Computing.
Useful suggestion and healthy criticisms to improve the material will be welcomed and
thankfully acknowledged.

Mr.M.VENGATESHWARAN

Designation :Assistant Professor in CSE

Qualification : B.E., M.E., (Ph.D)

Department : CSE

Specialization : Big Data, Machine & Deep Learning, Data mining,


Database, Information Retrieval, Social Network
Analysis.

Publication :
Awards - 06
Book Published – 08
International Journal - 31
International Conference -42
National Conference -32

2 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


3 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing
INTRODUCTION TO CLOUD COMPUTING
 Cloud Computing is the delivery of computing services such as servers, storage, databases,
networking, software, analytics, intelligence, and more, over the Cloud (Internet).
 Cloud computing is a service, which offers customers to work over the internet.
 It simply states that cloud computing means storing and accessing the data and programs over the
internet rather than the computer’s hard disk.
 The data can be anything such as music, files, images, documents, and many more.
 The user can access the data from anywhere just with the help of an internet connection.
 To access cloud computing, the user should register and provide with ID and password
for security reasons.
 The speed of transfer depends on various factors such as internet speed, the capacity of the server, and
many more.

History:
 Before cloud computing emerged, there was client/server computing, centralized storage in which all
the data, software applications and all the controls reside on the server side.
 If a user wants to run a program or access a specific data, then he connects to the server and gain
appropriate access and can do his business. Distributed computing concept came after this, where all
the computers are networked together and resources are shared when needed.
 The Cloud Computing concept came into the picture in the year 1950 with accessible via thin/static
clients and the implementation of mainframe computers. Then in 1961, John McCarthy delivered a
speech at MIT in which he suggested that computing can be sold like a utility like electricity and food.
The idea was great but it was much ahead of its time and despite having an interest in the model, the
technology at that time was not ready for it.
 In 1999, Salesforce.com became the 1st company to enter the cloud arena, excelling the concept of
providing enterprise-level applications to end users through the Internet.
 Then in 2002, Amazon came up with Amazon Web Services, providing services like computation,
storage, and even human intelligence.
 In 2009, Google Apps and Microsoft’s Windows Azure also started to provide cloud computing
enterprise applications. Other companies like HP and Oracle also joined the stream of cloud
computing, for fulfilling the need for greater data storage.

4 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


Definition of Cloud
 Cloud computing s an emerging model through which user can gain access to their resources,
application & services from anywhere at any time on any connected devices.
 Goal: Better use of distributed hardware and software resources.
 High throughput at lowest cost.
 Solve large scale problem in less time
 According to National Institute of Standards and Technology (NIST) define for cloud computing says
that “Cloud computing is a model for enabling environment , on-demand network access to a
shared pool of configurable computing resources (like storage, networks, servers, applications &
services)that can be rapidly provisioned & released with minimal management effort”.

5 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


Types of Deployment Models

1. Public cloud:

 Basically run over the internet


 It is designed for the general public where resources, application, services are provided over the
internet.
 Managed by Cloud service providers (CSP) and Cloud service Brokers (CSB) who gives access to
cloud services either free or subscription /pay per use model.
 Use this model without purchasing any specialize hardware or software.
 Ex: Amazon web services, Google app engine, sales force etc.,
Adv:
 It saves cost
 No need server administration
 No training is required
 Resources are easily scalable
Dis adv:
 Lack of data security

2. Private Cloud
 It is used by organization internally & it’s for single organization, anyone within the organization can
get access to data, services, and web application easily through local server and local network but users
outside the organizations cannot access them.
 This model run on intranet
 Completely managed by organization people.
Adv:
 Speed of access is high
 More secure
 Does not require internet connection
Dis-Adv:
 Implementation cost is high
 It require administrator
 Scalability is very limited
6 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing
3. Hybrid Cloud

 In a Hybrid cloud, there is an ease to move the application to move from one cloud to another. Hybrid
Cloud is a combination of Public and Private Cloud which supports the requirement to handle data in
an organization.
 It run on both online & offline
 In hybrid cloud lack of flexibility, security and certainty of in-house applications.
4. Community Cloud
 The companies having similar interest and work can share the same cloud and it can be done with the
help of Community Cloud. The initial investment is saved, as the setup is established.
 Managed by third party.

Some of the companies which use Cloud Computing are-

 Netflix
 Pinterest
 Xerox
 Instagram
 Apple
 Google
 Facebook

Types of Service Models:

1. SAAS
 SaaS stands for Software as a Service, provides a facility to the user to use the software from anywhere
with the help of an internet connection. It is also known as software on demand.
 The remote access is possible because of service providers, host applications and their associated data
at their location.
 There are various benefits of the SaaS as it is economical and only the user has to pay for some of the
basic costs such as licensing fees, installation costs, maintenance fees, and support fees.
 Some of the examples of SaaS are Yahoo! Mail, Hotmail, and Gmail.

7 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


2. PAAS

 PaaS stands for Platform as a Service.


 This helps the user by providing the facility to make, publish, and customize the software in the hosted
environment.
 An internet connection helps to do it. It also has several benefits such as it has lower costs and only the
user has to pay for the essential things.
 The host of a PaaS has the hardware and software of its own.
 This frees the user from installing the hardware and software to execute a new application.

3. IAAS
 IaaS stands for Infrastructure as a Service.
 With the help of IAAS, the user can use IT hardware and software just by paying the basic price of it.
 The companies that use IaaS are IBM, Google, and Amazon. With the help of visualization, the host
can manage and create the infrastructure resources at the cloud.
 For small start-ups and firms, the IaaS has the major advantage as it benefits them with the
infrastructure rather than spending a large amount of money on hardware and infrastructure.
 The reason for choosing IaaS is that it is easier, faster, and cost-efficient which reduces the burden of
the organizations.

Features of Cloud :

8 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


Advantages:

Disadvantages

Applications:

9 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


EVOLUTION OF CLOUD COMPUTING

Cloud computing is all about renting computing services. This idea first came in the 1950s.
In making cloud computing what it is today, five technologies played a vital role. These are
distributed systems and its peripherals, virtualization, web 2.0, service orientation, and
utility computing.

฀ Distributed System

It is a composition of multiple independent systems but all of them are depicted as a


single entity to the users. The purpose of distributed systems is to share resources and
also use them effectively and efficiently. Distributed systems possess characteristics
such as scalability, concurrency, continuous availability, heterogeneity, and
independence in failures. But the main problem with this system was that all the systems
were required to be present at the same geographical location. Thus to solve this
problem, distributed computing led to three more types of computing and they were-
Mainframe computing, cluster computing, and grid computing.
฀ Mainframe computing:
Mainframes which first came into existence in 1951 are highly powerful and reliable
computing machines. These are responsible for handling large data such as massive
input-output operations. Even today these are used for bulk processing tasks such as
online transactions etc. These systems have almost no downtime with high fault
tolerance. After distributed computing, these increased the processing capabilities of the
system. But these were very expensive. To reduce this cost, cluster computing came as
an alternative to mainframe technology.

10 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


฀ Cluster computing:
In 1980s, cluster computing came as an alternative to mainframe computing. Each
machine in the cluster was connected to each other by a network with high bandwidth.
These were way cheaper than those mainframe systems. These were equally capable of
high computations. Also, new nodes could easily be added to the cluster if it was
required. Thus, the problem of the cost was solved to some extent but the problem
related to geographical restrictions still pertained. To solve this, the concept of grid
computing was introduced.
฀ Grid computing:
In 1990s, the concept of grid computing was introduced. It means that different systems
were placed at entirely different geographical locations and these all were connected via
the internet. These systems belonged to different organizations and thus the grid
consisted of heterogeneous nodes. Although it solved some problems but new problems
emerged as the distance between the nodes increased. The main problem which was
encountered was the low availability of high bandwidth connectivity and with it other
network associated issues. Thus. cloud computing is often referred to as “Successor of
grid computing”.
฀ Virtualization:
It was introduced nearly 40 years back. It refers to the process of creating a virtual layer
over the hardware which allows the user to run multiple instances simultaneously on the
hardware. It is a key technology used in cloud computing. It is the base on which major
cloud computing services such as Amazon EC2, VMware vCloud, etc work on.
Hardware virtualization is still one of the most common types of virtualization.
฀ Web 2.0:
It is the interface through which the cloud computing services interact with the clients. It
is because of Web 2.0 that we have interactive and dynamic web pages. It also increases
flexibility among web pages. Popular examples of web 2.0 include Google Maps,
Facebook, Twitter, etc. Needless to say, social media is possible because of this
technology only. In gained major popularity in 2004.
฀ Service orientation:
It acts as a reference model for cloud computing. It supports low-cost, flexible, and
evolvable applications. Two important concepts were introduced in this computing
model. These were Quality of Service (QoS) which also includes the SLA (Service
Level Agreement) and Software as a Service (SaaS).
฀ Utility computing:
It is a computing model that defines service provisioning techniques for services such as
compute services along with other major services such as storage, infrastructure, etc
which are provisioned on a pay-per-use basis.

11 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


PARALLEL AND DISTRIBUTED COMPUTING
Parallel computing

A parallel system contains more than one processor having direct memory access to the shared memory that
can form a common address space. Usually, a parallel system is of a Uniform Memory Access (UMA)
architecture. In UMA architecture, the access latency (processing time) for accessing any particular location
of a memory from a particular processor is the same. Moreover, the processors are also configured to be in a
close proximity and are connected in an interconnection network. Conventionally, the interprocess processor
communication between the processors is happening through either read or write operations across a shared
memory, even though the usage of the message-passing capability is also possible (with emulation on the
shared memory). Moreover, the hardware and software are tightly coupled, and usually, the processors in
such network are installed to run on the same operating system. In general, the processors are homogeneous
and are installed within the same container of the shared memory. A multistage switch/bus containing a
regular and symmetric design is used for greater efficiency.

The following diagram represents a UMA parallel system with multiple processors connecting to multiple
memory units through network connection.

A multicomputer parallel system is another type of parallel system containing multiple processors configured

12 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


without having a direct accessibility to the shared memory. Moreover, a common address space may or may
not be expected to be formed by the memory of the multiple processors. Hence, computers belonging to this
category are not expected to contain a common clock in practice. The processors are configured in a close
distance, and they are also tightly coupled in general with homogeneous software and hardware. Such
computers are also connected within an interconnected network. The processors can establish a
communication with either of the common address space or message passing options. This is represented in
the diagram below.

A multicomputer system in a Non-Uniform Memory Access (NUMA) architecture is usually configured


with a common address space. In such NUMA architecture, accessing different memory locations in a shared
memory across different processors shows different latency times.

Array processor exchanges information by passing as messages. Array processors have a very small market
owing to the fact that they can perform closely synchronized data processing, and the data is exchanged in a
locked event for applications such as digital signal processing and image processing. Such applications can
also involve large iterations on the data as well.

Compared to the UMA and array processors architecture, NUMA as well as message-passing multicomputer
systems are less preferred if the shared data access and communication much accepted. The primary benefit
of having parallel systems is to derive a better throughput through sharing the computational tasks between
multiple processors. The tasks that can be partitioned into multiple subtasks easily and need little

13 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


communication for bringing synchronization in execution are the most efficient tasks to execute on parallel
systems. The subtasks can be executed as a large vector or an array through matrix computations, which are
common in scientific applications. Though parallel computing was much appreciated through research and
was beneficial on legacy architectures, they are observed no more efficient/economic in recent times due to
following reasons:

They need special configuration for compilers


The market for such applications that can attain efficiency through parallel processing is very small
The evolution of more powerful and efficient computers at lower costs made it less likely that organizations
would choose parallel systems.

DISTRIBUTED COMPUTING

Distributed computing is the concurrent usage of more than one connected computer to solve a problem
over a network connection. The computers that take part in distributed computing appear as single machines
to their users.

Distributing computation across multiple computers is a great approach when these computers are observed
to interact with each other over the distributed network to solve a bigger problem in reasonably less latency.
In many respects, this sounds like a generalization of the concepts of parallel computing that we discussed in
the previous section. The purpose of enabling distributed systems includes the ability to confront a problem
that is either bigger or longer to process by an individual computer.

Distributed computing, the latest trend, is performed on a distributed system, which is considered to be a
group of computers that do not stake a common physical clock or a shared memory, interact with the
information exchanged over a communication (inter/intra) network, with each computer having its own
memory, and runs on its own operating system. Usually, the computers are semi-autonomous, loosely coupled
and cooperate to address a problem collectively.

14 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


Examples of distributed systems include the Internet, an intranet, and a Network of Workstations (NOW),
which is a group of networked personal workstations connected to server machines represented in the
diagram above. Modern-day internet connections include a home hub with multiple devices connected and
operating on the network; search engines such as Google and Amazon services are famous distributed
systems. Three-dimensional animation movies from Pixar and DreamWorks are other trendy examples of
distributed computing.

Given the number of frames to condense for a full-length feature (30 frames per second on a 2-hour movie,
which is a lot!), movie studios have the requirement of spreading the full-rendering job to more computers.

15 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


In the preceding image, we can observe a web application, another illustration of a distributed application
where multiple users connect to the web application over the Internet/intranet. In this architecture, the web
application is deployed in a web server, which interacts with a DB server for data persistence.

The other aspects of the application requiring a distributed system configuration are instant messaging and
video conferencing applications. Having the ability to solve such problems, along with improved
performance, is the reason for choosing distributed systems.

The devices that can take part in distributed computing include server machines, work stations, and personal
handheld devices.

Capabilities of distributed computing include integrating heterogeneous applications that are developed and
run on different technologies and operating systems, multiple applications sharing common resources, a
single instance service being reused by multiple clients, and having a common user interface for multiple
applications.

Compare Distributed vs parallel

Parallel computing is a type of computingarchitecture in which several processors


simultaneously execute multiple, smaller calculations broken down from an overall larger,
complex problem.

Some examples of parallel computing include weather forecasting, movie special effects, and
desktop computerapplications.

A distributed system, also known asdistributed computing, is a system with multiple

16 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


components located on different machines that communicate and coordinate actions in order to
appear as a single coherent system to the end-user.

Why distributed computing is important?


Distributed computing helps improve performance of large-scale projects by combining the
power of multiple machines. It's much more scalable and allows users to
add computersaccording to growing workload demands.

How does distributed computing work?


Distributed systems are groups of networkedcomputers which share a common goal for
theirwork ..... In distributed computing, each processor has its own private memory
(distributedmemory). Information is exchanged by passing messages between the processors.

Parallel versus distributed computing


While both distributed computing and parallel systems are widely available these days, the
main difference between these two is that a parallel computing system consists of multiple
processors that communicate with each other using a shared memory, whereas a distributed
computing system contains multiple processors connected by a communication network.

In parallel computing systems, as the number of processors increases, with enough parallelism
available in applications, such systems easily beat sequential systems in performance through
the shared memory. In such systems, the processors can also contain their own locally
allocated memory, which is not available to any other processors.
In distributed computing systems, multiple system processors can communicate with each
other using messages that are sent over the network. Such systems are increasingly available
these days because of the availability at low price of computer processors and the high-
bandwidth links to connect them.
The following reasons explain why a system should be built distributed, not just parallel:

Scalability: As distributed systems do not have the problems associated with shared memory,
with the increased number of processors, they are obviously regarded as more scalable than
parallel systems.

Reliability: The impact of the failure of any single subsystem or a computer on the network of
computers defines the reliability of such a connected system. Definitely, distributed systems
demonstrate a better aspect in this area compared to the parallel systems.

Data sharing: Data sharing provided by distributed systems is similar to the data sharing
provided by distributed databases. Thus, multiple organizations can have distributed systems
with the integrated applications for data exchange.

Resources sharing: If there exists an expensive and a special purpose resource or a processor,
17 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing
which cannot be dedicated to each processor in the system, such a resource can be easily
shared across distributed systems.

Heterogeneity and modularity: A system should be flexible enough to accept a new


heterogeneous processor to be added into it and one of the processors to be replaced or
removed from the system without affecting the overall system processing capability.
Distributed systems are observed to be more flexible in this respect.

Geographic construction: The geographic placement of different subsystems of an


application may be inherently placed as distributed. Local processing may be forced by the low
communication bandwidth more specifically within a wireless network.

Economic: With the evolution of modern computers, high-bandwidth networks & low cost

Characteristics of Cloud Computing


There are basically 5 essential characteristics of Cloud Computing.

1. On-demand self-services:
The Cloud computing services does not require any human administrators, user themselves are
able to provision, monitor and manage computing resources as needed.

2. Broad network access:

The Computing services are generally provided over standard networks and
heterogeneous devices.
3. Rapid elasticity:

The Computing services should have IT resources that are able to scale out and in quickly and
on as needed basis. Whenever the user require services it is provided to him and it is scale out
as soon as its requirement gets over.

18 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


4. Resource pooling:

The IT resource (e.g., networks, servers, storage, applications, and services) present are shared
across multiple applications and occupant in an uncommitted manner. Multiple clients are
provided service from a same physical resource.

5. Measured service:
The resource utilization is tracked for each application and occupant, it will provide both the
user and the resource provider with an account of what has been used. This is done for various
reasons like monitoring billing and effective use of resource.

ELASTICITY IN CLOUD
Elasticity is the ability to grow or shrink infrastructure resources dynamically as
needed to adapt to workload changes in an autonomic manner, maximizing the use of
resources. This can result in savings in infrastructure costs overall. Not everyone can
benefit from elastic services though. Environments that do not experience sudden or
cyclical changes in demand may not benefit from the cost savings elastic services offer.
Use of “Elastic Services” generally implies all resources in the infrastructure be elastic.
This includes but not limited to hardware, software, QoS and other policies,
connectivity, and other resources that are used in elastic applications. This may become
a negative trait where performance of certain applications must have guaranteed
performance. It depends on the environment.

19 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


Cloud elasticity is a popular feature associated with scale-out solutions (horizontal
scaling), which allows for resources to be dynamically added or removed when needed.
Elasticity is generally associated with public cloud resources and is more commonly
featured in pay-per-use or pay-as-you-grow services. This means IT managers are not
paying for more resources than they are consuming and any given time. In virtualized
environments cloud elasticity could include the ability to dynamically deploy new
virtual machines or shutdown inactive virtual machines.

A use case that could easily have the need for cloud elasticity would be in retail
with increased seasonal activity. For example, during the holiday season for black
Friday spikes and special sales during this season there can be a sudden increased
demand on the system. Instead of spending budget on additional permanent
infrastructure capacity to handle a couple months of high load out of the year, this is a
good opportunity to use an elastic solution. The additional infrastructure to handle the
increased volume is only used in a pay-as-you-grow model and then “shrinks” back to a
lower capacity for the rest of the year. This also allows for additional sudden and
unanticipated sales activities throughout the year if needed without impacting
performance or availability. This can give IT managers the security of unlimited
headroom when needed. This can also be a big cost savings to retail companies looking
to optimize their IT spend if packaged well by the service provider.

On-demand Provisioning
On-demand computing is a delivery model in which computing resources are
made available to the user as needed. ... When the services are provided by a third-
party, the term cloud computing is often used as a synonym for on- demand
computing.

NIST CLOUD COMPUTING REFERENCE ARCHITECTURE


 The NIST Reference Architecture model (see figure 1) defines five key cloud players: cloud consumer,
cloud provider, cloud carrier, cloud broker, and cloud auditor.
 Each cloud player (called "actors" by NIST) can be an individual or an organization that "participates in a
transaction or process and/or performs tasks in cloud computing.

20 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


Cloud Provider Function—Service Orchestration
NIST defined service orchestration as “the composition of system components to support the Cloud Providers
activities in arrangement, coordination and management of computing resources in order to provide cloud
services to Cloud Consumers.”
 Service Layer
 The Service Layer is where the cloud provider defines the interface between the cloud consumer and the
cloud services of the cloud provider.
 The interface points are grouped according to the three service models (SaaS,PaaS, and IaaS).
 A cloud provider may define interface points in all three service models or just a subset.

Resource Abstraction and Control Layer

 This layer consists of two distinct but related areas: resource abstraction and control layer.
 The Resource Abstraction Layer primarily deals with virtualization. The Virtualization Essentials course
defines the concept of virtualization as "a set of techniques for hiding hardware resources behind software
abstractions to simplify the way other software or end users interact with those resources."
 The manipulation of the software-abstracted resources enables greater functionality and easier
configuration.
 This is what enables the cloud elasticity and automation.
 The hypervisor and storage area networks (SAN) are two examples of this concept.

21 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 The Control Layer provides the resources management capabilities that allow dynamic resource allocation,
scaling, dynamic reconfiguration, and dynamic access control. Commercial products such as vCloud from
VMware,and open source projects such as OpenStack are prime examples.

 Physical Resource Layer

 The Physical Resource Layer covers all of the traditional hardware resources that underpin the IT
infrastructure.
 This layer consists of physical servers (CPU, memory, bus architecture), disks and storage arrays, network
wiring,switches, and routers.
 This layer also covers the physical data center facility components such as heating,ventilation, air
conditioning (HVAC), electrical power, backup generators, and fuel; physical control of datacenters by IT
staff and contractors; and cabling to outside cloud carriers, phone communication, etc.

Cloud Provider Function—Cloud Service Management

 Cloud Service Management is a set of processes and activities a cloud provider must perform in order to
satisfactorily deliver cloud service to consumers.
 These apply equally to a public cloud provider and a private cloud provider.
 NIST groups these processes and activities into three board areas: Business Support, Provisioning and
Configuration, and Portability and Interoperability.

22 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 Business Support

o The Business Support processes are business-oriented and focus on the business operations of a
cloud provider as they relate to the delivery of cloud services to cloud consumers. There are six
key functions.

Customer Management:
This area covers the activities necessary to manage and maintain the relationship with the cloud
consumer. It deals with items such as customer accounts, complaints and issues, customer contact
information, history ofcustomr interactions, etc.

Contract Management:
This process focuses on the management of contracts between the cloud provider and consumer. This is
implemented via Service Level Agreements (SLAs). Consumers generally pick the level of SLA that
meets their requirements and budget.

Inventory Management:
This process manages the definitive set of cloud services offered to cloud consumers. It establishes a
service catalog and is the primary interface for the consumer to engage with the cloud provider.

Accounting and Billing:


This function handles the financial transactions between the provider and consumer. It generates the
invoices,sends them to the consumer, and collects the revenue. This function supports the “pay-as-you-
go” model as per NIST's cloud definition.

Reporting and Auditing:


This function monitors, tracks, and logs activities performed by the consumer, usually through the
management console. This helps to document what cloud resources the consumer requests, who
requested it, and when.

Pricing and Rating:


This process establishes the price points and tiering for the cloud services of the cloud provider. It
ensures that the cloud provider is competitive by monitoring the competition's pricing and making
adjustments as required. Thecloud provider usually offers discounts or credits to the consumer based on
volume usage.

 Provisioning and Configuration


o The Provisioning and Configuration area deals with process activities that the cloud provider
must execute as partof its internal operations.

Rapid Provisioning:
A cloud provider must be able to quickly respond to varying workload demands. This includes scaling
up as well as scaling down. This must be fully automated and requires a scriptable, virtualized
infrastructure.

Resource Changing:
To support rapid elasticity, the provider must implement changes to its underlying resources effectively
and speedily, primarily through automation. These changes include replacing broken components,
upgrading components, adding greater capacity, and reconfiguring existing components.

23 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


Monitoring and Reporting:
Ongoing monitoring of the provider's operations and cloud infrastructure is critical to ensure effective
and optimal quality of service. The handling and resolution of events and incidents is ongoing 24 x 7 x
365.

SLA Management:
The cloud provider must ensure that it is meeting its contractual obligations to its customers. Ongoing
management of SLA targets and operational level targets are performed to maintain a high quality of
service.

 Portability and Interoperability


o Cloud consumers need to have a viable exit strategy and are more willing to engage with a
cloud provider that makes it easier to execute an exit strategy. Therefore it is advantageous for
cloud providers to offer maximum interoperability and portability.

Data Portability:
A cloud provider must provide a mechanism to move large amounts of data into and out of the
provider's cloud environment. For example, in a SaaS environment, the cloud consumer must be able to
upload, in bulk, existing HR records into a HR SaaS application.
The consumer must also be able to export in bulk from the HR SaaS application back to their own data
center. Failure to provide easy and reliable transfer mechanisms will discourage the adoption of cloud
services.

Service Interoperability:
When a cloud provider adheres to well-known and accepted technology standards, it is easier for
consumers to develop and deploy cloud solutions that span across more than one cloud provider's
environment.
For a cloud consumer, service interoperability delivers greater disaster recovery resiliency by removing
a single point of failure (i.e. the cloud provider) and greater resource capacity by spreading the
workload across several providers' IaaS resources.

System Portability:
This capability enables a consumer to move or migrate infrastructure resources, like virtual machines
and applications, easily from one cloud provider to another.
As in data portability, this enables a smoother exit strategy that protects a consumer from an
unexpected, long-term disruption of a cloud provider's services.

Cloud Provider Function—Security

The traditional confidentiality-integrity-availability (CIA) areas of security still need to be addressed in each of
thethree service layers (IaaS, PaaS, SaaS). For example, an IaaS provider needs to ensure that the hypervisor is
secure and well-configured.

Authentication: Provide a multi-factor authentication by augmenting username/password credentials with a


hardware or software RSA token.

24 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


Identity management: Provide an effective identify management solution to manage the consumer usernames
and/or integrate to an in-house system such as Microsoft Active Directory.
Security monitoring: Provider must have a strong Intrusion Detection System (IDS)/Intrusion Prevention
System (IPS) tools to track and identify any potential security issue.

Incident response: A well-structured security process to deal with breaches with strong communication
channels is necessary to minimize the impact of any security incident.

Cloud Provider Function—Privacy

A cloud provider must ensure that consumer data stored in the cloud environment is protected and private to
the consumer. If the cloud provider collects data about the consumer, or the consumer's activities and behavior
patterns, then they must ensure that the collected data is fully protected and remains private, and cannot be
accessed by anyone other than the consumer.
A cloud provider must explicitly guarantee that a consumer's data remains in a well-defined geographical
location with explicit acknowledgement of the consumer.

Cloud Broker Functions

A cloud broker is an optional cloud player in the delivery of cloud services. NIST defines a cloud broker as an
entity that acts as an intermediary between the consumer and provider.
A cloud broker is involved in a cloud service delivery when a consumer chooses not to directly manage or
operate the usage of a cloud service.
A cloud broker can function in one or more of the following scenarios.

 Service Intermediation

Service Intermediation is when a broker performs value-add service on behalf of the consumer. For example,
in following figure, the cloud broker performs some administrative or management function on behalf of the
consumer for a particular cloud service.
This value-add service may include activities such as invoice management, invoice and usage reconciliation,
and end-user account management, etc.

 Service Aggregation

Service Aggregation is when a broker integrates two or more cloud services to provide a complex cloud
solution to the consumer. Following figure illustrates a cloud service that is composed of three different cloud
provider's services.

25 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


Following figure illustrates a more complex cloud solution composed from several cloud services, each one
delivered through a unique cloud provider.

 Service Arbitrage

Service Arbitrage is when a broker dynamically selects the best cloud service provider in real time. Following
figure illustrates a broker checking for the best cloud service, for example online storage, from three cloud
providers.

Cloud Auditor Functions


A cloud auditor is an optional cloud provider in the delivery of cloud services. They provide an independent
evaluation of a cloud provider's capabilities in terms of security, SLA performance, or adherence to industry
standards.
A cloud auditor is usually requested by a cloud consumer to evaluate a cloud provider.
In some cases, a cloud provider uses a cloud auditor to publically demonstrate their adherence to industry
standards, such as SOX compliance, HIPPA, and PCI.
Depending on the business industry and regulatory environment, a cloud consumer must have audited
compliance records before they can utilize a cloud service.
 Security Audit
In a security audit, a cloud auditor evaluates whether there are sufficient security controls in place and whether
the cloud provider demonstrates adherence to best practice security processes. For example, a cloud auditor
may validate whether or not a cloud provider is compliant to security standard ISO 27001.
 Privacy Impact Audit
A privacy audit by a cloud auditor can provide assurance that personal information (PI) and personally
identifiable information (PII) are protected by a cloud provider.
 Performance Audit
Cloud providers are obliged to deliver the quality of service as agreed to in the SLA. A performance audit by a
cloud auditor can independently verify whether or not such targets are consistently met.
26 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing
By using an independent third party, performance claims by either the cloud provider or cloud consumer can
be more objectively verified.

PUBLIC CLOUDS
A public cloud is built over the Internet and can be accessed by any user who has paid for the service. Public
clouds are owned by service providers and are accessible through a subscription. The callout box in top of
Figure 4.1 shows the architecture of a typical public cloud. Many public clouds are available, including Google
App Engine (GAE), Amazon Web Services (AWS), Microsoft Azure, IBM Blue Cloud, and Salesforce.com’s
Force.com. The providers of the aforementioned clouds are commercial providers that offer a publicly
accessible remote interface for creating and managing VM instances within their proprietary infrastructure. A
public cloud delivers a selected set of business processes. The application and infrastructure services are
offered on a flexible price-per-use basis.

PRIVATE CLOUDS
A private cloud is built within the domain of an intranet owned by a single organization. Therefore, it is client
owned and managed, and its access is limited to the owning clients and their partners. Its deployment was not
meant to sell capacity over the Internet through publicly accessible interfaces. Private clouds give local users a
flexible and agile private infrastructure to run service workloads within their administrative domains. A private
cloud is supposed to deliver more efficient and convenient cloud services. It may impact the cloud
standardization, while retaining greater customization and organizational control.
HYBRID CLOUDS
A hybrid cloud is built with both public and private clouds, as shown at the lower-left corner of Figure 4.1.
Private clouds can also support a hybrid cloud model by supplementing local infrastructure with computing
capacity from an external public cloud. For example, the Research Compute Cloud (RC2) is a private cloud,
built by IBM, that interconnects the computing and IT resources at eight IBM Research Centers scattered
throughout the United States, Europe, and Asia. A hybrid cloud provides access to clients, the partner network,
and third parties. In summary, public clouds promote standardization, preserve capital investment, and offer
application flexibility. Private clouds attempt to achieve customization and offer higher efficiency, resiliency,
security, and privacy. Hybrid clouds operate in the middle, with many compromises in terms of resource
sharing.

27 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


IAAS
This model allows users to use virtualized IT resources for computing, storage, and networking. In short, the
service is performed by rented cloud infrastructure. The user can deploy and run his applications over his
chosen OS environment. The user does not manage or control the underlying cloud infrastructure, but has
control over the OS, storage, deployed applications, and possibly select networking components. This IaaS
model encompasses storage as a service, compute instances as a service, and communication as a service. The
Virtual Private Cloud (VPC) in Example 4.1 shows how to provide Amazon EC2 clusters and S3 storage to
multiple users. Many startup cloud providers have appeared in recent years. GoGrid, FlexiScale, and Aneka
are good examples. Table 4.1 summarizes the IaaS offerings by five public cloud providers.

28 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


PAAS

To be able to develop, deploy, and manage the execution of applications using provisioned resources demands
a cloud platform with the proper software environment. Such a platform includes operating system and
runtime library support. This has triggered the creation of the PaaS model to enable users to develop and
deploy their user applications. Table 4.2 highlights cloud platform services offered by five PaaS services. The
platform cloud is an integrated computer system consisting of both hardware and software infrastructure. The
user application can be developed on this virtualized cloud platform using some programming languages and
software tools supported by the provider (e.g., Java, Python, .NET). The user does not manage the underlying
cloud infrastructure. The cloud provider supports user application development and testing on a well-defined
service platform. This PaaS model enables a collaborated software development platform for users from
different parts of the world. This model also encourages third parties to provide software management,
integration, and service monitoring solutions.

29 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


SAAS
This refers to browser-initiated application software over thousands of cloud customers. Services and tools
offered by PaaS are utilized in construction of applications and management of their deployment on resources
offered by IaaS providers. The SaaS model provides software applications as a service. As a result, on the
customer side, there is no upfront investment in servers or software licensing. On the provider side, costs are
kept rather low, compared with conventional hosting of user applications. Customer data is stored in the cloud
that is either vendor proprietary or publicly hosted to support PaaS and IaaS.
The best examples of SaaS services include Google Gmail and docs, Microsoft SharePoint, and the CRM
software from Salesforce.com. They are all very successful in promoting their own business or are used by
thousands of small businesses in their day-to-day operations. Providers such as Google and Microsoft offer
integrated IaaS and PaaS services, whereas others such as Amazon and GoGrid offer pure IaaS services and
expect third-party PaaS providers such as Manjrasoft to offer application development and deployment
services on top of their infrastructure services. To identify important cloud applications in enterprises, the
success stories of three real-life cloud applications are presented in Example 4.3 for HTC, news media, and
business transactions. The benefits of using cloud services are evident in these SaaS applications.

30 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


Cloud Models:- Characteristics

The essential characteristics can be elaborated as follows:

• On-demand self-service: Users are able to provision cloud computing resources without requiring human
interaction, mostly done though a web-based self-service portal (management console).

• Broad network access: Cloud computing resources are accessible over the network, supporting
heterogeneous client platforms such as mobile devices and workstations
.
• Resource pooling: Service multiple customers from the same physical resources, by securely separating the
resources on logical level.

• Rapid elasticity: Resources are provisioned and released on-demand and/or automated based on triggers or
parameters. This will make sure your application will have exactly the capacity it needs at any point of time.

• Measured service: Resource usage are monitored, measured, and reported (billed) transparently based on
utilization. In short, pay for use.

The following image shows that cloud computing is composed of five essential characteristics, three
deployment models, and four service models as shown in the following figure:

31 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


The cloud can provide exactly the same technologies as “traditional” IT infrastructure – the main difference, as
mentioned previously, is that each of these technologies is provided as a service. This service can be accessible
over a cloud management interface layer, which provides access over REST/SOAP API or a management
console website.
As an example, let’s consider Amazon Web Services (AWS). AWS provides multiple cloud infrastructure
services (see Following figure):

Amazon Elastic Compute Cloud (EC2) is a key web service that provides a facility to create and manage
virtual machine instances with operating systems running inside them. There are three ways to pay for EC2
virtual machine instances, and businesses may choose the one that best fits their requirements. An on-demand
instance provides a virtual machine (VM) whenever you need it, and terminates it when you do not. A reserved
instance allows the user to purchase a VM and prepay for a certain period of time. A spot instance can be
purchased through bidding, and can be used only as long as the bidding price is higher than others. Another
convenient feature of Amazon’s cloud is that it allows for hosting services across multiple geographical
locations, helping to reduce network latency for a geographically-distributed customer base.

Amazon Relational Database Service (RDS) provides MySQL and Oracle database services in the cloud.

Amazon S3 is a redundant and fast cloud storage service that provides public access to files over http. Amazon
SimpleDB is very fast, unstructured NoSQL database.

Amazon Simple Queuing Service (SQS) provides a reliable queuing mechanism with which application
developers can queue different tasks for background processing.

ARCHITECTURAL DESIGN CHALLENGES


Challenge 1—Service Availability and Data Lock-in Problem
 The management of a cloud service by a single company is often the source of single points of failure.
 To achieve HA, one can consider using multiple cloud providers.
 Using multiple cloud providers may provide more protection from failures.
 Another availability obstacle is distributed denial of service (DDoS) attacks.
 Some utility computing services offer SaaS providers the opportunity to defend against DDoS attacks
by using quick scale-ups.
Challenge 2—Data Privacy and Security Concerns
 Current cloud offerings are essentially public (rather than private) networks, exposing the system to
more attacks.
 Many obstacles can be overcome immediately with well-understood technologies such as encrypted
storage, virtual LANs, and network middleboxes (e.g., firewalls, packet filters).
 In a cloud environment, newer attacks may result from hypervisor malware, guest hopping and
hijacking, or VM rootkits.
 Another type of attack is the man-in-the-middle attack for VM migrations.

32 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


Challenge 3—Unpredictable Performance and Bottlenecks

 Multiple VMs can share CPUs and main memory in cloud computing, but I/O sharing is problematic.
 Internet applications continue to become more data-intensive. If we assume applications to be “pulled
apart” across the boundaries of clouds, this may complicate data placement and transport.
 Cloud users and providers have to think about the implications of placement and traffic at every level
of the system, if they want to minimize costs.
 This kind of reasoning can be seen in Amazon’s development of its new CloudFront service. Therefore,
data transfer bottlenecks must be removed, bottleneck links must be widened, and weak servers should
be removed..

Challenge 4—Distributed Storage and Widespread Software Bugs

 The database is always growing in cloud applications.


 The opportunity is to create a storage system that will not only meet this growth, but also combine it
with the cloud advantage of scaling arbitrarily up and down on demand. This demands the design of
efficient distributed SANs.
 Data centers must meet programmers’ expectations in terms of scalability, data durability, and HA.
Data consistence checking in SAN-connected data centers is a major challenge in cloud computing.
 Large-scale distributed bugs cannot be reproduced, so the debugging must occur at a scale in the
production data centers. No data center will provide such a convenience. One solution may be a
reliance on using VMs in cloud computing.

Challenge 5—Cloud Scalability, Interoperability, and Standardization

 The pay-as-you-go model applies to storage and network bandwidth; both are counted in terms of the
number of bytes used.
 Computation is different depending on virtualization level. GAE automatically scales in response to
load increases and decreases; users are charged by the cycles used.
 AWS charges by the hour for the number of VM instances used, even if the machine is idle. The
opportunity here is to scale quickly up and down in response to load variation, in order to save money,
but without violating SLAs.

33 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 Open Virtualization Format (OVF) describes an open, secure, portable, efficient, and extensible format
for the packaging and distribution of VMs.

Challenge 6—Software Licensing and Reputation Sharing

 Many cloud computing providers originally relied on open source software because the licensing model
for commercial software is not ideal for utility computing.
 The primary opportunity is either for open source to remain popular or simply for commercial software
companies to change their licensing structure to better fit cloud computing.
 One can consider using both pay-for-use and bulk-use licensing schemes to widen the business
coverage.
 One customer’s bad behavior can affect the reputation of the entire cloud.
 Another legal issue concerns the transfer of legal liability. Cloud providers want legal liability to
remain with the customer, and vice versa.
 This problem must be solved at the SLA level. We will study reputation systems for protecting data
centers in the next section.

BASICS OF VIRTUALIZATION
 Creation of virtual version of hardware platform, OS, N/W, storage etc.,
 It allows to run multiple OS on a single physical machine called host machine.
 Each instance of OS called Virtual machine (VM)
 The machine on which the virtual machine is created is known as host machine and virtual machine is
referred as a guest machine. This virtual machine is managed by a software or firmware, which is
known as hypervisor.

34 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 In earlier days industries used to kept separate physical servers for file storage, database, web hosting,
email etc., in their server rooms.
 Each server required separate H/W, OS, application software and administrator to manage it.
 Any failure in the server hardware may cause indefinite blocking of the services till it restored and
whole system may collapse.
 Therefore in search of consolidation solution the concept of virtualization came into the fileld.
 Virtualization allows to run multiple servers OS in a single physical machine.
 That greatly saves the cost behind purchasing extra physical servers, power consumption, man power,
licensing etc., it also reduce the no of physical servers required as shown in fig.
 Virtualization environment allows users to create VM, delete VM, copy VM, migrate VM, save state of
the VM and roll back the execution of VM.

Purpose of virtualization:

 To enhance resource sharing among multiple users.


 Improve computer performance in terms of maximum resource utilization and application flexibility.
 To implement virtualization VMM monitor required.
 A VMM or Hypervisor is piece of software that allows creating, running and managing the multiple
instances of OS over the shared hardware of host machine.
35 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing
 A VMM runs one or more VM on a machine which is called a host machine which can be a computer
as well as a server.
 The OS running inside VM called guest OS.
 Each VM shares hardware resources of host machine (CPU, RAM, storage, Network) to runs
independent operating systems .

Virtualization is a technique, which allows to share single physical instance of an application or resource
among multiple organizations or tenants (customers). It does so by assigning a logical name to a physical
resource and providing a pointer to that physical resource on demand.

 Virtualization Concept
 Creating a virtual machine over existing operating system and hardware is referred as Hardware
Virtualization. Virtual Machines provide an environment that is logically separated from the underlying
hardware.
 The machine on which the virtual machine is created is known as host machine and virtual machine is
referred as a guest machine. This virtual machine is managed by a software or firmware, which is
known as hypervisor.
 Hypervisor
 The hypervisor is a firmware or low-level program that acts as a Virtual Machine Manager. There are
two types of hypervisor:
 Type 1 hypervisor executes on bare system. LynxSecure, RTS Hypervisor, Oracle VM, Sun xVM
Server, VirtualLogic VLX are examples of Type 1 hypervisor. The following diagram shows the Type
1 hypervisor.

36 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 The type1 hypervisor does not have any host operating system because they are installed on a bare
system.
 Type 2 hypervisor is a software interface that emulates the devices with which a system normally
interacts. Containers, KVM, Microsoft Hyper V, VMWare Fusion, Virtual Server 2005 R2, Windows
Virtual PC and VMWare workstation 6.0 are examples of Type 2 hypervisor.
 The following diagram shows the Type 2 hypervisor.

 Benefits of Virtualization

 Virtualization can increase IT agility, flexibility, and scalability while creating significant cost
savings. Workloads get deployed faster, performance and availability increases and operations become
automated, resulting in IT that's simpler to manage and less costly to own and operate.
 Reduce capital and operating costs.
 Deliver high application availability.
 Minimize or eliminate downtime.
 Increase IT productivity, efficiency, agility and responsiveness.
 Speed and simplify application and resource provisioning.
 Support business continuity and disaster recovery.
 Enable centralized management.
 Build a true Software-Defined Data Center.

Types of Hardware Virtualization

Here are the three types of hardware virtualization:


 Full Virtualization
 Emulation Virtualization
 Para virtualization
 Full Virtualization

37 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 In full virtualization, the underlying hardware is completely simulated. Guest software does not
require any modification to run.

 Emulation Virtualization
 In Emulation, the virtual machine simulates the hardware and hence becomes independent of it. In
this, the guest operating system does not require modification.

 Para virtualization
 In Para virtualization, the hardware is not simulated. The guest software run their own isolated
domains.

 VMware vSphere is highly developed infrastructure that offers a management infrastructure


framework for virtualization. It virtualizes the system, storage and networking hardware.

38 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


TYPES OF VIRTUALIZATION

 Network virtualization. VLANs – virtual networks – have been around for a long time. A VLAN is a
group of systems that communicate in the same broadcast domain, regardless of the physical location
of each node. By creating and configuring VLANs on physical networking hardware, a network
administrator can place two hosts – one n New York City and one in Shanghai – on what appears to
these hosts to be the same physical network. The hosts will communicate with one another under this
scenario. This abstraction had made it easy for companies to move away from simply using physical
connections to define networks and be able to create less expensive networks that are flexible and
meet ongoing business needs.

 Application virtualization. Virtualization is all about abstraction. When it comes to application


virtualization, traditional applications are wrapped up inside a container that allows the application to
believe that it is running on an original supported platform. The application believes that it has access
to the resources that it needs to operate. Although virtualization applications are not really “installed”
in the traditional sense, they are still executed on systems as if they were.

 Desktop virtualization. Desktop and server virtualization are two sides of the same coin. Both
involve virtualization of entire systems, but there are some key differences. Server virtualization
involves abstracting server-based workloads from the underlying hardware, which are then delivered
to clients as normal. Clients don’t see any difference between a physical and virtual server. Desktop
virtualization, on the other hand, virtualizes the traditional desktop and moves the execution of that
client workload to the data center. Those workloads are then accessed via a number of different
methods, such as thin clients or other means.

IMPLEMENTATION LEVELS OF VIRTUALIZATION

 Virtualization is a computer architecture technology by which multiple virtual machines (VMs) are multiplexed
in the same hardware machine.
 The purpose of a VM is to enhance resource sharing by many users and improve computer performance in
terms of resource utilization and application flexibility.
 Hardware resources (CPU, memory,I/O devices, etc.) or software resources (operating system and software
libraries) can be virtualized in various functional layers.
 Levels of Virtualization Implementation

 A traditional computer runs with a host operating system specially tailored for its hardware architecture, as
shown in following (a).

39 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 After virtualization, different user applications managed by their own operating systems (guest OS) can run on
the same hardware, independent of the host OS.
 This is done by adding additional software, called a virtualization layer as shown in following figure (b).

 This virtualization layer is known as hypervisor or virtual machine monitor.


 The VMs are shown in the upper boxes, where applications run with their own guest OS over the virtualized
CPU, memory, and I/O resources.
 The main function of the software layer for virtualization is to virtualize the physical hardware of a host
machine into virtual resources to be used by the VMs, exclusively.
 The virtualization software creates the abstraction of VMs by interposing a virtualization layer at various levels
of a computer system.
 Common virtualization layers include the instruction set architecture (ISA) level, hardware level,operating
system level, library support level, and application level.
 Virtual Machine A representation of a real machine using software that provides an operating
environment which can run or host a guest operating system.

40 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 Guest Operating System An operating system running in a virtual machine environment that would
otherwise run directly on a separate physical system.
 Virtualization Layer or Virtual Machine Monitor The Virtualization layer is the middleware
between the underlying hardware and virtual machine represented in the system, also known as virtual
machine monitor (VMM)
Virtualization Ranging from Hardware to Applications in 5 Abstraction Levels

 Instruction Set Architecture Level

 At the ISA level, virtualization is performed by emulating a given ISA by the ISA of the host machine.
 For example, MIPS binary code can run on an x86-based host machine with the help of ISA emulation.
 Instruction set emulation leads to virtual ISAs created on any hardware machine.
 The basic emulation method is through code interpretation
 Instruction set emulation requires binary translation and optimization.
 A virtual instruction set architecture (V-ISA) thus requires adding a processor-specific software
translation layer to the compiler.
 V-ISA requires adding a processor-specific software translation layer in the complier.

 Virtualization at Hardware Abstraction Level 


 Hardware-level virtualization is performed right on top of the bare hardware. On the one hand, this
approach generates a virtual hardware environment for a VM. On the other hand, the process manages
the underlying hardware through virtualization.
 The idea was implemented in the IBM VM/370 in the 1960s.

41 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 More recently, the Xen hypervisor has been applied to virtualize x86-based machines to run Linux or
other guest OS applications.

Operating System Level


 Refers to an abstraction layer between traditional OS and user applications.
 OS-level virtualization creates isolated containers on a single physical server and the OS instances to
utilize the hardware and software in data centers.
 The containers behave like real servers.
 OS-level virtualization is mainly used in creating virtual hosting environments to allocate hardware
resources among a large number of mutually distrusting users.
 It is also used in consolidating server hardware by moving services on separate hosts into containers
or VMs on one server.

Library Support Level


 Virtualization with library interfaces is possible by controlling the communication link between
applications and the rest of a system through API hooks.
 The software tool WINE has implemented this approach to support Windows applications on top of
UNIX hosts.
 Another example is the vCUDA which allows applications executing within VMs to leverage GPU
hardware acceleration.
User-Application Level
 Virtualization at the application level virtualizes an application as a VM.
 Is also known as process-level virtualization.
 The most popular approach is to deploy high level language (HLL) VMs.
 The virtualization layer sits as an application program on top of the operating system.
 Any program written in the HLL and compiled for this VM will be able to run on it.
 The Microsoft .NET CLR and Java Virtual Machine (JVM) are two good examples of this class of
VM.
 Other forms of application-level virtualization are known as application isolation, application
sandboxing, or application streaming.
 The process involves wrapping the application in a layer that is isolated from the host OS and other
applications.
 An example is the LANDesk application virtualization platform which deploys software applications
as self-contained, executable files in an isolated environment without requiring installation, system
modifications, or elevated security privileges.

 VMM Design Requirements and Providers

 Hardware-level virtualization inserts a layer between real hardware and traditional operating systems.
This layer is commonly called the Virtual Machine Monitor (VMM) and it manages the hardware
resources of a computing system.
42 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing
 The VMM acts as a traditional OS.
 One hardware component, such as the CPU, can be virtualized as several virtual copies.
 Therefore, several traditional operating systems which are the same or different can sit on the same set
of hardware simultaneously.
 Three requirements for a VMM
1. Environment for programs which is essentially identical to the original machine.
2. Programs run in this environment should experience only minor decreases in speed.
3. VMM should be in complete control of the system resources
 Complete control of these resources by a VMM includes the following aspects:
1. The VMM is responsible for allocating hardware resources for programs
2. It is not possible for a program to access any resource not explicitly allocated to it.
3. It is possible under certain circumstances for a VMM to regain control of resources already allocated.

 Virtualization Support at the OS Level


 Cloud computing is transforming the computing landscape by shifting the hardware and staffing costs
of managing a computational center to third parties, just like banks.
 Cloud computing has at least two challenges. The first is the ability to use a variable number of
physical machines and VM instances depending on the needs of a problem. For example, a task may
need only a single CPU during some phases of execution but may need hundreds of CPUs at other
times.
 The second challenge concerns the slow operation of instantiating new VMs.
 Therefore, to better support cloud computing, a large amount of research and development should be
done.
Why OS-Level Virtualization?
 It is slow to initialize a hardware-level VM because each VM creates its own image from scratch. In a
cloud computing environment, perhaps thousands of VMs need to be initialized simultaneously
 Storing the VM images also becomes an issue.
 Moreover, full virtualization at the hardware level also has the disadvantages of slow performance and
low density, and the need for para-virtualization to modify the guest OS.
 OS-level virtualization provides a feasible solution for these hardware-level virtualization issues.
 Operating system virtualization inserts a virtualization layer inside an operating system to partition a
machine’s physical resources.
 It enables multiple isolated VMs within a single operating system kernel. This kind of VM is often
called a virtual execution environment (VE), Virtual Private System (VPS), or simply container.
 From the user’s point of view, VEs look like real servers.
 Although VEs can be customized for different people, they share the same operating system kernel.
Therefore, OS-level virtualization is also called single-OS image virtualization.
 Following figure illustrates operating system virtualization from the point of view of a machine stack.

43 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 Advantages of OS Extensions
1. VMs at the operating system level have minimal startup/shutdown costs, low resource requirements,
and high scalability.
2. For an OS-level VM, it is possible for a VM and its host environment to synchronize state changes
when necessary.
 Can be achieved through
1. All OS-level VMs on the same physical machine share a single operating system kernel.
2. The virtualization layer can be designed in a way that allows processes in VMs to access as many
resources of the host machine as possible, but never to modify them.

 Disadvantages of OS Extensions
1. All the VMs at operating system level on a single container must have the same kind of guest
operating system.
2. To implement OS-level virtualization, isolated execution environments (VMs) should be created based
on a single OS kernel.
3. The access requests from a VM need to be redirected to the VM’s local resource partition on the
physical machine.

Virtualization on Linux or Windows Platforms


 The Linux kernel offers an abstraction layer to allow software processes to work with and operate on
resources without knowing the hardware details.
 New hardware may need a new Linux kernel to support.
 Therefore, different Linux platforms use patched kernels to provide special support for extended
functionality.

44 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 Two OS tools (Linux vServer and OpenVZ) support Linux platforms to run other platform-based
applications through virtualization.
 The third tool, FVM, is an attempt specifically developed for virtualization on the Windows NT
platform.
 Middleware Support for Virtualization
 Library-level virtualization is also known as user-level Application Binary Interface (ABI) or API
emulation.
 This type of virtualization can create execution environments for running alien programs on a
platform rather than creating a VM to run the entire operating system.
 API call interception and remapping are the key functions performed.

45 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


VIRTUALIZATION STRUCTURES/TOOLS AND MECHANISMS

 Hypervisor software provides two different structure of virtualization


1. Hosted structure (Type 2 Virtualization)
2. Bare-metal structure (Type 1 Virtualization)
1. Hosted structure (Type 2 Virtualization)
 In hosted structure the guest OS and application runs on the top of base OS with help of VMM
(called hypervisor).
 VMM stays between the base OS and guest OS.
 This approach provides better compatibility of hardware because the base OS is responsible for
providing hardware drivers to guest OS instead of the VMM.

2. Bare-metal structure (Type 1 Virtualization)


 In bare metal structure, the VMM can be directly installed on the hardware therefore no intermediate
OS is needed.
 VMM can directly communicate with the hardware and does not rely on the host systems for pass
through permission which results in better performance, scalability and stability.

46 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 Depending on the position of the virtualization layer, there are several classes of VM architectures,
namely the hypervisor architecture, para virtualization, and host-based virtualization. The hypervisor is
also known as the VMM (Virtual Machine Monitor). They both perform the same virtualization
operations.

Hypervisor and Xen Architecture

 Hypervisor (VMM)
 A hardware virtualization techniques allowing multiple guest OSs to run on a host machine.
 Provides hyper calls for the guest OSs and applications
 Depending on the functionalities, a hypervisor can Assume a micro-kernel architecture Or assume a
monolithic hypervisor architecture like VMware ESX for server virtualization Types of Hypervisor
 Type 1 Hypervisor Run on the bare metal hypervisor

47 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 Examples: IBM CP/CMS hypervisor Microsoft Hyper-V
 Type 2 (Hosted Hypervisor) Run over a host OS The hypervisor is the second layer over the
hardware
 Examples: FreeBSD
 The Xen Architecture
 An open source hypervisor program developed by Cambridge University, http://xen.org
 Xen hypervisor (a micro kernel)
 Commercial Xen Hypervisors: Citrix XenServer, Oracle VM

 The core components of a Xen system are the hypervisor, kernel, and applications. The organization of
the three components is important.
 Like other virtualization systems, many guest OSes can run on top of the hypervisor. However, not all
guest OSes are created equal, and one in particular controls the others.
 The guest OS, which has control ability, is called Domain 0, and the others are called Domain U.
Domain 0 is a privileged guest OS of Xen.
 It is first loaded when Xen boots without any file system drivers being available. Domain 0 is
designed to access hardware directly and manage devices.
 Therefore, one of the responsibilities of Domain 0 is to allocate and map hardware resources for the
guest domains (the Domain U domains).

48 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


VIRTUALIZATION OF CPU, MEMORY, AND I/O DEVICES

Hardware Support for Virtualization


 To enhance protection all processors have at least two modes, user mode and supervisor mode, to
ensure controlled access of critical hardware.
 Instructions running in supervisor mode are called privileged instructions. Other instructions are
unprivileged instructions.
 In a virtualized environment, it is more difficult to make OSes and applications run correctly because
there are more layers in the machine stack.
 The VMware Workstation is a VM software suite for x86 and x86-64 computers.
 This software suite allows users to set up multiple x86 and x86-64 virtual computers and to use one or
more of these VMs simultaneously with the host operating system.
 The VMware Workstation assumes the host-based virtualization.
 Xen is a hypervisor for use in IA-32, x86-64, Itanium, and PowerPC 970 hosts. Actually,
 Xen modifies Linux as the lowest and most privileged layer, or a hypervisor.
 One or more guest OS can run on top of the hypervisor.
 KVM (Kernel-based Virtual Machine) is a Linux kernel virtualization infrastructure.
 KVM can support hardware-assisted virtualization and paravirtualization by using the Intel VT-x or
AMD-v and VirtIO framework, respectively.
 The VirtIO framework includes a paravirtual Ethernet card, a disk I/O controller, a balloon device for
adjusting guest memory usage, and a VGA graphics interface using VMware drivers.

 CPU VIRTUALIZATION
 A VM is a duplicate of an existing computer system in which a majority of the VM instructions are
executed on the host processor in native mode. Thus, unprivileged instructions of VMs run directly on
the host machine for higher efficiency.
 Other critical instructions should be handled carefully for correctness and stability.
 The critical instructions are divided into three categories: privileged instructions, control sensitive
instructions, and behavior-sensitive instructions.
 CPU architecture is virtualizable if it supports the ability to run the VM’s privileged and unprivileged
instructions in the CPU’s user mode while the VMM runs in supervisor mode.
 When the privileged instructions including control- and behavior-sensitive instructions of a VM are
executed, they are trapped in the VMM. In this case, the VMM acts as a unified mediator for hardware
access from different VMs to guarantee the correctness and stability of the whole system.
 RISC CPU architectures can be naturally virtualized because all control- and behavior-sensitive
instructions are privileged instructions.
 On the contrary,x86 CPU architectures are not primarily designed to support virtualization. This is
because about 10 sensitive instructions, such as SGDT and SMSW, are not privileged instructions.
These instructions cannot be trapped in the VMM.

49 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 On a native UNIX-like system, a system call triggers the 80h interrupt and passes control to the OS
kernel.
 The interrupt handler in the kernel is then invoked to process the system call. On a paravirtualization
system such as Xen, a system call in the guest OS first triggers the 80h interrupt normally.
 Almost at the same time, the 82h interrupt in the hypervisor is triggered.
 Incidentally, control is passed on to the hypervisor as well. When the hypervisor completes its task for
the guest OS system call, it passes control back to the guest OS kernel.
 Certainly, the guest OS kernel may also invoke the hypercall while it’s running.
CPU Virtualization

 CPU virtualization is related to range protection levels called rings in which code can execute.
 Intel X86 architecture of CPU offers four levels of privileges known as Ring 0,1,2,3
 Ring 0, Ring 1, Ring 2 are associated with OS
 Ring 3 is reserved for applications to manage access to the computer hardware.
 Ring 0 is used by kernel because of that ring 0 has the highest level privilege
 Ring 3 has lowest privilege as it belongs to user level application

 The user level applications typically run in ring3, the OS needs to have direct access to the memory
and hardware and must execute its privileged instructions in Ring0.

50 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


CPU Virtualization techniques:

1. Binary translation with full virtualization

2. OS Assisted Virtualization or Para virtualization

3. Hardware assisted virtualization (HVM)

1. Binary Translation with Full Virtualization


 Depending on implementation technologies, hardware virtualization can be classified into two
categories: full virtualization and host-based virtualization.
 Full virtualization does not need to modify the host OS. It relies on binary translation to trap and to
virtualize the execution of certain sensitive, non virtualizable instructions.
 Full Virtualization
 With full virtualization, noncritical instructions run on the hardware directly while critical instructions
are discovered and replaced with traps into the VMM to be emulated by software.
 Only critical instructions trapped into the VMM because binary translation can incur a large
performance overhead.
 Noncritical instructions do not control hardware or threaten the security of the system
 Therefore, running noncritical instructions on hardware not only can promote efficiency, but also can
ensure system security.

 Binary Translation of Guest OS Requests Using a VMM

 This approach was implemented by VMware and many other software companies. As shown in
following figure VMware puts the VMM at Ring 0 and the guest OS at Ring 1.
 The VMM scans the instruction stream and identifies the privileged, control- and behavior-sensitive
instructions.
 The method used in this emulation is called binary translation. Therefore, full vir-tualization combines
binary translation and direct execution.
 The guest OS is completely decoupled from the underlying hardware. Consequently, the guest OS is
unaware that it is being virtualized.

51 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 Binary translation employs a code cache to store translated hot instructions to improve performance,
but it increases the cost of memory usage.
 At the time of this writing, the performance of full virtualization on the x86 architecture is typically 80
percent to 97 percent that of the host machine.

 Host-Based Virtualization
 An alternative VM architecture is to install a virtualization layer on top of the host OS. This host OS is
still responsible for managing the hardware. The guest OSes are installed and run on top of the
virtualization layer.
 This host based architecture has some distinct advantages, as enumerated next. First, the user can
install this VM architecture without modifying the host OS.
 The virtualizing software can rely on the host OS to provide device drivers and other low-level
services.
 This will simplify the VM design and ease its deployment.
 Second, the host-based approach appeals to many host machine configurations.
 Compared to the hypervisor/VMM architecture, the performance of the host-based architecture may
also be low.
2. Para-Virtualization with Compiler Support

 A para-virtualized VM provides special APIs requiring substantial OS modifications in user


applications.
 Performance degradation is a critical issue of a virtualized system.
 The virtualization layer can be inserted at different positions in a machine software stack.
 Para-virtualization attempts to reduce the virtualization overhead, and thus improve performance by
modifying only the guest OS kernel.
 Following figure illustrates the concept of a para-virtualized VM architecture.

52 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 The guest operating systems are para-virtualized.
 They are assisted by an intelligent compiler to replace the non virtualizable OS instructions by
hypercalls as illustrated in following figure.

 The traditional x86 processor offers four instruction execution rings: Rings 0, 1, 2, and 3.
 The lower the ring number, the higher the privilege of instruction being executed.
 The OS is responsible for managing the hardware and the privileged instructions to execute at Ring 0,
while user-level applications run at Ring 3.
 The best example of para-virtualization is the KVM to be described below.
 Para-Virtualization Architecture

 When the x86 processor is virtualized, a virtualization layer is inserted between the hardware and the
OS.
 According to the x 86 ring definitions, the virtualization layer should also be installed at Ring 0.
Different instructions at Ring 0 may cause some problems.
 Para-virtualization replaces non virtualizable instructions with hypercalls that communicate directly
with the hypervisor or VMM.
 However, when the guest OS kernel is modified for virtualization, it can no longer run on the hardware
directly.
 The problems are compatibility, portability, the cost of maintaining para-virtualized OSes is high, the
performance advantage of para-virtualization varies greatly due to workload variations.

53 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 Compared with full virtualization, para-virtualization is relatively easy and more practical. Therefore,
many virtualization products employ the para-virtualization architecture.
 Xen, KVM, and VMware ESX are good examples.
 KVM (Kernel-Based VM)

 The KVM does the rest, which makes it simpler than the hypervisor that controls the entire machine.
 KVM is a hardware-assisted para-virtualization tool, which improves performance and supports
unmodified guest OSes such as Windows, Linux, Solaris, and other UNIX variants.

 Para-Virtualization with Compiler Support

 Para-virtualization handles these instructions at compile time.


 The guest OS kernel is modified to replace the privileged and sensitive instructions with hypercalls to
the hypervisor or VMM.
 The guest OS running in a guest domain may run at Ring 1 instead of at Ring 0.
 This implies that the guest OS may not be able to execute some privileged and sensitive instructions.
 The privileged instructions are implemented by hypercalls to the hypervisor. After replacing the
instructions with hypercalls, the modified guest OS emulates the behavior of the original guest OS.
 On an UNIX system, a system call involves an interrupt or service routine.
 The hypercalls apply a dedicated service routine in Xen.
3. Hardware Assisted Virtualization (HVM)

54 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


Hardware-Assisted CPU Virtualization
 This technique attempts to simplify virtualization because full or paravirtualization is complicated.
 Intel and AMD add an additional mode called privilege mode level (some people call it Ring-1) tox86
processors.
 Therefore, operating systems can still run at Ring 0 and the hypervisor can run at Ring -1.
 This technique removes the difficulty of implementing binary translation of full virtualization.
 It also lets the operating system run in VMs without modification.
 Should have high efficiency.
 Para-virtualization and hardware-assisted virtualization can be combined to improve the performance
further.

2. Memory Virtualization
 Virtual memory virtualization is similar to the virtual memory support provided by modern operating
Systems.
 All modern x86 CPUs include a memory management unit (MMU) and a translation look aside buffer
(TLB) to optimize virtual memory performance. However, in a virtual execution environment, virtual
memory virtualization involves sharing the physical system memory in RAM and dynamically
allocating it to the physical memory of the VMs.
 That means a two-stage mapping process should be maintained by the guest OS and the VMM,
respectively: virtual memory to physical memory and physical memory to machine memory.
 Furthermore, MMU virtualization should be supported, which is transparent to the guest OS.
 The guest OS continues to control the mapping of virtual addresses to the physical memory addresses
of VMs. But the guest OS cannot directly access the actual machine memory.
 The VMM is responsible for mapping the guest physical memory to the actual machine memory.
 Following figure shows the two-level memory mapping procedure.

 Each page table of the guest OSes has a separate page table in the VMM corresponding to it, the VMM
page table is called the shadow page table.

55 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 The MMU already handles virtual-to-physical translations as defined by the OS. Then the physical
memory addresses are translated to machine addresses using another set of page tables defined by the
hypervisor.
 Modern operating systems maintain a set of page tables for every process, the shadow page tables will
get flooded. Hence, the performance overhead and cost of memory will be very high.
 VMware uses shadow page tables to perform virtual-memory-to-machine-memory address translation.
 Processors use TLB hardware to map the virtual memory directly to the machine memory to avoid the
two levels of translation on every access.
 The VMM updates the shadow page tables to enable a direct lookup When the guest OS changes the
virtual memory to a physical memory mapping.
 The AMD Barcelona processor has featured hardware-assisted memory virtualization since 2007.
 It provides hardware assistance to the two-stage address translation in a virtual execution environment
by using a technology called nested paging.
3. I/O Virtualization
 I/O virtualization involves managing the routing of I/O requests between virtual devices and the shared
physical hardware.

 There are three ways to implement I/O virtualization: full device emulation, para-virtualization, and
direct I/O.
 Full device emulation is the first approach for I/O virtualization. Generally, this approach emulates
well-known, real-world devices.

56 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing


 The para-virtualization method of I/O virtualization is typically used in Xen. It is also known as the
split driver model consisting of a frontend driver and a backend driver.

 The frontend driver is running in Domain U and the backend driver is running in Domain 0. They
interact with each other via a block of shared memory.

 The frontend driver manages the I/O requests of the guest OSes and the backend driver is
responsible for managing the real I/O devices and multiplexing the I/O data of different VMs.

 Although para-I/O-virtualization achieves better device performance than full device emulation, it
comes with a higher CPU overhead.
 Direct I/O virtualization lets the VM access devices directly. It can achieve close-to-native
performance without high CPU costs.
 However, current direct I/O virtualization implementations focus on networking for mainframes.
There are a lot of challenges for commodity hardware devices.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Module 1 MANAGING CLOUD USING AWS
Introduction, Future of AWS, Services - AWS EC2, AWS S3 - Cloud storage, Types, Benefits, AWS
IAM - AWS Security, Working of IAM, Components AWS Cloud Front Working, Benefits. Introduction,
Snapshots vs AMI, Different scaling plans. Introduction, Benefits, Algorithms used for load balancing.

Introduction:
 Amazon Web Services (AWS) is a comprehensive and widely-used cloud computing platform provided
by Amazon.com.
 It offers a vast array of on-demand cloud services that enable businesses and individuals to build,
deploy, and manage various applications and services in a highly scalable and cost-effective manner.
 AWS provides a wide range of services across numerous categories, including computing power,
storage, databases, networking, machine learning, analytics, security, and more. These services are
designed to meet the requirements of different use cases, from small startups to large enterprises.

Some key AWS services include:


1. Elastic Compute Cloud (EC2): It offers virtual servers in the cloud, allowing users to run applications
and services with scalable compute capacity.
2. Simple Storage Service (S3): It provides scalable object storage for storing and retrieving data. S3
offers durability, availability, and scalability for a wide range of applications.
3. Relational Database Service (RDS): It offers managed database services for popular database engines
like MySQL, PostgreSQL, Oracle, and Microsoft SQL Server, simplifying database administration
tasks.
4. Lambda: It is a serverless computing service that allows you to run your code without provisioning or
managing servers. You pay only for the compute time consumed by your code.
5. Amazon DynamoDB: It is a fully-managed NoSQL database service that delivers single-digit
millisecond latency at any scale. It is designed to provide fast and predictable performance for
applications that require low-latency data access.
6. Virtual Private Cloud (VPC): It enables you to provision a logically isolated section of the AWS
Cloud where you can launch AWS resources in a virtual network that you define.
7. Amazon S3 Glacier: It is a secure and durable cloud storage service designed for long-term backup
and archiving of data. It offers low-cost storage with retrieval options optimized for infrequent access.
8. Amazon CloudFront: It is a content delivery network (CDN) service that delivers data, videos,
applications, and APIs to users globally with low latency and high transfer speeds.
9. Amazon Simple Notification Service (SNS): It is a fully-managed messaging service that enables you
to send notifications from the cloud to various endpoints, including email, SMS, mobile push
notifications, and more.

10. Amazon Elastic Beanstalk: It simplifies the deployment and management of applications by providing
an easy-to-use platform for deploying and scaling web applications developed in various languages.

50 Mr.M.Vengateshwaran M.E., Asst.Prof/CSE SKCET Cloud Computing

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Future of AWS
The future of Amazon Web Services (AWS) looks promising, as it continues to be a dominant player in
the cloud computing industry. Here are a few key aspects that are likely to shape the future of AWS:

1. Continued Growth and Innovation: AWS has been consistently expanding its portfolio of
services and features, catering to a wide range of customer needs. As technology advances,
AWS is expected to introduce more innovative solutions, especially in areas like artificial
intelligence (AI), machine learning (ML), Internet of Things (IoT), and serverless computing.
The company's focus on research and development, coupled with its large customer base,
positions it to maintain its leadership in the market.

2. Hybrid and Multi-Cloud Solutions: Many organizations are adopting a hybrid cloud strategy,
combining on-premises infrastructure with public cloud services. AWS recognizes this trend
and offers solutions like AWS Outposts, which brings AWS infrastructure and services to on-
premises data centers. Additionally, AWS has partnerships with other major cloud providers,
enabling customers to deploy applications seamlessly across multiple clouds. Expect AWS to
continue expanding its hybrid and multi-cloud offerings to cater to evolving customer
requirements.

3. Edge Computing and IoT: As the number of connected devices increases, there is a growing
demand for processing data at the edge to reduce latency and improve real-time decision-
making. AWS has already made strides in this area with services like AWS Greengrass and
AWS IoT Core. In the future, AWS is likely to enhance its edge computing capabilities,
enabling organizations to deploy and manage applications at the edge efficiently.

4. Advanced Analytics and AI/ML: Data analytics, AI, and ML are transforming industries
across the board. AWS has a robust suite of analytics and AI/ML services, including Amazon
Redshift, Amazon Athena, Amazon SageMaker, and Amazon Rekognition. AWS is expected to
invest further in these areas, enabling customers to extract valuable insights from their data and
build sophisticated AI/ML models.

5. Focus on Security and Compliance: Security and compliance continue to be critical


considerations for organizations moving to the cloud. AWS has a strong security posture and
offers a wide range of security services, such as AWS Identity and Access Management (IAM),
AWS Key Management Service (KMS), and AWS Security Hub. As security threats evolve,
AWS will likely continue to enhance its security services and provide customers with robust
tools to protect their applications and data.

6. Sustainability and Environmental Initiatives: Environmental sustainability has become an


important focus for many organizations, and AWS is committed to reducing its carbon
footprint. AWS has pledged to power its global infrastructure with 100% renewable energy and
has introduced initiatives like the AWS Sustainability Data Initiative. In the future, expect AWS
to continue investing in renewable energy projects and sustainable practices to minimize its
environmental impact.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Services:
Amazon Web Services (AWS) offers a wide range of cloud services across various categories. Here are
some of the key services provided by AWS:

1. Computing Services:
 Amazon Elastic Compute Cloud (EC2): Virtual servers in the cloud for running applications.
 AWS Lambda: Serverless compute service for running code without provisioning or managing
servers.
 AWS Batch: Fully managed batch processing at any scale.
2. Storage Services:
 Amazon Simple Storage Service (S3): Scalable object storage for storing and retrieving data.
 Amazon Elastic Block Store (EBS): Persistent block-level storage volumes for EC2 instances.
 Amazon Glacier: Low-cost storage service for archiving and long-term backup.
3. Database Services:
 Amazon Relational Database Service (RDS): Managed database service for popular relational
databases.
 Amazon DynamoDB: Fully managed NoSQL database service.
 Amazon Aurora: MySQL and PostgreSQL-compatible relational database with high
performance and availability.

4. Networking Services:
 Amazon Virtual Private Cloud (VPC): Isolated virtual network to launch AWS resources.
 AWS Direct Connect: Dedicated network connection between on-premises infrastructure and
AWS.
 Amazon Route 53: Scalable domain name system (DNS) web service.

5. Analytics Services:
 Amazon Redshift: Fast, fully managed data warehousing service.
 Amazon Athena: Serverless query service for analyzing data in Amazon S3.
 Amazon Kinesis: Real-time streaming data processing service.

6. Machine Learning Services:


 Amazon SageMaker: Fully managed service for building, training, and deploying machine
learning models.
 Amazon Rekognition: Deep learning-based image and video analysis service.
 Amazon Comprehend: Natural language processing (NLP) service for extracting insights from
text.
7. Security Services:
 AWS Identity and Access Management (IAM): Securely manage user access to AWS resources.
 AWS Key Management Service (KMS): Managed service for creating and controlling
encryption keys.
 Amazon GuardDuty: Intelligent threat detection service.
8. Developer Tools:
 AWS CloudFormation: Infrastructure as Code service for automating resource provisioning.
 AWS CodeCommit: Fully managed source control service.
 AWS CodePipeline: Continuous integration and delivery service.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


9. Management Tools:
 AWS CloudWatch: Monitor resources and applications, collect and track metrics, and set
alarms.
 AWS CloudTrail: Record AWS API calls for auditing and compliance purposes.
 AWS Systems Manager: Centralized management for AWS resources.

10. Internet of Things (IoT) Services:

 AWS IoT Core: Managed cloud platform for securely connecting and managing IoT devices.
 AWS IoT Analytics: Analytics service for IoT devices and data.

Amazon EC2
 Amazon Elastic Compute Cloud (EC2) is a core service offered by Amazon Web Services (AWS) that
provides resizable compute capacity in the cloud. It enables users to easily launch and manage virtual
servers, known as EC2 instances, to run their applications.

Key features and capabilities of EC2 include:

1. Virtual Servers: EC2 allows users to create virtual servers in the cloud, known as EC2
instances. Users can choose from a variety of instance types that vary in terms of computing
power, memory, storage, and networking capacity, allowing them to select the most suitable
instance for their workload.

2. Scalability: EC2 provides auto-scaling capabilities, allowing users to automatically adjust the
number of instances based on the workload demands. This enables applications to handle
fluctuations in traffic and ensures optimal performance and cost efficiency.

3. Flexible Pricing Options: EC2 offers various pricing models to match different usage patterns
and requirements. Users can choose from On-Demand Instances (pay-as-you-go), Reserved
Instances (upfront commitment for discounted pricing), and Spot Instances (bid-based pricing
for unused capacity). This flexibility allows users to optimize costs based on their specific
needs.

4. Multiple Operating Systems: EC2 supports a wide range of operating systems, including
popular Linux distributions, Windows Server, and other specialized operating systems. This
allows users to run their applications on the preferred operating system.

5. Security and Networking: EC2 provides robust security features, including virtual private
cloud (VPC) integration, security groups, network access control lists (ACLs), and the ability to
configure firewall settings. Users have control over network connectivity and can configure
private subnets, define access rules, and establish VPN connections.

6. Storage Options: EC2 offers various storage options to meet different application
requirements. Users can attach Elastic Block Store (EBS) volumes as persistent block-level
storage to their instances. Additionally, users can leverage Amazon S3 for object storage,
Amazon Elastic File System (EFS) for scalable file storage, and instance store volumes for
temporary storage.

7. Integration with Other AWS Services: EC2 integrates seamlessly with other AWS services,
enabling users to leverage the full capabilities of the AWS ecosystem. This includes services
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
like Amazon RDS for managed databases, Amazon S3 for object storage, AWS Lambda for
serverless computing, and more.

8. Monitoring and Management: EC2 provides monitoring and management tools to help users
monitor the health, performance, and utilization of their instances. Users can utilize Amazon
CloudWatch to collect and analyze metrics, set up alarms, and automate actions based on
predefined rules.

EC2 is widely used by organizations of all sizes, from startups to enterprises, to deploy a wide range of
applications, including web servers, databases, data processing, and machine learning workloads. Its
scalability, flexibility, and integration with other AWS services make it a popular choice for running
applications in the cloud.

EBS:
 EBS stands for Elastic Block Store.
 EC2 is a virtual server in a cloud while EBS is a virtual disk in a cloud.
 Amazon EBS allows you to create storage volumes and attach them to the EC2 instances.
 Once the storage volume is created, you can create a file system on the top of these volumes, and then
you can run a database, store the files, applications or you can even use them as a block device in some
other way.
 Amazon EBS volumes are placed in a specific availability zone, and they are automatically replicated
to protect you from the failure of a single component.
 EBS volume does not exist on one disk, it spreads across the Availability Zone. EBS volume is a disk
which is attached to an EC2 instance.
 EBS volume attached to the EC2 instance where windows or Linux is installed known as Root device
of volume.

SSD:
 SSD stands for solid-state Drives.
 In June 2014, SSD storage was introduced.
 It is a general purpose storage.
 It supports up to 4000 IOPS which is quite very high.
 SSD storage is very high performing, but it is quite expensive as compared to HDD (Hard Disk Drive)
storage.
 SSD volume types are optimized for transactional workloads such as frequent read/write operations
with small I/O size, where the performance attribute is IOPS.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


HDD:
 It stands for Hard Disk Drive.
 HDD based storage was introduced in 2008.
 The size of the HDD based storage could be between 1 GB to 1TB.
 It can support up to 100 IOPS which is very low.

Creating an EC2 instance


1. Sign in to the AWS Management Console.
2. Click on the EC2 service.
3. Click on the Launch Instance button to create a new instance.

 Now, we have different Amazon Machine Images. These are the snapshots of different virtual machines. We
will be using Amazon Linux AMI 2018.03.0 (HVM) as it has built-in tools such as java, python, ruby, perl,
and especially AWS command line tools.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


 Choose an Instance Type, and then click on the Next. Suppose I choose a t2.micro as an instance type.

The main setup page of EC2 is shown below where we define setup configuration.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
Amazon S3
Amazon S3:
 Amazon Simple Storage Service (S3) is a highly scalable and durable cloud storage service
provided by Amazon Web Services (AWS). S3 allows users to store and retrieve any amount of
data from anywhere on the web, making it a popular choice for a wide range of use cases.
 S3 is a safe place to store the files.
 It is Object-based storage, i.e., you can store the images, word files, pdf files, etc.
 The files which are stored in S3 can be from 0 Bytes to 5 TB.
 It has unlimited storage means that you can store the data as much you want.
 Files are stored in Bucket. A bucket is like a folder available in S3 that stores the files.
 S3 is a universal namespace, i.e., the names must be unique globally. Bucket contains a DNS
address. Therefore, the bucket must contain a unique name to generate a unique DNS address.
 If you create a bucket, URL look like:

Here are some key features and capabilities of AWS S3:


1. Object Storage: S3 is designed to store and retrieve objects, which can be files, images, videos,
documents, or any other type of data. Each object is identified by a unique key and can be up to 5
terabytes in size.

2. Scalability and Durability: S3 is built to scale and provides high durability for stored data. It
automatically replicates data across multiple devices and multiple geographically distributed data
centers, ensuring durability and availability.

3. Storage Classes: S3 offers multiple storage classes to meet different requirements and optimize
costs. These include:
Standard: The default storage class with high durability, availability, and low latency access.
Intelligent-Tiering: Automatically moves objects between frequent and infrequent access tiers
based on usage patterns.
Glacier: Suitable for long-term archival storage with lower costs but longer retrieval times.
Glacier Deep Archive: Designed for archival storage with the lowest cost and longer retrieval
times.
4. Data Transfer and Access Control: S3 provides secure data transfer over HTTPS and allows
granular access control. Access permissions can be managed using AWS Identity and Access
Management (IAM), bucket policies, and Access Control Lists (ACLs).

5. Versioning and Lifecycle Policies: S3 supports versioning, allowing users to preserve, retrieve,
and restore previous versions of objects. Lifecycle policies enable automated transitions of objects
between storage classes based on predefined rules, helping optimize costs.

6. Data Management and Analytics: S3 integrates with various AWS services for data management
and analytics purposes. This includes Amazon Athena for ad hoc querying of data using standard
SQL, Amazon Redshift for data warehousing, and Amazon Macie for data discovery and security.

7. Event Notifications and Triggers: S3 supports event notifications and triggers through Amazon
Simple Notification Service (SNS) and AWS Lambda. This enables users to respond to changes in
their S3 buckets, such as new object uploads or deletions, by triggering actions or workflows.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


8. Cross-Region Replication: S3 provides the option to replicate data automatically across different
AWS regions, enhancing data protection, disaster recovery, and reducing latency for globally
distributed applications.

9. Security and Compliance: S3 incorporates robust security features, including encryption at rest
and in transit, access control mechanisms, and integration with AWS Identity and Access
Management (IAM). It also supports compliance standards and regulations such as HIPAA, GDPR,
and PCI DSS.
AWS S3 is widely used for a variety of use cases, including backup and restore, content storage and
distribution, data lakes and analytics, application hosting, and media hosting. Its scalability, durability,
cost-effectiveness, and rich set of features make it a fundamental component of many cloud-based
applications and architectures.

Advantages of S3:

Amazon S3 Concepts:

1. Buckets
 A bucket is a container used for storing the objects.
 Every object is incorporated in a bucket.
 For example, if the object named photos/tree.jpg is stored in the treeimage bucket, then it can be
addressed by using the URL http://treeimage.s3.amazonaws.com/photos/tree.jpg.
 A bucket has no limit to the amount of objects that it can store. No bucket can exist inside of other
buckets.
 S3 performance remains the same regardless of how many buckets have been created.
 The AWS user that creates a bucket owns it, and no other AWS user cannot own it. Therefore, we can
say that the ownership of a bucket is not transferrable.
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
 The AWS account that creates a bucket can delete a bucket, but no other AWS user can delete the
bucket.
2. Objects
 Objects are the entities which are stored in an S3 bucket.
 An object consists of object data and metadata where metadata is a set of name-value pair that
describes the data.
 An object consists of some default metadata such as date last modified, and standard HTTP metadata,
such as Content type. Custom metadata can also be specified at the time of storing an object.
 It is uniquely identified within a bucket by key and version ID.
3. Key
 A key is a unique identifier for an object.
 Every object in a bucket is associated with one key.
 An object can be uniquely identified by using a combination of bucket name, the key, and optionally
version ID.
 For example, in the URL http://jtp.s3.amazonaws.com/2019-01-31/Amazons3.wsdl where "jtp" is the
bucket name, and key is "2019-01-31/Amazons3.wsdl"
4. Regions
 You can choose a geographical region in which you want to store the buckets that you have created.
 A region is chosen in such a way that it optimizes the latency, minimize costs or address regulatory
requirements.
 Objects will not leave the region unless you explicitly transfer the objects to another region.
5. Data Consistency Model
 Amazon S3 replicates the data to multiple servers to achieve high availability.
 Two types of model:
1. Read-after-write consistency for PUTS of new objects.
 For a PUT request, S3 stores the data across multiple servers to achieve high availability.
 A process stores an object to S3 and will be immediately available to read the object.
 A process stores a new object to S3, it will immediately list the keys within the bucket.
 It does not take time for propagation, the changes are reflected immediately.
2. Eventual consistency for overwrite PUTS and DELETES
 For PUTS and DELETES to objects, the changes are reflected eventually, and they are not
available immediately.
 If the process replaces an existing object with the new object, you try to read it immediately.
Until the change is fully propagated, the S3 might return prior data.
 If the process deletes an existing object, immediately try to read it. Until the change is fully
propagated, the S3 might return the deleted data.
 If the process deletes an existing object, immediately list all the keys within the bucket.
Until the change is fully propagated, the S3 might return the list of the deleted key.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Creating an S3 Bucket
 Sign in to the AWS Management console. After sign in, the screen appears is shown below:

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


 To create an S3 bucket, click on the "Create bucket". On clicking the "Create bucket"
button, the screen appears is shown below:

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Cloud Storage

Cloud Storage:
 Cloud storage is a cloud computing model that stores data on the Internet
through a cloud computing provider who manages and operates data storage
as a service.
 It’s delivered on demand with just-in-time capacity and costs, and eliminates
buying and managing your own data storage infrastructure. This gives you
agility, global scale and durability, with “anytime, anywhere” data access.

How Does Cloud Storage Work?

 Cloud storage is purchased from a third party cloud vendor who owns and operates
data storage capacity and delivers it over the Internet in a pay-as-you-go model.

 These cloud storage vendors manage capacity, security and durability to make data
accessible to your applications all around the world.
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
 Applications access cloud storage through traditional storage protocols or directly via
an API. Many vendors offer complementary services designed to help collect,
manage, secure and analyze data at massive scale.

Different Types of Cloud Storage?


There are primarily three types of cloud storage solutions:

1. Public Cloud Storage: Suitable for unstructured data, public cloud storage is offered by
third-party cloud storage providers over the open Internet. They may be available for free or
on a paid basis. Users are usually required to pay for only what they use.

2. Private Cloud Storage


A private cloud allows organizations to store data in their environment. The infrastructure is
hosted on-premises. It offers many benefits that come with a public cloud service such as
self- service and scalability; the dedicated in-house resources increase the scope for
customization and control. Internal hosting and company firewalls also make this the more
secure option.
3. Hybrid Cloud Storage

As the name suggests, hybrid cloud allows data and applications to be shared between a
public and a private cloud. Businesses that have a secret, the on-premise solution can
seamlessly scale up to the public cloud to handle any short-term spikes or overflow.

1. Object Storage - Applications developed in the cloud often take advantage of object
storage’s vast scalability and metadata characteristics. Object storage solutions like
Amazon Simple Storage Service (S3) are ideal for building modern applications from
scratch that require scale and flexibility, and can also be used to import existing data stores
for analytics, backup, or archive.
2. File Storage - Some applications need to access shared files and require a file system. This
type of storage is often supported with a Network Attached Storage (NAS) server. File
storage solutions like Amazon Elastic File System (EFS) are ideal for use cases like large
content repositories, development environments, media stores, or user home directories.
3. Block Storage - Other enterprise applications like databases or ERP systems often require
dedicated, low latency storage for each host. This is analagous to direct- attached storage
(DAS) or a Storage Area Network (SAN). Block-based cloud storage solutions like
Amazon Elastic Block Store (EBS) are provisioned with each virtual server and offer the
ultra low latency required for high performance workloads.

Benefits of Cloud Storage


Storing data in the cloud lets IT departments transform three areas:
1. Total Cost of Ownership. With cloud storage, there is no hardware to purchase, storage to
provision, or capital being used for "someday" scenarios. You can add or remove capacity
on demand, quickly change performance and retention characteristics, and only pay for
storage that you actually use. Less frequently accessed data can even be automatically
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
moved to lower cost tiers in accordance with auditable rules, driving economies of scale.
2. Time to Deployment. When development teams are ready to execute, infrastructure
should never slow them down. Cloud storage allows IT to quickly deliver the exact
amount of storage needed, right when it's needed. This allows IT to focus on solving
complex application problems instead of having to manage storage systems.
3. Information Management. Centralizing storage in the cloud creates a tremendous
leverage point for new use cases. By using cloud storage lifecycle management policies,
you can perform powerful information management tasks including automated tiering or
locking down data in support of compliance requirements.

Cloud Storage Requirements


Ensuring your company's critical data is safe, secure, and available when needed is essential.
There are several fundamental requirements when considering storing data in the cloud.
1. Durability. Data should be redundantly stored, ideally across multiple facilities and
multiple devices in each facility. Natural disasters, human error, or mechanical faults
should not result in data loss.
2. Availability. All data should be available when needed, but there is a difference
between production data and archives. The ideal cloud storage will deliver the right
balance of retrieval times and cost.
3. Security. All data is ideally encrypted, both at rest and in transit. Permissions and
access controls should work just as well in the cloud as they do for on premises
storage.

Five Ways to Use Cloud


Storage

1. Backup and Recovery


 Backup and recovery is a critical part of ensuring data is protected and accessible, but
keeping up with increasing capacity requirements can be a constant challenge.
 Cloud storage brings low cost, high durability, and extreme scale to backup and
recovery solutions.
 Embedded data management policies like Amazon S3 Object Lifecycle Management
can automatically migrate data to lower-cost tiers based on frequency or timing
settings, and archival vaults can be created to help comply with legal or regulatory

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


requirements.
 These benefits allow for tremendous scale possibilities within industries such as
financial services, healthcare, and media that produce high volumes of data with long-
term retention needs.
2. Software Test and Development
 Software test and development environments often requires separate, independent,
and duplicate storage environments to be built out, managed, and decommissioned.
 In addition to the time required, the up-front capital costs required can be extensive.
 Some of the largest and most valuable companies in the world have created
applications in record time by leveraging the flexibility, performance, and low cost of
cloud storage.
 Even the simplest static websites can be improved for an amazingly low cost.
Developers all over the world are turning to pay-as-you go storage options that
remove management and scale headaches.

3. Cloud Data Migration

 The availability, durability, and cost benefits of cloud storage can be very compelling
to business owners, but traditional IT functional owners like storage, backup,
networking, security, and compliance administrators may have concerns around the
realities of transferring large amounts of data to the cloud.
 Cloud data migration services services such as AWS Import/Export Snowball can
simplify migrating storage into the cloud by addressing high network costs, long
transfer times, and security concerns.
4. Compliance
 Storing data in the cloud can raise concerns about regulation and compliance, especially
if this data is already stored in compliant storage systems. Cloud data compliance
controls like
 Amazon Glacier Vault Lock are designed to ensure that you can easily deploy and
enforce compliance controls on individual data vaults via a lockable policy.
 You can specify controls such as Write Once Read Many (WORM) to lock the data
from future edits.
 Using audit log products like AWS CloudTrail can help you ensure compliance and
governance objectives for your cloud-based storage and archival systems are being
met.

5. Big Data and Data Lakes


 Traditional on-premises storage solutions can be inconsistent in their cost,
performance, and scalability — especially over time. Big data projects demand large-
scale, affordable, highly available, and secure storage pools that are commonly
referred to as data lakes.
 Data lakes built on object storage keep information in its native form, and include rich
metadata that allows selective extraction and use for analysis.
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
 Cloud-based data lakes can sit at the center of all kinds data warehousing, processing,
big data and analytical engines, such as Amazon Redshift, Amazon RDS, Amazon
EMR and Amazon DynamoDB to help you accomplish your next project in less time
with more relevance.
AWS IAM

AWS IAM:
 AWS Identity and Access Management (IAM) is a service provided by Amazon Web Services
(AWS) that enables users to manage access and permissions to AWS resources. IAM allows
organizations to create and control multiple users, groups, and roles, and define fine-grained
permissions to access various AWS services and resources.

Here are some key features and concepts of AWS IAM:

 Users: IAM allows you to create individual users within your AWS account. Each user is assigned
a unique access key and secret access key that they can use to interact with AWS programmatically
or through the AWS Command Line Interface (CLI).

 Groups: IAM groups are collections of IAM users. By assigning permissions to groups, you can
manage permissions for multiple users collectively, simplifying access management.

 Roles: IAM roles are used to grant temporary access to AWS resources to entities like IAM users,
applications, or AWS services. Roles can be assumed by entities and inherit permissions associated
with the role. This approach improves security by reducing the need for long-term credentials.

 Policies: IAM policies are JSON documents that define permissions. They are attached to IAM
users, groups, and roles to specify what actions they can perform on which resources. Policies can
be custom-defined or leverage AWS-managed policies that cover common use cases.

 Access Control: IAM provides fine-grained access control through policies. Policies allow you to
specify the actions that are allowed or denied, the resources on which the actions can be performed,
and the conditions under which the permissions are granted.

 Multi-Factor Authentication (MFA): IAM supports MFA, which adds an extra layer of security
to user sign-ins. With MFA, users are required to provide an additional authentication factor, such
as a one-time password generated by a virtual or physical MFA device.

 Identity Federation: IAM supports identity federation, allowing you to grant temporary access to
AWS resources to users authenticated by external identity providers (such as Active Directory,
Facebook, or Google). This enables organizations to use existing identity systems and extend them
to AWS.

 AWS Organizations Integration: IAM can be integrated with AWS Organizations, a service that
allows you to manage multiple AWS accounts centrally. This integration enables you to apply and
enforce policies across multiple accounts, making it easier to manage permissions and access
control.

 Audit and Compliance: IAM provides features for monitoring and auditing user activity, including
AWS CloudTrail integration. CloudTrail records API calls made to IAM and other AWS services,
allowing you to track actions taken by users and meet compliance requirements.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


IAM is a fundamental component of AWS security and access management. It allows organizations to
enforce the principle of least privilege, ensuring that users and entities have only the necessary
permissions to perform their tasks, thereby improving security and reducing the risk of unauthorized
access to AWS resources.

Creating IAM Roles:


o In the navigation pane of the console, click Roles and then click on "Create Role". The screen appears
shown below on clicking Create Role button.

 Choose the service that you want to use with the role.
 Select the managed policy that attaches the permissions to the service.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
AWS Security:
AWS places a strong emphasis on security and provides a comprehensive set of tools and services to
help users secure their applications and data.

Here are some key aspects of AWS security:

1. Shared Responsibility Model: AWS follows a shared responsibility model, where AWS is responsible
for the security of the cloud infrastructure, while customers are responsible for securing their
applications and data running on AWS. This model ensures a collaborative approach to security.

2. Identity and Access Management (IAM): AWS IAM allows users to manage access to AWS
resources by creating and managing users, groups, and roles. IAM enables fine-grained access control
through policies and supports multi-factor authentication (MFA) for added security.

3. Encryption: AWS provides robust encryption options to protect data in transit and at rest. Transport
Layer Security (TLS) encryption is used for securing data in transit, while AWS Key Management
Service (KMS) enables customers to manage encryption keys for data at rest, including database
storage, EBS volumes, and S3 objects.

4. Network Security: Amazon Virtual Private Cloud (VPC) allows users to create isolated virtual
networks in the cloud. VPC provides granular control over network settings, including subnets,
security groups, and network access control lists (ACLs). AWS also offers Distributed Denial of
Service (DDoS) protection and AWS Shield for protecting applications against cyberattacks.

5. Monitoring and Logging: AWS offers various monitoring and logging services to help users track and
analyze security events. Amazon CloudWatch allows users to monitor resources and receive real-time
insights, while AWS CloudTrail records API calls made to AWS services for auditing and compliance
purposes.

6. Security Compliance and Certifications: AWS has numerous compliance certifications, including
SOC 1, SOC 2, ISO 27001, HIPAA, and PCI DSS, among others. These certifications ensure that
AWS meets industry-recognized security standards and can support customers in meeting their
compliance requirements.

7. Incident Response and Forensics: AWS provides services and features to help customers respond to
security incidents and perform forensic investigations. This includes AWS Incident Response, which
provides guidance and support during security incidents, and AWS Artifact, which provides access to
AWS compliance reports and documents.

8. Security Automation: AWS offers automation tools such as AWS Config, AWS CloudFormation, and
AWS Systems Manager to help users implement security best practices and automate security-related
tasks, ensuring consistent security configurations across environments.

9. Partner Ecosystem: AWS has a broad partner ecosystem that includes security vendors and solutions.
Customers can leverage these partners to enhance their security posture, implement advanced threat
detection and prevention, and achieve comprehensive security solutions.

It's important to note that while AWS provides a secure infrastructure, customers must implement
security best practices and configure their applications and resources correctly to ensure optimal
security. AWS provides extensive documentation, best practice guides, and security whitepapers to help
users understand and implement effective security measures.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Working of IAM:
The AWS Identity and Access Management (IAM) service enables you to manage access to AWS
resources securely. IAM works based on the concept of identities, policies, and permissions. Here's how
IAM works:

1. Identity Creation: You start by creating IAM identities, which can be users, groups, or roles.
2. Users: IAM users are individual identities associated with a person or an application that
interact with AWS resources. Users are assigned unique access credentials (access key ID and
secret access key) for programmatic access or can use the AWS Management Console with their
own username and password.

3. Groups: IAM groups are collections of IAM users. You can assign permissions to groups
instead of managing permissions individually for each user. Users can be added or removed
from groups as needed, and they inherit the permissions assigned to the group.

4. Roles: IAM roles are used to delegate access to entities like IAM users, AWS services, or even
external identity providers. Roles have policies attached to them, specifying the permissions that
can be assumed by entities assuming the role. Roles are often used for granting temporary
access or for cross-account access.

5. Policy Creation: IAM policies define what actions are allowed or denied on AWS resources.
Policies are JSON documents that specify the permissions and resources associated with
identities or roles. You can create custom policies or use AWS-managed policies that cover
common use cases. Policies can be attached to users, groups, or roles.

6. Access Control: IAM enables fine-grained access control based on policies. Policies define
permissions using AWS service actions, resource ARNs (Amazon Resource Names), conditions,
and more. You can grant or deny permissions at the service, resource, or even individual API
operation level. IAM policies can also include conditions that define additional constraints for
access.

7. Authentication and Authorization: IAM handles authentication and authorization for AWS
resources. When an IAM user or entity requests access to an AWS resource, IAM authenticates
the identity and verifies the permissions associated with that identity. IAM ensures that users
and entities have the necessary permissions to perform actions on resources based on the
policies assigned to them.

8. Integration with AWS Services: IAM integrates with various AWS services to enable secure
access and control. For example, IAM roles can be assumed by AWS services to grant them
permissions to perform actions on your behalf. IAM policies can also be used to grant access to
specific AWS resources like S3 buckets, EC2 instances, or RDS databases.

9. Monitoring and Auditing: IAM activity can be monitored and audited using AWS CloudTrail.
CloudTrail records API calls made to IAM and other AWS services, allowing you to track
changes, detect unauthorized access attempts, and comply with security and auditing
requirements.

IAM provides a centralized and secure way to manage access to AWS resources. It follows the principle
of least privilege, ensuring that users and entities have only the necessary permissions to perform their
tasks. By using IAM, you can effectively control and secure access to your AWS environment.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Components of AWS Cloud Front Working
AWS CloudFront is a content delivery network (CDN) service provided by Amazon Web Services
(AWS). It delivers content, such as web pages, videos, and images, to end users with low latency and
high transfer speeds by caching content in edge locations around the world. The key components and
working of AWS CloudFront are as follows:

1. Origin: The origin is the source of the content that CloudFront delivers to end users. It can be
an Amazon S3 bucket, an Elastic Load Balancer, an EC2 instance, or a custom HTTP server
outside of AWS. CloudFront retrieves the content from the origin server when it is not present
in its cache or when the content has expired.

2. Distribution: A distribution is a collection of edge locations that serve content to end users.
When you create a CloudFront distribution, you specify the origin(s) from which CloudFront
should retrieve content. There are two types of distributions: web distributions for delivering
web content and RTMP (Real-Time Messaging Protocol) distributions for streaming media
content.

3. Edge Locations: CloudFront uses a network of edge locations located around the world to
cache and deliver content. Edge locations are geographically distributed points of presence
(PoPs) that act as caches for frequently accessed content. When a user requests content,
CloudFront delivers it from the nearest edge location, reducing latency and improving
performance.

4. Cache: CloudFront caches content at edge locations based on configurable cache behaviors. The
cache behaviors define how CloudFront should handle specific requests, including whether to
cache the content and for how long. Cached content is stored in the edge location's cache until it
expires or is evicted due to space constraints.

5. Content Delivery: When a user requests content, the request is routed to the nearest edge
location. If the requested content is already cached and has not expired, CloudFront delivers it
directly from the cache, resulting in low latency. If the content is not in the cache or has expired,
CloudFront retrieves it from the origin server, caches it in the edge location, and then delivers it
to the user.

6. SSL/TLS Encryption: CloudFront supports SSL/TLS encryption for secure content delivery.
You can configure CloudFront to use HTTPS to encrypt content in transit between the edge
locations and end users. CloudFront also supports the use of custom SSL/TLS certificates or
integrates with AWS Certificate Manager for managing SSL/TLS certificates.

7. Access Control: CloudFront provides various mechanisms for access control and security. You
can use AWS Identity and Access Management (IAM) to control who can create, configure, and
manage CloudFront distributions. Additionally, you can use CloudFront signed URLs or signed
cookies to restrict access to specific content or limit access duration.

8. Monitoring and Reporting: CloudFront integrates with AWS CloudWatch, which allows you
to monitor and gain insights into the performance and behavior of your CloudFront
distributions. CloudFront provides metrics, logs, and real-time data that you can use for
monitoring, troubleshooting, and optimizing the delivery of your content.
By leveraging the components and capabilities of CloudFront, you can distribute content globally with
low latency, high availability, and improved performance for your end users, enhancing their browsing
or streaming experience.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Benefits:

AWS CloudFront offers several benefits for content delivery and performance optimization:

 Low Latency and High Performance: CloudFront uses a network of globally distributed edge
locations to deliver content from the nearest edge location to end users, reducing latency and
improving performance. This results in faster load times and a better user experience.

 Global Content Delivery: CloudFront's extensive network of edge locations spans across the
globe, allowing you to distribute your content to users worldwide. It helps reduce the distance
between users and your content, enabling faster delivery regardless of their geographical
location.

 Scalability: CloudFront is highly scalable and can handle traffic spikes and sudden increases in
demand. It automatically scales to accommodate varying levels of traffic, ensuring that your
content remains accessible and responsive during peak usage periods.

 Cost-Effective: CloudFront offers a pay-as-you-go pricing model, where you only pay for the
data transfer and requests made by your users. The pricing is based on the amount of data
transferred and the number of requests, allowing you to optimize costs based on your actual
usage.

 Caching and Content Optimization: CloudFront caches your content at edge locations,
reducing the load on your origin server and improving response times. By caching content
closer to end users, CloudFront minimizes the need to fetch content from the origin server,
resulting in faster delivery.

 Customizable Content Delivery: CloudFront provides configurable cache behaviors, allowing


you to control how your content is cached and delivered. You can set caching rules based on file
extensions, HTTP headers, query strings, or cookies, tailoring the caching behavior to match
your application's requirements.

 Security: CloudFront supports various security features, including SSL/TLS encryption for
secure content delivery over HTTPS. You can also control access to your content using
CloudFront signed URLs or signed cookies, ensuring that only authorized users can access your
protected content.

 Integration with AWS Services: CloudFront seamlessly integrates with other AWS services,
such as Amazon S3, Amazon EC2, and AWS Lambda. You can easily combine CloudFront
with these services to deliver dynamic, static, or streaming content, enabling a cohesive and
scalable architecture.

 Real-Time Monitoring and Analytics: CloudFront integrates with AWS CloudWatch,


providing detailed metrics and real-time monitoring of your distribution's performance. You can
gain insights into traffic patterns, cache utilization, and other key metrics, enabling you to
optimize your content delivery strategy.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Introduction to Snapshots vs AMI
Snapshots and Amazon Machine Images (AMIs) are both used in Amazon Web Services (AWS)
for managing and preserving data and instances, but they serve different purposes. Let's compare
snapshots and AMIs:

Snapshots:

 A snapshot is a point-in-time copy of the data stored in an Amazon Elastic Block Store
(EBS) volume. It captures the data and configuration of the volume at the time the
snapshot is taken.
 Snapshots are primarily used for backup, recovery, and data persistence. They allow you to
create a backup of your EBS volumes and protect against data loss.
 Snapshots are incremental, meaning that after the initial snapshot, subsequent snapshots
only capture the changes since the previous snapshot. This helps in efficient storage
utilization and faster backup operations.
 Snapshots are stored in Amazon S3, and you are charged based on the size of the data
stored in the snapshot.
 Here are some key details about AWS snapshots:
1. Point-in-time Copies
2. Incremental Backups
3. Fast and Easy Backups
4. Encryption
5. Cost-effective
6. Region-based
7. Flexible Recovery Options

Amazon Machine Images (AMIs):

 An AMI is a pre-configured image that contains the root file system, applications, and
configuration necessary to launch an EC2 instance. It captures the entire state of an EC2
instance at the time the AMI is created.
 AMIs are used for creating and launching new EC2 instances with the same configuration
as the original instance. They provide a convenient way to replicate instances or create
templates for consistent deployments.
 AMIs include the operating system, installed software, and any additional data or
configurations present on the instance's root volume.
 AMIs can be public, shared with other AWS accounts, or privately owned by your
account. You can also create custom AMIs from existing EC2 instances or from snapshots.
 AMIs are stored in Amazon S3 and are charged based on the storage size of the AMI and
any associated snapshots.
 Here are some key details about AWS AMIs:
1. Types of AMIs
2. Customization
3. Versioning
4. Storage
5. Security
6. Sharing
7. Licensing

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Different Scaling Plans:
When it comes to scaling resources in AWS, there are several scaling plans available that cater
to different types of workloads and requirements. The following are some of the common
scaling plans offered by AWS:

1. Amazon EC2 Auto Scaling: This scaling plan is used to automatically adjust the
number of Amazon EC2 instances in an Auto Scaling group based on predefined
scaling policies. It allows you to define rules for scaling in and out based on metrics
such as CPU utilization, network traffic, or custom metrics. EC2 Auto Scaling helps
maintain application availability, optimize performance, and manage costs by
automatically adding or removing instances as needed.

2. Application Auto Scaling: Application Auto Scaling is a service that allows you to
automatically scale other AWS resources beyond EC2 instances. It supports scaling for
various services such as Amazon ECS (Elastic Container Service), DynamoDB,
Amazon Aurora, Amazon AppStream 2.0, and more. With Application Auto Scaling,
you can define scaling policies based on specific metrics and conditions related to the
respective service, enabling automatic scaling of resources to handle changes in
demand.

3. AWS Auto Scaling: AWS Auto Scaling provides a unified scaling experience across
multiple AWS services. It combines the capabilities of EC2 Auto Scaling, Application
Auto Scaling, and other scaling features to offer a comprehensive scaling solution.
AWS Auto Scaling allows you to define scaling policies across different services,
ensuring that your application components scale in a coordinated manner to meet
demand while optimizing resource utilization.

4. AWS Elastic Beanstalk: Elastic Beanstalk is a fully managed platform-as-a-service


(PaaS) offering by AWS. It simplifies the deployment and management of applications
by automatically handling the underlying infrastructure. Elastic Beanstalk includes
built-in scaling capabilities that can automatically adjust the capacity of EC2 instances
or other resources based on defined thresholds or metrics. It supports both manual
scaling and automatic scaling based on average CPU utilization, network traffic, or
custom metrics.

5. Amazon RDS Auto Scaling: RDS Auto Scaling allows you to automatically adjust the
capacity of Amazon RDS (Relational Database Service) instances based on predefined
scaling policies. It helps maintain optimal performance and availability of your
database by adding or removing RDS instances in response to changes in demand.
RDS Auto Scaling supports scaling based on metrics such as CPU utilization or
database connections.

AWS provides various scaling plans that allow you to scale your applications and infrastructure
in response to changing demand. Here are some of the different scaling plans in AWS:
1. Horizontal Scaling: Horizontal scaling is the process of adding more instances to your
application to handle increased traffic. This can be achieved using AWS Auto Scaling,
which automatically adjusts the number of EC2 instances based on the load on your
application.
2. Vertical Scaling: Vertical scaling is the process of increasing the capacity of an individual
instance to handle more traffic. This can be achieved by upgrading the instance type or
increasing the size of the instance's resources, such as CPU, memory, and storage.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


3. Scheduled Scaling: Scheduled scaling allows you to adjust the number of instances based
on a predetermined schedule. This is useful for applications that experience predictable
traffic patterns, such as an e-commerce site that experiences increased traffic during
holiday periods.
4. Manual Scaling: Manual scaling is the process of manually adjusting the number of
instances in response to changing traffic patterns. This can be done using the AWS
Management Console or the AWS CLI.
5. Predictive Scaling: Predictive scaling is a machine learning-based scaling approach that
uses historical data to predict future traffic patterns and automatically adjusts the number
of instances accordingly.
6. Application Load Balancer (ALB) Scaling: Application Load Balancer (ALB) Scaling
allows you to automatically scale your application based on the traffic to your ALB. The
ALB automatically distributes traffic to the appropriate instance based on the load on each
instance.

Introduction to Load balancing


 Load balancing is the method of distributing network traffic equally across a pool of
resources that support an application.
 Modern applications must process millions of users simultaneously and return the correct
text, videos, images, and other data to each user in a fast and reliable manner.
 To handle such high volumes of traffic, most applications have many resource servers with
duplicate data between them.
 A load balancer is a device that sits between the user and the server group and acts as an
invisible facilitator, ensuring that all resource servers are used equally.

Load balancing in AWS is a crucial component for achieving high availability, scalability, and
fault tolerance in your applications or services. AWS offers multiple load balancing services that
can distribute traffic across multiple resources to ensure efficient resource utilization and optimal
performance. The main load balancing services provided by AWS are:

Elastic Load Balancing (ELB): Elastic Load Balancing is a fully managed load balancing
service that distributes incoming traffic across multiple EC2 instances, containers, IP
addresses, or Lambda functions within a specific region. ELB automatically scales the load
balancer as traffic patterns change, ensuring that your application can handle varying
levels of traffic. AWS offers three types of ELB:

a. Classic Load Balancer (CLB): This is the traditional load balancer provided by AWS. It
operates at the transport layer (Layer 4) and can distribute traffic across EC2 instances.

b. Application Load Balancer (ALB): ALB operates at the application layer (Layer 7) and
provides advanced features such as content-based routing, path-based routing, and support
for HTTP/HTTPS protocols. It is suitable for modern web applications.

c. Network Load Balancer (NLB): NLB operates at the network layer (Layer 4) and is
designed to handle high volumes of traffic with ultra-low latencies. It is ideal for TCP,
UDP, and TLS traffic.

AWS Global Accelerator: AWS Global Accelerator improves the availability and
performance of your applications for global users by routing traffic through the AWS
global network infrastructure. It uses the AWS edge locations to direct traffic to the nearest
application endpoint, reducing latency and improving global application responsiveness.
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
AWS Application Auto Scaling: While not a load balancer itself, AWS Application Auto
Scaling complements load balancing by automatically adjusting the capacity of other AWS
resources based on defined scaling policies. It supports scaling for services like Amazon
ECS, DynamoDB, Aurora, and more, ensuring that the resources scale in sync with the
load.

Benefits
By leveraging these load balancing services in AWS, you can achieve benefits such as:

High availability: Load balancers distribute traffic across multiple resources, ensuring that if
one resource becomes unavailable, traffic is automatically routed to healthy resources,
minimizing downtime.
Scalability: Load balancers can dynamically scale the resources they distribute traffic to,
allowing your application to handle varying levels of traffic without manual intervention.
Fault tolerance: Load balancers monitor the health of resources and automatically route
traffic away from unhealthy resources, improving the fault tolerance of your application or
service.
Simplified management: AWS load balancing services are fully managed, meaning that
AWS handles the operational aspects such as capacity provisioning, scaling, and health
monitoring, allowing you to focus on your application logic.

How does Load balancing works:

Load balancing algorithms


Load balancing algorithms determine how incoming network traffic is distributed across
multiple servers or resources. AWS load balancing services employ various algorithms to
achieve efficient distribution of traffic. Here are some commonly used load balancing
algorithms:

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


1. Round Robin: This is a simple and widely used algorithm where each new request is
routed to the next available server in a circular order. Each server receives an equal
share of the traffic, making it a fair distribution method. Round Robin works well
when the servers have similar capacities and there are no significant differences in
response times.

2. Least Connection: With the Least Connection algorithm, incoming requests are routed
to the server with the fewest active connections at the time the request is received. This
algorithm ensures that the load is distributed based on the current load on the servers,
aiming to achieve a more balanced distribution.

3. IP Hash: In this algorithm, the client's IP address is used to determine which server
will handle the request. The IP address is hashed, and the resulting value is used to
select the server. This ensures that requests from the same IP address are consistently
routed to the same server, which can be useful for maintaining session persistence.

4. Least Time: The Least Time algorithm considers the response time of each server and
routes requests to the server with the lowest response time. This approach aims to
minimize the overall response time for the client.

5. Weighted Round Robin: In this algorithm, each server is assigned a weight that
corresponds to its processing capacity or performance. Servers with higher weights
receive a larger proportion of the traffic, enabling load balancing that takes into
account the server's capabilities.

6. Least Bandwidth: The Least Bandwidth algorithm routes requests to the server with
the least amount of current network traffic. This approach aims to balance the network
load across servers based on their available bandwidth.

It's important to note that not all load balancing algorithms are available in every load
balancing service. AWS load balancing services, such as Elastic Load Balancing (ELB) and
Application Load Balancer (ALB), provide built-in load balancing algorithms specific to each
service.
The choice of load balancing algorithm depends on various factors, including the nature of the
workload, the capabilities of the servers or resources, and the desired performance and
behavior of the application. AWS load balancing services typically offer configurable options
to select the appropriate algorithm or provide a default behavior that is suitable for most use
cases.
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
Types of Load Balancing
 Application load balancing
 Network load balancing
 Global server load balancing
 DNS load balancing

Types Of Load Balancing Technology

 Hardware load balancers


 Software load balancers

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Module 2 CONTAINERIZATION USING DOCKERS
Docker, Containers, Usage of containers, Terminology, Docker Run Static sites, Docker Images,
Docker File, Docker on AWS, Docker Network, Docker Compose, Development Workflow, AWS EC
Services.

Docker
 Docker is an open-source platform that allows you to automate the deployment, scaling, and
management of applications using containerization.
 Containers are lightweight and isolated environments that package an application and its
dependencies, providing consistency across different computing environments.

Here are some key concepts related to Docker:

1. Container: A container is an isolated and lightweight runtime environment that includes


everything needed to run an application, such as code, runtime, system tools, and system
libraries. Containers are based on container images.

2. Image: An image is a read-only template used to create containers. It contains the application
code, runtime, and dependencies required to run the application. Docker images are built from a
set of instructions called a Dockerfile.

3. Dockerfile: A Dockerfile is a text file that contains a set of instructions to build a Docker image.
It specifies the base image, application code, dependencies, and other configurations needed for
the container.

4. Docker Hub: Docker Hub is a public registry that hosts thousands of Docker images. It allows
users to share and download pre-built Docker images for various applications, frameworks, and
services.

5. Containerization: Containerization is the process of encapsulating an application and its


dependencies into a container. It provides a consistent and portable runtime environment,
making it easier to deploy and run applications across different systems.

6. Docker Compose: Docker Compose is a tool used to define and manage multi-container Docker
applications. It allows you to describe the services, networks, and volumes required for your
application using a YAML file.

7. Orchestration: Docker Swarm and Kubernetes are popular container orchestration platforms
that help manage and scale containerized applications across multiple hosts or nodes. They
provide features like service discovery, load balancing, scaling, and high availability.

Using Docker, developers can create consistent development and production environments, simplify
application deployment, and improve scalability. It also promotes collaboration and sharing of software
components through container images. Docker has gained significant popularity in the software
development industry due to its ease of use, portability, and resource efficiency.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


1. Initially, the Docker container will be in the created state.

2. Then the Docker container goes into the running state when the Docker run command is used.

3. The Docker kill command is used to kill an existing Docker container.

4. The Docker pause command is used to pause an existing Docker container.

5. The Docker stop command is used to pause an existing Docker container.

6. The Docker run command is used to put a container back from a stopped state to a running state.

Docker Architecture

 The server is the physical server that is used to host multiple virtual machines.
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
 The Host OS is the base machine such as Linux or Windows.
 The Hypervisor is either VMWare or Windows Hyper V that is used to host virtual
machines.
 You would then install multiple operating systems as virtual machines on top of the
existing hypervisor as Guest OS.
 You would then host your applications on top of each Guest OS.
The following image shows the new generation of virtualization that is enabled via Dockers.
Let’s have a look at the various layers.

 The server is the physical server that is used to host multiple virtual machines. So this
layer remains the same.
 The Host OS is the base machine such as Linux or Windows. So this layer remains the
same.
 Now comes the new generation which is the Docker engine. This is used to run the
operating system which earlier used to be virtual machines as Docker containers.
 All of the Apps now run as Docker containers.
The clear advantage in this architecture is that you don’t need to have extra hardware for Guest
OS. Everything works as Docker containers.

Containers:
 Containers are lightweight, standalone executable packages that include
everything needed to run an application, including code, runtime, system tools,
libraries, and settings.
 Containers provide a consistent, reliable, and efficient way to run applications
across different environments, from development to production.
 Containers are similar to virtual machines in that they provide a way to isolate an
application and its dependencies from the host system.
 However, containers are more lightweight and efficient than virtual machines
because they share resources with the host system, rather than requiring their own
operating system and virtual hardware.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


The key components of a container are:
1. Image: An image is a read-only template that contains the code, runtime, system tools,
libraries, and settings needed to run an application. Images are typically built from a
Dockerfile, which is a script that specifies the instructions for building the image.
2. Container: A container is a running instance of an image. Containers provide an isolated
environment for running an application and its dependencies, and can be started, stopped,
and managed independently of other containers on the same system.
3. Registry: A registry is a storage and distribution system for Docker images. Registries
can be public or private and are used to store and share images with other users and
systems.

Containers can be used for a wide range of applications and use cases, including:
1. Development and testing: Containers provide a consistent, reproducible environment for
developing and testing applications, making it easier to identify and fix bugs and
compatibility issues.
2. Deployment and scaling: Containers make it easy to deploy and scale applications by
providing a consistent environment across different systems and environments.

3. Microservices: Containers are well-suited for building and deploying microservices, which
are small, independent components that work together to form a larger application.
4. Legacy applications: Containers can be used to modernize and containerize legacy
applications, making them more portable, scalable, and efficient.

Usage of Containers:
Containers have gained widespread adoption in various areas of software development and
deployment. Here are some common use cases for containers:

Application Deployment:
 Containers provide a consistent and portable runtime environment for applications.
 Developers can package their applications along with all the necessary dependencies and
configurations into a container image.
 These images can be easily deployed across different environments, including development,
testing, staging, and production, ensuring that the application runs consistently across all
stages.

Microservices Architecture:
 Containers are well-suited for building and deploying microservices-based architectures.
 Each microservice can be packaged and deployed as a separate container, enabling
independent development, scalability, and deployment of individual components.
 Containers facilitate the decoupling of services, making it easier to manage and scale
complex distributed applications.

Continuous Integration and Continuous Deployment (CI/CD):


 Containers are an integral part of CI/CD pipelines.
 Developers can use containers to package and test applications in a consistent and
reproducible environment.
 Containers also enable the seamless deployment of applications to different environments,
ensuring that the application behaves the same in all stages of the pipeline.
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
Scaling and Load Balancing:
 Containers can be easily replicated and distributed across multiple hosts or nodes, allowing
applications to scale horizontally.
 Container orchestration platforms like Kubernetes and Docker Swarm provide built-in
features for managing scaling, load balancing, and auto-scaling of containerized
applications.

Development and Testing Environments:


 Containers provide developers with a lightweight and isolated environment to develop, test,
and debug applications.
 Developers can use containers to set up development environments that closely mirror the
production environment, ensuring consistent behavior across different stages of the
application lifecycle.

Legacy Application Modernization:


 Containers can be used to modernize legacy applications by containerizing them.
 This approach allows organizations to leverage the benefits of containers, such as
portability, scalability, and easier management, without completely re-architecting the
application.

Cloud and Hybrid Cloud Deployments:


 Containers are commonly used in cloud and hybrid cloud environments.
 Cloud platforms provide native support for containerization technologies, making it easy to
deploy and manage containers at scale.
 Containers enable the efficient utilization of cloud resources, facilitate application
portability, and simplify the deployment of multi-cloud or hybrid cloud architectures.

Big Data and Analytics:


 Containers are increasingly being used in big data and analytics workflows.
 Containers can package data processing frameworks, such as Apache Spark or Apache
Hadoop, along with the required dependencies.
 This enables developers and data scientists to create portable and reproducible data
processing pipelines, simplifying the development and deployment of analytics applications.

These are just a few examples of how containers are used in various aspects of software
development and deployment. The flexibility, portability, and scalability offered by containers make
them a powerful tool for modern application development and deployment practices.

Terminology:
Some common terminology related to containers:

1. Container: An isolated and lightweight runtime environment that encapsulates an application


and its dependencies.

2. Container Image: A read-only template used to create containers. It includes the application
code, runtime, libraries, and dependencies required to run the application.

3. Docker: An open-source platform that enables containerization, allowing you to build,


distribute, and run containers.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


4. Dockerfile: A text file that contains instructions to build a Docker image. It specifies the base
image, dependencies, configurations, and steps to set up the container.

5. Container Registry: A repository that stores and distributes container images. Docker Hub is a
popular public container registry, but private registries like Amazon ECR, Google Container
Registry, or Azure Container Registry are also commonly used.

6. Container Orchestration: The process of managing, deploying, scaling, and networking


containers in a distributed environment. Tools like Kubernetes and Docker Swarm are used for
container orchestration.

7. Microservices: An architectural style where an application is composed of small, independent


services that communicate with each other through APIs. Containers are often used to deploy
and manage microservices.

8. Containerization: The process of encapsulating an application and its dependencies into a


container. It provides consistency and portability across different environments.

9. Volume: A mechanism in containers that allows data to persist beyond the lifecycle of the
container. Volumes can be used to store and share data between containers or between
containers and the host system.

10. Container Networking: The networking infrastructure that enables communication between
containers. Containers can be connected through virtual networks, allowing them to
communicate with each other or with external systems.

11. Container Orchestration Platform: Software platforms that automate the deployment, scaling,
and management of containers. Examples include Kubernetes, Docker Swarm, and Apache
Mesos.

12. Service Discovery: The process of automatically detecting and registering available services
within a container orchestration platform. It enables containers to find and communicate with
each other dynamically.

13. Load Balancing: The distribution of incoming network traffic across multiple containers to
optimize resource utilization and improve application performance. Load balancers can be
integrated with container orchestration platforms.

14. Auto-scaling: The capability of dynamically adjusting the number of containers based on
workload demand. Auto-scaling helps ensure that the application can handle increased traffic or
workload without manual intervention.

15. Immutable Infrastructure: An approach where containers are treated as disposable and are not
modified once deployed. Instead, any updates or changes are made by creating new container
images and deploying them.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Docker Run Static sites

Running a static website using Docker is a great way to create a portable and scalable web hosting
solution.
Here are the steps to run a static site using Docker:
1. Create a Dockerfile: The first step is to create a Dockerfile that specifies the base image,
copies the static files into the container, and exposes the container port.
Here's an example of a Dockerfile for a static site built with HTML and CSS:
# Specify the base image
FROM nginx: latest
# Copy the static files into the container
COPY. /usr/share/nginx/html
# Expose port 80
EXPOSE 80
2. Build the Docker image: Use the `docker build` command to build the Docker image from the
Dockerfile. The command should be run from the directory containing the Dockerfile:
docker build -t my-static-site.

The `-t` flag sets the image name and the `. ` specifies the build context.

3. Run the Docker container: Once the Docker image is built, run the container using the `docker
run` command:
docker run -d -p 8080:80 my-static-site

The `-d` flag runs the container in detached mode, `-p` maps the container port 80 to the host port
8080, and `my-static-site` is the name of the Docker image.

4. Access the website: Open a web browser and go to `http://localhost:8080` to view the static
website.
By following these simple steps, you can easily run a static website using Docker.

Docker Image
In Docker, images are the building blocks for containers. An image is a read-only template that
contains the necessary files, dependencies, and configurations to run a specific application or service
within a container.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Here are some key points about Docker images:

Image Layers: Docker images are composed of multiple layers. Each layer represents a specific
modification or addition to the base image. Layering allows for efficient image sharing and
reusability, as common layers can be shared among multiple images.

Base Image: A base image serves as the starting point for creating other images. It typically
contains the minimal operating system or runtime environment required for running a specific type
of application. Examples of base images include official language runtimes (e.g., Python, Node.js)
or distribution-specific images (e.g., Ubuntu, Alpine).

Dockerfile: A Dockerfile is a text file that contains a series of instructions for building a Docker
image. It specifies the base image, additional software installations, file copying, environment
variables, exposed ports, and more. Docker uses the instructions in the Dockerfile to build a new
image layer by layer.

Image Repository: Docker images can be stored and managed in image repositories. Docker Hub is
a popular public image repository that hosts thousands of pre-built images. Private repositories, such
as Amazon ECR, Google Container Registry, and Azure Container Registry, allow organizations to
securely store and distribute their own Docker images.

Image Tag: An image tag is a label attached to an image to distinguish different versions or variants
of the same image. Tags are typically used to represent different versions, such as "latest," "v1.0," or
specific release numbers. When pulling or running an image, you can specify the tag to retrieve the
desired version.

Image Pull: To use an image, it needs to be pulled from a registry to the local Docker environment.
The docker pull command is used to download an image from a specified repository. If the image is
not found locally, Docker will fetch it from the repository.

Image Build: Docker builds images using the docker build command, which reads the instructions
from a Dockerfile and creates a new image based on those instructions. The build process involves
downloading necessary layers, executing the instructions, and generating a new image.

Image Layers and Caching: Docker utilizes layer caching during the image build process. If a
Dockerfile instruction has not changed since a previous build, Docker can reuse the corresponding
layer from cache. This caching mechanism speeds up subsequent builds, as unchanged layers do not
need to be rebuilt.

Image Tagging and Pushing: Once an image is built, it can be tagged with a specific version or
variant and pushed to a repository. The docker tag command is used to assign a new tag to an image,
and the docker push command is used to upload the image to a repository, making it available for
others to pull and use.

Docker images are fundamental to the containerization process, allowing for the reproducibility,
portability, and sharing of containerized applications and services. They enable developers to
package and distribute their applications, ensuring consistency across different environments and
simplifying the deployment process.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Docker File

A Dockerfile is a text file that contains a set of instructions for building a Docker image. These
instructions define the steps to create the image, such as specifying the base image, copying files,
installing dependencies, setting environment variables, exposing ports, and executing commands.

Here's an overview of the Dockerfile syntax and some commonly used instructions:

Base image: The first line of a Dockerfile specifies the base image on which your image will be
built. It defines the starting point for your image. Example: FROM ubuntu:20.04.

Copy files: The COPY instruction is used to copy files and directories from the build context (the
directory containing the Dockerfile) to the image. Example: COPY app.py /app/.

Set working directory: The WORKDIR instruction sets the working directory for subsequent
instructions. Example: WORKDIR /app.

Install dependencies: Use RUN instruction to execute commands during the image build process.
You can install dependencies, run package managers, or perform any other necessary setup tasks.
Example: RUN apt-get update && apt-get install -y python3.

Expose ports: The EXPOSE instruction documents the ports that the container listens on at
runtime. It doesn't actually publish the ports. Example: EXPOSE 8080.

Set environment variables: The ENV instruction sets environment variables in the image.
Example: ENV MY_VAR=my_value.

Execute commands: The CMD instruction specifies the default command to run when a container
is created from the image. It can be overridden when starting the container. Example: CMD
["python3", "app.py"].

Build the image: Use the docker build command to build the image based on the Dockerfile.
Example: docker build -t my-image ..

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Dockerfiles can be more complex, including multiple stages, conditionals, and other instructions to
accommodate specific application requirements. It's a best practice to keep Dockerfiles as concise as
possible, leveraging existing base images and minimizing the number of layers to optimize build
times and reduce image size.

Once the Dockerfile is ready, you can build the Docker image using the docker build command and
run containers based on that image using the docker run command.

Here is an example Dockerfile:


# Use an official Python runtime as a parent image
FROM python:3.7-slim

# Set the working directory to /app


WORKDIR /app
# Copy the current directory contents into the container at /app
COPY . /app
# Install any needed packages specified in requirements.txt
RUN pip install --trusted-host pypi.python.org -r requirements.txt

# Expose port 80 for the Flask app


EXPOSE 80

# Define environment variable


ENV NAME World

# Run the command to start the Flask app


CMD ["python", "app.py"]

Docker on AWS

Docker can be used on AWS (Amazon Web Services) to deploy and manage containers in the cloud.
AWS provides several services and features that integrate well with Docker, allowing you to build,
run, and scale containerized applications effectively.

Here are some key AWS services and features related to Docker:

1. Amazon Elastic Container Service (ECS): ECS is a fully managed container orchestration
service provided by AWS. It allows you to run Docker containers without managing the
underlying infrastructure. You can define task definitions that specify the container images,
resources, networking, and other configurations. ECS automatically handles container
deployment, scaling, and load balancing.

2. Amazon Elastic Kubernetes Service (EKS): EKS is a managed Kubernetes service on AWS. It
simplifies the deployment, management, and scaling of Kubernetes clusters. You can use EKS to
run Docker containers within Kubernetes pods, taking advantage of the rich ecosystem of
Kubernetes tools and features.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


3. Amazon Fargate: Fargate is a serverless compute engine for containers. It allows you to run
containers without managing the underlying infrastructure. With Fargate, you can define your
container configurations, specify resources, networking, and other settings, and AWS takes care
of provisioning and managing the required infrastructure.

4. AWS Batch: AWS Batch is a service for running batch computing workloads, including
containerized applications. It provides a managed environment for executing jobs at scale, and
you can use Docker containers as the execution environment for your batch jobs.

5. AWS CloudFormation: CloudFormation is AWS's infrastructure-as-code service. You can use


CloudFormation templates to define and provision AWS resources, including ECS clusters, EKS
clusters, networking, load balancers, and more. This allows you to create and manage your
Docker-based infrastructure in a repeatable and automated manner.

6. Amazon Elastic Container Registry (ECR): ECR is a fully managed Docker container registry
provided by AWS. It allows you to store, manage, and deploy container images. You can push
your Docker images to ECR and use them in ECS, EKS, or other container orchestration
platforms.

7. AWS Cloud Development Kit (CDK): CDK is an open-source development framework that
allows you to define cloud infrastructure using familiar programming languages. You can use
CDK to define Docker containers, ECS clusters, networking, and other AWS resources in code,
providing a more programmatic and repeatable way of managing your Docker deployments on
AWS.

These are just a few examples of how Docker can be used on AWS. AWS provides a wide range of
services that can be integrated with Docker to build scalable, resilient, and cost-effective
containerized applications. The choice of services depends on your specific use case and
requirements.

Docker Network

Docker provides a networking feature that allows containers to communicate with each other and
with external networks. Docker networking enables containers to be connected together, isolated
from other containers or networks, and exposed to the host system or other containers.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Here are some key concepts related to Docker networking:

Default Network: When Docker is installed, it creates a default bridge network called bridge.
Containers that are started without specifying a network explicitly are connected to this network by
default. Containers on the same bridge network can communicate with each other using IP
addresses.

Container Network Interface (CNI): Docker uses CNI plugins to manage container networking.
These plugins are responsible for creating and configuring the network interfaces of containers.
Docker supports multiple CNI plugins, including bridge, overlay, macvlan, host, and more.

Bridge Network: A bridge network is a private network internal to the Docker host. It allows
containers to communicate with each other using IP addresses. Containers on the same bridge
network can discover each other using their container names or IP addresses. By default, Docker
creates a bridge network called bridge when it is installed.

Host Network: In the host network mode, a container shares the network namespace with the
Docker host. This means the container uses the host's network stack and can directly access the
host's network interfaces. Containers in host network mode have the same network configuration as
the host system and can use the host's IP address.

Overlay Network: Overlay networks enable communication between containers running on


different Docker hosts. They are used in multi-host Docker deployments, such as Docker Swarm or
Kubernetes. Overlay networks use a VXLAN-based overlay to encapsulate and transmit container
network traffic across hosts.

User-defined Network: Docker allows you to create user-defined networks to isolate containers and
control their connectivity. User-defined networks provide network segmentation and firewall-like
rules to control communication between containers. Containers can be attached to multiple user-
defined networks, allowing for more complex network setups.

Service Discovery: Docker provides built-in service discovery mechanisms to facilitate


communication between containers. Containers on the same network can discover each other using
their container names as DNS names. This enables containers to connect to other containers using
familiar DNS-based hostname resolution.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Port Mapping: Port mapping is used to expose container ports to the host system or external
networks. By specifying port mappings, you can make container services accessible from the host or
other systems. For example, you can map port 8080 on the host to port 80 in a container, allowing
external access to the container's web service.

These are some fundamental concepts related to Docker networking. Understanding Docker
networking allows you to configure communication between containers, connect containers to
external networks, and build complex network setups for your containerized applications.

Docker Compose

Docker Compose is a tool that allows you to define and manage multi-container Docker
applications. It provides a simple way to define the services, networks, and volumes required for
your application using a YAML file. With Docker Compose, you can easily spin up and tear down
complex application stacks with just a few commands.

Here are the key features and concepts related to Docker Compose:

Compose File: A Compose file, usually named docker-compose.yml, is used to define the services,
networks, and volumes for your application. It is written in YAML format and specifies the
configuration options for each service in your application stack.

Services: A service in Docker Compose represents a containerized application component, such as a


web server, database, or worker process. Each service is defined as a separate block in the Compose
file and includes details like the image to use, environment variables, ports to expose, and volume
mounts.
Networks: Docker Compose allows you to create custom networks for your services. Networks
provide a way for containers to communicate with each other. By default, Compose creates a bridge
network for your application stack, but you can define additional networks with specific
characteristics and attach services to them.

Volumes: Volumes in Docker Compose allow you to persist data generated by your containers.
Volumes can be created and attached to services, ensuring that data is stored outside of the
container's filesystem. This enables data to be retained even if containers are destroyed or recreated.

Environment Variables: Docker Compose allows you to specify environment variables for your
services. Environment variables can be used to configure your application's behavior, pass secrets,
or provide runtime parameters. Environment variables can be set directly in the Compose file or
loaded from external files.

Building Images: Docker Compose supports building custom images for your services using
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
Dockerfiles. You can specify a build context and a Dockerfile for a service, and Compose will build
the image before starting the container.

Service Dependencies: Docker Compose allows you to define dependencies between services. You
can specify that one service depends on another, and Compose will start the services in the correct
order, ensuring that dependencies are resolved before a service is started.

Scaling Services: Docker Compose makes it easy to scale services horizontally. You can define the
desired number of replicas for a service, and Compose will create and manage the specified number
of containers.

Command-line Interface: Docker Compose provides a command-line interface (CLI) for


managing your application stack. You can use commands like up, down, start, stop, and logs to
control the lifecycle of your containers.
Docker Compose simplifies the process of managing multi-container applications, allowing you to
define, configure, and orchestrate your application stack in a declarative manner. It's particularly
useful for development, testing, and staging environments, where you need to quickly spin up
consistent application stacks with multiple interconnected services.

Development Workflow in AWS

The development workflow in AWS (Amazon Web Services) typically involves several stages and
tools to facilitate the development, testing, and deployment of applications.

Here's a general overview of a development workflow in AWS:

Environment Setup: Set up your development environment by installing the necessary tools and
SDKs provided by AWS. This may include the AWS CLI (Command Line Interface), AWS SDKs
for your programming language, and any additional development tools or IDEs.

Code Development: Write your application code using your preferred programming language and
development environment. This may include writing serverless functions, building web applications,
or developing microservices.

Version Control: Use a version control system like Git to manage your codebase. Create a Git
repository and commit your code changes regularly. It's a best practice to use branches and pull
requests to manage feature development, bug fixes, and code reviews.

Continuous Integration/Continuous Deployment (CI/CD): Implement a CI/CD pipeline to


automate the build, test, and deployment process. AWS offers several tools for CI/CD, including
AWS CodePipeline, AWS CodeBuild, and AWS CodeDeploy. These tools can be configured to
automatically build, test, and deploy your applications whenever changes are pushed to your code
repository.

Infrastructure as Code: Use infrastructure-as-code (IaC) tools like AWS CloudFormation or AWS
CDK (Cloud Development Kit) to define your application's infrastructure in code. This allows you
to provision and manage AWS resources, such as EC2 instances, databases, load balancers, and
security groups, using version-controlled templates or scripts.

Testing and Quality Assurance: Implement a testing strategy that includes unit tests, integration
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
tests, and end-to-end tests. Use AWS testing services like AWS CodeBuild, AWS Device Farm, or
AWS Lambda for automated testing. Additionally, consider implementing code review processes
and code analysis tools to ensure code quality and adherence to best practices.

Deployment and Staging: Deploy your application to staging environments for further testing and
validation. Use AWS services like AWS Elastic Beanstalk, AWS App Runner, or AWS ECS
(Elastic Container Service) to deploy and manage your applications. You can also leverage services
like AWS CloudFront or Amazon S3 for content delivery and hosting static assets.

Monitoring and Logging: Implement monitoring and logging solutions to gain insights into the
health, performance, and behavior of your application. Use AWS services like AWS CloudWatch,
AWS X-Ray, or AWS CloudTrail for monitoring, logging, and tracing application activities.
Configure alarms and notifications to proactively detect and respond to issues.

Scaling and Optimization: Monitor the performance of your application and optimize it for
scalability and cost efficiency. Utilize AWS Auto Scaling to automatically adjust resource capacity
based on demand. Analyze metrics and logs to identify performance bottlenecks and optimize the
application's infrastructure and code accordingly.

Security and Compliance: Implement security measures to protect your application and data.
Follow AWS security best practices and leverage AWS services like AWS Identity and Access
Management (IAM), AWS Secrets Manager, AWS Key Management Service (KMS), and AWS
Certificate Manager to manage access, secrets, encryption, and SSL/TLS certificates.

This is a high-level overview of a typical development workflow in AWS. The specific tools and
services you use may vary based on your application requirements and development preferences.
AWS offers a wide range of services and features to support various development methodologies,
deployment models, and scalability needs.

Amazon EC Services

AWS (Amazon Web Services) provides a comprehensive suite of EC (Elastic Compute) services
that offer scalable and flexible compute resources for running applications and workloads.

Here are some of the key EC services offered by AWS:


Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
1. Amazon EC2 (Elastic Compute Cloud): Amazon EC2 is a foundational service that provides
resizable compute capacity in the cloud. It allows you to provision virtual servers, known as
EC2 instances, with a wide selection of instance types, operating systems, and configurations.
EC2 instances are highly customizable and can be used for a variety of purposes, including
hosting web applications, running batch processing jobs, and deploying containers.

2. Amazon EC2 Auto Scaling: EC2 Auto Scaling helps you maintain the desired number of EC2
instances in an EC2 Auto Scaling group automatically. It automatically scales the number of
instances based on predefined scaling policies, ensuring that your application can handle varying
levels of traffic and demand. EC2 Auto Scaling also integrates with other AWS services, such as
Amazon CloudWatch, to provide dynamic scaling capabilities.

3. AWS Lambda: AWS Lambda is a serverless compute service that allows you to run code
without provisioning or managing servers. You can write your code in various programming
languages and upload it to Lambda, which then handles the underlying infrastructure and
automatically scales your code in response to incoming requests. Lambda is commonly used for
executing short-lived functions and building event-driven architectures.

4. Amazon Elastic Container Service (ECS): ECS is a fully managed container orchestration
service that allows you to run Docker containers in the cloud. It provides a scalable and secure
platform for deploying, managing, and scaling containerized applications. ECS supports both
EC2 launch type, where containers run on EC2 instances, and Fargate launch type, which allows
you to run containers without managing the underlying infrastructure.

5. Amazon Elastic Kubernetes Service (EKS): EKS is a fully managed Kubernetes service
provided by AWS. It simplifies the deployment, management, and scaling of Kubernetes
clusters. With EKS, you can run containerized applications using Kubernetes and leverage the
rich ecosystem of Kubernetes tools and features. EKS integrates with other AWS services and
offers native integration with AWS Fargate for serverless container deployments.

6. AWS Batch: AWS Batch is a fully managed service for running batch computing workloads. It
allows you to execute jobs on EC2 instances and automatically provisions the necessary
resources based on your job's requirements. AWS Batch provides a managed environment for
scheduling, monitoring, and scaling batch jobs, making it suitable for a wide range of use cases,
including data processing, scientific simulations, and analytics.

7. AWS Outposts: AWS Outposts brings AWS infrastructure and services to your on-premises
data center or edge location. It extends the capabilities of AWS services, including EC2, EKS,
and ECS, to run locally on Outposts hardware. This allows you to leverage AWS services and
manage your on-premises and cloud workloads consistently.

These are just a few examples of the EC services offered by AWS. Each service provides specific
capabilities and features to cater to different use cases and workload requirements. AWS EC
services offer the flexibility, scalability, and reliability needed to build and run a wide range of
applications in the cloud.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


MODULE 3 DEVOPS

Introduction, Test Driven Development, Continuous Integration, Code coverage, Best Practices,
Virtual Machines vs Containers, Rolling Deployments, Continuous Deployment, Auto Scaling.
Case Study: Open Stack, Cloud based ML Solutions in Healthcare.

Introduction
 DevOps, short for Development and Operations, is a collaborative approach to software
development that emphasizes communication, collaboration, and integration between software
developers and IT operations teams.
 It aims to improve the efficiency and quality of software delivery by breaking down silos and
fostering a culture of shared responsibility and continuous improvement.
 In traditional software development processes, development and operations teams often work in
isolation, leading to issues such as slow and error-prone deployments, lack of visibility, and
frequent miscommunication.
 DevOps aims to address these challenges by promoting cross-functional collaboration and
automation.

Key principles and practices associated with DevOps include:

1. Collaboration: DevOps encourages collaboration and communication between developers,


operations staff, quality assurance, and other stakeholders involved in the software development
lifecycle. This helps ensure that everyone is working towards common goals and enables faster
feedback loops.

2. Continuous Integration and Continuous Delivery (CI/CD): DevOps promotes the use of
automated tools and practices for integrating code changes frequently and delivering software in
small, incremental releases. CI/CD pipelines automate the building, testing, and deployment
processes, allowing teams to deliver software faster and with higher quality.

3. Infrastructure as Code (IaC): Infrastructure as Code is an approach where infrastructure


provisioning, configuration, and management are treated as code. This means that infrastructure
components, such as servers, networks, and storage, are defined and managed using code and
version control systems. IaC enables reproducibility, scalability, and consistency in
infrastructure deployments.

4. Automation: Automation plays a crucial role in DevOps. By automating repetitive and manual
tasks, teams can reduce errors, improve efficiency, and focus on higher-value activities.
Automation can include tasks like testing, deployment, monitoring, and infrastructure
provisioning.

5. Monitoring and Feedback: DevOps emphasizes the importance of monitoring software and
infrastructure in production to gain insights into performance, reliability, and user experience.
Monitoring helps identify issues and provides feedback to guide improvements in future
development cycles.

6. Culture of Continuous Learning: DevOps promotes a culture of continuous learning and


improvement. This includes fostering a blameless culture where mistakes are seen as learning
opportunities, encouraging knowledge sharing, and regularly reviewing processes to identify
areas for optimization.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


DevOps practices are often supported by a variety of tools and technologies, such as version control
systems (e.g., Git), continuous integration servers (e.g., Jenkins), configuration management tools (e.g.,
Ansible), containerization platforms (e.g., Docker, Kubernetes), and monitoring solutions (e.g.,
Prometheus, Grafana).

Adopting DevOps practices can result in benefits such as faster time to market, improved software
quality, increased collaboration, better resource utilization, and more reliable and resilient systems.
However, implementing DevOps requires organizational buy-in, cultural shifts, and investment in
appropriate tools and training to be successful.

Why DevOps?
Before going further, we need to understand why we need the DevOps over the other methods.

 The operation and development team worked in complete isolation.


 After the design-build, the testing and deployment are performed respectively. That's why they
consumed more time than actual build cycles.
 Without the use of DevOps, the team members are spending a large amount of time on
designing, testing, and deploying instead of building the project.
 Manual code deployment leads to human errors in production.
 Coding and operation teams have their separate timelines and are not in synch, causing further
delays.

DevOps History

 In 2009, the first conference named DevOpsdays was held in Ghent Belgium. Belgian consultant
and Patrick Debois founded the conference.
 In 2012, the state of DevOps report was launched and conceived by Alanna Brown at Puppet.
 In 2014, the annual State of DevOps report was published by Nicole Forsgren, Jez Humble,
Gene Kim, and others. They found DevOps adoption was accelerating in 2014 also.
 In 2015, Nicole Forsgren, Gene Kim, and Jez Humble founded DORA (DevOps Research and
Assignment).
 In 2017, Nicole Forsgren, Gene Kim, and Jez Humble published "Accelerate: Building and
Scaling High Performing Technology Organizations".

Features of Devops

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Devops Architecture:

Devops Life cycle:

1) Continuous Development
This phase involves the planning and coding of the software. The vision of the project is
decided during the planning phase. And the developers begin developing the code for the
application. There are no DevOps tools that are required for planning, but there are several tools
for maintaining the code.

2) Continuous Integration
This stage is the heart of the entire DevOps lifecycle. It is a software development practice in
which the developers require to commit changes to the source code more frequently. This may
be on a daily or weekly basis. Then every commit is built, and this allows early detection of
problems if they are present. Building code is not only involved compilation, but it also includes
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
unit testing, integration testing, code review, and packaging.

 The code supporting new functionality is continuously integrated with the existing code.
Therefore, there is continuous development of software. The updated code needs to be
integrated continuously and smoothly with the systems to reflect changes to the end-
users.

 Jenkins is a popular tool used in this phase. Whenever there is a change in the Git
repository, then Jenkins fetches the updated code and prepares a build of that code,
which is an executable file in the form of war or jar. Then this build is forwarded to the
test server or the production server.

3. Continuous Testing
This phase, where the developed software is continuously testing for bugs. For constant testing,
automation testing tools such as TestNG, JUnit, Selenium, etc are used. These tools allow QAs
to test multiple code-bases thoroughly in parallel to ensure that there is no flaw in the
functionality. In this phase, Docker Containers can be used for simulating the test environment.

Selenium does the automation testing, and TestNG generates the reports. This entire testing
phase can automate with the help of a Continuous Integration tool called Jenkins.

Automation testing saves a lot of time and effort for executing the tests instead of doing this
manually. Apart from that, report generation is a big plus. The task of evaluating the test cases
that failed in a test suite gets simpler. Also, we can schedule the execution of the test cases at
predefined times. After testing, the code is continuously integrated with the existing code.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


4) Continuous Monitoring
Monitoring is a phase that involves all the operational factors of the entire DevOps process,
where important information about the use of the software is recorded and carefully processed
to find out trends and identify problem areas. Usually, the monitoring is integrated within the
operational capabilities of the software application.

It may occur in the form of documentation files or maybe produce large-scale data about the
application parameters when it is in a continuous use position. The system errors such as server
not reachable, low memory, etc are resolved in this phase. It maintains the security and
availability of the service.

5) Continuous Feedback
The application development is consistently improved by analyzing the results from the
operations of the software. This is carried out by placing the critical phase of constant feedback
between the operations and the development of the next version of the current software
application.
The continuity is the essential factor in the DevOps as it removes the unnecessary steps which
are required to take a software application from development, using it to find out its issues and
then producing a better version. It kills the efficiency that may be possible with the app and
reduce the number of interested customers.

6) Continuous Deployment
In this phase, the code is deployed to the production servers. Also, it is essential to ensure that
the code is correctly used on all the servers.

 The new code is deployed continuously, and configuration management tools play an
essential role in executing tasks frequently and quickly.
 Here are some popular tools which are used in this phase, such as Chef, Puppet, Ansible,
and SaltStack.
 Containerization tools are also playing an essential role in the deployment phase.
Vagrant and Docker are popular tools that are used for this purpose. These tools help to
produce consistency across development, staging, testing, and production environment.
They also help in scaling up and scaling down instances softly.
 Containerization tools help to maintain consistency across the environments where the
application is tested, developed, and deployed. There is no chance of errors or failure in
the production environment as they package and replicate the same dependencies and
packages used in the testing, development, and staging environment. It makes the
application easy to run on different computers.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


7) Continuous Operations
 All DevOps operations are based on the continuity with complete automation of the
release process and allow the organization to accelerate the overall time to market
continuingly.

 It is clear from the discussion that continuity is the critical factor in the DevOps in
removing steps that often distract the development, take it longer to detect issues and
produce a better version of the product after several months.

 With DevOps, we can make any software product more efficient and increase the overall
count of interested customers in your product.

Devops workflow:

Devops Tools:

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Advantages
 DevOps is an excellent approach for quick development and deployment of applications.
 It responds faster to the market changes to improve business growth.
 DevOps escalate business profit by decreasing software delivery time and transportation costs.
 DevOps clears the descriptive process, which gives clarity on product development and
delivery.
 It improves customer experience and satisfaction.
 DevOps simplifies collaboration and places all tools in the cloud for customers to access.
 DevOps means collective responsibility, which leads to better team engagement and
productivity.
Disadvantages
 DevOps professional or expert's developers are less available.
 Developing with DevOps is so expensive.
 Adopting new DevOps technology into the industries is hard to manage in short time.
 Lack of DevOps knowledge can be a problem in the continuous integration of automation
projects.

Test-Driven Development (TDD)


Test-Driven Development (TDD) is a software development practice that emphasizes writing tests
before writing the code. It follows a cycle of writing a failing test, writing the code to make the test
pass, and then refactoring the code to improve its design. TDD aims to improve code quality,
maintainability, and reliability by ensuring that code is thoroughly tested.

The TDD process typically involves the following steps:

1. Write a Test: In TDD, you start by writing a test that defines the desired behavior of a small
piece of functionality. This test is initially expected to fail since the corresponding code hasn't
been implemented yet.

2. Run the Test: The next step is to run the test and observe it fail. This step verifies that the test
is correctly detecting the absence of the desired functionality.

3. Write the Code: Now, you implement the minimum amount of code necessary to make the test
pass. The focus is on writing the simplest and most straightforward solution to fulfill the test's
requirements.

4. Run the Test Again: After writing the code, you rerun the test suite to verify that the new test
you wrote passes. At this point, you should have at least one passing test.

5. Refactor the Code: With passing tests, you can refactor the code to improve its design without
changing its behavior. This step ensures that the code remains clean, maintainable, and adheres
to best practices.

6. Repeat: The process is repeated for the next small piece of functionality. Each new test exposes
requirements for the code that you incrementally implement until all the desired functionality is
fulfilled.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Benefits of Test-Driven Development:

Better Code Quality: TDD encourages developers to write code that is testable, modular, and loosely
coupled. The focus on writing tests first helps catch bugs early in the development process, resulting in
higher code quality.

Increased Confidence: By having comprehensive tests, developers gain confidence in making changes
to the codebase. Tests act as safety nets, allowing developers to refactor or add new features without
fear of breaking existing functionality.

Improved Design: TDD promotes good design principles such as separation of concerns and single
responsibility. By refactoring the code after each test passes, developers can continuously improve the
design, making it more maintainable and extensible.

Faster Debugging: When a test fails, it provides a clear indication of which specific functionality is
not working as expected. This speeds up the debugging process and helps pinpoint issues more
accurately.

Documentation and Specification: The tests act as living documentation of the codebase. They
provide examples of how the code should behave and serve as executable specifications for future
development and maintenance.

It's important to note that TDD is not a silver bullet and may not be suitable for all scenarios. Its
effectiveness can vary depending on the nature of the project, team dynamics, and other factors.
However, when used appropriately, TDD can be a valuable practice in software development, leading
to more robust and reliable code.

Continuous Integration
 Continuous Integration (CI) is a software development practice that involves frequently
integrating code changes from multiple developers into a shared repository.
 The main goal of CI is to detect integration issues and conflicts early by automatically building
and testing the software with each code commit.
 In a CI workflow, developers integrate their code changes into a central version control system,
such as Git, multiple times throughout the day.
 This triggers an automated build process, where the code is compiled, dependencies are
resolved, and the application or software is built. Following the build, a suite of automated tests
is executed to ensure that the integrated changes haven't introduced any regressions or defects.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Key elements and benefits of Continuous Integration include:

Automated Build and Test: CI relies on automated tools and scripts to build the software from source
code and run a suite of tests. This automation enables fast and reliable feedback on the health and
stability of the integrated code.

Early Detection of Issues: By integrating code frequently and running tests automatically, CI allows
for the early detection of integration issues, conflicts, and bugs. This helps identify and fix problems
when they are smaller and easier to address.

Collaboration and Communication: CI promotes collaboration among team members by encouraging


frequent code integration and shared code repositories. It improves communication between developers,
as they need to coordinate their changes and resolve conflicts regularly.

Rapid Feedback Loop: CI provides developers with rapid feedback on the status of their code
changes. They receive immediate notifications if their changes break the build or fail any tests. This
short feedback loop enables quick identification and resolution of issues.

Code Quality and Stability: Continuous Integration helps maintain high code quality and stability. It
ensures that all code changes are built and tested in a consistent and repeatable manner, reducing the
risk of deploying faulty or unstable code to production.

Continuous Delivery Readiness: CI is often a prerequisite for Continuous Delivery (CD). By


integrating code frequently and running automated tests, CI establishes a foundation of confidence and
quality required for seamless and frequent software deployments.

To implement Continuous Integration effectively, teams typically leverage CI/CD tools such as
Jenkins, Travis CI, CircleCI, or GitLab CI/CD. These tools automate the build and test processes,
provide reporting and notifications, and integrate with version control systems.

It's worth noting that while Continuous Integration focuses on code integration and automated testing,
Continuous Delivery (CD) and Continuous Deployment (CD) expand on CI by automating the entire
software delivery pipeline, including deployment to production environments. Together, CI/CD
practices contribute to a more streamlined and efficient software development process.

How CI Can be Used?


Over the past few years, Continuous Integration has become one of the best practices for software
development. The goal is to detect the errors early on without having to wait until the end of the
project.

Here are some basic prerequisites for implementing Continuous Integration:

 Automating builds
 Automating testing
 A single source code repository
 Visibility of the entire process
 Real-time code access to everyone in the team

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Importance of Continuous Integration
 Reduces Risk
 Better Communication
 Higher Product Quality
 Reduced Waiting Time

Benefits of Continuous Integration


 Risk Mitigation
 Quality Teams
 Increased Visibility

Challenges of Continuous Integration


 Change in Organizational Culture
 Difficult to Maintain
 Numerous Error Messages
Getting Started with Continuous Integration
While Continuous Integration should be done slowly, you may need to change the entire team
culture in order to fully implement it. Here are five steps to get started with Continuous
Integration:
1. Write tests for the most critical parts of the codebase
2. Run the tests automatically with a CI service on every push to the main repository
3. Make everyone in the team integrate their changes every day
4. As soon as the build is broken, fix it
5. For every new story that is implemented, write a test

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Best Continuous Integration Practices
 Test-Driven Development
 Code Reviews and Pull Requests
 Optimized Pipeline Speed

Continuous Integration Tools and Services

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Code coverage
Code coverage is a metric used in software testing to measure the extent to which the source code of
a software system is executed by the test suite. It quantifies the percentage of code lines, branches,
or statements that are covered by tests.

Code coverage is typically measured by running tests against the software and collecting
information on which parts of the code were executed during the test run. The collected data is then
used to calculate the coverage metrics.

There are different types of code coverage metrics:

1. Line Coverage: Line coverage measures the percentage of lines of code that are executed
during the test run. It indicates whether each line of code has been executed or not.

2. Branch Coverage: Branch coverage measures the percentage of branches or decision points
in the code that are executed during the test run. It checks if both the true and false branches
of each decision point have been executed.

3. Statement Coverage: Statement coverage measures the percentage of statements in the


code that are executed during the test run. It determines whether each individual statement
has been executed or not.

4. Function/Method Coverage: Function or method coverage measures the percentage of


functions or methods that are executed during the test run. It checks if each function or
method has been called at least once.

Code coverage helps assess the quality and thoroughness of the test suite. Higher code coverage
indicates that a larger portion of the code has been tested, potentially leading to the detection of
more bugs and ensuring that the code is exercised in various scenarios.

However, code coverage alone does not guarantee the absence of bugs or comprehensive
testing. It is possible to have high code coverage but still miss critical scenarios or edge cases.
Code coverage should be used as a tool to guide testing efforts, improve the effectiveness of test
suites, and identify areas that require more attention.

To measure code coverage, various tools and frameworks exist, such as JaCoCo, Cobertura,
Istanbul, and OpenCover, which provide insights into the coverage metrics of the codebase.
These tools can be integrated into the build process or test automation framework to generate
reports on code coverage.

 A code coverage tool works with a specific programming language. Apart from that,
they can be integrated with:

 Build tools like Ant, Maven, and Gradle.


 CI tools like Jenkins.
 Project management tools like Jira, and more.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Benefits of Code Coverage
 Code Maintenance
 Bad Code Discovery
 Faster Product Building

Code Coverage Tools


 Default Visual Studio Code Coverage
 Cobertura
 Coverage.py
 SimpleCov
 Istanbul

Best practices
DevOps encompasses a wide range of practices aimed at improving collaboration, efficiency, and
automation in software development and operations.

Here are some best practices in DevOps:

Continuous Integration and Continuous Delivery (CI/CD): Implement automated CI/CD pipelines
to enable frequent and reliable software releases. Automate the build, test, and deployment processes to
ensure that code changes are integrated and delivered smoothly.

Infrastructure as Code (IaC): Use infrastructure as code principles and tools, such as Terraform or
CloudFormation, to define and manage infrastructure resources. This allows for reproducibility,
scalability, and version control of infrastructure deployments.

Configuration Management: Employ configuration management tools, like Ansible or Chef, to


automate the provisioning and management of server configurations. Maintain consistency and enforce
desired configurations across different environments.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Version Control: Utilize version control systems, such as Git, for managing and tracking changes to
infrastructure code, scripts, and configurations. This enables collaboration, rollback capabilities, and a
clear audit trail.

Monitoring and Observability: Implement comprehensive monitoring and observability practices to


gain insights into system performance, health, and user experience. Use tools like Prometheus, Grafana,
or ELK stack to collect and analyze metrics, logs, and traces.

Collaboration and Communication: Foster a culture of collaboration and communication between


development, operations, and other stakeholders. Use collaboration tools like chat platforms, project
management systems, and video conferencing to facilitate effective communication.

Automated Testing: Implement automated testing practices at different levels, including unit tests,
integration tests, and system tests. Automated testing helps catch defects early, ensure code quality, and
improve overall software reliability.

Security and Compliance: Bake security and compliance practices into your DevOps processes.
Conduct regular security assessments, follow secure coding practices, and enforce compliance
requirements from the early stages of development.

Continuous Learning and Improvement: Encourage a culture of continuous learning and


improvement. Conduct regular retrospectives to identify areas for enhancement, provide opportunities
for learning new technologies, and foster innovation within the team.

Cross-Functional Collaboration: Encourage cross-functional collaboration and knowledge sharing.


Promote DevOps practices and principles across teams, breaking down silos, and fostering a shared
sense of ownership and responsibility.

Infrastructure Automation: Automate infrastructure provisioning, configuration management, and


deployment processes using tools like Docker, Kubernetes, or serverless technologies. Infrastructure
automation reduces manual effort, improves consistency, and enables scalability.

Resilience and Disaster Recovery: Design systems with resilience in mind, implementing redundancy,
fault tolerance, and disaster recovery mechanisms. Conduct regular drills and testing to ensure
readiness for potential failures and minimize downtime.

Virtual Machines vs Containers


Virtual Machines (VMs) and containers are both technologies used for deploying and running software
applications, but they have different characteristics and use cases. Here's a comparison between VMs
and containers:

1. Virtual Machines:

Isolation: VMs provide strong isolation between applications and the host operating system. Each VM
has its own complete operating system and runs on a hypervisor that manages the hardware resources.
Resource Requirements: VMs are resource-intensive because each VM requires a separate operating
system and has its own memory, disk space, and CPU allocation. This can lead to higher resource
overhead and slower startup times.
Portability: VMs can be portable across different hypervisors and cloud platforms, allowing
applications to be migrated between environments. However, some level of configuration and

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


compatibility considerations may be required.
Management: Managing VMs involves managing the entire VM lifecycle, including provisioning,
patching, and updates of the operating system and applications within each VM.
Scalability: VMs can be individually scaled up or down, but scaling usually involves allocating
additional resources to the VM, including memory and CPU.
Use Cases: VMs are suitable for running applications that require complete isolation and run different
operating systems or versions. They are commonly used for running legacy applications, complex
software stacks, and applications with strict security requirements.

2. Containers:

Isolation: Containers provide lightweight application isolation by sharing the host operating system
kernel. Each container runs in its own isolated user space, but shares the host OS, which reduces
resource overhead and improves performance.
Resource Requirements: Containers have lower resource overhead compared to VMs because they
share the host operating system. Multiple containers can run on the same host with efficient resource
utilization.
Portability: Containers are highly portable and can run consistently across different environments,
including development, testing, and production. They provide application consistency and eliminate
potential environment-related issues.
Management: Container management platforms, such as Docker and Kubernetes, simplify the
management of containers, including deployment, scaling, orchestration, and automated lifecycle
management.
Scalability: Containers are designed for scalability and can be easily scaled horizontally by adding
more containers to handle increased workload. Container orchestration platforms provide dynamic
scaling based on demand.
Use Cases: Containers are well-suited for microservices architectures, modernizing applications, and
deploying cloud-native applications. They are used extensively in DevOps practices for building,
testing, and deploying applications in a fast and efficient manner.

Rolling Deployment
Rolling deployments, also known as rolling updates or rolling upgrades, are a deployment strategy used
in software development and operations to minimize downtime and ensure seamless updates of
applications or services. The rolling deployment approach involves gradually updating instances of an
application or service in a controlled manner while keeping the application available and responsive to
users. Here's how rolling deployments work:

1. Incremental Updates: Instead of updating all instances of an application or service


simultaneously, rolling deployments update a subset of instances incrementally. This subset can
be a fixed number or a percentage of the total instances.
2. Load Balancing: Rolling deployments leverage load balancers or similar mechanisms to
distribute traffic across both the existing and updated instances. This ensures that user requests
are evenly distributed and the application remains accessible throughout the deployment
process.
3. Update Process: The rolling deployment process typically starts by updating a small number of
instances, often one or a few at a time, with the new version or configuration changes. Once the
updated instances are verified to be running correctly and stable, the process continues with
updating the next batch of instances.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


4. Monitoring and Validation: Monitoring tools and health checks are used to verify the health
and stability of each updated instance before moving on to the next batch. If any issues are
detected, the rolling deployment can be paused or rolled back to the previous version to
minimize the impact on users.

5. Gradual Rollout: The update process continues in a gradual and iterative manner until all
instances have been updated. This approach allows for monitoring the impact of the update and
quickly addressing any issues or regressions that may arise.

Benefits of Rolling Deployments:

1. Minimal Downtime: Rolling deployments minimize downtime as the application remains


available throughout the update process. Users experience minimal disruptions or service
interruptions.

2. Faster Rollback: If any issues are identified during the update, rolling deployments make it
easier to roll back to the previous version since only a subset of instances are updated at a time.

3. Risk Mitigation: Updating instances incrementally reduces the risk of widespread failures or
issues, as issues are isolated to the updated subset of instances, making it easier to identify and
address problems.

4. Continuous Availability: Rolling deployments enable continuous availability of the application


or service, allowing users to access the application without interruption during the deployment
process.

Considerations for Rolling Deployments:

1. Compatibility: Ensure backward compatibility between the existing and updated versions to
avoid compatibility issues or data inconsistencies during the rolling deployment process.

2. Monitoring and Health Checks: Implement robust monitoring and health check mechanisms
to detect issues promptly and validate the health of each updated instance.

3. Automation: Leverage automation tools and infrastructure-as-code practices to automate the


rolling deployment process and minimize manual intervention.

4. Rollback Plan: Have a well-defined rollback plan in case issues arise during the deployment.
This includes having backups, restoring previous versions, and communicating the rollback
process to stakeholders.

Rolling deployments are commonly used in DevOps practices and are facilitated by containerization
technologies, orchestration platforms like Kubernetes, and continuous integration and deployment
(CI/CD) pipelines. They provide a way to update applications and services seamlessly, reduce
downtime, and improve overall deployment reliability.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Continuous Deployment

Continuous Deployment is a software development practice where code changes are automatically
deployed to production environments as soon as they pass the necessary automated tests and quality
checks. It is an extension of continuous integration and continuous delivery (CI/CD) practices, enabling
a rapid and frequent release cycle.

Here's an overview of continuous deployment:

1. Automated Build and Test: Continuous Deployment relies on robust automation for building,
testing, and validating code changes. Automated processes ensure that code changes are
thoroughly tested to meet quality standards before deployment.

2. Integration with Version Control: Continuous Deployment is typically integrated with version
control systems, such as Git, to trigger deployment pipelines automatically when new code
changes are pushed or merged into the main branch.

3. Automated Deployment Pipelines: Deployment pipelines are set up to automate the entire
process from code commit to production deployment. These pipelines include stages for
building, testing, packaging, and deploying the application.

4. Continuous Testing: Continuous Deployment relies heavily on automated testing. Test suites,
including unit tests, integration tests, and end-to-end tests, are executed as part of the
deployment pipeline to ensure the quality and stability of the application.

5. Incremental Deployments: Continuous Deployment often employs strategies like rolling


deployments or canary releases. These strategies allow gradual and controlled deployment of
new code changes to production, minimizing the risk and impact of potential issues.

6. Monitoring and Feedback Loops: Continuous Deployment requires robust monitoring and
feedback mechanisms. Monitoring tools and real-time metrics help detect issues or anomalies in
the production environment, providing feedback to the development team for quick response
and resolution.

Four Activities of Continuous Deployment

1. Deploy – the practices necessary to deploy a solution to a production environment


2. Verify – the practices needed to ensure solution changes operate in production as intended
before releasing them to customers
3. Monitor – the practices to monitor and report on any issues that may arise in production
4. Respond – the practices to address any problems rapidly which may occur during deployment

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Enabling Continuous Deployment with DevOps

Benefits of Continuous Deployment:


1. Faster Time to Market: Continuous Deployment enables faster release cycles, allowing new
features and bug fixes to reach end-users more rapidly, improving the time to market.

2. Improved Collaboration: Continuous Deployment encourages collaboration among


development, operations, and quality assurance teams. Automation and streamlined processes
facilitate seamless coordination and feedback loops between teams.

3. Early Issue Detection: By continuously testing and validating code changes, issues or bugs can
be identified earlier in the development cycle, reducing the likelihood of major failures in
production.

4. Rapid Iteration and Innovation: Continuous Deployment fosters a culture of rapid iteration
and innovation. Developers can quickly receive user feedback and iterate on features,
incorporating improvements and new functionality in subsequent deployments.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Considerations for Continuous Deployment:

1. Test Coverage: Robust test suites are crucial for ensuring code stability and quality.
Comprehensive unit tests, integration tests, and end-to-end tests should be part of the automated
testing strategy.
2. Deployment Rollbacks: Continuous Deployment requires a well-defined rollback strategy in
case issues arise after a deployment. The ability to roll back to a previous known-good version
quickly is important to minimize any potential negative impact.
3. Monitoring and Alerting: Implementing effective monitoring and alerting systems is essential
to detect and respond to issues promptly. Real-time monitoring helps identify problems early
and trigger appropriate actions for resolution.
Continuous Deployment is a practice that requires careful planning, automated processes, and strong
collaboration between development, operations, and quality assurance teams. It empowers
organizations to release software faster, respond to user feedback quickly, and continuously deliver
value to end-users.

Auto scaling
Auto Scaling is a key component of DevOps and cloud computing that enables automatic adjustment of
computing resources based on real-time demand. It allows applications to scale up or down
dynamically in response to changes in workload, ensuring optimal performance and cost efficiency.

Here's an overview of Auto Scaling in DevOps:

1. Scaling Policies: Auto Scaling uses predefined scaling policies to determine when and how to
scale resources. These policies define thresholds or metrics, such as CPU utilization or network
traffic, that trigger scaling actions.

2. Elasticity: Auto Scaling enables the automatic addition or removal of resources based on
demand. When the workload increases, additional instances or resources are provisioned to
handle the increased load. Conversely, when the demand decreases, excess resources are
automatically terminated or scaled down to save costs.

3. Load Balancing: Auto Scaling works in conjunction with load balancing mechanisms. As new
instances are added, load balancers distribute incoming traffic evenly across the instances,
ensuring efficient utilization of resources and high availability.

4. Monitoring and Metrics: Auto Scaling relies on real-time monitoring and metrics to make
scaling decisions. Metrics such as CPU utilization, network traffic, or application-specific
metrics are continuously monitored to determine when to scale up or down.

5. Integration with Orchestration: Auto Scaling is often integrated with container orchestration
platforms like Kubernetes or infrastructure management tools like AWS Auto Scaling. These
platforms provide automation capabilities and facilitate efficient scaling of resources.

6. Fault Tolerance: Auto Scaling enhances fault tolerance by ensuring that sufficient resources
are available to handle increased load or compensate for failed instances. It helps maintain high
availability and resilience of applications.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


Benefits of Auto Scaling in DevOps:

1. Performance Optimization: Auto Scaling allows applications to dynamically scale resources


based on demand, ensuring optimal performance and responsiveness during peak periods while
minimizing resource waste during low-demand periods.
2. Cost Optimization: Auto Scaling helps optimize costs by automatically scaling resources based
on demand. It eliminates the need for manual adjustments and ensures that resources are
provisioned only when required, reducing unnecessary expenses.
3. Scalability and Elasticity: Auto Scaling provides scalability and elasticity by allowing
applications to handle increased load seamlessly. It ensures that resources are available to meet
user demand without overprovisioning.
4. High Availability: Auto Scaling, combined with load balancing, helps maintain high
availability by distributing traffic across multiple instances. If an instance fails, the load
balancer redirects traffic to healthy instances, minimizing service disruptions.
5. Automation and Efficiency: Auto Scaling automates the process of adjusting resources,
reducing manual effort and increasing operational efficiency. It frees up DevOps teams to focus
on other critical tasks and reduces the risk of human errors.

Considerations for Auto Scaling:

1. Monitoring and Alerting: Robust monitoring and alerting systems are essential to detect
changes in workload and trigger scaling actions promptly. Real-time metrics and proactive
monitoring help ensure timely resource adjustments.
2. Resource Constraints: Auto Scaling requires careful consideration of resource limits, such as
available compute resources, storage capacity, and network bandwidth. It is important to ensure
that sufficient resources are available to accommodate scaling demands.
3. Application Architecture: The application architecture should be designed to support Auto
Scaling. Applications should be stateless or capable of horizontal scaling, allowing new
instances to be added or removed seamlessly without impacting data consistency or user
experience.
4. Testing and Validation: Regular testing and validation of Auto Scaling configurations and
policies are crucial to ensure they function as expected. Load testing and performance testing
help identify any bottlenecks or limitations in the scaling capabilities.

Auto Scaling is a powerful tool in the DevOps toolbox that enables applications to scale dynamically
and efficiently based on workload fluctuations. It provides agility, cost optimization, and high
availability, allowing organizations to meet user demands effectively while optimizing resource
utilization.
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
Openstack
OpenStack is an open-source cloud computing platform that provides infrastructure as a service (IaaS)
capabilities. It enables the creation and management of private and public clouds by providing a set of
software tools and components for building and managing cloud infrastructure.

Here are some key aspects of OpenStack:

1. Components: OpenStack is composed of multiple components that work together to deliver


cloud infrastructure services. Some of the core components include:
 Nova: Manages the compute resources and provides virtual machine (VM)
orchestration.
 Neutron: Handles the networking aspects, including virtual networks, routers, and
network security.
 Cinder: Offers block storage services, allowing the attachment of storage volumes to
VMs.
 Swift: Provides object storage for storing large amounts of unstructured data.
 Keystone: Manages identity and authentication services, enabling access control and
user management.
 Glance: Handles the management and retrieval of virtual machine images.
 Horizon: A web-based dashboard for managing and provisioning cloud resources.

2. Scalability and Flexibility: OpenStack is designed to be highly scalable and flexible, allowing
users to add and manage a large number of compute, storage, and networking resources. It can
be used to build private clouds within an organization's own data centers or public clouds for
providing services to external users.

3. Open Source Community: OpenStack is an open-source project with a large and active
community of contributors and users. The community-driven development model ensures
regular updates, bug fixes, and the addition of new features. It also promotes interoperability
and standards compliance across different OpenStack deployments.

4. APIs and Interoperability: OpenStack provides a rich set of APIs that allow users to interact
with and automate the management of cloud resources. These APIs enable integration with
other systems and tools, making it possible to build custom applications or leverage existing
cloud management tools.

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


5. Integration with Other Technologies: OpenStack can integrate with various complementary
technologies and frameworks, such as Kubernetes for container orchestration, software-defined
networking (SDN) solutions, and storage systems. This enables users to leverage OpenStack as
a foundation for building comprehensive cloud infrastructure platforms.

6. Use Cases: OpenStack is used by organizations of various sizes and across different industries.
It is particularly popular in industries such as telecommunications, academia, research, and
media, where there is a need for scalable, on-demand infrastructure. OpenStack can be used to
build private clouds for internal use or public clouds for offering cloud services to external
customers.
7. Ecosystem and Vendor Support: OpenStack has a wide ecosystem of vendors and service
providers offering commercial distributions, support, and professional services around
OpenStack deployments. This allows organizations to leverage expertise and get assistance in
implementing and managing OpenStack-based cloud infrastructure.
How does OpenStack Work?
 Basically, OpenStack is a series of commands which is called scripts. And these scripts are
packed into packages, which are called projects that rely on tasks that create cloud
environments. OpenStack relies on two other forms of software in order to construct certain
environments:
 Virtualization means a layer of virtual resources basically abstracted from the hardware.
 A base OS that executes commands basically provided by OpenStack Scripts.
 So, we can say all three technologies, i.e., virtualization, base operating system, and OpenStack
must work together.

Installation and Configuration of OpenStack


DevStack will install the following components:

 Compute Service - Nove


 Image Service - Glance
 Identity Service - Keystone,
 Block Storage Service - Cinder
 OpenStack Dashboard - Horizon
 Network Service - Neutron
 Placement API - Placement
 Object Storage – Swift

Installation of OpenStack
Step 1: Update Ubuntu System
Step 2: Create Stack Use
Step 3: Install the Git
Step 4: Download OpenStack
Step 5: Create a DevStack Configuration File
Step 6 : Install OpenStack with DevStack
Step 7: Accessing OpenStack on a browser
Step 8: Create an Instance

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing


OpenStack Architecture

Highlights of OpenStack

 OpenStack has made it possible for companies such as Bloomberg and Disney to handle their
private clouds at very manageable prices.
 OpenStack offers mixed hypervisor environments and bare metal server environments.
 RedHat, SUSE Linux, and Debian have all been active contributors and have been supporting
OpenStack since its inception.

Cloud based ML Solutions in Healthcare

Cloud-based machine learning (ML) solutions in healthcare have gained significant traction in recent
years, offering numerous benefits such as scalability, accessibility, cost-efficiency, and collaboration
opportunities. Here are some common applications of cloud-based ML solutions in the healthcare
industry:

1. Medical Imaging Analysis: Cloud-based ML solutions can be utilized to analyze medical


images, such as X-rays, CT scans, and MRIs. ML algorithms can assist in detecting anomalies,
identifying patterns, and aiding in the diagnosis of conditions like tumors, fractures, or other
abnormalities.

2. Predictive Analytics and Risk Stratification: By leveraging cloud-based ML models,


healthcare providers can analyze large volumes of patient data to identify risk factors, predict
disease progression, and stratify patients based on their likelihood of developing certain
conditions. This can support early intervention and personalized treatment plans.

3. Electronic Health Records (EHR) Analysis: Cloud-based ML platforms can analyze


Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
structured and unstructured data from electronic health records to extract insights, identify
trends, and facilitate decision-making. ML algorithms can assist in automating tasks such as
data entry, anomaly detection, and predictive analytics for improved patient care.

4. Remote Patient Monitoring: Cloud-based ML solutions enable the analysis of data from
remote patient monitoring devices, wearables, and sensors. ML algorithms can detect patterns,
identify deviations, and provide real-time insights for proactive healthcare interventions,
especially for chronic disease management.

5. Drug Discovery and Personalized Medicine: Cloud-based ML platforms can analyze large-
scale genomic, proteomic, and metabolomic data to aid in drug discovery and development. ML
models can also facilitate precision medicine by predicting treatment responses based on an
individual's genetic profile, clinical data, and other relevant factors.

6. Health Chatbots and Virtual Assistants: Cloud-based ML-powered chatbots and virtual
assistants can provide personalized health information, answer queries, offer symptom
assessment, and provide recommendations for self-care. These solutions can enhance patient
engagement, provide 24/7 support, and triage healthcare resources effectively.

7. Healthcare Data Security and Privacy: Cloud-based ML solutions offer robust security
measures and compliance frameworks to protect sensitive patient data. Advanced encryption,
access controls, and data anonymization techniques help ensure data privacy and compliance
with regulations like HIPAA (Health Insurance Portability and Accountability Act).

Cloud providers like Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform
offer a range of cloud-based ML services, tools, and infrastructure that healthcare organizations can
leverage. These services provide pre-built ML models, data storage and processing capabilities, and
scalable computing resources to support ML workflows in healthcare.

However, it's important to consider data governance, regulatory compliance, and ethical implications
when implementing cloud-based ML solutions in healthcare. Adhering to data protection regulations,
Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing
ensuring data privacy, and maintaining transparency in ML algorithms are critical for maintaining trust
and ethical standards in healthcare applications.

Benefits:

Mr.M.Vengateshwaran M.E., (Ph.D) Asst.Prof/CSE Cloud Computing

You might also like