Model driven provisioning in multi-tenant clouds
Atul Gohad
IBM India Software Lab, Bangalore
India.
agohad@in.ibm.com
Karthikeyan Ponnalagu
IBM Research India, Bangalore
India.
Nanjangud C. Narendra
IBM India Software Lab, Bangalore
India.
karthikeyan.ponnalagu@in.ibm.com
Abstract— In multi-tenant cloud systems today, provisioning
of resources for new tenancy is based on selection from a catalogue published by the cloud provider. The published images are
generally a stack of appliances with Infrastructure (IaaS) and
Platform (PaaS) layers and optionally Application layers (SaaS).
Such a ready-made model enables quicker and streamlined
resource provisioning to clients. However, this approach poses
certain challenges to clients in the short run and providers
in the long run. Unique tenancy requirements from each client
are forcibly generalized by selecting one of the available images
from the catalogue as the tenancy requirements are not modeled
or validated to start with. Moreover, resource provisioning is
mostly done towards addressing the peak load expectations in
the tenancy. Such a static approach does not help in adapting
to dynamically changing tenancy requirements, most often
leading to the tenants owning and subsequently paying for
more than what they need. In particular, provisioned resources
are expected to perform at the same level of quality without
accounting for their changing health.
In our paper, we propose an extensible dynamic provisioning
framework to address these challenges. We start with defining
a Tenancy Requirements Model (TRM) which helps map provisioned resources with tenants. The provisioned and candidate
resources are also modeled with their Quality of Service (QoS)
characteristics which we call Health Grading Model (HGM);
this helps in continuous monitoring and grading of resources
based on health parameters and enables health prediction for
future provisioning. Together, TRM and HGM allow dynamic
re-provisioning for existing tenants based on either changing
tenancy requirements or health grading predictions. We also
present algorithms for prediction based provisioning and tenancy requirement matching. We illustrate our ideas throughout
this paper with a running example, and present a proof-ofconcept prototype implementation on IBM’s Rational Software
Architect modeling tool.
Keywords. multi tenant cloud, dynamic provisioning, predictive allocation
I. I NTRODUCTION
Resource provisioning in the cloud is challenging task, due
to the dynamism and heterogeneity inherent in cloud environments. This is caused by varying performance, workload and
application characteristics on the cloud [3]. A cloud provider
performs a crucial role in selection, allocation and utilization
of cloud resources. This paper is related to the components of rapid provisioning, resource changing, monitoring
and reporting, metering and Service Level Objective (SLO)
management, which form part of the overall cloud provider
We like to thank GR Gangadharan & Balaji Viswanathan for their early
feedback and Jinu M Airumalayil & Praveen S Rao for their help in
prototype implementation.
narendra@in.ibm.com
function of provisioning/configuration1 . A cloud provider’s
computing resources are pooled to serve multiple consumers
using the multi-tenant model, with different physical and
virtual resources dynamically assigned and reassigned according to consumer demand. There is a sense of location
independence in that the customer generally has no control or
knowledge over the exact location of the provided resources
but may be able to specify location at a higher level of
abstraction (e.g., country, state, or datacenter). Examples
of resources include storage, processing, memory, network
bandwidth, and virtual machines.
Traditional resource provisioning approaches help determine the number/type of resources required to meet SLOs.
These approaches typically involve the following: (a) Constructing an application performance model that predicts
the number/type of application instances required to handle
demand at each particular level, in order to satisfy Quality
of Service (QoS) requirements; (b) periodically predicting
future demand and determining resource requirements using
the performance model, and (c) automatically allocating resources using the predicted resource requirements. However,
the quality of chosen resources is assumed to remain constant
throughout the time period of provisioning. Also the status
of an entire system reflects the status of the component on
the system that has the most severe status. For example, if a
component within a system has a status of critical, the entire
system will have a status of critical, even if the critically
impacted component is not critical to the current tenancy
requirements within the provisioned system [6]. This gives
rise to the following limitations:
• Effect of health of the provisioned resources: The overall
system operations are affected due to continuous execution of processes, and broadly these effects can be
classified into the following: (i) slowing down of the
system due to operating system related issues such as
fragmentation, larger registries, additional process loads
and effects of thrashing [5] and [1]; and (ii) the effect
of varying available memory (RAM, virtual memory,
persistent storage) for processing either due to increase
in the number of tenants or burst of load, or due to
the application performance and memory consumption
increasing over time
1 http://collaborate.nist.gov/
twiki-cloud-computing/pub/CloudComputing/
Meeting12AReferenceArchitectureMarch282011/NIST_
CCRATWG_029.pdf
•
Effect of multi-tenancy dynamics: The overall system
operations are affected due to changes in resource requirements of existing or new tenants. These dynamics
can affect the agreed upon SLO of the provisioned
tenants. Typically, these can be classified as: (i) adding
another tenant (similar to increasing number of requests
for a service); (ii) non-availability of any service on
the provisioned resource due to maintenance and/or
upgrades; (iii) increased burst of load on a specific
multi-tenant application; and (iv) dynamically changing
tenancy requirements such as time bound ramp-up or
ramp-down in throughput requirements, e.g., increase
during peak hours (typically seasonal as in case of an
online retail website) and decrease during off peak/night
hours.
In this paper, we take a novel approach towards dynamic
resource provisioning and define a cloud provisioning system
based on what we define as Health Grading Model (HGM)
and Tenancy Requirement Model (TRM), and then provide
algorithms to dynamically provision the resources based on
tenancy matching to health metrics based on HGM, and
underpinned by tenancy requirements as specified in TRM.
The HGM helps in defining, quantifying and monitoring the
health of a resource, along with critical levels for each of
the monitored parameters. Similarly, TRM helps in defining
and quantifying the requirements of a specific tenancy. Our
dynamic resource provisioning algorithm ensures that the
set of provisioned resources are able to meet the changing
requirements of tenancy, without affecting the given tenancy
SLOs, using the best available set of healthy resources.
The health of a resource, is constituted by a set of critical
parameters. Any change in the value of these parameters
affect the functioning of that resource, in turn affecting
the SLO of the tenancy provisioned on that resource. It
may be possible to augment a resource to increase certain
of its parameter values, either by tuning or by physically
choosing another similar resource with enhanced capability,
so as to increase its health. Likewise, any decrease in such
a parameter value would result into deterioration of the
resource health, and have a negative impact on the SLO
of the tenancy provisioned. For example, health of a Server
System, could be best attributed to the number of CPU cores
and CPU utilization over a period of time. Any variations in
these values will have an impact on the SLO of provisioned
tenancy. We provide detailed description of our HGM in
Section III-C.
The key contributions of our paper are:
1) Enable tenancy requirements to be captured and represented in a formalized model both for future reuse
considerations and also for continuous conformance of
provisioning
2) Enable certification and guarantee of accurate resource
provisioning based on TRM and HGM
3) Flexible switching between health monitoring for provisioned resources and health monitoring for the candidate resource pool, thus supporting both published
catalogue based tenancy provisioning, and made-toorder tenancy provisioning
4) Continuous monitoring and replacement of provisioned
systems based on changing health grading
5) Providing the best-fit resource for current tenancy
requirements and thus maximizing resource utilization
and reducing cost of hosting.
To the best of our knowledge, this is the first integrated
technique for dynamic resource provisioning based on resource health grading.
Our paper is organized as follows. Section II introduces
our running example, which we use throughout the rest of
our paper for illustration. The architecture and models used
to represent resource’s health and tenancy requirements are
explained in Section III. Our dynamic provisioning algorithm
is then described in Section IV, and is also illustrated via our
running example. Our prototype implementation is presented
in Section V, while related work is discussed in Section VI.
Finally, we present concluding remarks in Section VII.
II. RUNNING E XAMPLE
We consider an example cloud infrastructure as represented in Fig. 1 capable of hosting tenancy applications of
two types. First is the Online Travel Portal (OTP), which
is a content management system, wherein customer profiles
are maintained and which is used by customers to log in,
create and view travel requests. This application interacts
with the database to store the customer profiles, and also with
appropriate booking agencies to confirm travel bookings.
Second is the Mail Processing Application (MPA), a compute
intensive application that interacts with various mail server
accounts and filters feeds from received emails. The resource
types required to satisfy the functions of these applications
are depicted in Table I.
TABLE I
A PPLICATION R ESOURCE R EQUIREMENTS IN AN E XAMPLE C LOUD
Resource Type ID
R1
R2
R3
R4
R5
Resource Type
Database Server
File Server
Mail Server
Storage
Web Server
Required by application(s)
OTP, MPA
OTP, MPA
MPA
OTP, MPA
OTP
The tenancy requests ordered by the time at which they are
received and their requested functional and non-functional
requirements are depicted in Fig. 2. The functional and
non-functional requirements together determine the choice
of resources and that of the resource parameters to be
monitored, so as to satisfy the tenancy SLOs.
The tenancy instances satisfied by provisioning the allocated resources at time T1 are depicted in table II. Note
that, from the resource pool some resources are used as
shared resources between different tenants, whereas some
are exclusively used for specific tenancy requests to satisfy
the requested tenancy requirements.
Fig. 1.
Running example cloud infrastructure
Fig. 2.
Tenancy Request Queue
TABLE II
S ATISFIED TENANCY REQUESTS AT TIME T1
Tenancy instance id
TR1-OTP1
TR2-MPA1
TR3-OTP2
TR4-MPA2
Provisioned resource instances
R1a, R2a, R4a, R5a
R1b, R2b, R3a, R4b
R1c, R2c, R4c, R5b
R1c, R2c, R3a, R4c
niques would provide the estimated constant number of
resources required to be provisioned irrespective of changing
tenancy requirements at different times. Second, the state
of resources is assumed to be constant over the period of
provisioning.
TABLE III
S ATISFIED TENANCY REQUESTS AT TIME T2
The health of provisioned resources can be determined
using tools such as Collectd2 . The parameters that are of
special interest to meet the required SLOs as specified in
the tenancy requirements, are provided below. The mapping
between tenancy SLO requirements to the system parameters
to be monitored is currently manual, based on providers’
knowledge. In our future work, we plan to make this mapping
an automated function.
• CPUTime The amount of time spent by the CPU in various states, most notably executing user code, executing
system code, waiting for IO-operations and being idle
• DiskAccess The average time an I/O-operation took to
complete
• Memory The amount of available free, used, cached and
buffered memory
• ServletRequests Number of servlet requests served over
period of time, to determine the number of hits served
for an application
Some interesting research issues arise from this running
example. First, the currently available provisioning tech2 http://collectd.org
Tenancy instance id
TR1-OTP1
TR2-MPA1
TR3-OTP2
TR4-MPA2
TR5-OTP3
Provisioned resource instances
R1a, R2a, R4a, R5a
R1b, R2b, R3a, R4b
R1c, R2c, R4c, R5b
R1c, R2c, R3a, R4c
R1c, R2c, R3a, R5c
For example at time instance T2 (see Table III), due to
additional provisioning of tenancy TR5-OTP3, there could
be issues satisfying the SLOs requirement in servicing the
number of hits, due to a shared usage of resource R1c. Also is
it really required to allocate a new Web Server instance (R5c)
to satisfy TR5-OTP3 or can the same request be satisfied by
using R5b without affecting the SLOs of both TR3-OTP2
and TR5-OTP3? Third, and more crucially, we would need
to determine the extent to which a given healthy resource can
be matched to suit varying requirements of the tenancy; for
instance at time T2, application OTP is to be provisioned, for
which a healthy shareable resource of type database server
may not be available. The first two issues are addressed
by our unique work in modeling the TRM and HGM; and
the third issue, is addressed by our dynamic provisioning
algorithm.
Fig. 3.
System architecture.
III. A RCHITECTURE AND M ODELS
A. System Architecture
The architecture of the proposed system is depicted in
Fig. 3. The cloud depicts the set of all available resources
in the cloud provider’s environment. These resources could
be of varying types, each with multiple instances.
The repository notation of Tenancy Requirements Model
(TRM) depicts multiple tenancy requirement models. Each
of these models provides details on specific tenancy such
as (a) the state (Created, Active, Paused) of the tenant, (b)
its functional and non-functional behaviour characteristics,
(c) constraints on inter-tenancy and intra-tenancy, and (d)
tenancy attributes. The repository notation of Catalogued
Tenancy Model (CTM) depicts a collection of past tenancy
requirements data along with the provisioned resource details, used to satisfy the tenancy behaviours. This repository
is typically helpful in determining the resource needs for a
new tenancy request, based on historical data. The repository
notation of Health Grading Model (HGM) consists of details
of health parameters that are of critical importance for each
provisioned resource satisfying the current tenancy. These
health grading models are continuously updated based on the
output of Continuous Health Grade Monitor. The changes
in resource health state (Good, Affected, Bad) are marked
based on the chosen monitoring policy for specific tenancy.
The resource health model defined in Section III-C explains
the details on resource health state and monitoring policy.
The provisioning plan is determined, validated and customized by the Provisioning Plan Identifier, Customizer and
Validator component. This component is primarily responsible for executing the algorithms to achieve dynamic addition
& removal of resources as well as alert and forecast the state
of provisioned resources based on continuously changing
resource health grading and tenancy requirements.
The high level control flow of our proposed system is
shown in Fig. 4. First, the functional and non-functional
requirements are captured as provided by the business users.
These requirements are translated into a Tenancy Requirement Model (TRM). Second, a check is performed to determine a matching provisioning model from the catalogued
tenancy models, on the basis of the determined TRM. The
set of resources required to satisfy tenancy, is determined
based on either the closest found match or in case of no
matches, the TRM’s are translated into resource require-
Fig. 4.
Control Flow of Proposed System
ments based on existing capacity planning tools [18]. Third,
from the pool of available resources, the required set of
resources is provisioned, and this set of resources is fed into
the Continuous Health Grade Monitor. The monitoring of
resource parameters can be done using specific monitoring
tools available for the resource, for example, on Linux tools
such as Collectd can be used. Based on the criticality of
involved resource health parameters, and the monitoring
policy chosen, a tenancy is re-provisioned. At the end of
the tenancy period, the resources are released back to the
pool, ready to be used for next tenancy request.
B. Catalogued Tenancy Model (CTM)
The CTM as represented in Fig.5 has a set of provisioned
resources, along with their health grade models used to
satisfy past tenancy requirement requests. This data set is
used to selectively choose the appropriate set of resources to
be provisioned for satisfying a new tenancy. The mapping
between TRM of new tenancy request and TRM from CTM
is used to determine this resource set. For example, in our
running example, the catalogued tenancy model, has a dataset
corresponding to the satisfied requests of TR1-OTA1, which
has provisioned resource set of PR=(R1a,R2a,R4a,R5a); this
is used as a baseline to determine required resource set for
TR3-OT2. Optionally, in cases, wherein an appropriately
matching tenancy cannot be found in CTM, the required
initial resource set is arrived at based on manual analysis
of the new TRM. In our future work we plan to address
various techniques that can be employed to arrive at such a
manual analysis.
Fig. 5.
Representation of Catalogued Tenancy Model
C. Health Grading Model (HGM)
We define the health of a resource as a set of attributes
which govern the functioning of that particular resource. For
example, Health of CPU can be attributed to the parameters
of executing user code, executing system code, waiting for
I/O-operations and being idle. Likewise, the health of an
underlying infrastructure such as router or firewall can be
attributed to system uptime parameters of average running
time, or maximum system uptime reached over a certain time
duration.
In general we formulate the health of a resource as a
tuple H = {ha , hb , hc , . . . hn } such that any change in value
of these parameters results in a change in behaviour and
functioning of that resource. We also define the individual
threshold levels T = {ta , tb , tc , . . . tn } corresponding to each
of the health parameters H and a critical flag for each health
parameter that is flagged whenever the parameter value
crosses above the threshold level. This helps in identifying
the resource that is critical for given tenancy and needs to
be replaced.
Also, we define a relative weighted score for resources
of the same type based on segmented list of parameters
and their values. The values can range from a relative
score of 0-100%. This helps in easily picking a healthier
resource from the given pool, based on the representation
that higher the value of score for a parameter, the healthier
the resource is compared to others. For example, the health
score of a system can be represented as segmented list of
CP U |M emory|Storage|N/W |Downtime with values of
80|94|100|72|95, respectively. A representation of Health
Grading Model is shown in Fig.6.
Fig. 6.
Representation of Health Grading Model
The selection of parameters that constitute resource health
determination, can be based on the monitoring policies defined below. The tenant can specify its choice of monitoring
policy at the time of request for each tenancy.
• Aggregation Policy: This policy makes use of any one
aggregate function such as (i) Sum of all parameters, (ii)
average of all parameters (iii) Only top m parameters,
m ≤ n.
• Non-Functional Requirements Policy: In this policy, the
user provides a list of non-functional requirements,
which feed into a parameter selection function. This
function chooses a set of all parameters that could in any
way affect the given set of non-functional requirements.
A average of all parameters aggregate policy is then
applied on this selected set of parameters. The function
to choose the set of parameters, given a set of nonfunctional requirements, could be based on past historical data of the tenancy and non-functional requirements,
or be purely a job done manually by the cloud provider.
• Individual Policy: In this policy each resource parameter
is monitored for its health. The parameters that could be
monitored are limited in scope only by the resource type
and their capabilities in exposing the parameters that
could be monitored. The monitoring of these parameters
could be either done manually or based on available
tooling corresponding to a particular resource.
For example in our running example, consider resource
R1C, a Database Server instance which is shared and used
by tenancy instances of TR3-OTP2, TR4-MPA2 at time T1
and along with tenancy instance TR5-OTP3 at time T2. We
consider disk access as one of the crucial parameters to be
monitored for resource of this type. In an ideal resource
behavior, without any fluctuations to the health parameter
of disk access, the utilization and provisioning of tenancy
instances would be as shown in Fig.7. The health of disk
access parameter is assumed to remain constant at 100%
of the device capability, over period of tenancy, also the
resource utilization remains constant at 70% , until time T2,
when due to additional tenancy of TR5-OTP3, the resource
utilization goes upto 90%.
Fig. 7.
Ideal Occupancy for Resource instance R1C.
However, due to fluctuations in health parameter of disk
access, which vary at different levels across time instances,
the resource utilization as well as the number of tenancies
that can be provisioned on a specific resource could vary as
shown in Fig.8.
Fig. 8. Effect of Disk Access Health Changes on Resource instance R1C.
Some interesting observations to be noted are:
1) As the health of a crucial parameter deteriorates,
the actual utilization of resource increases so as to
maintain tenancy SLO requirement
2) Due to the increase in utilization, the resource allocation algorithm needs to make judgements so as to not
choose any resource with affected health for further
tenancy requests. For example, as seen at time instance
T2, tenancy of TR5-OTP3 could not be provisioned on
R1C, since current resource utilization does not allow
for any new allocations.
3) It may also happen, that due to the affected health (AH)
of resource, some of the already provisioned tenants
need to be reallocated to another healthy system, so as
to maintain tenancy SLO requirements. For example,
in our running example, if the health of disk access
parameter on resource R1C deteriorates to less than
70% at time T2 + D1, then one of the provisioned
tenants (TR3-OTP2 or TR4-MPA) would need to be
deallocated from resource R1C. This is required, since
at 70% of its health, the resource utilization would go
to exact 100%, which in itself would not be satisfied
at all times in practice.
In our approach, we define such a point as the resource burnout point, and the corresponding health parameter value(s)
as health-breakdown point(s) of the resource with given set
of tenants, at a specific time. Our provisioning allocation
algorithm, therefore, allows for the cloud provider to specify
the value of resource burn out threshold RBTh and also
the health-breakdown threshold point HBTh for each health
parameter. Based on the values specified in these thresholds,
the reallocation algorithm, would ensure that tenancies are
shifted to other healthy systems, and thereby ensure tenancy
SLOs are maintained.
D. Tenancy Requirement Model (TRM)
A Tenancy Requirements Model (TRM) as represented in
Fig.9 constitutes behavior, constraints and state information.
Tenant behavior also referred to as tenant functionality is
used to represent a tenant’s functional requirements and
contains inputs and outputs.
Tenant constraints help specify a tenant’s conditions. An
example of a tenant constraint is “Use shared resources’ allow = No”, where the tenant requires that it be hosted on exclusive resources only. Tenant state information identifies
the state a tenant could be in at any point during its hosted
period in a multi-tenant system. A tenant could be in one
of these states: created, active or paused.Initially, a tenant is
in the created state. A tenant’s RequiredUptime constraint is
compared with the current time to identify its present state.
If it is found to be active, it is allocated resources and then
set in the active state. If the tenant is found to be inactive at
a certain point, i.e., its RequiredUptime constraints return a
false for the current time, its resources are reclaimed and it
is set to the paused state. For example, the RequiredUptime
constraint “RequiredUptime > 06:00 and RequiredUptime
< 18:00”, implies that a tenant would be active from
6AM to 6PM only and during any other hour, the tenant
would be set to the paused state. Fig.10 shows the states in
which a tenant can persist and the corresponding inter-state
transitions. Tenant factors include non-functional aspects of
a tenant. For example, in our running example the number
of hits, required uptime are included in tenancy factors.
Tenant factors value of a tenant represents the combined
effect of each of the above mentioned factors. It provides
a measure of a tenant’s efficiency and helps determine the
factors governing the SLOs of a tenant.
IV. DYNAMIC P ROVISIONING A LGORITHMS
In this section, we discuss our dynamic provisioning
algorithms, viz., matching TRM to CTM (Algorithm 1),
health grade monitoring (Algorithm 2) and actual dynamic
Fig. 9.
Fig. 10.
Representation of Tenancy Requirement Model
Tenant-state information for a single tenant
provisioning based on TRM-CTM matching and health grade
monitoring (Algorithm 3).
Algorithm 1 is used to determine the set of resources
required to satisfy tenant requirements. This matching is
performed based on the data captured in TRM as depicted in
Fig. 9, and the catalogue of provisioned resources for similar
tenancy requirements captured in CTM as depicted in Fig. 5.
In our running example, for the multiple tenancy requests of
application OTP, viz., TR3-OTP2 and TR5-OTP3 the set of
required resources is obtained based on matching catalogued
tenancy model found from previously provisioned tenancy of
TR1-OTP1.
Algorithm 2 is used to periodically update the resource
health grading model. Changes in the monitored parameter
values are updated, and used to arrive at relative resource
health grading score. The set of chosen parameters to be
monitored is as explained in section 6. In our running
example, referring to Fig. 8, the HGM for resource R1C,
gets updated at various time instances of T1+D1, T1+D2,
T2 and T2+D1. Similarly, the monitoring and updating of
HGM is periodically performed for each resource in the set
of provisioned resources.
Algorithm 3 is used to determine critically affected
provisioned resource and re-provision a new resource in-
Algorithm 1 Tenancy Requirement Model matching Catalogued Tenancy Model Algorithm
1: CurrentTenancyRequirementModel = T RM ;
2: CataloguedTenancyRequirementModel = CT M ;
3: CataloguedTenancyModelRepository = CT M R;
4: CurrentTenancyRequirementModelValues
of T RM
T RM (V al) = {RequirementP arams(T RM, j}, 1 ≤
j ≤ k);
5: CataloguedTenancytRequirementModelValues of CT M
CT M (V al) = {RequirementP arams(CT M, j}, 1 ≤
j ≤ k);
6: requirementM atch = FALSE;
7: for all CT M (CT M R) such that {T RMi } ⊆ {CT M }
do
8:
if T RM (V al) == CT M (V al) then
9:
requirementMatch = TRUE;
10:
Return(CTM)
11:
end if
12: end for
13: Return(Not Found)
stead. This algorithm is executed as part of the ”Provisioning Plan, Identifier, Customizer and Validator” component of proposed system architecture depicted in Fig. 3.
The inputs to this algorithm are: ResourcePool, ProvisionedResourceSet, HealthParametersOfResource, ThresholdParametersOfResource, MonitoringPolicy. This algorithm
works as follows. First, the criticality of resource health parameters for the tenancy provisioning, is determined for each
resource in ProvisionedResourceSet, and each HealthParametersOfResource. This is based on the MonitoringPolicy and
a check to determine if the current parameter value is above
the threshold range. Second, the resource with a critical
health parameter is marked for replacement, and a new
resource is allocated from the ResourcePool. Third, the set
of provisioned resources is updated by replacing a critically
affected resource with the newly provisioned resource. In our
running example, this algorithm helps in arriving at a new
Algorithm 2 Health Grade Monitoring Algorithm
1: ResourceMonitored = R;
2: Original HealthOfResource = H;
3: Updated HealthOfResource= H ′ ;
4: Original
HealthParamValues of H HO
{HealthP aramV alues(H, j}, 1 ≤ j ≤ k);
5: Monitored
HealthParamValues
H
HM
{HealthP arams(H, j}, 1 ≤ j ≤ k);
6: Updated
HealthParamValues
H′
HU
′
{HealthP aramV alues(H , j}, 1 ≤ j ≤ k);
7: paramV ariant = FALSE;
8: for 1 ≤ j ≤ k do
9:
if HO(j) != HM (j) then
10:
HU (j) = HM (j);
11:
paramVariant = TRUE;
12:
end if
13: end for
14: if paramV ariant = TRUE then
15:
HO = HU ;
16:
Return(H’)
17: else
18:
Return(H)
19: end if
=
=
=
set of provisioning resources, inclusive of resource R1D, to
be employed at time instance T2, since the resource of R1C
is not suitable to satisfy the tenancy request of TR5-OTP3.
V. P ROTOTYPE I MPLEMENTATION
In our experimentation setup, applications of OTP and
MPA were hosted on servers with 3.1 GHz processor speed.
These machines had varying number of processors per
resource instances as follows: Database server: 8 cores,
Web Servers: 4 cores, Mail and File servers: 2 cores. Our
prototype implementation is built as a plug-in on IBM’s
Rational Software Architect tool3 , as depicted in Fig. 11.
It has a Tenancy Modelling view to capture functional and
non-functional tenancy requirements. The Tenancy-Resource
Monitoring Parameters view lists resource parameters using
which the health is determined. The Cloud Resources Model
view can be used to model resources in the cloud. The health
status gauge for instances is shown in Resource Health Status
view. The graph in Relative Health Grade Monitor view,
is a snapshot taken across time periods of a particular day.
The Non-Functional requirements policy with average of all
monitored health parameters was used to arrive at relative
health scores for each of database instances R1a, R1b and
R1c. The threshold range for Good Health was specified
as (100%-85%), Minor Affected Health as (85% to 70%),
Severe Affected Health as (70% to 50%) and Bad Health as
(less than 50%). As can be seen, a reallocation of instance
R1c is required during time period of 11:00 AM to 3 PM,
since it is in Bad Health region.
3 http://www.ibm.com/developerworks/rational/
products/rsa/
Algorithm 3 Dynamic Provisioning Algorithm
1: ResourcePool = P ;
2: ProvisionedResourceSet = P R;
3: ProvisionedResourceSet.Current = CP R;
4: CPR.Health = CP R.h;
5: CPR.threshold = CP R.t;
6: CPR.predictedHealth = CP R.ph;
7: ProvisioningResourceSet.New = P S.new;
8: Resource
values
of
CP R
CP Rv
=
{Resources(R, j}, 1 ≤ j ≤ k);
9: Resource
Health of CP R.h CP R.hv
=
{HealthP arams(CP Rv, j}, 1 ≤ j ≤ k);
10: Resource Health of Threshold of CP R.t CP R.t =
{T hresholdP arams(CP Rv, j}, 1 ≤ j ≤ k);
11: Predicted Health of CP R.phv CP R.phv
=
{HealthP arams(CP Rv, j}, 1 ≤ j ≤ k at t=Tn);
12: ReplaceResource = FALSE;
13: P aramCritical = FALSE;
14: M onitoringP olicy = MP.AGGR — MP.NFR — MP.IP;
15: for
all
Resources(CP R)
such that {CP R.hi } ⊆ {CP R.t} do
16:
for 1 ≤ j ≤ k do
17:
if CP R.h(j) ≥ CP R.t(j) then
18:
CP R.ph(j) = CP R.t(j);
19:
ParamCritical = TRUE;
20:
end if
21:
if P aramCritical = TRUE then
22:
P S.new(j) = (P.allocate CPRj);
23:
ReplaceResource = TRUE;
24:
else
25:
P S.new(j) = CPR(j);
26:
end if
27:
end for
28: end for
29: if ReplaceResource = TRUE then
30:
Return(PS.new)
31: else
32:
Return(CPR)
33: end if
VI. R ELATED W ORK
Predictive Control: Recent work on predictive control for
dynamic resource allocation in data centers [22], [20] focuses on presenting and discussing several algorithms for
dynamic resource allocation while maintaining SLOs. In
particular, [22] evaluate three prediction algorithms, based
on standard autoregressive (AR) model, combined ANOVA
and AR models, and a multi-pulse model. They show that the
MP-based predictive controller performed slightly better than
the other two. They also discovered that all three predictive
methods performed much better when the prediction error
was taken care of via a feedback mechanism.
In [20], the authors investigate the overhead involved in a
dynamic allocation approach in a data center, in both system
capacity and application-level performance relative to static
Fig. 11.
Prototype Implementation
allocation. Experiments were conducted on the Xen and
OpenVZ systems. The authors found that both performance
and capacity overloads are higher for Xen compared to
OpenVZ.
In [23] the authors present an integrated automated capacity and workload management system that integrates multiple
resource controllers at three different scopes and time scales.
This was motivated by the need for an integrated and tunable solution for dynamic resource allocation that provides
a range of solutions, which can be combined to provide
optimal allocations. On similar lines, the citations [8], [14]
present an online dynamic prediction scheme for application
resource requirements in a cloud environment. This scheme
extracts fine-grained dynamic patterns in application resource
demands and adjusts their resource allocations automatically.
While this scheme has demonstrated good resource prediction accuracy with minimal over-estimation error and near
zero under-estimation error, we believe this, as well as the
approaches from [22], [20], [23], are complementary to our
health-based approach.
Dynamic Provisioning: The citations [16], [17] present
a novel technique for dynamic resource provisioning in
multi-tier internet applications, built on the following: (i) a
flexible queueing model to determine how much resources
to allocate, and (ii) a combination of predictive and reactive
methods that determine when to provision these resources,
at both long and short time scales. Predictive provisioning is
needed for longer time scales, in order to plan for long-term
workload requirements. Whereas, reactive provisioning is
needed for short time scales, in response to sudden increases
in workload demands. The authors show that a combination
of predictive and reactive provisioning helps maintain SLOs
even in the presence of peak workload demands. Although
this approach helps in determining the number of resources
needed, it lacks accounting for the effect of resource health
variations as mentioned in our approach.
Autonomic Techniques: Another set of research works focus
on autonomic [10] techniques for dynamic resource allocation in cloud computing environments. For example, [2]
opines that statistical machine learning techniques enable
automatic control of dynamic resource allocation in data centers. A similar method for self-adaptive and self-configured
CPU resource provisioning has been presented in [9]. In [13]
the authors show how modeling utility functions for workload allocation makes explicit the desirability of different
workload allocation strategies, so that optimization can be
used to select among the alternatives. We view these approaches as also being complementary to our health-based
approach.
SLA-based Provisioning: Of special relevance to our paper
are techniques for resource provisioning based on service
level agreements (SLAs) [15], [19], [7], [21], [4]. In [15] the
authors describe a set of policies for resource provisioning
for enhancing IaaS providers’ profit in a federated cloud
environment. On similar lines, [19] discusses a resource
allocation policy that is concerned with the problem of
running compute-intensive jobs using only spot VM instances. The solution proposed in [19] relies on job runtime
estimation methods to decide on which VMs to run each
job. The citation [7] tackles the resource allocation problem
with different types of application workloads. That paper
proposes an admission control and scheduling mechanism
that not only maximizes revenue and profit for the provider,
but also ensures SLA requirements are met. Similarly, [21]
proposes resource allocation algorithms for SaaS providers
who want to minimize infrastructure cost and SLA violations. In [4] the authors use analytical performance (such
as queueing network models) and workload information
to supply intelligent inputs about system requirements to
an application provisioner with limited knowledge of the
physical infrastructure. These techniques, although powerful
in their approaches, lack considerations for system health
parameter variations.
Capacity Planning: The citation [12] models the capacity
allocation problem as a statistical optimization approach, and
proposes a genetic algorithm as the solution. Via experiments
on a real-life cloud environment, [12] shows that reserving a
few servers for addressing future workload variability gives
better results in terms of less number of servers needed for
resource allocation compared to packing resources based on
peak workloads only. In [11] the authors discuss how the
advent of cloud computing has impacted capacity planning in
data centers, and how capacity planning in the cloud can be
addressed from the cloud users’ and cloud providers’ points
of view. The authors also present results of demonstrations
of their approach on the PlanetLab4 infrastructure.
In contrast, our health-based approach takes a more dynamic view of the cloud environment, in that it adjusts
resource provisioning for tenants based on overall system
health considerations.
VII. C ONCLUSIONS
In this paper we have addressed the crucial issue of provisioning appropriate resources for multi-tenant cloud environments. We showed that traditional provisioning techniques
are not sufficient to solve this problem, as provisioned resources undergo dynamic changes, which can affect tenancy
SLO. We presented our algorithm for grading and monitoring
resource health, thereby determining affected provisioned
resource. This approach is based on mapping of functional
and non-functional tenancy requirements with appropriate
resources, their parameters, and health monitoring policy. We
demonstrated our approach via a proof-of-concept prototype,
using a realistic example of cloud resources and applications.
In future work we will test our approach on larger case
studies, incorporating automated mapping of tenancy requirements to resource(s), parameter(s) and monitoring policy.
We will also study the effects of SLA based provisioning
in conjunction with resource health monitoring, including
investigating tandem effects of resource health parameters
and the QoS parameters mentioned in [21].
R EFERENCES
[1] A. Batat and D.G. Feitelson. Gang scheduling with memory considerations. In Parallel and Distributed Processing Symposium, 2000.
IPDPS 2000. Proceedings. 14th International, pages 109 –114, 2000.
4 https://www.planet-lab.org/db/pub/sites.php
[2] Peter Bodı́k, Rean Griffith, Charles Sutton, Armando Fox, Michael
Jordan, and David Patterson. Statistical machine learning makes
automatic control practical for internet datacenters. In Proceedings of
the 2009 conference on Hot topics in cloud computing, HotCloud’09,
Berkeley, CA, USA, 2009. USENIX Association.
[3] Rodrigo N. Calheiros, Rajiv Ranjan, Anton Beloglazov, Cesar A.
F. De Rose, and Rajkumar Buyya. Cloudsim: a toolkit for modelling
and simulation of cloud computing environments and evaluation of
resource provisioning algorithms. 2010.
[4] Rodrigo N. Calheiros, Rajiv Ranjan, and Rajkumar Buyya. Virtual
machine provisioning based on analytical performance and qos in
cloud computing environments. In ICPP, pages 295–304, 2011.
[5] C. Dabrowsk and K. Mills. Vm leakage and orphan control in
open-source clouds. Cloud Computing Technology and Science, IEEE
International Conference on, 0:554–559, 2011.
[6] IBM Director.
http://publib.boulder.ibm.com/
infocenter/director/v6r2x/index.jsp?topic=/com.
ibm.director.status.helps.doc/fqm0_c_viewing_
health_and_status.html.
[7] Saurabh Kumar Garg, Srinivasa K. Gopalaiyengar, and Rajkumar
Buyya. Sla-based resource provisioning for heterogeneous workloads
in a virtualized cloud datacenter. In ICA3PP (1), pages 371–384, 2011.
[8] Zhenhuan Gong, Xiaohui Gu, and John Wilkes. Press: Predictive
elastic resource scaling for cloud systems. In CNSM, pages 9–16,
2010.
[9] Evangelia Kalyvianaki, Themistoklis Charalambous, and Steven Hand.
Self-adaptive and self-configured cpu resource provisioning for virtualized servers using kalman filters. In ICAC, pages 117–126, 2009.
[10] Jeffrey O. Kephart and David M. Chess. The vision of autonomic
computing. IEEE Computer, 36(1):41–50, 2003.
[11] Daniel A. Menasc and Paul Ngo.
Understanding
cloud computing: Experimentation and capacity planning;
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.158.21.
[12] Swarna Mylavarapu, Vijay Sukthankar, and Pradipta Banerjee. An optimized capacity planning approach for virtual infrastructure exhibiting
stochastic workload. In SAC, pages 386–390, 2010.
[13] Norman W. Paton, Marcelo A. T. Aragão, Kevin Lee, Alvaro A. A.
Fernandes, and Rizos Sakellariou. Optimizing utility in cloud computing through autonomic workload execution. IEEE Data Eng. Bull.,
32(1):51–58, 2009.
[14] Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, and John Wilkes.
Cloudscale: Elastic resource scaling for multi-tenant cloud systems.
In SoCC, 2011.
[15] Adel Nadjaran Toosi, Rodrigo N. Calheiros, Ruppa K. Thulasiram,
and Rajkumar Buyya. Resource provisioning policies to increase iaas
provider’s profit in a federated cloud environment. In HPCC, pages
279–287, 2011.
[16] Bhuvan Urgaonkar, Prashant J. Shenoy, Abhishek Chandra, and Pawan
Goyal. Dynamic provisioning of multi-tier internet applications. In
ICAC, pages 217–228, 2005.
[17] Bhuvan Urgaonkar, Prashant J. Shenoy, Abhishek Chandra, Pawan
Goyal, and Timothy Wood. Agile dynamic provisioning of multi-tier
internet applications. TAAS, 3(1), 2008.
[18] Jose Vargas and Clint Sherwood. Cloud success secret: Flexible
capacity planning. 2010.
[19] William Voorsluys, Saurabh Kumar Garg, and Rajkumar Buyya. Provisioning spot market cloud resources to create cost-effective virtual
clusters. In ICA3PP (1), pages 395–408, 2011.
[20] Zhikui Wang, Xiaoyun Zhu, Pradeep Padala, and Sharad Singhal.
Capacity and performance overhead in dynamic resource allocation
to virtual containers. In Integrated Network Management, pages 149–
158, 2007.
[21] Linlin Wu, Saurabh Kumar Garg, and Rajkumar Buyya. Sla-based
resource allocation for software as a service provider (saas) in cloud
computing environments. In CCGRID, pages 195–204, 2011.
[22] Wei Xu, Xiaoyun Zhu, Sharad Singhal, and Zhikui Wang. Predictive
control for dynamic resource allocation in enterprise data centers. In
NOMS, pages 115–126, 2006.
[23] Xiaoyun Zhu, Donald Young, Brian J. Watson, Zhikui Wang, Jerry
Rolia, Sharad Singhal, Bret McKee, Chris Hyser, Daniel Gmach, Rob
Gardner, Tom Christian, and Ludmila Cherkasova. 1000 islands:
Integrated capacity and workload management for the next generation
data center. In ICAC, pages 172–181, 2008.