Model driven provisioning in multi-tenant clouds

In multi-tenant cloud systems today, provisioning of resources for new tenancy is based on selection from a catalogue published by the cloud provider. The published images are generally a stack of appliances with Infrastructure (IaaS) and Platform (PaaS) layers and optionally Application layers (SaaS). Such a ready-made model enables quicker and streamlined resource provisioning to clients. However, this approach poses certain challenges to clients in the short run and providers in the long run. Unique tenancy requirements from each client are forcibly generalized by selecting one of the available images from the catalogue as the tenancy requirements are not modeled or validated to start with. Moreover, resource provisioning is mostly done towards addressing the peak load expectations in the tenancy. Such a static approach does not help in adapting to dynamically changing tenancy requirements, most often leading to the tenants owning and subsequently paying for more than what they n...

Model driven provisioning in multi-tenant clouds Atul Gohad IBM India Software Lab, Bangalore India. Karthikeyan Ponnalagu IBM Research India, Bangalore India. Nanjangud C. Narendra IBM India Software Lab, Bangalore India. Abstract— In multi-tenant cloud systems today, provisioning of resources for new tenancy is based on selection from a catalogue published by the cloud provider. The published images are generally a stack of appliances with Infrastructure (IaaS) and Platform (PaaS) layers and optionally Application layers (SaaS). Such a ready-made model enables quicker and streamlined resource provisioning to clients. However, this approach poses certain challenges to clients in the short run and providers in the long run. Unique tenancy requirements from each client are forcibly generalized by selecting one of the available images from the catalogue as the tenancy requirements are not modeled or validated to start with. Moreover, resource provisioning is mostly done towards addressing the peak load expectations in the tenancy. Such a static approach does not help in adapting to dynamically changing tenancy requirements, most often leading to the tenants owning and subsequently paying for more than what they need. In particular, provisioned resources are expected to perform at the same level of quality without accounting for their changing health. In our paper, we propose an extensible dynamic provisioning framework to address these challenges. We start with defining a Tenancy Requirements Model (TRM) which helps map provisioned resources with tenants. The provisioned and candidate resources are also modeled with their Quality of Service (QoS) characteristics which we call Health Grading Model (HGM); this helps in continuous monitoring and grading of resources based on health parameters and enables health prediction for future provisioning. Together, TRM and HGM allow dynamic re-provisioning for existing tenants based on either changing tenancy requirements or health grading predictions. We also present algorithms for prediction based provisioning and tenancy requirement matching. We illustrate our ideas throughout this paper with a running example, and present a proof-ofconcept prototype implementation on IBM’s Rational Software Architect modeling tool. Keywords. multi tenant cloud, dynamic provisioning, predictive allocation I. I NTRODUCTION Resource provisioning in the cloud is challenging task, due to the dynamism and heterogeneity inherent in cloud environments. This is caused by varying performance, workload and application characteristics on the cloud [3]. A cloud provider performs a crucial role in selection, allocation and utilization of cloud resources. This paper is related to the components of rapid provisioning, resource changing, monitoring and reporting, metering and Service Level Objective (SLO) management, which form part of the overall cloud provider We like to thank GR Gangadharan & Balaji Viswanathan for their early feedback and Jinu M Airumalayil & Praveen S Rao for their help in prototype implementation. function of provisioning/configuration1 . A cloud provider’s computing resources are pooled to serve multiple consumers using the multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned according to consumer demand. There is a sense of location independence in that the customer generally has no control or knowledge over the exact location of the provided resources but may be able to specify location at a higher level of abstraction (e.g., country, state, or datacenter). Examples of resources include storage, processing, memory, network bandwidth, and virtual machines. Traditional resource provisioning approaches help determine the number/type of resources required to meet SLOs. These approaches typically involve the following: (a) Constructing an application performance model that predicts the number/type of application instances required to handle demand at each particular level, in order to satisfy Quality of Service (QoS) requirements; (b) periodically predicting future demand and determining resource requirements using the performance model, and (c) automatically allocating resources using the predicted resource requirements. However, the quality of chosen resources is assumed to remain constant throughout the time period of provisioning. Also the status of an entire system reflects the status of the component on the system that has the most severe status. For example, if a component within a system has a status of critical, the entire system will have a status of critical, even if the critically impacted component is not critical to the current tenancy requirements within the provisioned system [6]. This gives rise to the following limitations: • Effect of health of the provisioned resources: The overall system operations are affected due to continuous execution of processes, and broadly these effects can be classified into the following: (i) slowing down of the system due to operating system related issues such as fragmentation, larger registries, additional process loads and effects of thrashing [5] and [1]; and (ii) the effect of varying available memory (RAM, virtual memory, persistent storage) for processing either due to increase in the number of tenants or burst of load, or due to the application performance and memory consumption increasing over time 1 twiki-cloud-computing/pub/CloudComputing/ Meeting12AReferenceArchitectureMarch282011/NIST_ CCRATWG_029.pdf • Effect of multi-tenancy dynamics: The overall system operations are affected due to changes in resource requirements of existing or new tenants. These dynamics can affect the agreed upon SLO of the provisioned tenants. Typically, these can be classified as: (i) adding another tenant (similar to increasing number of requests for a service); (ii) non-availability of any service on the provisioned resource due to maintenance and/or upgrades; (iii) increased burst of load on a specific multi-tenant application; and (iv) dynamically changing tenancy requirements such as time bound ramp-up or ramp-down in throughput requirements, e.g., increase during peak hours (typically seasonal as in case of an online retail website) and decrease during off peak/night hours. In this paper, we take a novel approach towards dynamic resource provisioning and define a cloud provisioning system based on what we define as Health Grading Model (HGM) and Tenancy Requirement Model (TRM), and then provide algorithms to dynamically provision the resources based on tenancy matching to health metrics based on HGM, and underpinned by tenancy requirements as specified in TRM. The HGM helps in defining, quantifying and monitoring the health of a resource, along with critical levels for each of the monitored parameters. Similarly, TRM helps in defining and quantifying the requirements of a specific tenancy. Our dynamic resource provisioning algorithm ensures that the set of provisioned resources are able to meet the changing requirements of tenancy, without affecting the given tenancy SLOs, using the best available set of healthy resources. The health of a resource, is constituted by a set of critical parameters. Any change in the value of these parameters affect the functioning of that resource, in turn affecting the SLO of the tenancy provisioned on that resource. It may be possible to augment a resource to increase certain of its parameter values, either by tuning or by physically choosing another similar resource with enhanced capability, so as to increase its health. Likewise, any decrease in such a parameter value would result into deterioration of the resource health, and have a negative impact on the SLO of the tenancy provisioned. For example, health of a Server System, could be best attributed to the number of CPU cores and CPU utilization over a period of time. Any variations in these values will have an impact on the SLO of provisioned tenancy. We provide detailed description of our HGM in Section III-C. The key contributions of our paper are: 1) Enable tenancy requirements to be captured and represented in a formalized model both for future reuse considerations and also for continuous conformance of provisioning 2) Enable certification and guarantee of accurate resource provisioning based on TRM and HGM 3) Flexible switching between health monitoring for provisioned resources and health monitoring for the candidate resource pool, thus supporting both published catalogue based tenancy provisioning, and made-toorder tenancy provisioning 4) Continuous monitoring and replacement of provisioned systems based on changing health grading 5) Providing the best-fit resource for current tenancy requirements and thus maximizing resource utilization and reducing cost of hosting. To the best of our knowledge, this is the first integrated technique for dynamic resource provisioning based on resource health grading. Our paper is organized as follows. Section II introduces our running example, which we use throughout the rest of our paper for illustration. The architecture and models used to represent resource’s health and tenancy requirements are explained in Section III. Our dynamic provisioning algorithm is then described in Section IV, and is also illustrated via our running example. Our prototype implementation is presented in Section V, while related work is discussed in Section VI. Finally, we present concluding remarks in Section VII. II. RUNNING E XAMPLE We consider an example cloud infrastructure as represented in Fig. 1 capable of hosting tenancy applications of two types. First is the Online Travel Portal (OTP), which is a content management system, wherein customer profiles are maintained and which is used by customers to log in, create and view travel requests. This application interacts with the database to store the customer profiles, and also with appropriate booking agencies to confirm travel bookings. Second is the Mail Processing Application (MPA), a compute intensive application that interacts with various mail server accounts and filters feeds from received emails. The resource types required to satisfy the functions of these applications are depicted in Table I. TABLE I A PPLICATION R ESOURCE R EQUIREMENTS IN AN E XAMPLE C LOUD Resource Type ID R1 R2 R3 R4 R5 Resource Type Database Server File Server Mail Server Storage Web Server Required by application(s) OTP, MPA OTP, MPA MPA OTP, MPA OTP The tenancy requests ordered by the time at which they are received and their requested functional and non-functional requirements are depicted in Fig. 2. The functional and non-functional requirements together determine the choice of resources and that of the resource parameters to be monitored, so as to satisfy the tenancy SLOs. The tenancy instances satisfied by provisioning the allocated resources at time T1 are depicted in table II. Note that, from the resource pool some resources are used as shared resources between different tenants, whereas some are exclusively used for specific tenancy requests to satisfy the requested tenancy requirements. Fig. 1. Running example cloud infrastructure Fig. 2. Tenancy Request Queue TABLE II S ATISFIED TENANCY REQUESTS AT TIME T1 Tenancy instance id TR1-OTP1 TR2-MPA1 TR3-OTP2 TR4-MPA2 Provisioned resource instances R1a, R2a, R4a, R5a R1b, R2b, R3a, R4b R1c, R2c, R4c, R5b R1c, R2c, R3a, R4c niques would provide the estimated constant number of resources required to be provisioned irrespective of changing tenancy requirements at different times. Second, the state of resources is assumed to be constant over the period of provisioning. TABLE III S ATISFIED TENANCY REQUESTS AT TIME T2 The health of provisioned resources can be determined using tools such as Collectd2 . The parameters that are of special interest to meet the required SLOs as specified in the tenancy requirements, are provided below. The mapping between tenancy SLO requirements to the system parameters to be monitored is currently manual, based on providers’ knowledge. In our future work, we plan to make this mapping an automated function. • CPUTime The amount of time spent by the CPU in various states, most notably executing user code, executing system code, waiting for IO-operations and being idle • DiskAccess The average time an I/O-operation took to complete • Memory The amount of available free, used, cached and buffered memory • ServletRequests Number of servlet requests served over period of time, to determine the number of hits served for an application Some interesting research issues arise from this running example. First, the currently available provisioning tech2 Tenancy instance id TR1-OTP1 TR2-MPA1 TR3-OTP2 TR4-MPA2 TR5-OTP3 Provisioned resource instances R1a, R2a, R4a, R5a R1b, R2b, R3a, R4b R1c, R2c, R4c, R5b R1c, R2c, R3a, R4c R1c, R2c, R3a, R5c For example at time instance T2 (see Table III), due to additional provisioning of tenancy TR5-OTP3, there could be issues satisfying the SLOs requirement in servicing the number of hits, due to a shared usage of resource R1c. Also is it really required to allocate a new Web Server instance (R5c) to satisfy TR5-OTP3 or can the same request be satisfied by using R5b without affecting the SLOs of both TR3-OTP2 and TR5-OTP3? Third, and more crucially, we would need to determine the extent to which a given healthy resource can be matched to suit varying requirements of the tenancy; for instance at time T2, application OTP is to be provisioned, for which a healthy shareable resource of type database server may not be available. The first two issues are addressed by our unique work in modeling the TRM and HGM; and the third issue, is addressed by our dynamic provisioning algorithm. Fig. 3. System architecture. III. A RCHITECTURE AND M ODELS A. System Architecture The architecture of the proposed system is depicted in Fig. 3. The cloud depicts the set of all available resources in the cloud provider’s environment. These resources could be of varying types, each with multiple instances. The repository notation of Tenancy Requirements Model (TRM) depicts multiple tenancy requirement models. Each of these models provides details on specific tenancy such as (a) the state (Created, Active, Paused) of the tenant, (b) its functional and non-functional behaviour characteristics, (c) constraints on inter-tenancy and intra-tenancy, and (d) tenancy attributes. The repository notation of Catalogued Tenancy Model (CTM) depicts a collection of past tenancy requirements data along with the provisioned resource details, used to satisfy the tenancy behaviours. This repository is typically helpful in determining the resource needs for a new tenancy request, based on historical data. The repository notation of Health Grading Model (HGM) consists of details of health parameters that are of critical importance for each provisioned resource satisfying the current tenancy. These health grading models are continuously updated based on the output of Continuous Health Grade Monitor. The changes in resource health state (Good, Affected, Bad) are marked based on the chosen monitoring policy for specific tenancy. The resource health model defined in Section III-C explains the details on resource health state and monitoring policy. The provisioning plan is determined, validated and customized by the Provisioning Plan Identifier, Customizer and Validator component. This component is primarily responsible for executing the algorithms to achieve dynamic addition & removal of resources as well as alert and forecast the state of provisioned resources based on continuously changing resource health grading and tenancy requirements. The high level control flow of our proposed system is shown in Fig. 4. First, the functional and non-functional requirements are captured as provided by the business users. These requirements are translated into a Tenancy Requirement Model (TRM). Second, a check is performed to determine a matching provisioning model from the catalogued tenancy models, on the basis of the determined TRM. The set of resources required to satisfy tenancy, is determined based on either the closest found match or in case of no matches, the TRM’s are translated into resource require- Fig. 4. Control Flow of Proposed System ments based on existing capacity planning tools [18]. Third, from the pool of available resources, the required set of resources is provisioned, and this set of resources is fed into the Continuous Health Grade Monitor. The monitoring of resource parameters can be done using specific monitoring tools available for the resource, for example, on Linux tools such as Collectd can be used. Based on the criticality of involved resource health parameters, and the monitoring policy chosen, a tenancy is re-provisioned. At the end of the tenancy period, the resources are released back to the pool, ready to be used for next tenancy request. B. Catalogued Tenancy Model (CTM) The CTM as represented in Fig.5 has a set of provisioned resources, along with their health grade models used to satisfy past tenancy requirement requests. This data set is used to selectively choose the appropriate set of resources to be provisioned for satisfying a new tenancy. The mapping between TRM of new tenancy request and TRM from CTM is used to determine this resource set. For example, in our running example, the catalogued tenancy model, has a dataset corresponding to the satisfied requests of TR1-OTA1, which has provisioned resource set of PR=(R1a,R2a,R4a,R5a); this is used as a baseline to determine required resource set for TR3-OT2. Optionally, in cases, wherein an appropriately matching tenancy cannot be found in CTM, the required initial resource set is arrived at based on manual analysis of the new TRM. In our future work we plan to address various techniques that can be employed to arrive at such a manual analysis. Fig. 5. Representation of Catalogued Tenancy Model C. Health Grading Model (HGM) We define the health of a resource as a set of attributes which govern the functioning of that particular resource. For example, Health of CPU can be attributed to the parameters of executing user code, executing system code, waiting for I/O-operations and being idle. Likewise, the health of an underlying infrastructure such as router or firewall can be attributed to system uptime parameters of average running time, or maximum system uptime reached over a certain time duration. In general we formulate the health of a resource as a tuple H = {ha , hb , hc , . . . hn } such that any change in value of these parameters results in a change in behaviour and functioning of that resource. We also define the individual threshold levels T = {ta , tb , tc , . . . tn } corresponding to each of the health parameters H and a critical flag for each health parameter that is flagged whenever the parameter value crosses above the threshold level. This helps in identifying the resource that is critical for given tenancy and needs to be replaced. Also, we define a relative weighted score for resources of the same type based on segmented list of parameters and their values. The values can range from a relative score of 0-100%. This helps in easily picking a healthier resource from the given pool, based on the representation that higher the value of score for a parameter, the healthier the resource is compared to others. For example, the health score of a system can be represented as segmented list of CP U |M emory|Storage|N/W |Downtime with values of 80|94|100|72|95, respectively. A representation of Health Grading Model is shown in Fig.6. Fig. 6. Representation of Health Grading Model The selection of parameters that constitute resource health determination, can be based on the monitoring policies defined below. The tenant can specify its choice of monitoring policy at the time of request for each tenancy. • Aggregation Policy: This policy makes use of any one aggregate function such as (i) Sum of all parameters, (ii) average of all parameters (iii) Only top m parameters, m ≤ n. • Non-Functional Requirements Policy: In this policy, the user provides a list of non-functional requirements, which feed into a parameter selection function. This function chooses a set of all parameters that could in any way affect the given set of non-functional requirements. A average of all parameters aggregate policy is then applied on this selected set of parameters. The function to choose the set of parameters, given a set of nonfunctional requirements, could be based on past historical data of the tenancy and non-functional requirements, or be purely a job done manually by the cloud provider. • Individual Policy: In this policy each resource parameter is monitored for its health. The parameters that could be monitored are limited in scope only by the resource type and their capabilities in exposing the parameters that could be monitored. The monitoring of these parameters could be either done manually or based on available tooling corresponding to a particular resource. For example in our running example, consider resource R1C, a Database Server instance which is shared and used by tenancy instances of TR3-OTP2, TR4-MPA2 at time T1 and along with tenancy instance TR5-OTP3 at time T2. We consider disk access as one of the crucial parameters to be monitored for resource of this type. In an ideal resource behavior, without any fluctuations to the health parameter of disk access, the utilization and provisioning of tenancy instances would be as shown in Fig.7. The health of disk access parameter is assumed to remain constant at 100% of the device capability, over period of tenancy, also the resource utilization remains constant at 70% , until time T2, when due to additional tenancy of TR5-OTP3, the resource utilization goes upto 90%. Fig. 7. Ideal Occupancy for Resource instance R1C. However, due to fluctuations in health parameter of disk access, which vary at different levels across time instances, the resource utilization as well as the number of tenancies that can be provisioned on a specific resource could vary as shown in Fig.8. Fig. 8. Effect of Disk Access Health Changes on Resource instance R1C. Some interesting observations to be noted are: 1) As the health of a crucial parameter deteriorates, the actual utilization of resource increases so as to maintain tenancy SLO requirement 2) Due to the increase in utilization, the resource allocation algorithm needs to make judgements so as to not choose any resource with affected health for further tenancy requests. For example, as seen at time instance T2, tenancy of TR5-OTP3 could not be provisioned on R1C, since current resource utilization does not allow for any new allocations. 3) It may also happen, that due to the affected health (AH) of resource, some of the already provisioned tenants need to be reallocated to another healthy system, so as to maintain tenancy SLO requirements. For example, in our running example, if the health of disk access parameter on resource R1C deteriorates to less than 70% at time T2 + D1, then one of the provisioned tenants (TR3-OTP2 or TR4-MPA) would need to be deallocated from resource R1C. This is required, since at 70% of its health, the resource utilization would go to exact 100%, which in itself would not be satisfied at all times in practice. In our approach, we define such a point as the resource burnout point, and the corresponding health parameter value(s) as health-breakdown point(s) of the resource with given set of tenants, at a specific time. Our provisioning allocation algorithm, therefore, allows for the cloud provider to specify the value of resource burn out threshold RBTh and also the health-breakdown threshold point HBTh for each health parameter. Based on the values specified in these thresholds, the reallocation algorithm, would ensure that tenancies are shifted to other healthy systems, and thereby ensure tenancy SLOs are maintained. D. Tenancy Requirement Model (TRM) A Tenancy Requirements Model (TRM) as represented in Fig.9 constitutes behavior, constraints and state information. Tenant behavior also referred to as tenant functionality is used to represent a tenant’s functional requirements and contains inputs and outputs. Tenant constraints help specify a tenant’s conditions. An example of a tenant constraint is “Use shared resources’ allow = No”, where the tenant requires that it be hosted on exclusive resources only. Tenant state information identifies the state a tenant could be in at any point during its hosted period in a multi-tenant system. A tenant could be in one of these states: created, active or paused.Initially, a tenant is in the created state. A tenant’s RequiredUptime constraint is compared with the current time to identify its present state. If it is found to be active, it is allocated resources and then set in the active state. If the tenant is found to be inactive at a certain point, i.e., its RequiredUptime constraints return a false for the current time, its resources are reclaimed and it is set to the paused state. For example, the RequiredUptime constraint “RequiredUptime > 06:00 and RequiredUptime < 18:00”, implies that a tenant would be active from 6AM to 6PM only and during any other hour, the tenant would be set to the paused state. Fig.10 shows the states in which a tenant can persist and the corresponding inter-state transitions. Tenant factors include non-functional aspects of a tenant. For example, in our running example the number of hits, required uptime are included in tenancy factors. Tenant factors value of a tenant represents the combined effect of each of the above mentioned factors. It provides a measure of a tenant’s efficiency and helps determine the factors governing the SLOs of a tenant. IV. DYNAMIC P ROVISIONING A LGORITHMS In this section, we discuss our dynamic provisioning algorithms, viz., matching TRM to CTM (Algorithm 1), health grade monitoring (Algorithm 2) and actual dynamic Fig. 9. Fig. 10. Representation of Tenancy Requirement Model Tenant-state information for a single tenant provisioning based on TRM-CTM matching and health grade monitoring (Algorithm 3). Algorithm 1 is used to determine the set of resources required to satisfy tenant requirements. This matching is performed based on the data captured in TRM as depicted in Fig. 9, and the catalogue of provisioned resources for similar tenancy requirements captured in CTM as depicted in Fig. 5. In our running example, for the multiple tenancy requests of application OTP, viz., TR3-OTP2 and TR5-OTP3 the set of required resources is obtained based on matching catalogued tenancy model found from previously provisioned tenancy of TR1-OTP1. Algorithm 2 is used to periodically update the resource health grading model. Changes in the monitored parameter values are updated, and used to arrive at relative resource health grading score. The set of chosen parameters to be monitored is as explained in section 6. In our running example, referring to Fig. 8, the HGM for resource R1C, gets updated at various time instances of T1+D1, T1+D2, T2 and T2+D1. Similarly, the monitoring and updating of HGM is periodically performed for each resource in the set of provisioned resources. Algorithm 3 is used to determine critically affected provisioned resource and re-provision a new resource in- Algorithm 1 Tenancy Requirement Model matching Catalogued Tenancy Model Algorithm 1: CurrentTenancyRequirementModel = T RM ; 2: CataloguedTenancyRequirementModel = CT M ; 3: CataloguedTenancyModelRepository = CT M R; 4: CurrentTenancyRequirementModelValues of T RM T RM (V al) = {RequirementP arams(T RM, j}, 1 ≤ j ≤ k); 5: CataloguedTenancytRequirementModelValues of CT M CT M (V al) = {RequirementP arams(CT M, j}, 1 ≤ j ≤ k); 6: requirementM atch = FALSE; 7: for all CT M (CT M R) such that {T RMi } ⊆ {CT M } do 8: if T RM (V al) == CT M (V al) then 9: requirementMatch = TRUE; 10: Return(CTM) 11: end if 12: end for 13: Return(Not Found) stead. This algorithm is executed as part of the ”Provisioning Plan, Identifier, Customizer and Validator” component of proposed system architecture depicted in Fig. 3. The inputs to this algorithm are: ResourcePool, ProvisionedResourceSet, HealthParametersOfResource, ThresholdParametersOfResource, MonitoringPolicy. This algorithm works as follows. First, the criticality of resource health parameters for the tenancy provisioning, is determined for each resource in ProvisionedResourceSet, and each HealthParametersOfResource. This is based on the MonitoringPolicy and a check to determine if the current parameter value is above the threshold range. Second, the resource with a critical health parameter is marked for replacement, and a new resource is allocated from the ResourcePool. Third, the set of provisioned resources is updated by replacing a critically affected resource with the newly provisioned resource. In our running example, this algorithm helps in arriving at a new Algorithm 2 Health Grade Monitoring Algorithm 1: ResourceMonitored = R; 2: Original HealthOfResource = H; 3: Updated HealthOfResource= H ′ ; 4: Original HealthParamValues of H HO {HealthP aramV alues(H, j}, 1 ≤ j ≤ k); 5: Monitored HealthParamValues H HM {HealthP arams(H, j}, 1 ≤ j ≤ k); 6: Updated HealthParamValues H′ HU ′ {HealthP aramV alues(H , j}, 1 ≤ j ≤ k); 7: paramV ariant = FALSE; 8: for 1 ≤ j ≤ k do 9: if HO(j) != HM (j) then 10: HU (j) = HM (j); 11: paramVariant = TRUE; 12: end if 13: end for 14: if paramV ariant = TRUE then 15: HO = HU ; 16: Return(H’) 17: else 18: Return(H) 19: end if = = = set of provisioning resources, inclusive of resource R1D, to be employed at time instance T2, since the resource of R1C is not suitable to satisfy the tenancy request of TR5-OTP3. V. P ROTOTYPE I MPLEMENTATION In our experimentation setup, applications of OTP and MPA were hosted on servers with 3.1 GHz processor speed. These machines had varying number of processors per resource instances as follows: Database server: 8 cores, Web Servers: 4 cores, Mail and File servers: 2 cores. Our prototype implementation is built as a plug-in on IBM’s Rational Software Architect tool3 , as depicted in Fig. 11. It has a Tenancy Modelling view to capture functional and non-functional tenancy requirements. The Tenancy-Resource Monitoring Parameters view lists resource parameters using which the health is determined. The Cloud Resources Model view can be used to model resources in the cloud. The health status gauge for instances is shown in Resource Health Status view. The graph in Relative Health Grade Monitor view, is a snapshot taken across time periods of a particular day. The Non-Functional requirements policy with average of all monitored health parameters was used to arrive at relative health scores for each of database instances R1a, R1b and R1c. The threshold range for Good Health was specified as (100%-85%), Minor Affected Health as (85% to 70%), Severe Affected Health as (70% to 50%) and Bad Health as (less than 50%). As can be seen, a reallocation of instance R1c is required during time period of 11:00 AM to 3 PM, since it is in Bad Health region. 3 products/rsa/ Algorithm 3 Dynamic Provisioning Algorithm 1: ResourcePool = P ; 2: ProvisionedResourceSet = P R; 3: ProvisionedResourceSet.Current = CP R; 4: CPR.Health = CP R.h; 5: CPR.threshold = CP R.t; 6: CPR.predictedHealth = CP; 7: ProvisioningResourceSet.New = P; 8: Resource values of CP R CP Rv = {Resources(R, j}, 1 ≤ j ≤ k); 9: Resource Health of CP R.h CP R.hv = {HealthP arams(CP Rv, j}, 1 ≤ j ≤ k); 10: Resource Health of Threshold of CP R.t CP R.t = {T hresholdP arams(CP Rv, j}, 1 ≤ j ≤ k); 11: Predicted Health of CP R.phv CP R.phv = {HealthP arams(CP Rv, j}, 1 ≤ j ≤ k at t=Tn); 12: ReplaceResource = FALSE; 13: P aramCritical = FALSE; 14: M onitoringP olicy = MP.AGGR — MP.NFR — MP.IP; 15: for all Resources(CP R) such that {CP R.hi } ⊆ {CP R.t} do 16: for 1 ≤ j ≤ k do 17: if CP R.h(j) ≥ CP R.t(j) then 18: CP = CP R.t(j); 19: ParamCritical = TRUE; 20: end if 21: if P aramCritical = TRUE then 22: P = (P.allocate CPRj); 23: ReplaceResource = TRUE; 24: else 25: P = CPR(j); 26: end if 27: end for 28: end for 29: if ReplaceResource = TRUE then 30: Return( 31: else 32: Return(CPR) 33: end if VI. R ELATED W ORK Predictive Control: Recent work on predictive control for dynamic resource allocation in data centers [22], [20] focuses on presenting and discussing several algorithms for dynamic resource allocation while maintaining SLOs. In particular, [22] evaluate three prediction algorithms, based on standard autoregressive (AR) model, combined ANOVA and AR models, and a multi-pulse model. They show that the MP-based predictive controller performed slightly better than the other two. They also discovered that all three predictive methods performed much better when the prediction error was taken care of via a feedback mechanism. In [20], the authors investigate the overhead involved in a dynamic allocation approach in a data center, in both system capacity and application-level performance relative to static Fig. 11. Prototype Implementation allocation. Experiments were conducted on the Xen and OpenVZ systems. The authors found that both performance and capacity overloads are higher for Xen compared to OpenVZ. In [23] the authors present an integrated automated capacity and workload management system that integrates multiple resource controllers at three different scopes and time scales. This was motivated by the need for an integrated and tunable solution for dynamic resource allocation that provides a range of solutions, which can be combined to provide optimal allocations. On similar lines, the citations [8], [14] present an online dynamic prediction scheme for application resource requirements in a cloud environment. This scheme extracts fine-grained dynamic patterns in application resource demands and adjusts their resource allocations automatically. While this scheme has demonstrated good resource prediction accuracy with minimal over-estimation error and near zero under-estimation error, we believe this, as well as the approaches from [22], [20], [23], are complementary to our health-based approach. Dynamic Provisioning: The citations [16], [17] present a novel technique for dynamic resource provisioning in multi-tier internet applications, built on the following: (i) a flexible queueing model to determine how much resources to allocate, and (ii) a combination of predictive and reactive methods that determine when to provision these resources, at both long and short time scales. Predictive provisioning is needed for longer time scales, in order to plan for long-term workload requirements. Whereas, reactive provisioning is needed for short time scales, in response to sudden increases in workload demands. The authors show that a combination of predictive and reactive provisioning helps maintain SLOs even in the presence of peak workload demands. Although this approach helps in determining the number of resources needed, it lacks accounting for the effect of resource health variations as mentioned in our approach. Autonomic Techniques: Another set of research works focus on autonomic [10] techniques for dynamic resource allocation in cloud computing environments. For example, [2] opines that statistical machine learning techniques enable automatic control of dynamic resource allocation in data centers. A similar method for self-adaptive and self-configured CPU resource provisioning has been presented in [9]. In [13] the authors show how modeling utility functions for workload allocation makes explicit the desirability of different workload allocation strategies, so that optimization can be used to select among the alternatives. We view these approaches as also being complementary to our health-based approach. SLA-based Provisioning: Of special relevance to our paper are techniques for resource provisioning based on service level agreements (SLAs) [15], [19], [7], [21], [4]. In [15] the authors describe a set of policies for resource provisioning for enhancing IaaS providers’ profit in a federated cloud environment. On similar lines, [19] discusses a resource allocation policy that is concerned with the problem of running compute-intensive jobs using only spot VM instances. The solution proposed in [19] relies on job runtime estimation methods to decide on which VMs to run each job. The citation [7] tackles the resource allocation problem with different types of application workloads. That paper proposes an admission control and scheduling mechanism that not only maximizes revenue and profit for the provider, but also ensures SLA requirements are met. Similarly, [21] proposes resource allocation algorithms for SaaS providers who want to minimize infrastructure cost and SLA violations. In [4] the authors use analytical performance (such as queueing network models) and workload information to supply intelligent inputs about system requirements to an application provisioner with limited knowledge of the physical infrastructure. These techniques, although powerful in their approaches, lack considerations for system health parameter variations. Capacity Planning: The citation [12] models the capacity allocation problem as a statistical optimization approach, and proposes a genetic algorithm as the solution. Via experiments on a real-life cloud environment, [12] shows that reserving a few servers for addressing future workload variability gives better results in terms of less number of servers needed for resource allocation compared to packing resources based on peak workloads only. In [11] the authors discuss how the advent of cloud computing has impacted capacity planning in data centers, and how capacity planning in the cloud can be addressed from the cloud users’ and cloud providers’ points of view. The authors also present results of demonstrations of their approach on the PlanetLab4 infrastructure. In contrast, our health-based approach takes a more dynamic view of the cloud environment, in that it adjusts resource provisioning for tenants based on overall system health considerations. VII. C ONCLUSIONS In this paper we have addressed the crucial issue of provisioning appropriate resources for multi-tenant cloud environments. We showed that traditional provisioning techniques are not sufficient to solve this problem, as provisioned resources undergo dynamic changes, which can affect tenancy SLO. We presented our algorithm for grading and monitoring resource health, thereby determining affected provisioned resource. This approach is based on mapping of functional and non-functional tenancy requirements with appropriate resources, their parameters, and health monitoring policy. We demonstrated our approach via a proof-of-concept prototype, using a realistic example of cloud resources and applications. In future work we will test our approach on larger case studies, incorporating automated mapping of tenancy requirements to resource(s), parameter(s) and monitoring policy. We will also study the effects of SLA based provisioning in conjunction with resource health monitoring, including investigating tandem effects of resource health parameters and the QoS parameters mentioned in [21]. R EFERENCES [1] A. Batat and D.G. Feitelson. Gang scheduling with memory considerations. In Parallel and Distributed Processing Symposium, 2000. IPDPS 2000. Proceedings. 14th International, pages 109 –114, 2000. 4 [2] Peter Bodı́k, Rean Griffith, Charles Sutton, Armando Fox, Michael Jordan, and David Patterson. Statistical machine learning makes automatic control practical for internet datacenters. In Proceedings of the 2009 conference on Hot topics in cloud computing, HotCloud’09, Berkeley, CA, USA, 2009. USENIX Association. [3] Rodrigo N. Calheiros, Rajiv Ranjan, Anton Beloglazov, Cesar A. F. De Rose, and Rajkumar Buyya. Cloudsim: a toolkit for modelling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. 2010. [4] Rodrigo N. Calheiros, Rajiv Ranjan, and Rajkumar Buyya. Virtual machine provisioning based on analytical performance and qos in cloud computing environments. In ICPP, pages 295–304, 2011. [5] C. Dabrowsk and K. Mills. Vm leakage and orphan control in open-source clouds. Cloud Computing Technology and Science, IEEE International Conference on, 0:554–559, 2011. [6] IBM Director. infocenter/director/v6r2x/index.jsp?topic=/com. ibm.director.status.helps.doc/fqm0_c_viewing_ health_and_status.html. [7] Saurabh Kumar Garg, Srinivasa K. Gopalaiyengar, and Rajkumar Buyya. Sla-based resource provisioning for heterogeneous workloads in a virtualized cloud datacenter. In ICA3PP (1), pages 371–384, 2011. [8] Zhenhuan Gong, Xiaohui Gu, and John Wilkes. Press: Predictive elastic resource scaling for cloud systems. In CNSM, pages 9–16, 2010. [9] Evangelia Kalyvianaki, Themistoklis Charalambous, and Steven Hand. Self-adaptive and self-configured cpu resource provisioning for virtualized servers using kalman filters. In ICAC, pages 117–126, 2009. [10] Jeffrey O. Kephart and David M. Chess. The vision of autonomic computing. IEEE Computer, 36(1):41–50, 2003. [11] Daniel A. Menasc and Paul Ngo. Understanding cloud computing: Experimentation and capacity planning; [12] Swarna Mylavarapu, Vijay Sukthankar, and Pradipta Banerjee. An optimized capacity planning approach for virtual infrastructure exhibiting stochastic workload. In SAC, pages 386–390, 2010. [13] Norman W. Paton, Marcelo A. T. Aragão, Kevin Lee, Alvaro A. A. Fernandes, and Rizos Sakellariou. Optimizing utility in cloud computing through autonomic workload execution. IEEE Data Eng. Bull., 32(1):51–58, 2009. [14] Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, and John Wilkes. Cloudscale: Elastic resource scaling for multi-tenant cloud systems. In SoCC, 2011. [15] Adel Nadjaran Toosi, Rodrigo N. Calheiros, Ruppa K. Thulasiram, and Rajkumar Buyya. Resource provisioning policies to increase iaas provider’s profit in a federated cloud environment. In HPCC, pages 279–287, 2011. [16] Bhuvan Urgaonkar, Prashant J. Shenoy, Abhishek Chandra, and Pawan Goyal. Dynamic provisioning of multi-tier internet applications. In ICAC, pages 217–228, 2005. [17] Bhuvan Urgaonkar, Prashant J. Shenoy, Abhishek Chandra, Pawan Goyal, and Timothy Wood. Agile dynamic provisioning of multi-tier internet applications. TAAS, 3(1), 2008. [18] Jose Vargas and Clint Sherwood. Cloud success secret: Flexible capacity planning. 2010. [19] William Voorsluys, Saurabh Kumar Garg, and Rajkumar Buyya. Provisioning spot market cloud resources to create cost-effective virtual clusters. In ICA3PP (1), pages 395–408, 2011. [20] Zhikui Wang, Xiaoyun Zhu, Pradeep Padala, and Sharad Singhal. Capacity and performance overhead in dynamic resource allocation to virtual containers. In Integrated Network Management, pages 149– 158, 2007. [21] Linlin Wu, Saurabh Kumar Garg, and Rajkumar Buyya. Sla-based resource allocation for software as a service provider (saas) in cloud computing environments. In CCGRID, pages 195–204, 2011. [22] Wei Xu, Xiaoyun Zhu, Sharad Singhal, and Zhikui Wang. Predictive control for dynamic resource allocation in enterprise data centers. In NOMS, pages 115–126, 2006. [23] Xiaoyun Zhu, Donald Young, Brian J. Watson, Zhikui Wang, Jerry Rolia, Sharad Singhal, Bret McKee, Chris Hyser, Daniel Gmach, Rob Gardner, Tom Christian, and Ludmila Cherkasova. 1000 islands: Integrated capacity and workload management for the next generation data center. In ICAC, pages 172–181, 2008.