Abstract
Recent technological advances in Grid computing enable the virtualization and dynamic delivery of computing services on demand to realize utility computing. In utility computing, computing services will always be available to the users whenever the need arises, similar to the availability of real-world utilities, such as electrical power, gas, and water. With this new outsourcing service model, users are able to define their service needs through Service Level Agreements (SLAs) and only have to pay when they use the services. They do not have to invest on or maintain computing infrastructures themselves and are not constrained to specific computing service providers. Thus, a commercial computing service will face two new challenges: (i) what are the objectives or goals it needs to achieve in order to support the utility computing model, and (ii) how to evaluate whether these objectives are achieved or not. To address these two new challenges, this paper first identifies four essential objectives that are required to support the utility computing model: (i) manage wait time for SLA acceptance, (ii) meet SLA requests, (iii) ensure reliability of accepted SLA, and (iv) attain profitability. It then describes two evaluation methods that are simple and intuitive: (i) separate and (ii) integrated risk analysis to analyze the effectiveness of resource management policies in achieving the objectives. Evaluation results based on simulation successfully demonstrate the applicability of separate and integrated risk analysis to assess policies in terms of the objectives. These evaluation results which constitute an a posteriori risk analysis of policies can later be used to generate an a priori risk analysis of policies by identifying possible risks for future utility computing situations.
Similar content being viewed by others
References
Altair Grid Technologies: OpenPBS release 2.3 administrator guide. http://www.openpbs.org/docs.html (2000)
Amazon: Elastic compute cloud (EC2). http://www.amazon.com/ec2/ (2007)
Baker, M., Buyya, R.: Cluster computing: the commodity supercomputer. Softw. Pract. Exp. 29(6), 551–576 (1999)
Buyya, R., Murshed, M.: GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for Grid computing. Concurr. Comput. Pract. Exp. 14(13–15), 1175–1220 (2002)
Chun, B.N., Culler, D.E.: User-centric performance analysis of market-based cluster batch schedulers. In: Proceedings of the 2nd International Symposium on Cluster Computing and the Grid (CCGrid 2002), pp. 22–30, Berlin, 22–24 May 2002
Crouhy, M., Galai, D., Mark, R.: The Essentials of Risk Management. McGraw-Hill, New York (2006)
Etsion, Y., Tsafrir, D.: A short survey of commercial cluster batch schedulers. Technical Report 2005-13, School of Computer Science and Engineering, The Hebrew University of Jerusalem (2005)
Feitelson, D.G.: Parallel workloads archive. http://www.cs.huji.ac.il/labs/parallel/workload/ (2007)
Foster, I., Kesselman, C. (eds.): The Grid 2: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Francisco (2003)
HP: Utility computing. http://www.hp.com/go/utility/ (2007)
IBM: On demand business. http://www.ibm.com/ondemand/ (2007)
Irwin, D.E., Grit, L.E., Chase, J.S.: Balancing risk and reward in a market-based task service. In: Proceedings of the 13th International Symposium on High Performance Distributed Computing (HPDC13), pp. 160–169, Honolulu, 4–6 June 2004
Islam, M., Balaji, P., Sadayappan, P., Panda, D.K.: Towards provision of quality of service guarantees in job scheduling. In: Proceedings of the 6th IEEE International Conference on Cluster Computing (CLUSTER 2004), pp. 245–254, San Diego, 20–23 September 2004
Kannan, S., Roberts, M., Mayes, P., Brelsford, D., Skovira, J.F.: Workload management with loadleveler. IBM Redbooks, Poughkeepsie. http://www.redbooks.ibm.com/redbooks/pdfs/sg246038.pdf (2001)
Kleban, S.D., Clearwater, S.H.: Computation-at-risk: assessing job portfolio management risk on clusters. In: Proceedings of the 3rd International Workshop on Performance Modeling, Evaluation, and Optimization of Parallel and Distributed Systems (PMEO 2004), 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), Santa Fe, 26–30 April 2004
Kleban, S.D., Clearwater, S.H.: Computation-at-risk: employing the Grid for computational risk management. In: Proceedings of the 6th IEEE International Conference on Cluster Computing (CLUSTER 2004), pp. 347–352, San Diego, 20–23 September 2004
Lifka, D.A.: The ANL/IBM SP scheduling system. In: Proceedings of the 1st Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP 1995). Lecture Notes in Computer Science (LNCS), vol. 949/1995, pp. 295–303, Santa Barbara, 25 April 1995
Moeller, R.R.: COSO Enterprise Risk Management: Understanding the New Integrated ERM Framework. Wiley, Hoboken (2007)
Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans. Parallel Distrib. Syst. 12(6), 529–543 (2001)
Pfister, G.F.: In Search of Clusters, 2nd edn. Prentice Hall PTR, Upper Saddle River (1998)
Platform Computing: LSF version 4.1 administrator’s guide. http://www.platform.com/services/support/ (2001)
Popovici, F.I., Wilkes, J.: Profitable services in an uncertain world. In: Proceedings of the 18th ACM/IEEE Conference on Supercomputing (SC 2005), Seattle, 12–18 November 2005
Schneider, B., White, S.S.: Service Quality: Research Perspectives. Sage, Thousand Oaks (2004)
Sherwani, J., Ali, N., Lotia, N., Hayat, Z., Buyya, R.: Libra: a computational economy-based job scheduling system for clusters. Softw. Pract. Exp. 34(6), 573–590 (2004)
Sulistio, A., Poduval, G., Buyya, R., Tham, C.-K.: On incorporating differentiated levels of network service into GridSim. Future Gener. Comput. Syst. 23(4), 606–615 (2007)
Sun Microsystems: Sun One Grid engine, administration and user’s guide. http://gridengine.sunsource.net/project/gridengine/documentation.html (2002)
Sun Microsystems: Sun Grid. http://www.sun.com/service/sungrid/ (2007)
Tsafrir, D., Etsion, Y., Feitelson, D.G.: Modeling user runtime estimates. In: Proceedings of the 11th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP 2005). Lecture Notes in Computer Science (LNCS), vol. 3834/2005, pp. 1–35, Cambridge, 19 June 2005
University of Wisconsin-Madison: Condor version 6.7.1 manual. http://www.cs.wisc.edu/condor/manual/v6.7/ (2004)
Van Looy, B., Gemmel, P., Van Dierdonck, R. (eds.): Services Management: An Integrated Approach, 2nd edn. Financial Times Prentice Hall, Harlow (2003)
Xiao, L., Zhu, Y., Ni, L.M., Xu, Z.: GridIS: an incentive-based Grid scheduling. In: Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), Denver, 3–8 April 2005
Yeo, C.S., Buyya, R.: A taxonomy of market-based resource management systems for utility-driven cluster computing. Softw. Pract. Exp. 36(13), 1381–1419 (2006)
Yeo, C.S., Buyya, R.: Managing risk of inaccurate runtime estimates for deadline constrained job admission control in clusters. In: Proceedings of the 35th International Conference on Parallel Processing (ICPP 2006), pp. 451–458, Columbus, 14–18 August 2006
Yeo, C.S., Buyya, R.: Integrated risk analysis for a commercial computing service. In: Proceedings of the 21st IEEE International Parallel and Distributed Processing Symposium (IPDPS 2007), Long Beach, 26–30 March 2007
Yeo, C.S., Buyya, R.: Pricing for utility-driven resource management and allocation in clusters. Int. J. High Perform. Comput. Appl. 21(4), 405–418 (2007)
Yeo, C.S., Buyya, R., de Assuncao, M.D., Yu, J., Sulistio, A., Venugopal, S., Placek, M.: Utility computing on global Grids. In: Bidgoli, H. (ed.) The Handbook of Computer Networks, Chapter 143. Wiley, New York (2007)
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Yeo, C.S., Buyya, R. Integrated Risk Analysis for a Commercial Computing Service in Utility Computing. J Grid Computing 7, 1–24 (2009). https://doi.org/10.1007/s10723-008-9103-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-008-9103-2