Fault Tolerance Model For Reliable Cloud Computing
Fault Tolerance Model For Reliable Cloud Computing
Fault Tolerance Model For Reliable Cloud Computing
Volume: 1 Issue: 7
_________________________________________________________________________
A.S.Sambare
Dept.of Computer Technology
P.I.E.T, Nagpur
ashishsambare@yahoo.co.in
S.D.Zade
Dept.of Computer Sci. & Engg
P.I.E.T, Nagpur
cdzshrikant@gmail.com
Abstract: Cloud computing has emerged as a platform that grants users with direct yet shared access to remote computing
resources and services. Cloud must provide services to many users at the same time; the scheduling strategy should be
developed for multiple tasks. In cloud computing processing is done on remote computer hence there are more chances of
errors, due to the undetermined latency and loose control over computing node. Hence remote computers should be highly
reliable. This is reason why a cloud computing infrastructure should be fault tolerant as well scheduling properly to
performing tasks. This paper mainly deals with a fault tolerance model for cloud computing & Paper describes model for
Fault Tolerance in Cloud computing (FTMC) FTMC model tolerates the faults on the basis of reliability of each computing
node. A Computing node is selected for computation on the basis of its reliability and can be removed, if does not perform well
for applications.
General Terms:
Cloud Computing, Fault Tolerance
Keywords:
Cloud Computing, Fault Tolerance, and Reliability
______________________________________________________*****_________________________________________________
1. INTRODUCTION
Cloud computing is a model for enabling convenient,
on-demand network access to a shared pool of
configurable computing resources (e.g., networks,
servers, storage, applications, and services). . Cloud
computing emerges as a new computing paradigm
which aims to provide reliable, customized and QoS
(Quality of Service) guaranteed computing dynamic
environments for end-users. Overall, cloud computing
brings
new aspects in computing resource
management: infinite computing resources available on
demand for the perspective of the end users; zero upfront commitment from the cloud users; and short-term
usage of any high-end computing resources [11],
[12].The basic principle of cloud computing is that user
data is not stored locally but is stored in the data center
of internet. The companies which provide cloud
computing service could manage and maintain the
operation of these data centers. The users can access the
stored data at any time by using Application
Programming Interface (API) provided by cloud
providers through any terminal equipment connected to
the internet. Not only are storage services provided but
also hardware and software services are available to the
general public and business markets. The services
provided by service providers can be everything, from
the infrastructure, platform or software resources. Each
such service is respectively called Infrastructure as a
Service (IaaS), Platform as a Service (PaaS) or
Software as a Service (SaaS).To give services to end
_________________________________________________________________________
_________________________________________________________________________
to get maximum output from cloud. X. Kong et. al. [1,
2] presented a model for virtual infrastructure
performance and fault tolerance. But it is not well suited
for the fault tolerance of real time cloud applications. A
model for non-cloud environment is proposed by J.
Coenen and J.Hooman [3] which describes a model for
fault tolerance in distributed real time system. Ravi
Jhawar, Vincenzo Piuri, Marco Santambrogio in their
paper [4] A Comprehensive Conceptual System-Level
Approach to Fault Tolerance in Cloud Computing,
describes a framework for user in which user can
specify and apply the desired level of fault tolerance
without requiring any knowledge about its
implementation with the help of dedicated user service
layer. Another model is time stamped fault tolerance
of distributed RTS [5], which is proposed by S. Malik
and M. J. Rehman. This model proposed time stamping
with the outputs.All these models mainly deals with the
fault tolerance without taking reliability of computer
nodes into consideration. Fault Tolerant system means
system should work under the presence of fault.[4].
Techniques to build efficient and fault tolerant
applications for Amazons EC2 are provided in [6].
Another approach using fault tolerance m middleware
which follows a leader/follower replication approach to
tolerate crash faults has been proposed in [7]. However,
all these techniques either tolerate only a specific kind
of fault or provide a single method to resilience. The
reliability of cloud system is a major concern among
users. In Approach for constructing a modular Fault
tolerant protocol paper by M.Hiltunen & R.D.
Schlichting proposed a modular protocols & combining
them to a system using hierarchical techniques.[10].
2. FAULT TOLERENCE MODEL IN CLOUD
COMPUTING (FTMC)
The given system deals with the fault tolerance
mechanism In this system , a model name fault
tolerance model for cloud ( FTMC) model is based
upon reliability assessment of computing nodes known
as a virtual machines(VM) in cloud environment and
fault tolerance of real time applications running on
those VMs. A virtual machine is selected for
computation on the basis of its reliability and can be
removed, if does not perform well for real time
applications. In this scheme, N virtual machine, which
run the N variant algorithms Algorithm X1 runs on
Virtual machine-1, X2 a run on Virtual machine-2,
up till Xm, which runs on Virtual machine m. Then
there is ACCEPTER which is responsible for the
verification of output result of each node. The outputs
are then passed to TIMER which checks the timing of
each result. On the basis of the timing the
RELIABILITY ASSESSOR calculates and reassigns
the reliability of each module. Then all the results are
forwarded to DECISION MAKER which selects the
output on the basis of best reliability. The output of a
node with highest reliability is selected
2.1 Working of Model
In the fault tolerance mechanism, has M virtual
machines. Each node is taking input data from the input
_________________________________________________________________________
Volume: 1 Issue: 7
_________________________________________________________________________
also request to some responsible module (resource
manager or scheduler) to remove one node with
minimum reliability and add a new node.
_________________________________________________________________________
Volume: 1 Issue: 7
_________________________________________________________________________
3. RESULTS & DISCUSSION
After applying FTMC algorithm, following results are
found.
Cycle
VM1
Reliability
VM2
Reliability
Fail
0.69
Pass
0.74
Fail
0.64
Pass
0.70
Fail
0.62
Fail
0.68
Pass
0.70
Pass
0.74
Fail
0.65
Pass
0.70
2012
IEEE,
DOI
10.1109/SysCon.2012.6189503
[5] S. Malik, M. J. Rehman, Time Stamped Fault
Tolerance in Distributed Real Time Systems; IEEE
International Multitopic Conference, Karachi, Pakistan,
2005
[6] J. Barr, A. Narin, and J. Varia, Building FaultTolerant Applications on AWS, October 2011.
[Online].
Available: http://media.amazonwebservices.com/AWSBuilding-Fault-tolerant-application.pdf
[7] W. Zhao, P. M. Melliar-Smith, and L. E. Moser,
Fault Tolerance Middleware for Cloud Computing, in
Proceedings of the 2010 IEEE 3rd International
Conference on Cloud Computing, ser. CLOUD 10.
Washington, DC, USA: IEEE Computer Society, 2010,
pp. 6774.
[8] K. V. Vishwanath and N. Nagappan,
Characterizing cloud computing hardware reliability,
in Proceedings of the 1st ACM symposium on Cloud
computing, ser. SoCC 10. New York, NY, USA:
ACM, 2010, pp. 193204.
5. REFERENCES
[1] X. Kong, J. Huang, C. Lin, P. D. Ungsunan,
Performance, Fault-tolerance and Scalability Analysis
of Virtual Infrastructure Management System, 2009
IEEE International Symposium on Parallel and
Distributed Processing with Applications, Chengdu,
China, August 9-12, 2009.
[2] X. Kong, J. Huang, C. Lin, Comprehensive
Analysis of Performance, Fault-tolerance and
Scalability in Grid Resource Management System,
2009 Eighth International Conference on Grid and
Cooperative Computing, Lanzhou, China
[3] J .Coenen, J. Hooman, A Formal Approach to Fault
Tolerance in Distributed Real-Time Systems,
Department of Mathematics and Computing Science,
Eindhoven University of Technology, Nether land
[9]
Amazon
elastic
http://aws.amazon.com/ec2/.
compute
cloud.
603
IJRITCC | JULY 2013, Available @ http://www.ijritcc.org
_________________________________________________________________________