CloudComputing Unit 3
CloudComputing Unit 3
CloudComputing Unit 3
UNIT - III
Data Storage and Cloud Computing: Data Storage – Cloud Storage – Cloud Storage from LANs
to WANs – Cloud Computing Services: Cloud Services – Cloud Computing at Work
1. DATA STORAGE
1.1 Introduction to Enterprise Data Storage
1.2 Data Storage Management
1.3 File Systems
1.4 Cloud Data Stores
1.5 Using Grids for Data Storage
What is Storage?
Storage is a resource to be allocated to organizations to add more value. Data storage
management includes a set of tools to configure, backup, assign to users according to defined
policies. Service level agreements (SLA) support clear business objectives, reduced risk
mitigation levels and legal issues.
Maintaining storage devices is a tedious job for storage administrators. They adopt some
utilities to monitor and manage storage devices. Storage Resource Management (SRM) tools
include configuration tools, provisioning tools and measurement tools.
● Configuration tools handle the set-up of storage resources. These tools help to organize
and manage RAID devices by assigning groups, defi ning levels or assigning spare drives.
● Provisioning tools defi ne and control access to storage resources for preventing a network
u ser from being able to use any other user’s storage.
Performance Barrier
Rapid growth in data has caused a parallel increase in the size of databases. In the traditional
storage method, the response time taken for queries is slow and it should be increased. Be it a
social networking site, an enterprise database or a web application, all requires faster disk
access to read and write data.
1.3FILE SYSTEMS
A file system is a structure used in computer to store data on a hard disk. When we install a
new hard disk, we need to partition and format it using a fi le system before storing data.
There are three fi le systems in use in Windows OS; they are NTFS, FAT32 and rarely-used
FAT.
FAT system was first devised in the so-called computer environment in the early years.
FAT was planned for systems with very small RAM and small disks. It required much less
system resources compared to other fi le systems like UNIX.
Essentially, the FAT system has made a comeback. Thumb or flash drives have become very
common and have smaller size that makes the FAT system useful. The smaller sizes are even
formatted in FAT16.
1.3.2 NTFS
In the 1990s, Microsoft recognized that DOS based Windows was inadequate because of
demands in business and industry.
They started working for better software which can suit larger systems. NTFS is much
simpler than FAT. While fi les are used, the system areas can be customized, enlarged, or
moved as required.
NTFS has much more security incorporated. NTFS is not apt for small-sized disks.
XtreemFS is a distributed, replicated and open source. XtreemFS allows users to mount and
access files via WWW. Engaging XtreemFS a user can replicate the fi les across data centres
to reduce network congestion, latency and increase data availability. Installing XtreemFS is
quite easy, but replicating the fi les is bit difficult.
Kosmos Distributed File System (KFS) gives high performance with availability and
reliability. For example, search engines, data mining, grid computing, etc.
It is deployed in C++ using standard system components such as STL, boost libraries, aio,
log4cpp. KFS is incorporated with Hadoop and Hypertable
CloudFS
A data store is a data repository where data are stored as objects. Data store includes data
repositories, flat files that can store data. Data stores can be of different types:
A Distributed Data Store is like a distributed database where users store information on
multiple nodes. These kinds of data store are non-relational databases that searches data quickly
over a large multiple nodes. Examples for this kind of data storage are Google’s BigTable,
Amazon’s Dynamo and Windows Azure Storage.
Established IT organizations have started using advanced technologies for managing large
size data, which come from social computing and data analysis applications.
BigTable
Demand for storage requirement prevails in gird computing. Storage for grid computing
requires a common fi le system to present as a single storage space to all workloads. Presently
grid computing system uses NAS type of storage. NAS provides transparency but limits scale
and storage management capabilities.