Lab 1 Cloud Computing Virtualization: Jinnah University For Women Instructor Engr S M Asim Ali
Lab 1 Cloud Computing Virtualization: Jinnah University For Women Instructor Engr S M Asim Ali
Lab 1 Cloud Computing Virtualization: Jinnah University For Women Instructor Engr S M Asim Ali
Cloud Computing
Virtualization
Jinnah University for Women
Instructor Engr S M Asim Ali
TASK LIST
What is Virtualization?
Show your understanding through 02 examples
LAB 2
Cloud Computing
Services
Jinnah University for Women
Instructor Engr S M Asim Ali
TASK LIST
What operating system will you prefer for creating Virtual
Environment
Mention the services of Microsoft Operating System or
Linux that support virtualization
LAB 3
Cloud Computing
HADOOP as a tool for MAP REDUCE
Jinnah University for Women
Instructor Engr S M Asim Ali
TASK LIST
Introduction
Data Grid vs. Computing Grid
Grid Computing
Cloud Computing
Data Grid (HaDoop File System)
Computing Grid (Map Reduce)
Counting of Words
Conclusion
Motivation
Count how frequent
each words appears
in the corpus
MEDline (18 millions
texts)
Motivation
I want to extend my
research to another
corpus
Need more computing resources
Data Grid vs. Computing Grid
Data Grid:
distributed data storage
controlled sharing and management of large amounts of
distributed data.
Computing Grid:
Parallel execution
divide pieces of a program among several computers
Data Grid + Computing Grid
Grid Computing
Grid Computing
The Grid
Master
Slaves
Grid Computing
Motivation: high performance, improving resources
utilization
Aims to create illusion of a simple, yet powerful computer
out of a large number of heterogeneous systems
Tasks are submitted and distributed on nodes in the grid
Cloud Computing
The interesting thing about cloud
computing is that weve redefined
cloud computing to include everything
that we already do.
Larry Ellison
during Oracles Analyst Day
Cloud Computing
Pay-as-you-go
No initial investments
Reduced operation costs
Scalability
Availability
Cloud Computing - Open Issues
Bandwidth and latency
Lack of standard and portability
Black-box implementations
Security and lack of control
Immature tools and framework support
Legal issues (ownership, auditing, etc)
Limited Service Level of Agreements (SLAs)
Data Grid vs. Computing Grid
Data Grid:
distributed data storage
controlled sharing and management of large amounts of
distributed data.
Computing Grid:
Parallel execution
divide pieces of a program among several computers
Data Grid + Computing Grid
Grid Computing
Data Grid (Hadoop FS - Overview)
Caching of Data
Namenode
(master node)
Metadata (Name, .., ..)
Index:
Datanodes
(Slave node)
Block ops
Client
Ask specific
text
Replication
Data Grid (HDFS - Replication Data)
Counting Words in Text Files
1 3 2 0
0 5 1 8
7 2 3 5
Split-Operation
countWords(File)
countWords(File)
countWords(File)
countWords(File)
Map-Operation
w
1
:
w
2
:
w
4
:
w
3
:
w
5
:
6 2 3 4
0 1 0 0
w
1
: 6
w
2
: 14
w
3
: 15
w
4
: 17
w
5
: 1
Reduce-Operation
Advantages of Hadoop
Purely written in Java, requires installation of Cygwin under
Windows
Available under LGPL and Apache 2.0 license
Usually offers only one implementation for the different
features of a grid framework
May also use other file systems than Hadoop FS
Very flexible implementation of MapReduce
For split operation only supports FileSplit out of the box
Better suited for computations where
large data collections should be handled
if reduce-operation is more than a simple aggregation of
the maps output