Cloud Lab Manual

Download as pdf or txt
Download as pdf or txt
You are on page 1of 61

Ex.No.

1 CREATION OF VIRTUAL MACHINES

Date:

AIM:

To find procedure to run the virtual machine of different configuration and checkhow
many virtual machines can be utilized at particular time.

PROCEDURE:

Installing Ubuntu using Oracle Virtual Box


Step1:

Open Oracle virtual box manager and click create new -> virtual machine.

Step 2:

Provide a name for the virtual machine and select the hard disk size for the virtualmachine.

Select the storage size as Dynamically allocated memory size and click OK.The

virtual machine will be created.


Step 3:

Select the iso file of the virtual OS Ubuntu and click Start.

Step 4:

The virtual OS Ubuntu is opened successfully. Now type “gedit” in the search
box to open text editor in Ubuntu.
Step 5:

Type your desired C program in text editor and save it with the extension (.c).

Step 6:

Type “terminal” in the search box to open the command window.


Step 7:

Type the necessary commands to compile and run the C program.

Installing Windows 7 using Oracle Virtual Box

PROCEDURE:

Step1:

Open Oracle virtual box manager and click create new -> virtual machine. Provide and name
for the operating system and select the memory size to be occupied in memory.
Step 2:

Select the iso file of the virtual OS Windows7 and click Start.
Step 3:

Select the language to use in the Operating System and click Install Now.
Step 4:

Select the type of installation as Custom for new installation and allocate Diskspace
according to your convenience. Click Next to start the installation.
Step 5:

After installation the system will be restarted to complete the installation.


Step 6:

Provide a user name and password(optional) to gain access over the OS.
Step 7:

Set the time and date for the new Operating System.

Step 8:

Thus the new Operating System Windows7 will be opened as the virtual machine.
RESULT:

Thus the procedure to run different virtual machines on a single system using
Oracle Virtual Box is studied and implemented successfully.
Ex.No.2 EXECUTION OF A SAMPLE PROGRAM IN A

Date: VIRTUAL MACHINE

AIM:

To find a procedure to use the C compiler in the virtual machine and execute asample
program.

PROCEDURE:

Step 1:

Open the virtual machine in which you want to run the C program.

Step 2:

The text editor used by the Ubuntu Operating System is the GEDIT.It can

be opened by using the search option by typing gedit in it.

The text editor will be opened now.


Step 3:

Type your desired C program in the text editor and save it as a C file using theExtension

(.c) for C programs.


Step 4:

Type “terminal” in the search box to open the command window.

Step 5:

Type the necessary commands to compile and run the C program.(1).

cc_filename to compile the C program.


(2). ./a.out to display the output of the last compiled program.

RESULT:

Thus the procedure to use the C compiler in the virtual machine and execute asample program is
implemented successfully.
Ex.No.3 Installing and Running the Google App Engine

Date: On Windows

AIM:

To Installing and Running the Google App Engine On Windows

PROCEDURE:

Step 1:

Pre--Requisites: Python 2.5.4

If you don't already have Python 2.5.4 installed in your computer, download and
Install Python 2.5.4 from:

http://www.python.org/download/releases/2.5.4/
Download and Install

You can download the Google App Engine SDK by going to:

http://code.google.com/appengine/downloads.html

and download the appropriate install package.

Download the Windows installer – the simplest thing is to download it to your


Desktop or another folder that you remember.
Double Click on the GoogleApplicationEngine installer.

Click through the installation wizard, and it should install the App Engine. If you do not
have Python 2.5, it will install Python 2.5 as well.

Once the install is complete you can discard the downloaded installer
Making your First Application

Now you need to create a simple application. We could use the “+”
option to have the launcher make us an application – but instead we
will do it by hand to get a better sense of what is going on.

Make a folder for your Google App Engine applications. I am going


to make the Folder on my Desktop called “apps” – the path to this
folder is:

C:\Documents and Settings\csev\Desktop\apps

And then make a sub--‐ folder in within apps called “ae--01--trivial” – the
path to this folder would be:

C:\ Documents and Settings \csev\Desktop\apps\ae--01--trivial

Using a text editor such as JEdit (www.jedit.org), create a file called app.yaml in the
ae--01--trivial folder with the following contents:

application: ae-01-
trivialversion: 1
runt
ime
:
pyt
hon
api
_ve
rsio
n: 1

handlers:
- url: /.*
script: index.py

Note: Please do not copy and paste these lines into your text editor –
you might end up with strange characters – simply type them into your
editor.

Then create a file in the ae--01--trivial folder called index.py with three lines in it:

print 'Content-Type:
text/plain'print ' '
print 'Hello there Chuck'

Then start the GoogleAppEngineLauncher program that can be


found under Applications. Use the File --> Add Existing
Application command and navigate into the apps directory and
select the ae--01--trivial folder. Once you have added the
application, select it so that you can control the application using the
launcher.
Once you have selected your application and press Run. After a few
moments your application will start and the launcher will show a little
green icon next to your application. Then press Browse to open a
browser pointing at your application which is running at
http://localhost:8080/

Paste http://localhost:8080 into your browser and you


should see your application as follows:

Just for fun, edit the index.py to change the name “Chuck” to your
own name and press Refresh in the browser to verify your updates.

RESULT:

Thus the Google App Engine installed on Windows successfully.


Ex.No : 4 Google App Engine Launch the Web Applications
Date :

Aim:
To Use GAE launcher to launch the web applications
Procedure:

Before you can host your website on Google App Engine:

 Create a new Cloud Console project or retrieve the project ID of an existing project to use:

 Install and then initialize the Google Cloud SDK

Creating a website to host on Google App Engine Basic structure for the project

This guide uses the following structure for the project:

 app.yaml: Configure the settings of your App Engine application.

 www/: Directory to store all of your static files, such as HTML, CSS, images, and
JavaScript.

 css/: Directory to store stylesheets.

 style.css: Basic stylesheet that formats the look and feel of your site.

 images/: Optional directory to store images.

 index.html: An HTML file that displays content for your website.

 js/: Optional directory to store JavaScript files.

 Other asset directories.

Creating the app.yaml file:

The app.yaml file is a configuration file that tells App Engine how to map URLs to your
static files. In the following steps, you will add handlers that will load www/index.html when
someone visits your website, and all static files will be stored in and called from
the www directory.

Create the app.yaml file in your application's root directory:

 Create a directory that has the same name as your project ID. You can find

your project ID in the Console.

 In directory that you just created, create a file named app.yaml.


 Edit the app.yaml file and add the following code to the file:

1. runtime: python27
api_version: 1
threadsafe: true

handlers:
- url: /
static_files: www/index.html
upload: www/index.html

- url: /(.*)
static_files: www/\1
upload: www/(.*)
Creating the index.html file
Create an HTML file that will be served when someone navigates to the root page of your
website.
Store this file in your www directory.
<html>
<head>
<title>Hello, world!</title>
<link rel="stylesheet" type="text/css" href="/css/style.css">
</head>
<body>
<h1>Hello, world!</h1>
<p>
This is a simple static HTML file that will be served from Google App
Engine.
</p>
</body>
</html>

Deploying your application to App Engine

When you deploy your application files, your website will be uploaded to App Engine. To deploy
your app,
run the following command from within the root directory of your application where the app.yaml
file is located:

gcloud app deploy

Viewing your application

To launch your browser and view the app at https://PROJECT_ID.REGION_ID.r.appspot.com,


run the following command:
gcloud app browse

Result :

Thus the GAE launcher to launch the Applications Successfully.


Ex.No : 5 Simulate a Cloud Scenario Using CloudSim
Date :

Aim :

To Simulate a cloud scenario using CloudSim and run a scheduling algorithm that is not
present in CloudSim

Introduction:

 CloudSim
 A Framework for modeling and simulation of Cloud Computing
Infrastructures and services
 Originally built at the Cloud Computing Distributed Systems (CLOUDS)
Laboratory, The University of Melbourne, Australia
 It is completely written in JAVA
 Main Features of CloudSiM
o Modeling and simulation
o Data centre network topologies and message-passing applications
o Dynamic insertion of simulation elements
o Stop and resume of simulation
o Policies for allocation of hosts and virtual machines
 Cloudsim – Essentials
 JDK 1.6 or above http://tinyurl.com/JNU-JAVA
 Eclipse 4.2 or above http://tinyurl.com/JNU-Eclipse
 Alternatively NetBeanshttps://netbeans.org/downloads
 Up & Running with cloudsim guide: https://goo.gl/TPL7Zh
 Cloudsim-Directory structure
 cloudsim/ -- top level CloudSim directory
 docs/ -- CloudSim API Documentation
 examples/ -- CloudSim examples
 jars/ -- CloudSim jar archives
 sources/ -- CloudSim source code
 Cloudsim - Layered Architecture

 Cloudsim - Component model classes


o CloudInformationService.java
o Datacenter.java,Host.java,Pe.java
o Vm.java,Cloudlet.java
o DatacenterBroker.java
o Storage.java,HarddriveStorage.java, SanStorage.java

 Cloudsim - Major blocks/Modules


o org.cloudbus.cloudsim
o org.cloudbus.cloudsim.core
o org.cloudbus.cloudsim.core.predicates
o org.cloudbus.cloudsim.distributions
o org.cloudbus.cloudsim.lists
o org.cloudbus.cloudsim.network
o org.cloudbus.cloudsim.network.datacenter
o org.cloudbus.cloudsim.power
o org.cloudbus.cloudsim.power.lists
o org.cloudbus.cloudsim.power.models
o org.cloudbus.cloudsim.provisioners
o org.cloudbus.cloudsim.util

 Cloudsim - key components


o Datacenter
o DataCenterCharacteristics
o Host
o DatacenterBroker
o RamProvisioner
o BwProvisioner
o Storage
o Vm
o VMAllocationpolicy
o VmScheduler
o Cloudlet
o CloudletScheduler
o CloudInformationService
o CloudSim
o CloudSimTags
o SimEvent
o SimEntity
o CloudsimShutdown
o FutureQueue
o DefferedQueue
o Predicate and associative classes.

CloudSim Elements/Components
Procedure to import Eclipse, Cloudsim in your system

Step 1: Link to download Eclipse and download Eclipse for Windows 64bit into your Local
machine

https://www.eclipse.org/downloads/packages/release/kepler/sr1/eclipse-ide-
java-developers

Windows
x86_64
Step 2: Download cloudsim-3.0.3 from git hub repository in your local machine

https://github.com/Cloudslab/cloudsim/releases/tag/cloudsim-3.0.3

Cloudsi
m-

Step 3: Download commons-maths3-3.6.1 from git hub repository in your local machine

https://commons.apache.org/proper/commons-math/download_math.cgi

Commons-
maths3-
3.6.1- bin.zip
Step 4: Downloaded Eclipse, cloudsim-code-master and Apache Commons Math 3.6.1 in
your local machine and extract cloudsim-3.0.3 and Apache Commons Math 3.6.1

Downloaded
Files

Step 5: First of all, navigate to the folder where you have unzipped the eclipse folder and
open Eclipse.exe
Step 6: Now within Eclipse window navigate the menu: File -> New -> Project, to open the
new project wizard

Step 7: A ‗New Project‗ wizard should open. There are a number of options displayed and
you have to find & select the ‗Java Project‗ option, once done click ‘Next‗
Step 8: Now a detailed new project window will open, here you will provide the project name
and the path of CloudSim project source code, which will be done as follows:

Project Name: CloudSim.

Step 9: Unselect the ‘Use default location’ option and then click on ‘Browse’ to open the path
where you have unzipped the Cloudsim project and finally click Next to set project settings.
Step 10: Make sure you navigate the path till you can see the bin, docs, examplesetc folder in
the navigation plane.

Step 11: Once done finally, click ‗Next„ to go to the next step i.e. setting up of project
settings
Step 12: Now open ‘Libraries’ tab and if you do not find commons-math3-3.x.jar (here ‘x’
means the minor version release of the library which could be 2 or greater) in the list then
simply click on ‗Add External Jar’ (commons-math3-3.x.jar will be included in the project
from this step)

Step 13: Once you have clicked on ‗Add External JAR’s‗ Open the path where you have
unzipped the commons-math binaries and select ‗Commons-math3-3.x.jar„ and click on open.
Step 14: Ensure external jar that you opened in the previous step is displayed in the list and
then click on ‗Finish„ (your system may take 2-3 minutes to configure the project)

Step 15: Once the project is configured you can open the ‗Project Explorer‗and start exploring
the Cloudsim project. Also for the first time eclipse automatically start building the workspace
for newly configured Cloudsim project, which may take some time depending on the
configuration of the computer system.

Following is the final screen which you will see after Cloudsim is configured.
Step 16: Now just to check you within the ‗Project Explorer‗, you should navigate to the
‗examples‗ folder, then expand the package ‗org.cloudbus.cloudsim.examples‗ and double
click to open the ‗CloudsimExample1.java‗

.
Step 17: Now navigate to the Eclipse menu ‗Run ->Run‗ or directly use a keyboard
shortcut ‘Ctrl + F11’ to execute the ‗CloudsimExample1.java‗.
Step 18: If it is successfully executed it should be displaying the following type to output in the
console window of the Eclipse IDE.

Result:

Thus the cloudsim is simulated using Eclipse Environment successfully.


Ex.No : 6 Procedure File Transfer in Client & Server using virtual machine.
Date :

Aim:
To procedure File Transfer in Client & Server using virtual machine
Steps:
Steps to perform File Transfer in Client & Server using virtual machine.
Step 1: Open a virtual machine to do file transfer.
Step 2: Write the java program for FTP Client and FTP Server. Step 3: Run the program.
Source Code:
FTPClient.java
import java.io.*;
import java.net.*;
import java.util.*;
public class
FTPClient
{
public static void main(String args[])throws IOException
{
try
{
int number;
Socket s=new Socket("127.0.0.1",10087);
Scanner sc=new Scanner(System.in);
System.out.println("Enter the file name:");
String fn=sc.next();
DataOutputStream dos=new DataOutputStream(s.getOutputStream());
dos.writeUTF(fn);
DataInputStream dis=new DataInputStream(s.getInputStream());
String input=(String)dis.readUTF();
FileInputStream fis=new FileInputStream(input);
System.out.println("Even Numbers in the" +fn+" are");
int i=0;
while((i=fis.read())!=-1)
{
System.out.println((char)i);
}
s.close();
}
catch(Exception e){
System.out.println("Port not available "+e);
}
}
}
FTPServer.java
import java.io.*;
import java.net.*;
import java.util.*;
public class
FTPServer
{
public static void main
(
String args[]
)
throws IOException
{
Try
{
int num;
Scanner sc=new Scanner(System.in);
ServerSocket ss=new ServerSocket(10087);
Socket s=ss.accept();
System.out.println("Waiting. ... ");
DataInputStream dis=new DataInputStream(s.getInputStream());
String input=(String)dis.readUTF();
DataOutputStream dos=new DataOutputStream(s.getOutputStream());
FileInputStream fis = new FileInputStream("out.txt");
FileOutputStream fos = new FileOutputStream(input);
while((num=fis.read())!= -1)
{
if(num%2==0)
{
fos.write(num);
}
}
dos.writeUTF(input);
System.out.println("File is sent to client");
ss.close();
s.close();
}
catch(Exception e)
{
System.out.println("Port not available"+e);
}
}
}

Out.txt
1
2
3
4
5
6
7
8
9
Output:

Result:
Thus the program to the File transfer operation using virtual machine was successfully executed
and verified.
Ex.No : 7 Find a procedure to launch virtual machine using Openstack
Date :

Aim :

To Find a procedure to launch virtual machine using trystack (Openstack)

Introduction:
 OpenStack was introduced by Rackspace and NASA in July 2010. • It is modular architecture
 Designed to easily scale out
 Based on (growing) set of core services
 OpenStack is an Infrastructure as a Service known as Cloud Operating System, that take resources such
as Compute, Storage, Network and Virtualization Technologies and control those resources at a data
center level
 The project is building an open source community - to share resources and technologies with the goal of
creating a massively scalable and secure cloud infrastructure.
 The software is open source and limited to just open source APIs such as Amazon.

The following figure shows the OpenStack architecture

OpenStack architecture
The major components are

1. Keystone
2. Nova
3. Glance
4. Swift
5. Quantum
6. Cinder
 KEYSTONE :
o Identity service
o Common authorization framework
o Manage users, tenants and roles
o Pluggable backends (SQL,PAM,LDAP, IDM etc)

 NOVA
o Core compute service comprised of
 Compute Nodes – hypervisors that run virtual machines
 Supports multiple hypervisors KVM,Xen,LXC,Hyper-V
and ESX
 Distributed controllers that handle scheduling, API calls, etc
 Native OpenStack API and Amazon EC2 compatible
API
 GLANCE
o Image service
o Stores and retrieves disk images (Virtual machine templates)
o Supports RAW,QCOW,VHD,ISO,OVF & AMI/AKI
o Backend Storage : File System, Swift, Gluster, Amazon S3
 SWIFT
o Object Storage service
o Modeled after Amazon„s Service
o Provides simple service for storing and retrieving arbitrary data
o Native API and S3 compatible API
 NEUTRON
o Network service
o Provides framework for Software Defined Network
o Plugin architecture
 Allows intergration of hardware and software based network
solutions

 Open vSwitch, Cisco UCS,Standard LinuxBridge,NiCira


NVPCINDER

o Block Storage (Volume) service


o Provides block storage for Virtual machines(persistent disks)
o Similar to Amazon EBS service
o Plugin architecture for vendor extensions
 NetApp driver for cinder
 HORIZON
o Dashboard
o Provides simple self service UI for end-users
o Basic cloud administrator functions
 Define users, tenants and quotas
 No infrastructure management
 HEAT OpenStack Orchestration
o Provides template driven cloud application orchestration
o Modeled after AWS Cloud Formation
o Targeted to provide advanced functionality such as high availability
and auto scaling
o Introduced by Redhat
 CEILOMETER – OpenStack Monitoring and Metering
o Goal: To Provide a single infrastructure to collect measurements from
an entire OpenStack Infrastructure; Eliminate need for multiple agents
attaching to multiple OpenStack Projects
o Primary targets metering and monitoring: Provided extensibility

 Steps in Installing
Openstack
Step 1:
Download and Install Oracle Virtual Box latest version & Extensionpackage
Link : https://virtualbox.org/wiki/downloads

Download CentOS 7 OVA(Open Virtual Appliance) from


Link : https://linuxvmimages.com/images/centos-7

Import CentOS 7 OVA(Open Virtual Appliance) into Oracle Virtual Box


Step 3:Login into CenOS 7

 Login Details
o User name : centos
o Password : centos
 To change into root user in Terminal
#sudosu–

Step 4: Installation Steps for OpenStack

Step5: Command to disable and stop firewall

# systemctl disable firewalld#systemctl stop firewalld


Step 6: Command to disable and stop Network Manager

# systemctl disable

NetworkManager# systemctl stop

NetworkManager

Step 7: Enable and start Network

#systemctl enable

network#systemctl start

network
Step 8: OpenStack will be deployed on your Node with the help of PackStack package provided by
rdo repository (RPM Distribution of OpenStack).In order to enable rdo repositories on Centos 7 run
the below command.
#yum install -y https://rdoproject.org/repos/rdo-release.rpm

Step 9: Update Current packages

#yum update –y
Step 10:Install OpenStack Release for CentOS

#yum install –y openstack-packstack

Step 11:Start packstack to install OpenStack Newton

#packstak --allinone

Step 12:Note the user name and password from keystonerc_admin

#cat keystonerc_admin
Step 13: Click the URL and enter the user name and password to start OpenStack

OpenStack is successfully launched in your machine


Result:

Thus the OpenStack Installation is executed successfully.


Ex.No : 8 Install Hadoop single node cluster

Date :

Aim :

To Install Hadoop single node cluster and run simple applications like wordcount

Procedure:

Step 1:
Installing Java is the main prerequisite for Hadoop. Install java1.7.
$sudo apt-get update
$sudo apt-get install openjdk-7-jdk
$sudo apt-get install openjdk-7-jre
$ java -version

java version "1.7.0_79"

OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-0ubuntu1.14.04.1)

OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode)

Step 2:

SSH Server accepting password authentication (at least for the setup time).

To install, run:

student@a4cse196:~$ su

Password:

root@a4cse196:/home/student# apt-get install openssh-server

Step 3:

Generate the ssh key

root@a4cse196:/home/student# ssh-keygen -t rsa -P "" -f ~/.ssh/id_rsa

Generating public/private rsa key pair.

Created directory '/root/.ssh'.

Your identification has been saved in /root/.ssh/id_rsa.

Your public key has been saved in /root/.ssh/id_rsa.pub.

The key fingerprint is:

77:a1:20:bb:db:95:6d:89:ce:44:25:32:b6:81:5d:d5 root@a4cse196
The key's random art image is:

+--[ RSA 2048] ---+

| .... |

| o. E |

| oB.o |

| +*+. |

| .S+. |

| .o=. |

| .=+ |

| o=. |

| ..o |

+ +

Step 4:

If the master also acts a slave (`ssh localhost` should work without a password)

root@a4cse196:/home/student# cat $HOME/.ssh/id_rsa.pub >>$HOME/.ssh/authorized_keys

Step 5:

Create hadoop group and user:

Step 5.1 root@a4cse196:/home/student# sudo addgroup hadoop

Adding group `hadoop' (GID 1003) ...

Done.

Step 5.2 root@a4cse196:/home/student# sudo adduser --ingroup hadoop hadoop

Adding user `hadoop' ...

Adding new user `hadoop' (1003) with group `hadoop' ...

Creating home directory `/home/hadoop'

Copying files from `/etc/skel' ...

Enter new UNIX password:

Retype new UNIX password:

passwd: password updated successfully

Changing the user information for hadoop


Enter the new value, or press ENTER for the default

Full Name []:

Room Number []:

Work Phone []:

Home Phone []:

Other []:

Is the information correct? [Y/n] Y

root@a4cse196:/home/student#

Step 6:

Copy your .tar file to home.(hadoop-2.7.0.tar.gz)

Step 7:

Extracting the tar file.

root@a4cse196:/home/student# sudo tar -xzvf hadoop-2.7.0.tar.gz -C /usr/local/lib/

Step 8:

Changing the Ownership

root@a4cse196:/home/student# sudo chown -R hadoop:hadoop /usr/local/lib/hadoop-2.7.0

Step 9:

Create HDFS directories:

root@a4cse196:/home/student# sudo mkdir -p /var/lib/hadoop/hdfs/namenode

root@a4cse196:/home/student# sudo mkdir -p /var/lib/hadoop/hdfs/datanode

root@a4cse196:/home/student# sudo chown -R hadoop /var/lib/hadoop

Step 10:

Check where your Java is installed:

root@a4cse196:/home/student# readlink -f /usr/bin/java

/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java

Step 11:

Open gedit and do it

root@a4cse196:/home/student# gedit ~/.bashrc


Add to ~/.bashrc file:

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64

export HADOOP_INSTALL=/usr/local/lib/hadoop-2.7.0

export PATH=$PATH:$HADOOP_INSTALL/bin

export PATH=$PATH:$HADOOP_INSTALL/sbin

export HADOOP_MAPRED_HOME=$HADOOP_INSTALL

export HADOOP_COMMON_HOME=$HADOOP_INSTALL

export HADOOP_HDFS_HOME=$HADOOP_INSTALL

export YARN_HOME=$HADOOP_INSTALL

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native

export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib/native" Step

12:

Reload source

root@a4cse196:/home/student# source ~/.bashrc

Step 13:

Modify JAVA_HOME in /usr/local/lib/hadoop-2.7.0/etc/hadoop/hadoop-env.sh:

root@a4cse196:/home/student# cd /usr/local/lib/hadoop-2.7.0/etc/hadoop

root@a4cse196:/usr/local/lib/hadoop-2.7.0/etc/hadoop# gedit hadoop-env.sh

export JAVA_HOME=${ JAVA_HOME}

Changed this to below path

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64

Step 14:

Modify /usr/local/lib/hadoop-2.7.0/etc/hadoop/core-site.xml to have something like:

root@a4cse196:/usr/local/lib/hadoop-2.7.0/etc/hadoop# gedit core-site.xml

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>
</property>

</configuration>

Step 15:

Modify /usr/local/lib/hadoop-2.7.0/etc/hadoop/yarn-site.xml to have something like:


root@a4cse196:/usr/local/lib/hadoop-2.7.0/etc/hadoop# gedit yarn-site.xml

<configuration>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

</configuration>

Step 16:

Create /usr/local/lib/hadoop-2.7.0/etc/hadoop/mapred-site.xml from template:


root@a4cse196:/usr/local/lib/hadoop-2.7.0/etc/hadoop# cp /usr/local/lib/hadoop-
2.7.0/etc/hadoop/mapred-site.xml.template /usr/local/lib/hadoop-2.7.0/etc/hadoop/mapred-site.xml

Step 17:

Modify /usr/local/lib/hadoop-2.7.0/etc/hadoop/mapred-site.xml to have something like:


root@a4cse196:/usr/local/lib/hadoop-2.7.0/etc/hadoop# gedit mapred-site.xml

<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>
</configuration>

Step 18:

Modify /usr/local/lib/hadoop-2.7.0/etc/hadoop/hdfs-site.xml to have something like:

root@a4cse196:/usr/local/lib/hadoop-2.7.0/etc/hadoop# gedit hdfs-site.xml


<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:/var/lib/hadoop/hdfs/namenode</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:/var/lib/hadoop/hdfs/datanode</value>

</property>

</configuration>

Step 19:

Make changes in /etc/profile

$gedit /etc/profile JAVA_HOME=/usr/lib/jvm/java-

7-openjdk-amd64

PATH=$PATH:$JAVA_HOME/bin

export JAVA_HOME

export PATH

$source /etc/profile

Step 20:

root@a4cse196:/usr/local/lib/hadoop-2.7.0/etc/hadoop# hdfs namenode -format


Step 21:

Switch to hadoop user


start-dfs.sh

yes

yes

start-yarn.sh
root@a4cse196:/home/hadoop# jps

6334 SecondaryNameNode

6498 ResourceManager

6927 Jps

6142 DataNode

5990 NameNode

6696 NodeManager

Step 22:

Browse the web interface for the Name Node; by default it is available at:

http://localhost:50070

Result:
Thus the procedure to set up the one node Hadoop cluster was successfully done and verified.
Word Count Program Using Map And Reduce

Procedure:
1. Analyze the input file content
2. Develop the code
a. Writing a map function
b. Writing a reduce function
c. Writing the Driver class
3. Compiling the source
4. Building the JAR file
5. Starting the DFS
6. Creating Input path in HDFS and moving the data into Input path
7. Executing the program

Program: WordCount.java

import java.io.IOException;

import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;import

org.apache.hadoop.fs.Path;

importorg.apache.hadoop.io.IntWritable;import

org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job; import

org.apache.hadoop.mapreduce.Mapper; import

org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class WordCount {

public static class TokenizerMapper

extends Mapper<Object, Text, Text, IntWritable>


{
private final static IntWritable one = new IntWritable(1);

private Text word = new Text();

public void map(Object key, Text value, Context context

) throws IOException, InterruptedException {


StringTokenizer itr = new StringTokenizer(value.toString());

while (itr.hasMoreTokens()) {

word.set(itr.nextToken());

context.write(word, one);

public static class IntSumReducer

extends Reducer<Text,IntWritable,Text,IntWritable> {

private IntWritable result = new IntWritable();

public void reduce(Text key, Iterable<IntWritable> values,

Context context

) throws IOException, InterruptedException {

int sum = 0;

for (IntWritable val : values) {

sum += val.get();

result.set(sum);

context.write(key, result);

public static void main(String[] args) throws Exception {

Configuration conf = new Configuration();

Job job = Job.getInstance(conf, "word count");

job.setJarByClass(WordCount.class);

job.setMapperClass(TokenizerMapper.class);

job.setCombinerClass(IntSumReducer.class);

job.setReducerClass(IntSumReducer.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

FileInputFormat.addInputPath(job, new Path(args[0]));

FileOutputFormat.setOutputPath(job, new Path(args[1]));

System.exit(job.waitForCompletion(true) ? 0 : 1);

Save the program as

WordCount.java

Step 1: Compile the java program

For compilation we need this hadoop-core-1.2.1.jar file to compile the mapreduce program.

https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-core/1.2.1

Assuming both jar and java files in same directory run the following command to compile

root@a4cseh160:/#javac -classpath hadoop-core-1.2.1.jar WordCount.java

Step 2: Create a jar file

Syntax:

jar cf jarfilename.jar MainClassName*.class

Output:

root@a4cseh160:/#jar cf wc.jar WordCount*.class

Step 3: Make directory in hadoop file system

Syntax:

hdfs dfs -mkdir directoryname

Output:

root@a4cseh160:/# hdfs dfs -mkdir /user

Step 4: Copy the input file into hdfs


Syntax:

hdfs dfs -put sourcefile destpath

Output:

root@a4cseh160:/#hdfs dfs -put /input.txt /user


Step 5: To a run a program

Syntax:

hadoop jar jarfilename main_class_name inputfile outputpath

Output:

root@a4cseh160:/#hadoop jar wc.jar WordCount /user/input.txt /user/out

Input File: (input.txt)


Cloud and Grid Lab. Cloud and Grid Lab. Cloud Lab.

Output:
18

3 Cloud

3 Lab.

2 Grid

2 and
Step 6: Check the output in the Web UI at http://localhost:50070.

In the Utilities tab select browse file system and select the correct user.

The output is available inside the output folder named user.

Step 7: To Delete an output folder

Syntax:

hdfs dfs -rm -R outputpath

Output:

root@a4cseh160:/#hdfs dfs -rm -R /user/out.txt

Result:
Thus the numbers of words were counted successfully by the use of Map and
Reduce tasks.

You might also like