0% found this document useful (0 votes)
12 views15 pages

Ravinder Big Data 4 PDF

This document outlines the steps to create a Java program implementing the Map Reduce Paradigm using Hadoop in a Cloudera Virtual Machine. It details the process of setting up a Java project in Eclipse, creating necessary Java files (WordCount.java, WordMapper.java, WordReducer.java), and adding required libraries. The document also includes code snippets for each Java file and instructions for exporting the project as a JAR file.

Uploaded by

thaakuranujtomar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views15 pages

Ravinder Big Data 4 PDF

This document outlines the steps to create a Java program implementing the Map Reduce Paradigm using Hadoop in a Cloudera Virtual Machine. It details the process of setting up a Java project in Eclipse, creating necessary Java files (WordCount.java, WordMapper.java, WordReducer.java), and adding required libraries. The document also includes code snippets for each Java file and instructions for exporting the project as a JAR file.

Uploaded by

thaakuranujtomar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

PRACTICAL NO:-04

AIM:- Run a java program based on parallel programming to implement the


concept of Map Reduce Paradigm.
Step1: Start Cloudera Virtual Machine .
Step2: Open Eclipse (Icon is there in the Desktop of VM)
Step3: Go to (Menu) >New> Java Project
Enter the Project Name as
Anirudh Then
Click Finish
Name-Anirudh Attri Roll no-11212586 Section-5CSE(C2)

Step4: Add a New Java File (WordCount.java)


As displayed in the image

(Once you click on “class” Menu Give the File Name as “WordCount.java”)

Repeat the steps to add 2 more files


WordMapper.java
WordReducer.java
Name-Anirudh Attri Roll no-11212586 Section-5CSE(C2)
Step6:
In this step we will add Hadoop and MapReduce Libraries in our Project
Follow the steps as given in the images

Click on “Add external Jars” Button


Repeat the Same Step
And Add the jar Files from
/user/lib/Hadoop-0.20-mappreduce Directory
Now! We are ready for programming
Step7:
Open WordCount.java(from package explore on left)
And import the following packages
import java.io.IOException;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.Job;
public class WordCount {
public static void main(String[] args) throws IOException
{
if (args.length != 2) {
System.out.printf(
"Usage: WordCount <input dir> <output dir>\n");
System.exit(-1);
}
@SuppressWarnings("deprecation")
Job job = new Job();
job.setJarByClass(WordCount.class); //entry point
job.setJobName("Word Count");
FileInputFormat.setInputPaths(job,new Path(args[0]));
FileOutputFormat.setOutputPath(job,new Path(args[1]));
job.setMapperClass(WordMapper.class);
job.setReducerClass(WordReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class); boolean
success =false;
try {
success = job.waitForCompletion(true); //start job
} catch (ClassNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.exit(success ? 0:1);
}
}

Step8:
In the Next Step we will write code for WordMapper.java
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer.Context;
public class WordMapper extends Mapper<LongWritable, Text , Text, IntWritable>{
@Override
public void map(LongWritable key , Text value , Context context )
throws IOException, InterruptedException {
String line= value.toString();
for (String word : line.split(" ")){
if (word.length() > 0) {
//context.write(new Text("total"), new IntWritable(1));
context.write(new Text(word), new IntWritable(1)); //intermediate output
}
}
}
}
Open WordMapper.java and Write the code shown above.
Lets code WordReducer.java
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
public class WordReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
@Override
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException{
int wordCount=0;
for(IntWritable value : values){
wordCount +=value.get();
}
context.write(key, new IntWritable(wordCount));}
}
Lets Export the Jar File of the Project
Then click OK and Press Next
Then Press OK in the next Dialog
Finally Click on Finish Button

Go to Desktop and Check if ParwezH.jar is generated or not.

If the File is Generated


Now
Lets prepare the Input File

You might also like