0% found this document useful (0 votes)

18 views14 pages

3 MapReduce Program Ex Code

The document explains the structure of a MapReduce program, which consists of three main parts: Mapper Phase Code, Reducer Phase Code, and Driver Code. It provides detailed Java code for both the Mapper and Reducer classes, illustrating how to tokenize input text and aggregate word counts. Additionally, it outlines the configuration and execution of the MapReduce job using Hadoop, including input/output paths and job settings.

Uploaded by

kajalyadav102703

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views14 pages

3 MapReduce Program Ex Code

Uploaded by

kajalyadav102703

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 14

MapReduce Program

Explanation of MapReduce Program

The entire MapReduce program can be fundamentally divided into three

parts:

Mapper Phase Code

Reducer Phase Code

Driver Code
Mapper code:
public static class Map extends
Mapper<LongWritable,Text,Text,IntWritable> {
public void map(LongWritable key, Text value, Context context)
throws IOException,InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
value.set(tokenizer.nextToken());
context.write(value, new IntWritable(1));
}
 We have created a class Map that extends the class Mapper which is
already defined in the MapReduce Framework.
 We define the data types of input and output key/value pair after the
class declaration using angle brackets.
 Both the input and output of the Mapper is a key/value pair.
 Input:
 The key is nothing but the offset of each line in the text file: LongWritable
 The value is each individual line (as shown in the figure at the right): Text
 Output:
 The key is the tokenized words: Text
 We have the hardcoded value in our case which is 1: IntWritable
 Example – Dear 1, Bear 1, etc.
 We have written a java code where we have tokenized each word and
assigned them a hardcoded value equal to 1.
Reducer Code:
public static class Reduce extends Reducer<Text,IntWritable,Text,IntWritable>
{
public void reduce(Text key, Iterable<IntWritable> values,Context context)
throws IOException,InterruptedException {
int sum=0;
for(IntWritable x: values)
{
sum+=x.get();
}
context.write(key, new IntWritable(sum));
}
}
 We have created a class Reduce which extends class Reducer like that of Mapper.
 We define the data types of input and output key/value pair after the class
declaration using angle brackets as done for Mapper.
 Both the input and the output of the Reducer is a key-value pair.
 Input:
 The key nothing but those unique words which have been generated after the sorting
and shuffling phase: Text
 The value is a list of integers corresponding to each key: IntWritable
 Example – Bear, [1, 1], etc.
 Output:
 The key is all the unique words present in the input text file: Text
 The value is the number of occurrences of each of the unique words: IntWritable
 Example – Bear, 2; Car, 3, etc.
 We have aggregated the values present in each of the list corresponding to each
key and produced the final answer.
 In general, a single reducer is created for each of the unique words, but, you can
specify the number of reducer in mapred-site.xml.
Driver Code:

 Configuration conf= new Configuration();

 Job job = new Job(conf,"My Word Count Program");
 job.setJarByClass(WordCount.class);
 job.setMapperClass(Map.class);
 job.setReducerClass(Reduce.class);
 job.setOutputKeyClass(Text.class);
 job.setOutputValueClass(IntWritable.class);
 job.setInputFormatClass(TextInputFormat.class);
 job.setOutputFormatClass(TextOutputFormat.class);
 Path outputPath = new Path(args[1]);
 //Configuring the input/output path from the filesystem into the job
 FileInputFormat.addInputPath(job, new Path(args[0]));
 FileOutputFormat.setOutputPath(job, new Path(args[1]));
 In the driver class, we set the configuration of our MapReduce job to
run in Hadoop.
 We specify the name of the job, the data type of input/output of the
mapper and reducer.
 We also specify the names of the mapper and reducer classes.
 The path of the input and output folder is also specified.
 The method setInputFormatClass () is used for specifying how a
Mapper will read the input data or what will be the unit of work. Here,
we have chosen TextInputFormat so that a single line is read by the
mapper at a time from the input text file.
 The main () method is the entry point for the driver. In this method,
we instantiate a new Configuration object for the job.
 package co.edureka.mapreduce;

 import java.io.IOException;

 import java.util.StringTokenizer;

 import org.apache.hadoop.io.IntWritable;

 import org.apache.hadoop.io.LongWritable;

 import org.apache.hadoop.io.Text;

 import org.apache.hadoop.mapreduce.Mapper;

 import org.apache.hadoop.mapreduce.Reducer;

 import org.apache.hadoop.conf.Configuration;

 import org.apache.hadoop.mapreduce.Job;

 import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;

 import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

 import org.apache.hadoop.fs.Path;

 public class WordCount{

 public static class Map extends Mapper<LongWritable,Text,Text,IntWritable> {
 public void map(LongWritable key, Text value,Context context) throws
IOException,InterruptedException{
 String line = value.toString();
 StringTokenizer tokenizer = new StringTokenizer(line);
 while (tokenizer.hasMoreTokens()) {
 value.set(tokenizer.nextToken());
 context.write(value, new IntWritable(1));}}

 public static class Reduce extends
Reducer<Text,IntWritable,Text,IntWritable> {
 public void reduce(Text key, Iterable<IntWritable>
values,Context context) throws IOException,InterruptedException {
 int sum=0;
 for(IntWritable x: values)
 {
 sum+=x.get();
 }
 context.write(key, new IntWritable(sum));
 }
 }
public static void main(String[] args) throws Exception {
Configuration conf= new Configuration();
Job job = new Job(conf,"My Word Count Program");
job.setJarByClass(WordCount.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
Path outputPath = new Path(args[1]);
//Configuring the input/output path from the filesystem into the job
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
//deleting the output path automatically from hdfs so that we don't have to delete it
explicitly
outputPath.getFileSystem(conf).delete(outputPath);
//exiting the job only if the flag value becomes false
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
Run the MapReduce code:

 The command for running a MapReduce code is:

hadoop jar hadoop-mapreduce-example.jar WordCount /sample/input

/sample/output
References

https://www.edureka.co/blog/mapreduce-tutorial/

https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html

https://www.tutorialspoint.com/hadoop/hadoop_mapreduce.htm

https://www.geeksforgeeks.org/mapreduce-understanding-with-real-life-e
xample/

Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
BDC Output 3
No ratings yet
BDC Output 3
4 pages
Experiment-4 BDA LAB
No ratings yet
Experiment-4 BDA LAB
7 pages
Palak
No ratings yet
Palak
10 pages
Big Data 4 Vivek
No ratings yet
Big Data 4 Vivek
3 pages
Hadoop Wordcount Program
No ratings yet
Hadoop Wordcount Program
20 pages
Advanced Mapreduce
No ratings yet
Advanced Mapreduce
37 pages
Big Data Practical 2
No ratings yet
Big Data Practical 2
11 pages
Ravikant Hadoop File
No ratings yet
Ravikant Hadoop File
22 pages
Word Count Example
No ratings yet
Word Count Example
4 pages
B1 Instructions
No ratings yet
B1 Instructions
9 pages
Sanoob BDA - 2
No ratings yet
Sanoob BDA - 2
4 pages
Kick Start Hadoop: Word Count - Hadoop Map Reduce Example
No ratings yet
Kick Start Hadoop: Word Count - Hadoop Map Reduce Example
13 pages
Developing A Mapreduce Application: by Dr. K. Venkateswara Rao Professor Department of Cse
No ratings yet
Developing A Mapreduce Application: by Dr. K. Venkateswara Rao Professor Department of Cse
83 pages
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
13 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements
7 pages
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
No ratings yet
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
5 pages
Hadoop and Map Reduce
No ratings yet
Hadoop and Map Reduce
27 pages
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
No ratings yet
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
5 pages
Map Reduce
No ratings yet
Map Reduce
57 pages
Hadoop Developingapps PDF
No ratings yet
Hadoop Developingapps PDF
17 pages
Lecture 04
No ratings yet
Lecture 04
25 pages
Bda Experiment No2
No ratings yet
Bda Experiment No2
12 pages
Map Reduce Programming
No ratings yet
Map Reduce Programming
67 pages
Part B Assignment - No - 1
No ratings yet
Part B Assignment - No - 1
6 pages
Example - (Map Function in Word Count)
No ratings yet
Example - (Map Function in Word Count)
6 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
MapReduce Programs
No ratings yet
MapReduce Programs
10 pages
Exp 4 Word Count
No ratings yet
Exp 4 Word Count
4 pages
BDA3
No ratings yet
BDA3
7 pages
Ravinder Big Data 4 PDF
No ratings yet
Ravinder Big Data 4 PDF
15 pages
Prerequisites: Single Node Setup Cluster Setup
No ratings yet
Prerequisites: Single Node Setup Cluster Setup
5 pages
Practical 2-1
No ratings yet
Practical 2-1
4 pages
DA Lab Program-2
No ratings yet
DA Lab Program-2
6 pages
Practical 2c
No ratings yet
Practical 2c
2 pages
Sanjith BDA 2
No ratings yet
Sanjith BDA 2
4 pages
Experiment 6 BDA
No ratings yet
Experiment 6 BDA
4 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
M4 06 MapReduce
No ratings yet
M4 06 MapReduce
28 pages
Hadoop MapReduce Flow Chart
No ratings yet
Hadoop MapReduce Flow Chart
28 pages
Bda Unit III r20csm
No ratings yet
Bda Unit III r20csm
54 pages
Execute Java Map Reduce Sample Using Eclipse
No ratings yet
Execute Java Map Reduce Sample Using Eclipse
9 pages
BDA University Questions
No ratings yet
BDA University Questions
10 pages
Classcreation
No ratings yet
Classcreation
2 pages
Mapreduce Programming Framework
No ratings yet
Mapreduce Programming Framework
23 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
58 pages
Parlab Parallel Boot Camp Cloud Computing With Mapreduce and Hadoop
No ratings yet
Parlab Parallel Boot Camp Cloud Computing With Mapreduce and Hadoop
49 pages
BDA-WORDCOUNT_250805_135324
No ratings yet
BDA-WORDCOUNT_250805_135324
5 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Hadoop Mapred
100% (1)
Hadoop Mapred
11 pages
ADA Lab Manual
No ratings yet
ADA Lab Manual
34 pages
Mapreduce: Simplified Data Processing On Large Clusters by Jeffrey Dean and Sanjay Ghemawa Presented by Jon Logan
No ratings yet
Mapreduce: Simplified Data Processing On Large Clusters by Jeffrey Dean and Sanjay Ghemawa Presented by Jon Logan
30 pages
Word Count Program
No ratings yet
Word Count Program
3 pages
BDF Programs
No ratings yet
BDF Programs
32 pages
Unit 2
No ratings yet
Unit 2
24 pages
Exp 3-Word Count
No ratings yet
Exp 3-Word Count
4 pages
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
No ratings yet
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
22 pages
Source Code For Wordcount
No ratings yet
Source Code For Wordcount
3 pages
Unit 4 BDA
No ratings yet
Unit 4 BDA
31 pages
Practical 3bcbs
No ratings yet
Practical 3bcbs
5 pages
Luis Fernando Trueba (CV)
No ratings yet
Luis Fernando Trueba (CV)
1 page
Madhubabu - Shivangi - PDF
No ratings yet
Madhubabu - Shivangi - PDF
228 pages
Implementation of Three Address Code
No ratings yet
Implementation of Three Address Code
9 pages
Experiment 3 Python
No ratings yet
Experiment 3 Python
4 pages
Object Oriented Modeling and Design (9166) - Sample Paper of MSBTE For Sixth Semester Final Year Computer Engineering Diploma (80 Marks)
0% (1)
Object Oriented Modeling and Design (9166) - Sample Paper of MSBTE For Sixth Semester Final Year Computer Engineering Diploma (80 Marks)
2 pages
Laboratory Manual: Communication Systems
No ratings yet
Laboratory Manual: Communication Systems
12 pages
Apache VS16 Binaries and Modules Download
No ratings yet
Apache VS16 Binaries and Modules Download
2 pages
Green Lantern Automation Framework For Testcomplete: Guide
No ratings yet
Green Lantern Automation Framework For Testcomplete: Guide
13 pages
Sharpen Up On C#
No ratings yet
Sharpen Up On C#
19 pages
Imaster NCE V100R020C10 REST NBI User Guide 10
100% (1)
Imaster NCE V100R020C10 REST NBI User Guide 10
110 pages
PLC Workshop Suite For Siemens S5
No ratings yet
PLC Workshop Suite For Siemens S5
3 pages
SFIN Pre-Check Report For Migrating To New Asset Accouting
No ratings yet
SFIN Pre-Check Report For Migrating To New Asset Accouting
5 pages
Cooper Caruso Resume 2023
No ratings yet
Cooper Caruso Resume 2023
1 page
Surekha K Summary
No ratings yet
Surekha K Summary
3 pages
Microsoft SQL 2019 Hardening Guide v1.1
No ratings yet
Microsoft SQL 2019 Hardening Guide v1.1
11 pages
POSIX Threads
No ratings yet
POSIX Threads
20 pages
Flask Documentation (PDFDrive)
No ratings yet
Flask Documentation (PDFDrive)
259 pages
Hema Lakshmi Siva Meghana Udathu - Latest Resume-2
No ratings yet
Hema Lakshmi Siva Meghana Udathu - Latest Resume-2
1 page
Flex Notes
No ratings yet
Flex Notes
1 page
An Operating Systems Course: With Projects in Java
No ratings yet
An Operating Systems Course: With Projects in Java
3 pages
FlexLine GeoCOM Manual en PDF
No ratings yet
FlexLine GeoCOM Manual en PDF
131 pages
Opps
No ratings yet
Opps
7 pages
IBM Debug Tools zOS
No ratings yet
IBM Debug Tools zOS
23 pages
Getting Started With Swift PDF
No ratings yet
Getting Started With Swift PDF
11 pages
R 2 Book
No ratings yet
R 2 Book
414 pages
cs309 Asgn1-1
No ratings yet
cs309 Asgn1-1
2 pages
Ranjani Resume
No ratings yet
Ranjani Resume
4 pages
Walker A Kennedy W Go in Action Second Edition Meap v1
No ratings yet
Walker A Kennedy W Go in Action Second Edition Meap v1
59 pages
Saurabh Pandey Resume
No ratings yet
Saurabh Pandey Resume
1 page
Bugreport BF7 GL SP1A.210812.016 2023 10 25 14 15 27 Dumpstate - Log 9159
No ratings yet
Bugreport BF7 GL SP1A.210812.016 2023 10 25 14 15 27 Dumpstate - Log 9159
38 pages

3 MapReduce Program Ex Code

Uploaded by

3 MapReduce Program Ex Code

Uploaded by

MapReduce Program

Explanation of MapReduce Program

The entire MapReduce program can be fundamentally divided into three

Mapper Phase Code

Reducer Phase Code

 Configuration conf= new Configuration();

 public class WordCount{

 The command for running a MapReduce code is:

hadoop jar hadoop-mapreduce-example.jar WordCount /sample/input

You might also like