Open navigation menu

Scribd

0% found this document useful (0 votes)

55 views

Module10-BigData Guide v1.0

The document discusses developing and running a MapReduce program on AWS Elastic MapReduce. It provides steps to create a MapReduce Java project in Eclipse with Map and Reduce classes, package it into a JAR file, and then run the JAR on EMR by specifying the input and output locations.

Uploaded by

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views

Module10-BigData Guide v1.0

The document discusses developing and running a MapReduce program on AWS Elastic MapReduce. It provides steps to create a MapReduce Java project in Eclipse with Map and Reduce classes, package it into a JAR file, and then run the JAR on EMR by specifying the input and output locations.

Uploaded by

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

AWS Development Certification Training www.edureka.

co/aws-development

Module 10: Big Data and Analytics

Hands-on

© Brain4ce Education Solutions Pvt. Ltd.

Module 10: Big Data and Analytics www.edureka.co/aws-development

Table of Contents
Installing and developing a mapreduce program .......................................................... 2
Running an Elastic MapReduce job .............................................................................. 5

©Brain4ce Education Solutions Pvt. Ltd Page 1

Module 10: Big Data and Analytics www.edureka.co/aws-development

Installing and developing a MapReduce program

Step 1: Create a Custom JAR as below

 In Eclipse (or whatever the IDE you are using), Create simple Java Project with name
"WordCount".
 Create a java class name Map and override the map method as below:

Map.java
public class Map extends Mapper<longwritable, text,="" intwritable=""> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
@Override
public void map (LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
context.write(word, one);
}
}
}

©Brain4ce Education Solutions Pvt. Ltd Page 2

Module 10: Big Data and Analytics www.edureka.co/aws-development

Create a java class named Reduce and override the reduce method as below

Reduce.java
public class Reduce extends Reducer<text, intwritable,="" text,="" intwritable=""> {
@Override
protected void reduce(Text key, java.lang.Iterable<intwritable> values,
org.apache.hadoop.mapreduce.Reducer<text, intwritable,="" text,="" intwritable="">.Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable value : values) {
sum += value.get();
}
context.write(key, new IntWritable(sum));
}
}

Create a java class named WordCount and defined the main method as below

WordCount.java
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "wordcount");
job.setJarByClass(WordCount.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}

©Brain4ce Education Solutions Pvt. Ltd Page 3

Module 10: Big Data and Analytics www.edureka.co/aws-development

Export the WordCount program in a jar using eclipse and save it to some location on
disk. Make sure that you have provided the Main Class (WordCount.jar)

©Brain4ce Education Solutions Pvt. Ltd Page 4

Module 10: Big Data and Analytics www.edureka.co/aws-development

Running an Elastic MapReduce job

Step 2: Run a mapreduce job with a custom jar

 Sign in to the AWS Management Console and open the Amazon Elastic MapReduce
console at https://console.aws.amazon.com/elasticmapreduce/
 Click Create New Job Flow.
 In the DEFINE JOB FLOW page, enter the following details,
» Job Flow Name = WordCountJob
» Select Run your own application
» Select Custom JAR in the drop-down list
» Click Continue
 In the SPECIFY PARAMETERS page, enter values in the boxes using the following
table as a guide, and then click Continue.
» JAR Location = bucketName/jarFileLocation
» JAR Arguments = s3n://bucketName/inputFileLocation, s3n://bucketName/outputpath

©Brain4ce Education Solutions Pvt. Ltd Page 5

You might also like

Learn JavaScript in 24 Hours
From Everand
Learn JavaScript in 24 Hours
Alex Nordeen
3.5/5 (5)
DURGA Unix Material
81% (21)
DURGA Unix Material
81 pages
Linux Lab Manual
No ratings yet
Linux Lab Manual
373 pages
Object Oriented Programming Project Report
No ratings yet
Object Oriented Programming Project Report
23 pages
Create A Paint Bucket Tool in HTML5 and JavaScript
No ratings yet
Create A Paint Bucket Tool in HTML5 and JavaScript
5 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Core Java Programming Book
From Everand
Core Java Programming Book
Manish Soni
No ratings yet
Map Reduce
No ratings yet
Map Reduce
57 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
BDA Lab 8 Manual
No ratings yet
BDA Lab 8 Manual
7 pages
BDC Output 3
No ratings yet
BDC Output 3
4 pages
CS-702 (D) BigData
No ratings yet
CS-702 (D) BigData
61 pages
Mapreduce Programming Framework
No ratings yet
Mapreduce Programming Framework
23 pages
Week-8 de
No ratings yet
Week-8 de
9 pages
CS702_Big_Data_Programs
No ratings yet
CS702_Big_Data_Programs
58 pages
Example - (Map Function in Word Count)
No ratings yet
Example - (Map Function in Word Count)
6 pages
Big Data Akshat
No ratings yet
Big Data Akshat
57 pages
IGNOU PGDCA MCS 206 Object Oriented Programming using Java Previous Years solved Papers
From Everand
IGNOU PGDCA MCS 206 Object Oriented Programming using Java Previous Years solved Papers
Manish Soni
No ratings yet
JavaScript. A Comprehensive manual for creating dynamic, responsive websites and applications: Suitable For Both Novice And Experts.
From Everand
JavaScript. A Comprehensive manual for creating dynamic, responsive websites and applications: Suitable For Both Novice And Experts.
Abdulrazak Nugwa Ibrahim
5/5 (1)
Prerequisites: Single Node Setup Cluster Setup
No ratings yet
Prerequisites: Single Node Setup Cluster Setup
5 pages
Big Data Lab Manual
No ratings yet
Big Data Lab Manual
32 pages
MCTS 70-515 Exam: Web Applications Development with Microsoft .NET Framework 4 (Exam Prep)
From Everand
MCTS 70-515 Exam: Web Applications Development with Microsoft .NET Framework 4 (Exam Prep)
Eddie Vi
4/5 (1)
Big Data 4 Vivek
No ratings yet
Big Data 4 Vivek
3 pages
The Definitive JavaScript Handbook: From Fundamentals to Cutting‑Edge Best Practices
From Everand
The Definitive JavaScript Handbook: From Fundamentals to Cutting‑Edge Best Practices
Aarav Joshi
No ratings yet
Unit IV Programming Model
No ratings yet
Unit IV Programming Model
30 pages
How to a Developers Guide to 4k: Developer edition, #3
From Everand
How to a Developers Guide to 4k: Developer edition, #3
Xinc Cyberwizard
No ratings yet
Hands-On Exercises With Big Data: Lab Sheet 1: Getting Started With Mapreduce and Hadoop
No ratings yet
Hands-On Exercises With Big Data: Lab Sheet 1: Getting Started With Mapreduce and Hadoop
14 pages
bda megh
No ratings yet
bda megh
50 pages
50 Recipes for Programming Node.js
From Everand
50 Recipes for Programming Node.js
Jamie Munro
3/5 (4)
Ravinder Big Data 4 PDF
No ratings yet
Ravinder Big Data 4 PDF
15 pages
3 MapReduce program ex code
No ratings yet
3 MapReduce program ex code
14 pages
Azure For Starters
From Everand
Azure For Starters
Chinmoy Mukherjee
No ratings yet
Practical 2-1
No ratings yet
Practical 2-1
4 pages
Hadoop and Map Reduce
No ratings yet
Hadoop and Map Reduce
27 pages
Mastering NestJS: Comprehensive Guide to Building Scalable and Robust Node.js Applications
From Everand
Mastering NestJS: Comprehensive Guide to Building Scalable and Robust Node.js Applications
Aarav Joshi
No ratings yet
JAVASCRIPT FRONT END PROGRAMMING: Crafting Dynamic and Interactive User Interfaces with JavaScript (2024 Guide for Beginners)
From Everand
JAVASCRIPT FRONT END PROGRAMMING: Crafting Dynamic and Interactive User Interfaces with JavaScript (2024 Guide for Beginners)
DAISY JOHNSTON
No ratings yet
Developing A Mapreduce Application: by Dr. K. Venkateswara Rao Professor Department of Cse
No ratings yet
Developing A Mapreduce Application: by Dr. K. Venkateswara Rao Professor Department of Cse
83 pages
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
No ratings yet
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
22 pages
Unit 4 BDA
No ratings yet
Unit 4 BDA
31 pages
Parlab Parallel Boot Camp Cloud Computing With Mapreduce and Hadoop
No ratings yet
Parlab Parallel Boot Camp Cloud Computing With Mapreduce and Hadoop
49 pages
Ravikant_Hadoop_file
No ratings yet
Ravikant_Hadoop_file
22 pages
JavaScript Patterns JumpStart Guide (Clean up your JavaScript Code)
From Everand
JavaScript Patterns JumpStart Guide (Clean up your JavaScript Code)
Dan Wahlin
4.5/5 (3)
Enterprise Application Development with Ext JS and Spring
From Everand
Enterprise Application Development with Ext JS and Spring
Gerald Gierer
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
13 pages
6 WIBD-Practicals
No ratings yet
6 WIBD-Practicals
19 pages
Bda Unit-Iii
No ratings yet
Bda Unit-Iii
42 pages
MapReduce Tutorial
No ratings yet
MapReduce Tutorial
32 pages
MapReduce Tutorial
No ratings yet
MapReduce Tutorial
32 pages
Big Data Analysis 3170722 Lab Manual
No ratings yet
Big Data Analysis 3170722 Lab Manual
68 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
Palak
No ratings yet
Palak
10 pages
Introduction To MapReduce
No ratings yet
Introduction To MapReduce
9 pages
C# 2010 Coding Briefs Data Access
From Everand
C# 2010 Coding Briefs Data Access
Kevin Hough
No ratings yet
MICROSOFT AZURE ADMINISTRATOR EXAM PREP(AZ-104) Part-4: AZ 104 EXAM STUDY GUIDE
From Everand
MICROSOFT AZURE ADMINISTRATOR EXAM PREP(AZ-104) Part-4: AZ 104 EXAM STUDY GUIDE
Devi Prasad
No ratings yet
JavaScript: Igniting Business Growth Through Dynamic Web Development
From Everand
JavaScript: Igniting Business Growth Through Dynamic Web Development
Sachin Naha
No ratings yet
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
59 pages
Module2 C MapReduceParadigm
No ratings yet
Module2 C MapReduceParadigm
74 pages
Visual Basic 2010 Coding Briefs Data Access
From Everand
Visual Basic 2010 Coding Briefs Data Access
Kevin Hough
5/5 (1)
Execute Java Map Reduce Sample Using Eclipse
No ratings yet
Execute Java Map Reduce Sample Using Eclipse
9 pages
Bigdata Lab
No ratings yet
Bigdata Lab
55 pages
Hadoop Tutorial - YDN
No ratings yet
Hadoop Tutorial - YDN
14 pages
JavaScript Fundamentals: JavaScript Syntax, What JavaScript is Use for in Website Development, JavaScript Variable, Strings, Popup Boxes, JavaScript Objects, Function, and Event Handlers: JavaScript Syntax, What JavaScript is Use for in Website Development, JavaScript Variable, Strings, Popup Boxes, JavaScript Objects, Function, and Event Handlers
From Everand
JavaScript Fundamentals: JavaScript Syntax, What JavaScript is Use for in Website Development, JavaScript Variable, Strings, Popup Boxes, JavaScript Objects, Function, and Event Handlers: JavaScript Syntax, What JavaScript is Use for in Website Development, JavaScript Variable, Strings, Popup Boxes, JavaScript Objects, Function, and Event Handlers
Steven Bright
No ratings yet
JavaScript Fundamentals: JavaScript Syntax, What JavaScript is Use for in Website Development, JavaScript Variable, Strings, Popup Boxes, JavaScript Objects, Function, and Event Handlers
From Everand
JavaScript Fundamentals: JavaScript Syntax, What JavaScript is Use for in Website Development, JavaScript Variable, Strings, Popup Boxes, JavaScript Objects, Function, and Event Handlers
Steven Bright
No ratings yet
2020300053_BDA_EXP2_CHINMAY
No ratings yet
2020300053_BDA_EXP2_CHINMAY
7 pages
AWS s3 DG
No ratings yet
AWS s3 DG
749 pages
Module 8: Network and Security: Hands-On
No ratings yet
Module 8: Network and Security: Hands-On
9 pages
Module 11 AWS Best Practices: Introduction: Operational Checklist
No ratings yet
Module 11 AWS Best Practices: Introduction: Operational Checklist
10 pages
Module8 SecurityIAM PDF
No ratings yet
Module8 SecurityIAM PDF
11 pages
Module6 HA PDF
No ratings yet
Module6 HA PDF
7 pages
Module 7: Data Management Backup, DR, Test/Dev Environments
No ratings yet
Module 7: Data Management Backup, DR, Test/Dev Environments
9 pages
Module10 BigData PDF
No ratings yet
Module10 BigData PDF
4 pages
Module 12 Cost Optimization: New Topics
No ratings yet
Module 12 Cost Optimization: New Topics
6 pages
Git
No ratings yet
Git
20 pages
Set Up Your Real Time Chat App On Amazon Ec2 With Docker and Feathersjs
No ratings yet
Set Up Your Real Time Chat App On Amazon Ec2 With Docker and Feathersjs
23 pages
Cloud Computing New One Project
No ratings yet
Cloud Computing New One Project
77 pages
Docker k8s Lab
100% (1)
Docker k8s Lab
81 pages
Red Hat Enterprise Linux 7 Security Guide en US
No ratings yet
Red Hat Enterprise Linux 7 Security Guide en US
266 pages
Working With Unix
100% (1)
Working With Unix
210 pages
#### System Administration Topics ##
No ratings yet
#### System Administration Topics ##
3 pages
Red Hat Enterprise Linux-7-System Administrators Guide-En-US
No ratings yet
Red Hat Enterprise Linux-7-System Administrators Guide-En-US
601 pages
AWS Security Best Practices
50% (2)
AWS Security Best Practices
79 pages
Module6 HA PDF
No ratings yet
Module6 HA PDF
7 pages
100 Days of Devops - Day 58-Docker Basics: Run Applications With Containers"
No ratings yet
100 Days of Devops - Day 58-Docker Basics: Run Applications With Containers"
25 pages
Notes On Module 4: CLI + Assignments
No ratings yet
Notes On Module 4: CLI + Assignments
11 pages
Module8 SecurityIAM PDF
No ratings yet
Module8 SecurityIAM PDF
11 pages
Assignment 3: Vector and Hashset
No ratings yet
Assignment 3: Vector and Hashset
17 pages
LBguide Ipv4
No ratings yet
LBguide Ipv4
471 pages
Latihan 04 Operator Percabangan: Nama: Ahmad Naufal Syahbana Kelas: Informatika B-Pagi NIM: 210602062
No ratings yet
Latihan 04 Operator Percabangan: Nama: Ahmad Naufal Syahbana Kelas: Informatika B-Pagi NIM: 210602062
4 pages
Evading Detection - A Beginner's Guide To Obfuscation
No ratings yet
Evading Detection - A Beginner's Guide To Obfuscation
80 pages
Pmul Policy Language
No ratings yet
Pmul Policy Language
692 pages
Linh Nguyen Front End Resume
No ratings yet
Linh Nguyen Front End Resume
2 pages
Javajava
No ratings yet
Javajava
5 pages
Write A Program To Implement Quick Sort Algorithm.: (For Programming Based Labs)
No ratings yet
Write A Program To Implement Quick Sort Algorithm.: (For Programming Based Labs)
4 pages
Exception Handling PDF
No ratings yet
Exception Handling PDF
9 pages
System Life Cycle
No ratings yet
System Life Cycle
6 pages
Version: COBOL/Handout/0408/1.0 Date: 30-04-08
100% (1)
Version: COBOL/Handout/0408/1.0 Date: 30-04-08
208 pages
IT 118 - SIA - Module 1
No ratings yet
IT 118 - SIA - Module 1
8 pages
A Short History On Web Technologies
100% (1)
A Short History On Web Technologies
58 pages
Bhavika Computer Record
No ratings yet
Bhavika Computer Record
181 pages
Labsheet 5
No ratings yet
Labsheet 5
15 pages
Assemblers: - Two Functions: - Some Features: - Other Features
No ratings yet
Assemblers: - Two Functions: - Some Features: - Other Features
37 pages
M Tech Cloud Computing 2013
No ratings yet
M Tech Cloud Computing 2013
45 pages
File Handling in Python
No ratings yet
File Handling in Python
11 pages
Csqsao 10
No ratings yet
Csqsao 10
607 pages
Log
No ratings yet
Log
2 pages
CICD
No ratings yet
CICD
8 pages
Internet Programming CS8651 2 Marks & Part B
No ratings yet
Internet Programming CS8651 2 Marks & Part B
206 pages
Arba Minch Apparent
100% (1)
Arba Minch Apparent
31 pages
PPT1 - Introduction To Data Structure
No ratings yet
PPT1 - Introduction To Data Structure
37 pages
P54x1Z TC2 EN 1.2
No ratings yet
P54x1Z TC2 EN 1.2
8 pages
BASH Programming
No ratings yet
BASH Programming
151 pages
Passportautomation PDF
No ratings yet
Passportautomation PDF
20 pages
The Federal Polytechnic, Ado - Ekiti: An Assignment On Basic Computer Application
No ratings yet
The Federal Polytechnic, Ado - Ekiti: An Assignment On Basic Computer Application
34 pages