BIGDATA
Execute Java programs (.JAR files) from MapRedue.
To make sure our Eclipse is working with the simple program.
Click on Eclipse
To create a java program in eclipse, we need to create a below three objects.
1. Java Project
2. Package
3. Class (where we need to create a java programs)
Step 1:
Right click on the Package Explorer (Left side Pane)
New -> Java Project
Give the project name as "welcome" and Click Finish.
Note: Now the project "welcome" will be created in the Package Explorer.
1 sairavi.bigdata@gmail.com
99520 29030
BIGDATA
Step 2:
Right click on the project "welcome” New -> Package
Give the package name as "welcome" against the Name
Make sure Source Folder as "welcome\src" and click Finish.
Note: Now the package name "welcome" will be created in the package source folder.
Step 3:
Right click on the project "welcome" New -> Class
Give the class name as "welcome" against the Name and
make sure Source folder as "welcome\src" and
Package as "welcome" and
make sure Public static void main box checked then click Finish.
Note: Now class "welcome" will be created under the package.
The created class file will be shown like below..
Note: Add the highlighted print statement under void main.
package welcome;
public class welcome {
/**
* @param args
*/
public static void main(String[] args)
{
System.out.println("Welcome");
}
Make sure no error shown in Class page.
To run the file, right click on the class page Run As -> Java Application
2 sairavi.bigdata@gmail.com
99520 29030
BIGDATA
Note: If your class file is error free, you will be able to see the "Welcome" on the result pane in the bottom.
Create and Execute .JAR file from MapReduce.
Program description: To count the repeated words from the file
Pre Requisites:
Copy the below files and paste into Cloudera's Home and unzip/Extract the "hadoop jars' file.
Step 1:
Right click on the Package Explorer (Left side Pane)
New -> Java Project
Give the project name as "wordcount" and Click finish.
Note: Now the project " wordcount " will be created in the Package Explorer.
Step 2:
Right click on the project " wordcount " New -> Package
Give the package name as " wordcount " against the Name
Make sure Source Folder as " wordcount \src" and click Finish.
Note: Now the package name " wordcount " will be created in the package source folder.
Step 3:
Right click on the project " wordcount " New -> Class
Give the class name as " wordcount " against the Name and
make sure Source folder as " wordcount \src" and
Package as " wordcount " and
make sure Public static void main box checked then click Finish.
Note: Now class " wordcount " will be created unser the package.
The created class file will be shown like below..
3 sairavi.bigdata@gmail.com
99520 29030
BIGDATA
package wordcount;
public class wordcount {
/**
* @param args
*/
public static void main(String[] args) {
Open the file wordcount.java and copy all the contents and
Replace it in Eclipse wordcount.java class file. Please make sure the first line would be " package
wordcount;" and Save it (Ctrl+S).
Note: Now we could see there are lots of red line in the Class script and it is expecting JAR files reference.
To add the reference files, right click on the project "wordcount" from
Package explorer -> Properties ->
select Java Build Path from Left pane and select Libraries from the Right side pane and click on
Add External Jars and browse the folder "hadoop_jars" which we extracted/unzipped in Cloudera's
Home path and select all the 10 supporting .jar files and click Ok to completed.
Note: Make sure all there underlined errors are removed and class became error free.
Now right click on the project "wordcount" from the package explorer
-> Export -> Expand Java -> select JAR File
-> click Next -> select "wordcount" from the left pane and click "Browse"
-> Provide the File name as "wordcount.jar"
-> Provide the Folder path as "/home/cloudera"
-> Next -> Finish.
Note: Now you could see the created jar file "wordcount.jar" created in "/home/cloudera/"
Now we are into execute the program from MapReduce.
Go to "/home/cloudera/" folder then right clicks Open Terminal
4 sairavi.bigdata@gmail.com
99520 29030
BIGDATA
[Coudera@Localhost ~]$ pwd
home/Cloudera -- Present Working Directory
Create a test file
$ cat > test.text
this is a test file
this
a
file
Ctrl+D to save
Now test.txt file is created with above mentioned four lines.
$ hadoop fs -ls / --It will list the folders which are avail.
Now place the file into hadoop
$ hadoop fs -put test.txt /tmp
$ hadoop fs -ls /tmp -- Now you cound see the test.txt file which you created is moved to hadoop.
Below command to execute the jar file in MapReduce.
$ hadoop jar wordcount.jar wordcount.wordcount /tmp/test.txt Target
Map Reduce program will run and to get the word count from the test file and stores the result into
Target folder.
$ hadoop fs -ls /user/cloudera/Target
Now you could see the file part-00000
$ hadoop fs -cat /user/cloudera/Target/part-00000
a 2
file 2
is 1
sample 1
this 2
Note: Similarly we need to other jar files as well with MapReduce program.
5 sairavi.bigdata@gmail.com
99520 29030