0% found this document useful (0 votes)
4 views

basic HDFS commands

The document provides a comprehensive guide on using basic HDFS commands, including how to check Hadoop version, list directories, manage files, and manipulate permissions. It also covers creating directories, moving files, and using commands like 'fsck' and 'stat' for file system checks and information retrieval. Additionally, it outlines steps for setting up a MapReduce program in Eclipse, including project creation and configuration of necessary JAR files.

Uploaded by

reena naaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

basic HDFS commands

The document provides a comprehensive guide on using basic HDFS commands, including how to check Hadoop version, list directories, manage files, and manipulate permissions. It also covers creating directories, moving files, and using commands like 'fsck' and 'stat' for file system checks and information retrieval. Additionally, it outlines steps for setting up a MapReduce program in Eclipse, including project creation and configuration of necessary JAR files.

Uploaded by

reena naaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Understanding and using basic HDFS commands

[cloudera@localhost ~]$ su root

Password: training

[root@localhost cloudera]# jps

[root@localhost cloudera]# su training

[cloudera@localhost ~]$ gedit comedy

Hi How are you

Hi Hello

Save & Exit--------> Ctrl+S and Ctrl+Q

1. Print the Hadoop version

[cloudera@localhost ~]$ hadoop version

Hadoop 0.20.2-cdh3u2
Subversion file:///tmp/topdir/BUILD/hadoop-0.20.2-cdh3u2 -r
95a824e4005b2a94fe1c11f1ef9db4c672ba43cb
Compiled by root on Thu Oct 13 21:51:41 PDT 2011
From source with checksum 644e5db6c59d45bca96cec7f220dda51

2. List the contents of the root directory in HDFS

[cloudera@localhost ~]$ hadoop fs -ls /

Found 6 items

drwxr-xr-x - hbase supergroup 0 2023-10-10 21:32 /hbase

drwxr-xr-x - training supergroup 0 2014-05-05 00:09 /home


drwxrwxrwx - hue supergroup 0 2014-06-10 10:54 /tmp

drwxr-xr-x - hue supergroup 0 2023-10-10 03:00 /user

drwxr-xr-x - training supergroup 0 2014-05-04 23:21 /usr

drwxr-xr-x - mapred supergroup 0 2023-10-08 23:04 /var

3. Report the amount of space used and available on currently mounted filesystem

[cloudera@localhost ~]$ hadoop fs -df /

Filesystem Size Used Avail Use%

/ 18611908608 224362496 11882389504 1%

4. Count the number of directories, files and bytes under the paths that match the
specified file pattern

[cloudera@localhost ~]$ hadoop fs -count /

2620 2777 199719447 hdfs://localhost/

5. Create a new directory named “Vinay” below the /user/training directory in


HDFS

[cloudera@localhost ~]$ hadoop fs -mkdir /user/cloudera/datascience

[cloudera@localhost ~]$ hadoop fs –ls /user/cloudera/datascience

6. Add a text file from the local directory to the new directory hadoop in HDFS

[cloudera@localhost ]$ hadoop fs -put comedy.txt


/user/cloudera/datascience

7. See how much space this directory occupies in HDFS.


[cloudera@localhost data]$ hadoop fs -du .

8. Delete a directory ‘datascience’ and content inside also from the /user/cloudera

[cloudera@localhost ]$ hadoop fs -rmr datascience

9. To empty the trash

[cloudera@localhost ]$ hadoop fs –expunge

10. copyFromLocal---Add a .txt file from the local directory to the Vinay directory
you created in HDFS.

[cloudera@localhost ]$ hadoop fs
copyFromLocal comedy.txt /user/cloudera/datascience

11. To view the contents of your text file

[cloudera@localhost ]$ hadoop fs -cat /user/cloudera/datascience/comedy

I love hadoop
Hadoop love you

12. CopyToLocal----Add the txt file from “datascience” directory which is present
in HDFS directory to the local directory

[cloudera@localhost ~]$ hadoop fs –copyToLocal


/user/cloudera/datascience/comedy /home/cloudera/Sample

13. cp is used to copy files between directories present in HDFS

[cloudera@localhost ~]$ hadoop fs -cp


/user/cloudera/datascience/comedy Data

14. ‘-get’ command can be used alternaively to ‘-copyToLocal’ command


[cloudera@localhost]$ hadoop fs -get /user/cloudera/datascience/comedy
/home/cloudera/datatemp

15. Display last kilobyte of the file “.txt” to stdout.

[cloudera@localhost ]$ hadoop fs -tail /user/cloudera/datascience/comedy

I love hadoop
Hadoop love you

16. # Use ‘-chmod’ command to change permissions of a file

[cloudera@localhost ]$ hadoop fs -ls /datascience


Found 1 items
-rw-r–r– 1 training supergroup 51553 2015-07-06 09:58
/user/training/Vinay/comedy
[cloudera@localhost ]$ hadoop fs -chmod 600 datascience/comedy
[cloudera@localhost ]$ hadoop fs -ls datascience
Found 1 items
-rw——- 1 training supergroup 51553 2015-07-06 09:58
/user/training/Vinay/comedy

17. Default names of owner and group are training,training


Use ‘-chown’ to change owner name and group name simultaneously

[cloudera@localhost ]$ hadoop fs -ls /user/cloudera/datascience/comedy

[cloudera@localhost ]$ sudo -u hdfs hadoop fs -chown root:root


/user/cloudera/datascience/comedy

18. Default name of group is training


Use ‘-chgrp’ command to change group name

[cloudera@localhost ]$ hadoop fs -ls /user/cloudera/datascience/comedy.txt

[cloudera@localhost ]$ sudo -u hdfs hadoop fs -chgrp cloudera


/user/cloudera/datascience/comedy

[cloudera@localhost ]$ hadoop fs –ls /user/cloudera/datascience/comedy


19. Move a file from one directory to other directory

[cloudera@localhost ~]$ hadoop fs –mkdir datascience1

[cloudera@localhost ~]$ hadoop fs -mv


/user/cloudera/datascience/comedy /user/cloudera/datascience1

20. Use ‘-setrep’ command to change replication factor of a file

[cloudera@localhost ~]$ hadoop fs -setrep -w 2


/user/cloudera/datascience/comedy

[cloudera@localhost ~]$ hadoop fs –ls /use/cloudera/datascience1/comedy

21. touchz-----> To create a empty file in your specified directory

[cloudera@localhost ~]$ hadoop fs –touchz /user/cloudera/datascience/comedy1

22. fsck: this command is used to check the health of the files in Hadoop File
System

The different formats used for fsck

[cloudera@localhost ~]$ hadoop fsck /user/cloudera/datascience

[cloudera@localhost ~]$ hadoop fsck /user/cloudera/datasciencde -racks

[cloudera@localhost ~]$ hadoop fsck /user/cloudera/Vinay -fiels

[cloudera@localhost ~]$ hadoop fsck /user/cloudera/Vinay -blocks

[cloudera@localhost ~]$ hadoop fsck /user/cloudera/Vinay -Locations

23. stat: stat Command is used to print the information about a file from the
directory, it prints the static information about the file. It has different format such
as

[cloudera@localhost ~]$ hadoop fs -stat %b


/user/cloudera/datascience/comedy

It specifies the file size in bytes


[cloudera@localhost ~]$ hadoop fs -stat %n
/user/cloudera/datascience/comedy

Comedy.txt It specifies the File Name

[cloudera@localhost ~]$ hadoop fs -stat %o


/user/cloudera/datascience/comedy.txt

67108864 It Specifies the Block size

[cloudera@localhost ~]$ hadoop fs -stat %r


/user/cloudera/datascience/comedy.txt

1 It specifies the reputation of the file

[cloudera@localhost ~]$ hadoop fs -stat %y


/user/cloudera/datascience/comedy.txt

2023-09-11 16:08:04 It specifies the modification date

Map Reduce Program Eclipse Steps:

1. Create Java Project

File--> New --> java project

Ex: Vinay

Expand Vinay  Src and JRE System Library

2. Configure jar files.

Vinay --> Src --> Build Path --> Configure build Path

---> Libraries ---> Add External jar files.

3. Select following jars.

/usr/lib/hadoop-0.20/hadoop-core.jar

/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar

4. Create package:
Vinay --> Src --> New --> Package

Ex: analytics

5. Create java class

Vinay --> Src --> analytics --> New --> Class

Ex: WriteToHdfs

You might also like