Bda A2
Bda A2
Bda A2
Hadoop
A hadoop cluster architecture consists of a data centre, rack and the node
that actually executes the jobs. Data centre consists of the racks and racks
consists of nodes. A medium to large cluster consists of a two or three
level hadoop cluster architecture that is built with rack mounted servers.
Every rack of servers is interconnected through 1 gigabyte of Ethernet (1
GigE). Each rack level switch in a hadoop cluster is connected to a cluster
level switch which are in turn connected to other cluster level switches or
they uplink to other switching infrastructure.
fails then the replicated copy of the data present on the other node in the
cluster can be used for analysis.
Execution:
this website:
https://hadoop.apache.org/release/
3.3.6.html
2. Extract this tar file using tar command:
$ mv hadoop-3.3.4 hadoop
4. Open the hadoop-env.sh file in the nano editor and edit the
JAVA_HOME variable
JAVA_HOME
7. Create a hadoop user
$ source ~/.bashrc
15. Now, open the core-site.xml file and add the following configurations
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
16. Now, open the hdfs-site.xml file and add the following configurations
as shown:
<configuration>
<property>
<name>dfs.name.dir</name><value>/usr/local/hadoop/data/nameNo
de</ value>
</property>
<property>
<name>dfs.data.dir</name><value>/usr/local/hadoop/data/dataNod
e</ value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
17. Now, open the mapred-site.xml file and add the following
configurations as shown:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hadoop-master:9001</value>
</property>
</configuration>
18. Now, on master, Open the workers file:
19. Use the following command to copy the files to slave node
$ scp -r /usr/local/hadoop/* slave1:/usr/local/hadoop
20.Now, on the slave1 open the yarn-site.xml file using following
command:
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop-master</value>
</property>
</configuration>
21. Now we need to format the HDFS file system. Run this
command on the master node:
Conclusion:
In conclusion, this experiment helped me learn how to set up a multi-node
Hadoop cluster. By configuring different parts of Hadoop and
understanding how it handles errors, I gained valuable skills for managing
large amounts of data. Doing tasks like formatting the HDFS file system
also gave me hands-on experience in working with big data systems. This
experiment improved my understanding of Hadoop's structure and taught
me how to build and manage data processing setups effectively.