Hadoop Installation
Hadoop Installation
Hadoop Installation
https://www.vultr.com/docs/install-and-configure-apache-hadoop-on-ubuntu-20-04
1. Install Java
$ java -version
$ sudo su - hadoop
$ sudo su - hadoop
$ ssh-keygen -t rsa
$ ssh localhost
$ sudo su - hadoop
Download the latest stable version of Hadoop. To get the latest version, go to Apache
Hadoop official download page.
4. Configure Hadoop
Add the following lines to the file. Save and close the file.
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
$ source ~/.bashrc
Hadoop has a lot of components that enable it to perform its core functions. To configure
these components such as YARN, HDFS, MapReduce, and Hadoop-related project settings,
you need to define Java environment variables in hadoop-env.sh configuration file.
Sham Shul Shukri Mat
Find the Java path.
$ which javac
$ readlink -f /usr/bin/javac
Add the following lines to the file. Then, close and save the file.
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
export HADOOP_CLASSPATH+=" $HADOOP_HOME/lib/*.jar"
$ cd /usr/local/hadoop/lib
$ hadoop version
Edit the core-site.xml configuration file to specify the URL for your NameNode.
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://0.0.0.0:9000</value>
<description>The default file system URI</description>
</property>
</configuration>
Edit hdfs-site.xml configuration file to define the location for storing node metadata, fs-
image file.
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hdfs/datanode</value>
</property>
</configuration>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
$ sudo su - hadoop
$ start-dfs.sh
$ start-yarn.sh
$ jps
You can access the Hadoop NameNode on your browser via http://server-IP:9870. For
example:
http://192.0.2.11:9870