July 2, 2024
Tutorials

How to Install Hadoop on Ubuntu 22.04

How to Install Hadoop on Ubuntu 22

Hadoop is an effective tool for handling large data. It splits big datasets into smaller chunks for processing across multiple computer systems. Hadoop is utilized in various fields for tasks such as data analysis, machine learning, and more.

Through this manual, you will learn the procedure to set up Hadoop on Linux-based systems (i.e.Ubuntu 22.04).

How to Install Java Development Kit (JDK) for Hadoop on Ubuntu 22.04?

First thing first, using the given below step install Java for Hadoop on your Ubuntu 22.04 operating system.

Step 1: Update Ubuntu Repository

Initiate the installation process by updating your Ubuntu package list:

sudo apt update

Make sure you have the above message after executing the command.

Step 2: Install Java Package

Hadoop is dependent on Java, so the first step is to install the default version of the Java Development Kit (JDK) on your system:

sudo apt install default-jdk

Step 3: Verify Java installation

The Java installation can be confirmed using the command:

java -version

How to Install Hadoop on Ubuntu 22.04?

In this section, operate these commands to install Hadoop on your Linux system like Ubuntu 22.04.

Step 1: Access the Hadoop Binary File

Navigate to the official Apache Hadoop website to access the latest stable release of the Hadoop binary file, which will be downloaded and installed on your system:

https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.4.0/hadoop-3.4.0.tar.gz

Step 2: Download Hadoop Binary File (.tar)

After copying the link, use the wget command to download the Hadoop binary file in tar extension:

wget https://dlcdn.apache.org/hadoop/common/hadoop-3.4.0/hadoop-3.4.0.tar.gz

The binary file named hadoop-3.4..0.tar.gz has been saved to your current path.

Step 3: Extract Hadoop Binary File (.tar)

To utilize the Hadoop file, first, you need to extract the tar file through the command:

tar -xzf hadoop-3.4.0.tar.gz

Step 4: List Hadoop After Extraction

After the successful extraction of the Hadoop tar file, let’s verify the file using the command:

ls

You will see the “hadoop-3.4.0” folder after extraction.

Step 5: Move the Hadoop to “/usr/local”

Now, let’s move the extracted Hadoop file to the local path:

sudo mv hadoop-3.4.0 /usr/local/hadoop

Step 6: Locate Java Path

The command will show the absolute path to the Java executable files:

readlink -f /usr/bin/java | sed "s:bin/java::"

This path is necessary for configuring the Hadoop package.

Step 7: Modify the Hadoop Configuration File

Open the “hadoop-env.sh” file, which is located at the “/usr/local/hadoop/etc/hadoop/” using Nano editor:

sudo nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

Locate “export JAVA_HOME=” in the Hadoop configuration file, usually placed in line number 52. Add these lines and uncomment the relevant line of code according to your need:

# For static Location: 
#export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64/ 
# For dynamic location: 
export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::")

Note: To display line numbers in the Nano text editor, you can press the combination “Alt + N”

Step 8: Verify Hadoop Installation

To check if the Hadoop is installed correctly by using the command:

/usr/local/hadoop/bin/hadoop version

You will see the Hadoop version details (i.e. Hadoop 3.4.0) in the output if Hadoop is installed on your Ubuntu 22.04 system.

Conclusion

Hadoop can be installed on Linux-based systems, including Ubuntu 22.04. Hadoop is dependent on Java, so install the default JDK first. Then, download and extract the Hadoop binary file. Finally, update the Hadoop configuration file with “export JAVA_HOME=$(readlink -f /usr/bin/java | sed “s:bin/java::”)”.

Leave feedback about this

  • Quality
  • Price
  • Service

PROS

+
Add Field

CONS

+
Add Field
Choose Image
Choose Video