Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Hadoop 3.x High Availability (HA) Configuration: A Step-by-Step Guide

Tech 1

Prerequisites for Hadoop HA Configuration

  • JDK (version used: JDK 1.8; configure JDK environment variables independently)
  • ZooKeeper (version used: ZooKeeper 3.8.3)
  • Hadoop (version used: Hadoop 3.3.6)
  • Hadoop cluster configured with three nodes: master (primary), slave1 (secondary), slave2 (secondary)
  • Host IP mappings and passwordless SSH setup are omitted here
  • Modify paths in configuration files as needed
  • Critical: Disable firewalls on all nodes

ZooKeeper Configuration

(Configuration details omitted; refer to standard ZooKeeper deployment guides)

Hadoop HA Configuration

1. Configure Environment Variables (All Nodes)

Edit /etc/profile.d/bigdata_env.sh:

# Hadoop Environment Variables
export HADOOP_BASE=/opt/module/hadoop
export HADOOP_COMMON_DIR=$HADOOP_BASE
export HADOOP_HDFS_DIR=$HADOOP_BASE
export HADOOP_YARN_DIR=$HADOOP_BASE
export HADOOP_MAPRED_DIR=$HADOOP_BASE
export PATH=$HADOOP_BASE/bin:$HADOOP_BASE/sbin:$PATH

2. Hadoop HA Core Configuration (/etc/hadoop Diretcory)

2.1 Edit hadoop-env.sh

export JDK_HOME=/opt/module/jdk
export NN_USER=root
export DN_USER=root
export JOURNALNODE_USER=root
export SECONDARY_NN_USER=root

2.2 Edit yarn-env.sh

export JDK_HOME=/opt/module/jdk
export YARN_NM_USER=root
export YARN_RM_USER=root

2.3 Edit core-site.xml

<!-- Assemble multiple NameNode addresses into cluster 'mycluster' -->
<property>
  <name>fs.defaultFS</name>
  <value>hdfs://hdfscluster</value>
</property>

<!-- Directory for Hadoop runtime generated files -->
<property>
  <name>hadoop.tmp.dir</name>
  <value>/opt/data/hadoop/tmpdir</value>
</property>

<!-- ZooKeeper quorum address for ZKFC -->
<property>
  <name>ha.zookeeper.quorum</name>
  <value>master:2181,slave1:2181,slave2:2181</value>
</property>

<!-- Static user identity for Hadoop web services -->
<property>
  <name>hadoop.http.staticuser.user</name>
  <value>root</value>
</property>

<!-- Hosts root can proxy (all hosts) -->
<property>
  <name>hadoop.proxyuser.root.hosts</name>
  <value>*</value>
</property>

<!-- User groups root can proxy (all groups) -->
<property>
  <name>hadoop.proxyuser.root.groups</name>
  <value>*</value>
</property>

2.4 Edit hdfs-site.xml

<!-- Replication factor configuration -->
<property>
  <name>dfs.replication</name>
  <value>2</value>
</property>

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.