System Requirement
This section outlines the steps to install a Hadoop cluster using Apache Ambari, which provides an end-to-end solution for managing and monitoring a Hadoop cluster. With Apache Ambari, one can deploy, operate, manage configuration changes, and monitor services for all the nodes in the cluster from a central location.
To install the platform, your system must meet the following minimum requirements:
Operating Systems Requirements
Browser Requirements
Software Requirements
JDK Requirements
Database Requirements
Recommended Maximum Open File Descriptors
Operating Systems Requirements
The following, 64-bit operating systems are supported:
CentOS 7.5, 7.6, 7.7
RHEL 7.5, 7.6, 7.7
The installer pulls many packages from the base OS repositories. If you do not have a complete set of base OS repositories available to all your machines at the time of installation you may run into issues.
If you encounter problems with base OS repositories being unavailable, please contact your system administrator to setup the repositories.
Browser Requirements
The Ambari Install Wizard runs as a browser-based Web application. You must have a machine capable of running a graphical browser to use this tool. The minimum required browser versions are:
Microsoft Edge 41 or later
Firefox 61 or later
Google Chrome 67 or later
Software Requirements On each of your hosts:
yum and rpm (RHEL/CentOS)
scp, curl, unzip, tar, and wget
OpenSSL
Python 2.7
JDK Requirements
The following Java runtime environments are supported:
Open JDK 8 64-bit
Database Requirements
Ambari requires a relational database to store information about the cluster configuration and topology. The following databases are supported –
PostgreSQL 10.2, 9.6, 9.5, 9.4
Memory Requirements
The Ambari host should have at least 1 GB RAM, with 500 MB free. To check available memory on any host, run:
free -m
Check the Maximum Open File Descriptors
The recommended maximum number of open file descriptors is 10000, or more. To check the current value set for the maximum number of open file descriptors, execute the following shell commands on each host:
ulimit -Sn
ulimit -Hn
If the output is not greater than 10000, run the following command to set it to a suitable default:
ulimit -n 10000
Collect Information
Before deploying, you should collect the following information:
The fully qualified domain name (FQDN) of each host in your system. The Ambari install wizard supports using IP addresses. You can use hostname -f to check or verify the FQDN of a host.
List of components you want to set up on each host.
Base directories you want to use as mount points for storing:
NameNode data
DataNodes data
Secondary NameNode data
Oozie data
YARN data
ZooKeeper data
Various log, pid, and db files
Use base directories that provide persistent storage locations for your INVARIANT components and your Hadoop data. Installing INVARIANT components in locations that may be removed from a host may result in cluster failure or data loss.
Last updated