System Requirement

This section outlines the steps to install a Hadoop cluster using Apache Ambari, which provides an end-to-end solution for managing and monitoring a Hadoop cluster. With Apache Ambari, one can deploy, operate, manage configuration changes, and monitor services for all the nodes in the cluster from a central location.

To install the platform, your system must meet the following minimum requirements:

Operating Systems Requirements
Browser Requirements
Software Requirements
JDK Requirements
Database Requirements
Recommended Maximum Open File Descriptors

Operating Systems Requirements

The following, 64-bit operating systems are supported:

CentOS 7.5, 7.6, 7.7
RHEL 7.5, 7.6, 7.7

The installer pulls many packages from the base OS repositories. If you do not have a complete set of base OS repositories available to all your machines at the time of installation you may run into issues.

If you encounter problems with base OS repositories being unavailable, please contact your system administrator to setup the repositories.

Browser Requirements

The Ambari Install Wizard runs as a browser-based Web application. You must have a machine capable of running a graphical browser to use this tool. The minimum required browser versions are:

Microsoft Edge 41 or later
Firefox 61 or later
Google Chrome 67 or later

Software Requirements On each of your hosts:

yum and rpm (RHEL/CentOS)
scp, curl, unzip, tar, and wget
OpenSSL
Python 2.7

JDK Requirements

The following Java runtime environments are supported:

Open JDK 8 64-bit

Database Requirements

Ambari requires a relational database to store information about the cluster configuration and topology. The following databases are supported –

PostgreSQL 10.2, 9.6, 9.5, 9.4

Memory Requirements

The Ambari host should have at least 1 GB RAM, with 500 MB free. To check available memory on any host, run:

free -m

Check the Maximum Open File Descriptors

The recommended maximum number of open file descriptors is 10000, or more. To check the current value set for the maximum number of open file descriptors, execute the following shell commands on each host:

ulimit -Sn 
ulimit -Hn

If the output is not greater than 10000, run the following command to set it to a suitable default:

ulimit -n 10000

Collect Information

Before deploying, you should collect the following information:

The fully qualified domain name (FQDN) of each host in your system. The Ambari install wizard supports using IP addresses. You can use hostname -f to check or verify the FQDN of a host.
List of components you want to set up on each host.
Base directories you want to use as mount points for storing:
- NameNode data
- DataNodes data
- Secondary NameNode data
- Oozie data
- YARN data
- ZooKeeper data
- Various log, pid, and db files

Use base directories that provide persistent storage locations for your INVARIANT components and your Hadoop data. Installing INVARIANT components in locations that may be removed from a host may result in cluster failure or data loss.

PreviousInstall Overview NextSoftware Requirement

Last updated 5 years ago