Install Overview

A Hadoop cluster installation involves unpacking the software on all the machines in the cluster or installing it via a packaging system as appropriate for your operating system.

One machine in the cluster should be designated as the NameNode and another machine as a ResourceManager. These servers are the also called as masters.

Other services (such as YARN, Spark, Job History server) are usually run either on dedicated hardware or on shared infrastructure.

The remaining machines within the cluster act as both DataNode and NodeManager. These are referred to as data nodes or slaves.

For starting, use a Single Node setup - Single Node Setup

For development and production work loads, use the Cluster Setup for a multi-node Hadoop installation.

If you want to use Apache Ambari to setup the cluster, follow the Cluster setup steps outlined in the next section.

Last updated