Cluster Install from Ambari
Installing the Hadoop cluster
Setup YUM repository server
Use the YUM repository created from when Ambari was installed.
Download and extract Hadoop stack tarballs
Download and copy the Hadoop data platform stack tarballs. These should be installed onto the machine that hosts the YUM server. Unless you’re using a dedicated machine for the YUM repository server, this will be the same admin host you used for installing the Ambari Server.
Setup local YUM repositories
The stacks are shipped as archived YUM repositories. These should be deployed in YUM repository server to be accessible by the Ambari Server and all cluster hosts.
Each stack repository contains the setup_repo.sh script that assumes:
YUM repository server is accessible by all hosts in the cluster
Document root of your YUM server is /var/www/html/
Each stack’s script creates a symbolic link in the YUM repository server document root to point to the location of the extracted stack tarball and creates a repo definition file in /etc/yum.repos.d/ directory so that your local yum command can find the repository. It is essential that the hostnames in the repo definition files use the Fully Qualified Domain Name (FQDN) of the YUM server host that is accessible from all cluster hosts.
For each stack, run the local repo setup script:
/staging/{stack}/setup_repo.sh
If the repository setup was successful, the script will print out the repository URL. Write down the URL as you will need it later when installing an the cluster using Ambari Server UI
Note: If your YUM repository server runs on a different host than the admin host where the Amabri Server is installed, copy the generated repository definition files in /etc/yum.repos.d/ to /etc/yum.repos.d on the admin host where you installed the Ambari Server.
Test that the repositories are properly configured – run the following command from the admin host:
yum repolist
You should see the repositories for the stacks listed.
Login to Ambari Server
Once the Ambari Server is started:
Open http://{ambari.server.host}:8080 in the web browser
Login to the server using user admin and the password admin. These credentials can be changed later.
Launch Install Wizard
Once logged into Ambari, click on “Launch Install Wizard” button to enter into cluster creation wizard. The wizard is self-explanatory and guides you through the steps necessary to provision a new INVARIANT cluster. A few actions requiring particular attention are listed below:
Modify YUM repository URLs
In the Select Stack section, select 3.0 the version and click Advanced Repository Options to reveal a list of YUM repositories Ambari will search to get INVARIANT stacks RPMs from. The values provided here out-of-the box need to be replaced with the URLs of the stack repositories you have installed previously. Replace with the appropriate repository URL used earlier when you ran setup_repo.sh script for the stack. If you don’t have the links handy, you can always get them from the /etc/yum.repos.d/-.repo file.
Note: After you deploy the cluster, you can update repositories via the Ambari UI (Admin > Repositories).
Specify host names and SSH key
In the Install Options section, you need to provide FQDN names for the hosts that will comprise your cluster. You can use the range expression using square brackets – for example, host[01-10].domain will describe 10 hosts.
If you want Ambari to automatically provision and register Ambari Agents on the cluster hosts, you will need to provide a private key that you used to setup password-less SSH on your cluster. You can either pick and choose a file or copy&paste the file content into the screen form.
Ambari Agents Manual Install
Note: If you do not want to provide the private key or setup password-less SSH you will have to provision and configure the Ambari Agents manually. In this you have to:
Setup Ambari Repository by copying /etc/yum.repos.d/ambari.repo file from the YUM repository server
Install the Ambari Agent:
yum install ambari-agent
Edit the Ambari Agent configuration (/etc/ambari-agent/conf/ambari-agent.ini) to point it to the Ambari Server:
[server]
hostname={ambari.server.hostname}
url_port=8440
secured_url_port=8441
Start the agent:
ambari-agent start
The agent registers itself with the server after starting.
Choose Services
You must choose the services that you want to install. Before configuring the master and slave nodes, it’s important to understand the different components of a Hadoop cluster.
A master node maintains the knowledge about the distributed file system. The node-master will handle this role in this guide, and host two daemons:
Name Node: manages the distributed file system and knows where the blocks are stored inside the cluster.
Resource Manager: manages the YARN jobs and takes care of scheduling and executing processes on slave nodes.
The other servers store the actual data and provide processing power to run the jobs. They will host two daemons:
The data node manages the actual data physically stored on the node.
The node manager manages execution of tasks on the node.
Assign Masters
You need to assign “master” service components to your cluster hosts.
Assign Slaves and Clients
You need to assign “slave” and “client” service components to your cluster hosts.
Note: The panel displaying the list of services to provision on each host. Make sure to configure all the components.
Install, Start and Test
As the install proceeds, the screen will show the cluster deployment progress on each host. Each component that is mapped to the host will be installed, started and a test run to validate the component.
When “Successfully installed and started the services” message appears, click Next.
On the Summary page, review the list of completed tasks. Select complete to navigate to the cluster Dashboard.
Cluster Dashboard
The Dashboard is a central place that displays the services deployed and their status. Add new services or hosts, stop and start services and components, explore monitoring metrics and perform service specific actions. The cluster is ready for use
Last updated