Setting up an Cloudera Cluster on AWS EC2 - Part 1

In this demo we will be configuring cloudera cluster on Amazon Web Services EC2 platform.


    -   Sign up for an AWS account if you don’t have one.
    -   RHEL 7.5 Operating System
    -   2 Core Processor minimum 8 GB RAM - AWS Instances EC2

Server Details

Step #1: Configure Network and Network Security Group

Configure Network
Go to >> Click on Services >> In Networking & Content >> VPC >> Launch VPC

Click on Create VPC then click OK

Configure Network Security Group
Click on Security Group >> Click on Create Security Group > > Provide name tag and group name along with the description. Select VPC ID that we created in Step #1 as shown below

In Network Security Group, select security group as show below

Configure Inbound and Outbound Rules
Create inbound and outbound rules for security group and click save

Step #2: Create key pair to connect AWS Instances using putty

Create Key Pair
To connect AWS instance from windows using putty
Go to AWS Console >> EC2 >> Network & Security >> Key pairs >> Create Key Pairs

It not only create key but also downloads pem file which has private key to interact with external client machine and aws console.

Download Putty and Puttygen
Download and install putty and puttygen from
Open puttygen >> click on Load >> Browse to downloaded key file (demokp.pem) >> click on save private key

Step #3: Create RHEL instance on amazon EC2.

Select Instance >> Go to your AWS Console >> Click on Launch Instance and Select Red Hat Linux 7.5

Instance Type
In instance type dialog box, choose t2.large and click next

Click Next to configure Instance details, keep number of instance to 3, select network and enable auto-assign public IP

Increase disk size to 30 GB - > Click on Next Add TAG

Tag Instance
Click Next to Add Tags to your instances.
Click Add Tag an select our VPC subnet. This tag will be used to re-label our instances as “namenode”, “datanode” and so on.

Security Group
Click Next to configure Security Group for the instances.

Launch Instances
Finally we get to the Launch screen.

Go to the instances page and check on the status of the instances.

Naming the Instances
On the instances page, let us setup the names of the instances. These are not DNS names, but names we assign to help us distinguish between them.
Click the pencil icon next to the name and setup the names as shown below

Copy Public / Private DNS name and IP address of each instances on notepad as show below

Save configuration in putty
Open putty and save the details in putty as show below

In drop down list go to Auth >> Browse and select demokp.ppk key from your host machine

Step #4: Common setup on all nodes

Some of the setup`s are  common for all the nodes.

Install Java
Note: Perform this step on all the servers
Install the package: java-1.8.0-openjdk-devel on all the nodes.

yum install java-1.8.0-openjdk-devel

Downloads and Install files over a network.
Note: Perform this step on all the servers

yum install –y wget

Install Network Transfer Protocol
Note: Perform this step on all the servers 
Download and install NTP (Network Transfer Protocol) package. This ensure machine are synchronize with each other with respect to time (On all machines)

yum install –y ntp

Disable SELinux
Note: Perform this step on all the servers
SELinux is a security enhancement to Linux which allows users and administrators more control over access control.

vi /etc/selinux/sysconfig

Disable Firewall
Note: Perform this step on all the servers 
Turnoff the firewall and reboot the machine.

systemctl disable firewalld
service ntdp start

Configure Passwordless SSH
Please refer to the link Setting up SSH Passwordless Login Using SSH Keygen to configure passwordless ssh.

Step #5: Download and Install Cloudera

Download Cloudera
Note: Perform this step on machine m1
Go to >> Right click on cloudera-manager-installer.bin and copy link address

Paste the url in m1 web browser to download cloudera setup as shown below

No comments:

Post a Comment