Installing Cloudera CDH4 on Ubuntu 12.04 LTS

Presented are some short notes for installing Cloudera CDH4 on Ubuntu 12.04 LTS running as a guest OS on Oracle’s VirtualBox. For those unfamiliar with Cloudera and CDH, CDH is Cloudera’s 100% open source Hadoop distribution. What is documented here is not a complete tutorial, but rather pieces of information to be used in conjunction with the product’s documentation. Use these tips to make the installation of Cloudera on Ubuntu easier.

Prerequisites

Creating the VM

Create a new virtual machine using the new VM wizard and downloaded Ubuntu ISO. It is important to have the 64-bit LTS ISO or the Cloudera manager will not start.

VM Settings:

  • 4GB RAM (minimum)
  • 2 CPUs
  • 128MB Display Memory
  • 25GB Dynamic Disk

When finished, you should see something similar to the following:

Configuring Ubuntu 12.04 LTS

Once you have started the Ubuntu VM and logged in, set a password for root. The Cloudera manager will need the password to install the cluster. You are also free to use a passwordless sudo setup.


sudo passwd root

Next, you will need to install the SSH server and client. This is needed by the Cloudera manager for cluster installation:


sudo apt-get install openssh-client
sudo apt-get install openssh-server

Make the following changes to /etc/hosts. Not modifying the file will cause a number of cluster startup errors such as not being able to start hBase or creating a number of default directories:


127.0.0.1 KRDAVIS-CLOUDERA localhost
#127.0.0.1 localhost
#127.0.1.1 KRDAVIS-CLOUDERA

Install the GNOME session fallback package:


sudo apt-get install gnome-session-fallback

Logout and select “Gnome Classic (no effects)” for your session. This will prevent any weirdness with running Compiz under the VM. You can now log back in.

Install Cloudera CDH4

Start the cluster installation by running the Cloudera installation manager:


chmod 755 cloudera-manager-installer.bin
sudo ./cloudera-manager-installer.bin

Follow the instructions and accept and default values. When you are done, you should have a single node CDH4 cluster running in your VM!

Shutting down the cluster and VM

When it comes time to shutdown the VM, I found I have fewer problems if I shutdown the cluster by logging into the Cloudera management web app. Select “All Services” from the “Services” menu. For the cluster, select “Stop…” from the Actions dropdown menu. Wait for all services to come to a stop.

After verifying that all cluster services are stopped, shutdown the VM by opening a terminal and running the following command:


sudo /sbin/shutdown -h now

Selecting shutdown from the Ubuntu UI appears to only log out of the system without shutting it down. That is a problem for another day.

Please follow and like us:

11 Replies to “Installing Cloudera CDH4 on Ubuntu 12.04 LTS”

  1. Hello Keith, Thanks a lot for this post. Can you please provide more details on installation steps..

    And one question. Assume I have installed CDH4 successfully on ubnutu, then how can I start my nodes.

  2. Hi Davis,
    I have installed CDH$-YARN. While updating some settings got corrupted. I am unable to start NN and SNN. The error is :

    sush@sush-desktop:~$ for svc in /etc/init.d/hadoop-hdfs-* ; do sudo $svc start ; done
    * Starting Hadoop datanode:
    starting datanode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-datanode-sush-desktop.out
    * Starting Hadoop namenode:
    bash: line 0: cd: /var/lib/hdfs/: No such file or directory
    * Starting Hadoop secondarynamenode:
    bash: line 0: cd: /var/lib/hdfs/: No such file or directory

    Pl. help how to resolve the issue. I have deleted /tmp/ and formatted NN. Other nodes are starting but NN and SNN are not.

  3. Shan – it looks like /var/lib/hdfs was not created during the install process. I would suggest that you wipe you cluster nodes and retry the install process. Failing that, I would search the excellent resources on the Cloudera site to help you troubleshoot your problems. That is where I found some of the answers to create this post.

  4. ./cloudera-manager-installer.bin: 1: ./cloudera-manager-installer.bin: Syntax error: ")" unexpected

    This is the error I am getting when I tried to run the bin file , Please let me know what to do

  5. Please go to the Cloudera site and see if you can find an answer in their excellent online resources. If you don't find an answer to your problem, open a support request with Cloudera.

  6. Thanks for this post Keith. It was really useful

    I succesfully installed and Logged into the cloudera on my system(localhost:7150), During the Cluster installation.
    I searched the cluster by typing localhost before install. during the installation I am getting the following error.

    Installation failed. Failed to receive heartbeat from agent.
    Ensure that the host's hostname is configured properly.
    Ensure that port 7182 is accessible on the Cloudera Manager server (check firewall rules).
    Ensure that ports 9000 and 9001 are free on the host being added.
    Check agent logs in /var/log/cloudera-scm-agent/ on the host being added (some of the logs can be found in the installation details).

    Can you tel the changes what to made to run properly

    etc/hosts contains the following details

    127.0.0.1 PDURAI-CLOUDERA localhost

    #127.0.0.1 localhost

  7. Please go to the Cloudera site and see if you can find an answer in their excellent online resources. If you don't find an answer to your problem, open a support request with Cloudera.

    I haven't experienced any of the problems posted by others here. I am not trying to put anyone off by not answering questions to errors. I think the software publisher is the place to go.

    Best of luck!

Leave a Reply

Your email address will not be published. Required fields are marked *