Here is a short presentation that I gave as a guest speaker to a group of graduate level statisticians at the University of Utah. It is a very high level view of what Hadoop is, who is using it, when you should use it, and bit on how it works.
Presented are some short notes for installing Cloudera CDH4 on Ubuntu 12.04 LTS running as a guest OS on Oracle’s VirtualBox. For those unfamiliar with Cloudera and CDH, CDH is Cloudera’s 100% open source Hadoop distribution. What is documented here is not a complete tutorial, but rather pieces of information to be used in conjunction with the product’s documentation. Use these tips to make the installation of Cloudera on Ubuntu easier.