Free Online Hadoop Training by Cloudera

Update: Some of the links are no longer valid. Clicking those takes you to the Cloudera videos page

I was looking for some online training on Hadoop. Turns out Cloudera has some excellent training available online, and best of all it is free.

I decided to create a list of all the training material with the time required to go through each one for easy reference

Training Type Time
Introduction to HBase Video 15 min
Download and setup training VM VMWare VM ~
Thinking at Scale Video 23 min
MapReduce and HDFS Video 50 min
Getting Started with Hadoop PDF, exercises 30 min
Hadoop Ecosystem Tour Video 20 min
Programming with Hadoop Video 50 min
MapReduce Algorithms Video 35 min
Writing MapReduce Programs PDF, exercises 90 min
Introduction to Hive Video 50 min
Hive Tutorial Video 16 min
Introduction to Pig Video 50 min
Pig Tutorial Video 15 min

So far I have gone through the first couple of videos and set up the VM. The videos are really good. Hopefully, they will offer more of their training online (free, or paid) or start to offer it more frequently in Europe.

Security against malicious websites

Most people know about computer viruses, but very few people are aware of the danger posed by malicious websites.
If you are not careful, malicious websites can steal your personal data by using vulnerabilities in certain websites. These kind of attacks are generally referred to as cross site scripting or XSS, and in general, it is very hard to be sure that an website you visit is not vulnerable to such attacks.

Logo Designer David Airey lost his domain as a result of an XSS attack, and a while ago Friendster suffered from a similar attack.

One precaution you can take against such attacks is to have multiple browsers on your computer, and use separate browsers to access sites with different trust levels. I divide up the sites I visit into three trust levels and use three different browsers to visit each category. One is for my primary email and banking etc. The second is for my secondary email (no personal stuff), blogs and other known sites. And finally the third where I visit sites from search engine results or other untrusted sources. This might sound paranoid, but when it comes to computer security, a certain amount of paranoia is essential, especially if you are using it for your business or professional use.

Another common mistake is to log in to your mail and other accounts from internet cafes while travelling. You can never be sure when a computer at an internet cafe has a key logger or other malicious software installed, either by an unscrupulous employee or by another user or by a downloaded virus. Set up a temporary email account for use while you are traveling, and have your other email accounts forward the email to this account. This does not guarantee safety, but at least it will minimize the risk.

For higher security, you really need to do dig deeper into computer security issues. This article at Wikipedia is a good start. But for most people, following a few reasonable precautions like the ones mentioned above can offer a good enough safety net.