Free Online Hadoop Training by Cloudera

Update: Some of the links are no longer valid. Clicking those takes you to the Cloudera videos page

I was looking for some online training on Hadoop. Turns out Cloudera has some excellent training available online, and best of all it is free.

I decided to create a list of all the training material with the time required to go through each one for easy reference

Training Type Time
Introduction to HBase Video 15 min
Download and setup training VM VMWare VM ~
Thinking at Scale Video 23 min
MapReduce and HDFS Video 50 min
Getting Started with Hadoop PDF, exercises 30 min
Hadoop Ecosystem Tour Video 20 min
Programming with Hadoop Video 50 min
MapReduce Algorithms Video 35 min
Writing MapReduce Programs PDF, exercises 90 min
Introduction to Hive Video 50 min
Hive Tutorial Video 16 min
Introduction to Pig Video 50 min
Pig Tutorial Video 15 min

So far I have gone through the first couple of videos and set up the VM. The videos are really good. Hopefully, they will offer more of their training online (free, or paid) or start to offer it more frequently in Europe.

Set timeout on Web Service proxy generated by Flex Builder

Flex Builder has a pretty nifty WSDL tool to generate support classes to access a Web Service.
Some of the code generated looks rather sloppy though, but then that is the case with most code generation tools.

One glaring omission seems to be that there is no way to set a request timeout when using the Service class generated.
What further adds to confusion is that the underlying Service class in fact extends mx.rpc.AbstractWebService and hence mx.rpc.AbstractService which has a requestTimeout property. Setting that property has no effect however.

To set a timeout on the request, you need to modify the Flex Builder generated class.

Find the method that actually makes the call to the remote server, and set the requestTimeout value on the AsyncRequest being created

private function call(operation:WSDLOperation, args:Object, token:AsyncToken, headers:Array=null):void
  var enc:SOAPEncoder = new SOAPEncoder();
  var inv:AsyncRequest = new AsyncRequest();    
  inv.requestTimeout = _asyncRequestTimeout;

And add an asyncRequestTimeout property to the service class

private var _asyncRequestTimeout:int;
public function set asyncRequestTimeout(asyncRequestTimeout:int):void        
  _asyncRequestTimeout = asyncRequestTimeout;        

You could also use the asyncRequest property in AbstractService instead of creating your own property, but this way if somebody overwrites the generated class without adding these changes, the compiler will flag it as an error.
And a compile time error is always better than a runtime bug, isn’t it?

Using Maven on Intellij IDEA on Mac

Running/Debugging a maven target on Intellij IDEA results in a big scary error message: –

Error running […]: No valid Maven installation found. Either set the home directory in the configuration dialog or set the M2_HOME environment variable on your system.

To fix, you have to add M2_HOME variable in the Mac specific way.

Create or modify the file ~/.MacOSX/environment.plist, and add an entry for M2_HOME

If that file is already there, you may find that you can’t edit in a text editor because it is a binary file. First you will have to convert it to plain text (xml), edit it, and then you can convert it back to the binary format.

$ plutil -convert xml1 environment.plist

$ vim environment.plist

$ plutil -convert binary1 environment.plist

Ref: and


Developer communities in the Netherlands

July last year, I moved to the Netherlands to join Xebia Nederland after working at Xebia India for a year.

There were a few different reasons for the move, but one thing that I was looking forward to personally was a more vibrant developer community. In spite of the huge number of developers in India, there are not that many hard core developer events or communities. What events there are, are geared towards marketing or propagation of latest buzzwords be they Agile, Lean, or Cloud or something else. And Malaysia before that wasn’t any better either.

In contrast, in the Netherlands, I have been to a whole lot of programming groups and events in just six months – Devoxx 2010, and a fair few NL NoSQL group meetups. Not to mention such events hosted by Xebia itself – many excellent Xebia knowledge exchange sessions and tech rallies. A few months ago, we had James Coplien come in for two days to talk about lean architecture and organizational patterns. During one of the last tech rallies, Dan North dropped by for a while and talked about deliberate discovery and his other (post agile?) ideas. In short, the developer in me has been having a rocking time. Let us see what the year 2011 brings!

On being a programmer in India

Early in 2009, we moved to India for a while after having lived in Malaysia for over 6 years. This was also the first time I really got to work in software development in India (not counting my very first job which I no longer count in my experience).

Now, I worked at a great place and learnt a lot in the year I was there, not to mention many new friends I made. But there was one thing I realized while there that will make me think twice about moving to India again. You see, if you are working in a services company and most of the time you are working for clients in Europe or US, the main reason they are coming to India is to save costs.

What that means for you as a programmer is that there is an upper bound on how much the client will pay for you – be it USD 25, 35 or 50. Past that limit, it does not matter how good you are, your company cannot bill you at a higher rate. The reason being that most of the time, for the client, you are basically a guy with 6 years of experience in technology X (or 8 year or whatever).

Of course, there are developers in India who are getting paid the same as the developers in US. But not in a services company. They are probably working in certain product companies, or through their existing network, or on their own startups. You can get what you believe you are worth only if the client is hiring “You” and not just an “X years of experience” guy. This is something I will have to think really hard about if I ever move to India again.

Pivotal Tracker – agile project planning

A mention from a colleague made me look up Pivotal Tracker. It looks like a really nice tool for planning Scrums. We are using Version 1 on our current project, and I hate that beast. In comparison, Pivotal Tracker’s clean, usable interface looks really great.


I created a small test project to check it out. It is a somewhat opinionated software, and may not do some of the things that you expect it to do. For example, by default, there is no option to break down a story into tasks, or to assign points to bugs and technical tasks, but you can change that setting on a per-project basis if you want to. Check out their help section for other things it can or cannot do – Pivotal Tracker help.
And it is free with no limitations! What more can you ask for, go check it out today.

Another good write up on Pivotal Tracker –

Helping web developers and operations bridge the deployment gap
Thoughtworks Mingle vs. Pivotal Labs Tracker.

WordPress Post-Notification – Show the author in email subject

I was trying to fix a WordPress Post Notification plugin to include the post author in the emails subject. Instead of inserting the author name it was leaving the @@author in the subject.

A little digging around found me the solution –

In sendmail.php->post_notification_create_email(), find –

$subject = get_option('post_notification_subject');

and add a new line after that –

$subject = str_replace('@@author', $post_author, $subject); 

Ref – Post Notification forum

Moving a WordPress blog to new domain

WordPress documentation does a pretty good job of explaining the steps you have to take if you are moving your wordpress blog to a new domain.

See Changing The Site URL – Domain Name Change

Even after you update the guid in your WordPress database’s posts table by following the instructions there, you may find that some of the images do not show up, and the image and other links are broken.

Run this command to see if the problem is in the post contents in the WordPress database’s posts table –

select id, post_title from wp_posts where post_content like '%exampleoldsiteurl%';

(Replace exampleoldsiteurl with the old url of your blog)

If this shows posts that are using the old url, you need to replace that url in the post_content just as you did for guid –

UPDATE wp_posts SET post_content = replace(post_content, 'exampleoldsiteurl','examplenewsiteurl');

Make sure to carefully read the instructions on the WordPress Codex link above, and absolutely make certain that you have a back-up of your database before you run these sql commands on it.