Building java projects on

26 01 2010

RunCodeRun is a hosted Continuous integration environment developed by Relevance. Thanks to the Relevance team and their effort to provide this service free for opensource projects.

I wanted to build rapa on RunCodeRun. I was trying to setup my project on RunCodeRun using instructions in a blog post. It is a very useful blog and easy to follow. But I was having trouble getting code to compile on runcoderun box. Here are the problems and their solutions.

I guess there is no real jdk on runcoderun, so my compile task on ant would not work.

I checked in tools.jar as a part of my project libraries. Then I added it as part of the classpath when invoking ant from rake.

classpath = [File.join(".","lib","ant.jar"), File.join(".","lib","tools.jar"), File.join(".","lib","junit.jar"), File.join(".","lib","ant-junit.jar"), File.join(".","lib","ant-launcher.jar")].join(File::PATH_SEPARATOR)
system "java -cp #{classpath} -emacs dist"

Even after I fixed that the build would always go green even if there was failure from ant

I guess the exit status returned by rake was always zero irrespective of what ant returned. I changed the rakefile as shown below.

system "java -cp #{classpath} -emacs dist"
exit $?.exitstatus

This seemed to fix all the issues with the build.

Along with these issues I was also facing an issue with ant-junit task with junit 4. Basically standard distribution of ant does not understand the annotations of junit4 and requires the test class to extend TestCase. I had to checkout the latest source from ant subversion trunk and build it. I am currently using these jars.

Have a look at the project in github.

Continuous integration on.


Mnesia Quickstart

5 01 2010

Basic introduction on Mnesia

Mnesia is a RDBMS. But it belongs to the nosql category of RDBMS. Reason being the query language is not sql but Erlang. That makes it very easy to store and retrieve without having to go through Object Relational Mapping. So we can actually call Mnesia an object relational database.

Why and where would one want to Mnesia?

Erlang in general is used to program highly distributed and fault tolerant systems. Even though it has its roots in the telecommunication industry, it has proven useful in several other sites like ejabbered, Facebook chat etc. Mnesia is just a part of Erlang and is built with Erlang.

Hence it gives you configurable degree of Fault-tolerance (by means of replication).

Another important feature of mnesia is the Dirty read interface. It is possible to read, write and search Mnesia tables without protecting the operation inside a transaction.


The below steps should get you started on mnesia, if you are using it for the first time.

Required Software – Erlang

I use ubuntu as my OS. But that should not make it much different on any other OS.

Now lets start with some code. It is useful to install the table-viewer (erlang-tv).

The goal of this exercise is to create table called person, insert few records and read them.

create a file called Person.hrl

-record(person, {name,      %% atomic, unique key
age,        %% age
married_to, %% name of partner or undefined
children }).%% list of children

create a file called Person.erl

init() ->
insert() ->
T = fun() ->
X = #person{name=john,
read(Name) ->
R = fun() ->

Start command line erlang. Type in the below command from the directory which contains the above two files

erl mnesia dir .

The above command conveys that the current directory will be used to store Mnesia files.

Compile the person.erl



Start mnesia


Create person table.


Insert a record.


Use table viewer to check if the record has been inserted.


This will launch table view application. By default the table viewer shows the ETS tables. To look at the table we just created go to view menu and select Mnesia tables.

Read the record using Mnesia:read()


In my next post I will cover Mnesia queries as List Comprehension.

Jumping through hoops to represent trees in Database

29 12 2009

Recently I have been working on a project where we have to represent hierarchical data in Database. Unfortunately we do not have much choice with the database. We are using a relational database.

If you have done this, you will agree with me that it is not a very enjoyable experience.

Firstly we need to choose between several models to represent trees in database

a. Adjacency (self referential tables)

b. Materialized path (lineage)

Shortcomings of adjacency model

Tree traversal is costly in adjacency model. Finding out children and grandchildren of a parent may be quite complex

Shortcomings of materialized path

Materialized path requires you to build this information at some point in time. If you have a million records for which you need to build materialized path, then I suggest you start now, because no knows when it will end. If some one knows of an efficient way of doing this please let me know. If you get past this stage, then there is the issue of updating the data to handle moves and deletions.

Static and Dynamic Data

The choice we make is mostly driven by how many changes can we expect. If we are never going to modify the data, probably materialized path any other approach which stores the lineage information alongside each row is useful. But this is rarely the case.

Some vendor specific help

The guys at micrsoft and oracle seem to have seen this issue and suggest the use of below techniques for this issue.

Sql Server

1. Common table expression: Popularly known as CTE, this is a way to run recursive queries on a self-referential table.

2. HierarchyID: This is a datatype that is available in SqlServer 2008. It uses materialized path.


1. Start with and connect by: This is similar to the above method. It works on self-referential Table.

Object modeling trees

Imagine a scenario where you need to model a huge Family. I guess we start by having Person class. Each person has 0 or more children. Children is nothing but a collection of Persons. Mapping this to the data in database is a pain.

1. Lazy loading: Most probably you will have to lazy load the children as and when you need them. Else you may have to wait a generation to get the complete tree loaded.

2. If we want to implement things like Delete or reassignment, saving the data back to database will not be easy.

Better ways to store hierarchical data

Hierarchies are graphs. It is better to use a database like neo4j. Neo4j has been a very popular graph Db.

Coroutines – back to basics

27 12 2009

Ruby 1.9 Fibers has got me reading about Coroutines.
Thought I should put all my understanding somewhere, as I read and understand coroutines in more depth.

Most of the content in this post just a aggregation of various sources.

Coroutines are program components that allow multiple entry points and can return any number of times. Coroutines belong to a category of programming construct called Continuations.

All programming languages have one way or another to handle control flow. Within a control flow there is an associated state. This state is information like value of a variable etc. Callstack is one of the most popular way to store this information. Every method has its own call stack and this stack is erased once the method returns either normally or through exceptional Flow.

In a Coroutine this is not the case. We can suspend and resume execution without loosing the stack.

Types of Coroutines:

1. Symmetric Coroutines: A symmetric coroutine can use one function to yield and another to resume. Example: Lua

2. Asymmetric Coroutines: They are also called as semi-coroutines. The choice for the transfer of control is limited. Asymmetric Coroutines can only transfer control back to their caller. Example: Ruby 1.9  Fibers


producer consumer


def producer do
value = 0
loop do
Fiber.yield value
value += 1

def consumer(source) do
for x in 1..9 do
value = source.resume
puts value




fib = do
x, y = 0, 1
loop do
Fiber.yield y
x, y = y, x+y

20.times { puts fib.resume }

Why are coroutines important?

The main reason why coroutines are making the limelight again is because of concurrency. In my humble opinion, concurrency is reviving many of the well known but forgotten programming concepts back.

To take the example of ruby, most of us are aware of the Global Interpreter Lock. Threading in ruby is totally useless because ultimately all thread run as part of the same OS thread, which means there no true concurrency. Fibers in ruby are very similar to threads but are light weight threads. They can scheduled, suspended and resumed as per the programmers choice.

Coroutines can be used to construct the actor model of concurrency. This is the same model used by Erlang. Revactor is a very nice implementation of the actor model in ruby.

I will add code here when time permits.

Remote inception – An Experience Report on an inception over phone

9 12 2009

Before I start, I would like to state that this article does not advocate for or against running an agile inception over phone. It is more of an experience report. Please feel free to post your comments.


Inception is at the heart of a successful agile engagement. In an agile project we work with the client and not just for the client. Inception starts the process where the team, client and consultants, start thinking alike and working together.
It is so much easier to work with a colleague once you have synchronized your frequencies/wavelengths. Okay, enough blabber about wave theory.

This article is about my experience, learning and rant about remote inceptions. I intend to keep it more like a free flowing conversation.


Projects are set fail if the initial understanding and the basis for further development is flawed. Some key questions that immediately come to our mind:

Does the client know what he wants?

Does he really need to build it, or can he buy something that already exist (COTS)?

The answers to these and many other questions would become apparent in an inception.

Ideally inception is where we start by setting a vision for the project, break it down into achievable milestones and further down into playable stories. But all this mandates that you have the client right in front of you.

Most inception exercises require face to face interaction to make communication as clear as possible. It is necessary to use tools (simple and sophisticated) to make mental model explicit, elicit the requirements and clear any doubts. Inception is a fun and effective way to interact with the client and bring every one on the team on board with the project’s goals.

Most inception should have the below activities in the agenda.

1. Team introductions – May seem simple. But simple activities like playing a small team game act as the all important ice breaker.

2. Collaborative Modeling sessions – As many or as little, as per requirement of the project. A good inception would have several of these sessions on project specific topics as well as general discussions on Non Functional Requirements.

3. Prioritization – Lay out the options in front of the client story cards. Let the client move the cards to prioritize them. In some cases this exercise leads to a rough release plan.

4. Inception showcase
The above activities are a small subset of an inception. But, these are the ones which bring out the most useful facts that are necessary for the project success. Also they are the ones which require as much face to face interactions and team efforts.

At the end of an inception the team must be able to decide if they should go ahead with the project.


Let me now explain a little bit about the scenario we were faced with. Our clients had very limited budget and could not afford to include travel expenses for either them to travel to our location (India) or for us to travel to theirs (Chicago). It may seem very sensible to not start the project until sufficient budget is available. But the client could not get more budget unless something was built and built soon. So we had to do an inception with them over phone, with a 12 hour time difference (Sadly the video conference equipment on their side was broken).

All this got us be more resourceful and improvise with what we had. The only way ahead was to address all the risks as best as we could.

Managing risk

A remote inception is very risky business. The probability of success is quite small. Always communicate this to the customer and try and push for a face to face inception. Remember this is not for your benefit, but it is in the best interest of the customer. It is a good idea to maintain a shared risks log with the customer.

Below are some risks that we faced.

Risk: Understanding about some Features may not be completely correct


  • Client was made aware that there may be minor misunderstanding despite best efforts.
  • In our case the application functionality was quite closely associated with the UI. So we came with early mockups that were as close as possible to what the client wanted. We let them edit the same and maintained them for future reference.
  • Rather than plainly documenting technical understanding, we built very crude prototypes. Most of the time, code is the best documentation and communication mechanism.
  • The client was made aware that our initial estimates would be bumped up by a certain risk factor to accommodate any issues with understanding. It is better to promise less and deliver more.

Risk: 12 hour time lag. It was imminent from the beginning that we had to spend time outside our usual working hours to spend enough time on the inception.


We scheduled for calls which ranged between 3 to 4 hours everyday. A face to face inception can have day long agenda. But it is better to maintain lesser number of hours on remote inceptions. Small 15 minute breaks were counted in.

Inception Agenda

We made sure that all stake holders were in a position to dedicate time for inception. Instant messenger proved very useful. We also sent out links to tools like webex (Desktop sharing tool). Initially we were confident about our own superhuman capabilities to spend late hours at office to have longer conference calls. But a senior member in our team rightly pointed out the flaw and reduced it to an optimal 3 hour call. This suited us well. After three hours over the phone it is extremely tiring to do any other productive work.

We prepared the agenda to ensure that we had time to cover all topics that we considered necessary. But it was not something that was set in stone. Some sessions finish ahead of time while others may reveal unknown areas, which require fresh slots to be included. We revised the agenda from time to time.

A typical day would start with the recap of the previous days meeting notes. This would be quick 15 minute exercise which would warm up the team for the long call ahead. We also used this time to follow up on each other’s progress.

Communication and meeting notes

As in the case of any normal inception, never go without a good scribe. We took turns at this role and noted down all the key points. Though it may have sounded silly sometimes, we tried to paraphrase the client’s sentences and validated our understanding. The client was informed at the very beginning of the inception that we may have to repeat some lines to confirm our understanding. In our case one team member from our customer side volunteered to take notes as well. At the end of the day we would share notes and if there are any differences in understanding we would resolve it in the following day’s meeting. Once all differences are cleared, we would put it in a place where every one has access.

Try to learn how each person sounds, so that you can associate a voice and/or accent to a person. Also suggest your customer to do the same. This helps a lot in keeping the conversation easy.

Sometimes you will not know when the person on the other side of the phone has lost interest in what you are saying. It is better to speak slowly and clearly. While talking to a person face to face it is very easy to detect when he/she is loosing interest. On the phone the one possible way to do this is to include small questions while one speaks. This way you know the person on the other side is listening.


We used low tech tools to simulate a virtual card wall where clients to could move cards. You could use an online card wall for this. Screen sharing tools like webex are extremely important. There are quite a lot of free tools available.

Start using a project management tool early in the cycle. Start adding stories to the project management tool as early as possible. A spreadsheet may be easy to start with. We used mingle for project management.

In Retrospect

If I had to do this all over again, I will still consider it extremely risky business. Few things that I might do differently are listed below.

  1. Get the video conference equipment worked out early. In our case since the team size was small. So this did not become a great issue. I would strongly recommend having a video conference for bigger teams.
  2. If there is not enough budget for the entire team to travel, try to have at least one representative from your side at the customer’s location. He could facilitate the activities.
  3. Capture the clients mood over the period of inception using tools like Niko-niko Calendar


Remote inceptions are tough if not impossible. Try to avoid it as much as you can. But if have to do it, you know are not alone. In the end it is our goal to help the customer, no matter what the constraint. Fortunately, in our case the exercise was a success and the customer was happy.

I hate ORM

9 12 2009

The title is not meant to start a war over the concept of ORM. I appreciate the effort that has gone into mappers. But lets take a look at why I hate ORMs. (Dont hate me because I hate ORM 🙂 )


I am beginning to wonder how many applications that we build really need a relational database.

Some terms become synonymous with their usage. For instance in the Xerox has become synonymous with Copiers.

Relational databases have almost become synonymous with Databases. As a developer or anyone involved in system design it is very important to know the options that are available to store data. The choice of persistence technology governs application scaling and performance in a very big way.

Now, Why do I hate ORM

ORMs hide the inconvenience that comes with using RDBMS with object oriented code.

When I learned relational modeling, I really liked it. I still do like making relational models.  But how long have relational databases been in existence. They were in existence much before the widespread usage of object oriented programming. Back then code was procedural. The relationship between data had to exist somewhere and it made sense to have it in the persistent store. Querying became easier.

But it was rather hard to switch older persistent stores with other technologies when we moved to object oriented code. Reasons were many. For example: availability skilled database developers, strong trust in RDBMS, good vendor support etc. But the move towards newer languages like C++, java and C# was inevitable. ORMs was win win solution to this problem.

Before ORM, all of us were known to writing a mapping layer ourselves. ORM was such a relief when it hit the markets. It set us free after years of wrangling with ugly mappers. But in the revelry we seem to have forgotten that it was database that needed a second look and not the codebase.

Now we have Duplication of relationships in data as well as in code. It is surprising that duplication of relationships has not struck us as problem.

Even frameworks like rails give us an impression that the standard way to build a web application is to use an RDBMS as a backend.

I simply cannot grasp the amount of effort we put into mapping object to schema. Another annoying issue is to have completed Database Design before starting development. Using Hibernate or Active Record on top of an existing schema is nothing less than tying oneself up in knots.

There is no point in great Object Oriented code if the system design is not appropriate. It is my humble thought that ORM should not be used as an excuse to choose Relational Databases over other options. As in any case use with Discretion.

Let me know what you think.

Testability Explorer and rapa

5 09 2009

After a hard day at work my mind was in no state to churn out quality code on open source. But really wanted to get something done about rapa. So thought it might be a good idea to integrate some metrics into the build and see what they had to say.

Misko has created this really cool tool called Testability Explorer. It tells you how testable your classes are. But why do we need such a tool?

Rapa was mainly designed and developed the TDD way. There is always more to learn and improve in the code. Certain parts of Rapa became increasingly difficult to develop the TDD way. We also need a tool that can Flag some basic testability issues.


Lets start with some basic testability issues that we all may have faced at some point in time.

A very common issue most people new to TDD and mocks face is the mocks returning mocks. Below is an example.

MethodExecutor executor = new MethodExecutor(mockMethodProvider);

In the first line we are setting an expectation on mockMethodProvider to return a mockMethod. The mockMethodProvider is injected into the methodExecutor. Then we call execute on executor. In the last line we are verifying that execute was called on the mockMethod by the executor.

If we look at this code closely we can observe that MethodExecutor is being injected with mockMethodProvider when all it needs is the mockMethod.

It is a simple design issue. You can find this and many more examples of such common issues with testability in Misko’s blog.

Rapa and Testability Explorer

Integrating testability explorer itself into the build is an easy process. I used the instructions in this link.

Below is the report.

Testability Explorer Report for Rapa release 0.8
Testability Explorer Report for Rapa release 0.8

Looks like the code is not too bad. 🙂