Debugging apache mod_proxy_balancer

14 10 2013

Below are some notes that I made while debugging mod_proxy_balancer. I had to set it up in a hurry when I realized that Amazon Elastic Load Balancer I was using is only capable of sticky session using Cookies. I needed a load balancer that can use a url parameter to maintain sticky session. Thankfully a friend suggest that we use mod_proxy_balancer.

There are lots of material about mod_proxy_balancer, but it is hardly simple to get it working. There are some less know details without which you cannot get it working. I would suggest you take a look at Nginx or other alternatives before choosing mod_proxy_balancer.

Tech stack summary

I had to serve a NodeJS based Javascript API using a Load Balancer capable of Sticky Sessions. The reason for sticky session is beyond scope of this blog :). The entire setup is on Amazon EC2 instances with CentOS.

The whole #!

Since I had only two servers to load balance, I assigned them ids s.1 and s.2. It is very important that the routes are named by an alphanumeric prefix, a dot and then a number. Eg: server.1, t.2 etc. The mod_proxy_balancer code splits this route name using the dot and uses the second value as the route number. So s.1 would point to “route=1”.

<Proxy balancer://mycluster/>
 BalancerMember http://<ip-address-1>:80 route=1
 BalancerMember http://<ip-address-2>:80 route=2
</Proxy>

The first request coming to the mod_proxy_balancer is randomly routed to any one of the load BalancerMember. Lets say this request is received by server with route id s.1. The server then serves the request along with its route id (routeId=s.1). All further requests from that browser should now contain the url parameter “routeId=s.1”. Below configuration in bold tells mod_proxy_balancer to read this url parameter and use it to route the request to server 1.

ProxyPass / balancer://mycluster/ lbmethod=byrequests stickysession=_nsrouteid
ProxyPassReverse / balancer://mycluster/

That should get things working. How do we know the above setup is working and is sending requests to appropriate servers.

LogLevel warn
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{BALANCER_SESSION_STICKY}e\" \"%{BALANCER_SESSION_ROUTE}e\" \"%{BALANCER_WORKER_ROUTE}e\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %b" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent

BALANCER_SESSION_STICKY – This is assigned the stickysession value used for the current request. It is the name of the cookie or request parameter used for sticky sessions

BALANCER_SESSION_ROUTE – This is assigned the route parsed from the current request.

BALANCER_WORKER_ROUTE – This is assigned the route of the worker that will be used for the current request.

I have taken above information from mod_proxy_balancer documentation.

To begin with BALANCER_SESSION_STICKY should be the same as “stickysession” parameter in ProxyPass configuration. The BALANCER_SESSION_ROUTE will not be set for the first request from browser. BALANCER_WORKER_ROUTE will be chosen based on the load balance algorithm.

After first request is served by one of the servers, all further requests sent to the server should have routeId url parameter. The BALANCER_SESSION_ROUTE should show the value in url parameter. It should be “1” when the url parameter is “routeId=s.1”. BALANCER_WORKER_ROUTE will be the same as BALANCER_SESSION_ROUTE. This shows that the requests are sticky.





Subdomains, pretty urls and some config

15 01 2011

This post sort of collates information about using subdomains to make your urls look much nicer. Say suppose you are building a tumblr like service, then you would also think of providing subdomain based urls for each of your customers. Lets consider for the sake of explanation that you have a website called sconesandtea.com and you want to have several urls under this domain like cream.sconesandtea.com, jam.sconesandtea.com etc. Its not rocket science but it is rather painful to search for all the information yourself if you are new to this. This is more of a write up for myself, so excuse the free form writing style.

To start with you should configure subdomains with your domain registrar. Some basics are here. In short you have to make sure the intendend subdomain based url reaches your server in addition to your domain based urls.

Once you are done with that, take stock of the problem you have at hand.

  1. If you just want your url to redirect, that can be handled at nginx or apache level. For example if you just want cream.sconesandtea.com to redirect to sconesandtea.com/addons/cream, you can achieve this with url rewrite in nginx or apache. The point to bear in mind is that this setup results in http redirect and the url in your browser will not be cream.sconesandtea.com, but it will be sconesandtea.com/addons/cream after the redirect. We will go into this in detail in a bit.
  2. But if you do not want http://sconesandtea.com/addons/cream to be exposed to the outside world and the public url should be http://cream.sconesandtea.com, then it needs some logic to be built into the app.

Simple Redirection

Below is a snippet of nginx url rewrite module.

set $subdomain "";
set $subdomain_root "";
if ($host ~* "^(.+)\.sconesandtea\.com$") {
set $subdomain $1;
rewrite ^(.*)$ http://sconesandtea.in/addons/$subdomain;
break;
}

This will return a http 302 redirect. If you want the status code to be 301 append permanent key word to the rewrite url line.

rewrite ^(.*)$ http://sconesandtea.com/addons/$subdomain;

More on this here.

Handling subdomains at application level

The first thing to get past for this is to simulate the production scenario on a dev box. As most of you would know add the below entry in you /etc/hosts to simulate domain based url on local.

127.0.1.1    sconesandtea.com

But /etc/hosts does not support wildcard based subdomains. So for testing purposes add the subdomain specifically.

127.0.1.1    cream.sconesandtea.com

Based on the framework you are using there may be several ways of achieving the logic to use subdomains to render specific pages. For django you could use the middleware available here. This is quite a useful snippet. It makes subdomain available through request, which you can use elsewhere in your code. This snippet does not support subdomain based urls starting with www. So you may have to tweak it as per your application’s needs.

Please feel free to add or correct any information here.





Sqlserver Non-clustered indexes and deadlocks

5 07 2010

ORM tools and other abstraction on RDBMS have become ubiquitous. But there is no substitute for understanding the basics of a database. This opinion of mine was only reinforced by a recent issue which I was fixing with a colleague.

Bug: Error log for showed the below exception
System.Data.SqlClient.SqlException: Transaction (Process ID 53) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

Tech stack: .Net 3.5, sqlserver 2005, nhibernate

The exception stack trace pointed to the table that was being deadlocked.

Could not execute command: UPDATE Email SET PersonId = @p0 WHERE Id = @p1

Recreating deadlock issues is not a trivial thing. But thankfully in our case the deadlock was so severe that when I ran my tests in parallel, almost 50% of the transactions failed at a concurrent load of just 2. That was a decent first step since we were consistently able to reproduce the issue.

Sqlserver Management studio comes with some tools which are quite useful in this situation. To see what was causing the deadlock all I had to do was to run profiler on the database. To launch a profile follow the below steps.

tools > profiler > file > new trace > mention database details

The trace properties window should open up. Open the event selection tab and select show all events. This should show more events. Under the locks section select all the events that may be useful to you.

Start the trace, run your tests in parallel sit back with some popcorn and enjoy the action packed adventure. Run a find for Deadlocks and you should be presented with a nice picture of what is happening.

Lets zoom in on the action.

Inferences:

  • The deadlock is not on the object, because the object ids are the same. This is something which we also guessed from the query in the exception log UPDATE Email SET PersonId = @p0 WHERE Id = @p1
  • But the page ids are different.

Quite puzzled we looked at the table design to see if something was wrong there. And yes we saw what the problem was. The table did not have a primary key column.

Even though that may look like harmless issue, there are consequences of creating a table without a primary key in sqlserver. When you define a primary key a unique clustered index is created. But this table had a unique constraint on the id column, which would create a unique non-clustered index. Non-clustered secondary indexes may introduce deadlocks. More details in this link (See Non-Clustered indexes). You will also find it very useful to know how clustered and non-clustered indexes work.

In this case, the primary key and there by the clustered index was missing. We introduced a primary key constraint on the id column and  ran tests again. Even at a much higher concurrent user count the deadlocks did not happen again.





Customize gradle directory structure

4 06 2010

I started using gradle very recently. It is so much more easy to understand than maven. I guess I am not intelligent enough for maven. Gradle also follows a very similar directory structure to maven. But I wanted to change it as per my projects directory structure.

project
project/src
project/test

All I had to do was customize the sourceSets.

sourceSets {
main {
java {
srcDir 'src'
}
}
test {
java {
srcDir 'test'
}
}

I could not find this information straight away (especially the test sources location). So I am posting it here for future reference.





CI – Have we forgotten the integration bit?

22 05 2010

Almost all software that has every been built had to go through an integration stage in one way or another. It is quite odd that many software projects even to this day look at integration as a separate phase in the project. But things are changing quite fast and many projects adopting Continuous Integration. This is good news because we are saving ourselves a lot of time wasted in the so called integration bugs.

So what is the issue?

Lets cut to the chase. Mostly organizations tend to try out CI on a not so important team before it is adopted across all teams. But in such a scenario CI is only building a smaller system.  Even organizations that have been using CI for quite some time seem to have CI builds per team. But usually the software that the smaller teams are building are just small pieces in a bigger system. Integration bugs still have a longer feedback cycle.

It is not enough to have a CI infrastructure that only verifies the subsystem. CI must deploy the smaller module on to an environment with the rest of the system and run tests as a whole. This gives feedback on the integration.

So we get the point. Whats the big deal?

Actually setting up a good CI is not so simple. Unless the build system is not well thought after CI does not come for free. While writing the build scripts/code one must keep in mind the bigger picture. The effort involved is very similar to a traditional production release. It is also important to revise the build from time to time. While a build that fails for several unknown reasons is a pain, a build that is not proving anything is even worse.

CI should also not be an activity solely done by configuration management team. The people who are responsible for the application design also have a big role to play. One size does not fit all and concepts that proved successful in one project may not work so well for another. Also CI has to evolve with the system that it is integrating. Unless it is up to date with the latest design changes its not doing much.

If your CI is testing a subsystem it can only be called an automated build system for that module. CI needs to verify key integration issues and to some extent even performance. Writing build systems that truly integrate is an interesting activity. It gets you thinking on how two systems integrate. Questions that you would have not thought about previously suddenly become more obvious.

A Continuous Integration build must be run on a production like environment. If not, the application must be deployed to production as often as possible.  A well written CI also inspires confidence to move to a continuous deployment mode.

I do not claim to be an expert on CI, but I think it is important not to forget the principles behind practices. I would associate more importance on the integration bit in CI than smaller checks that verify code style etc.

Also I would recommend reading Sai’s blog on CI.

Let me know your thoughts on this.





Continuous deployment and flexibility

16 03 2010

Say a client asks you to build an application such that he can change it between releases. He wants to be able to do this so that he can react to business needs quickly. But most of the time this is interpreted as a way to build an application that is highly configurable.

One of the application I was working on had very high levels of complexity because of the infrastructure to make it configurable. The client was using a very expensive rule engine which was sold to him saying that business can deploy rules without developer help.

Clearly what the client wanted here was to be able to react to business needs. But what he got was an application which had a baggage of infrastructure to make it highly configurable. In this day and age where many projects use continuous deployment, there is not much of an issue in responding to a business need. Release becomes a non-event.

Even during maintenance phase, we can continue to keep the continuous deployment setup in some cases.

It is just a thought and I would love to see what people have to say about this.





Agile Bengaluru 2010

31 01 2010

Its been a week since agile bengaluru 2010. It was a really nice experience to meet J. B. Rainsberger, Jeff Patton, David Hussman, Naresh Jain and many more people all in one place. I liked the venue even though there was no wifi :P. More than anything else I really appreciate the “go green” theme. This is not a complete experience report, apologies for my laziness when comes to writing long posts.

Being a “Post Agile Conference”,  most topics were aimed at reflecting on how agile has helped us in the past and where are we going. Below are some of the sessions that I enjoyed.

Discovery and Delivery – Redesigning agility – Keynote by David Hussman. A great start to the conference.

Monkey See Monkey Do – by Naresh Jain and Sandeep Shetty – A very interesting talk that looked at some of the agile practices that have become dogma. It would have been great if we had some more time.

Outside the code – Using agile practices to drive product success – by Jeff Patton – It was nice to hear about the “Discovery” part of the agile software development. Go check out the slides.

Using Theory of Constraint and Just in time approach to coach agile teams – a workshop by J. B. Rainsberger and Naresh Jain.

Stop it or I will Bury you alive in a box – by J. B. Rainsberger – J.B. spoke about the 10 things we should stop doing in 2010.

Captain planet (Saurabh Arora) talks – Very inspiring talk on global warming. Keep up the good work dude.

Apart from this I got an opportunity to talk to Jeff Patton some time between the sessions. Spoke about how words like requirements, customers etc do not go well with software development.

I had to rush before all the lightning talks got over. Managed to listen to a some talks like “Agile Deployment”. Just before I left I spoke about “Developer + Tester + Operations = DevTestOps”.

Programming with the stars – A very entertaining session. The participants had to impress the audience to get selected to the next round. The winners would pair four accomplished developers (stars) and come up with a five minute coding exercise that they have to present to a panel of judges (J. B. Rainsberger, Jeff Patton). Very entertaining. Really appreciate the participants and the stars for live coding in front of a sizable audience. The winners got a life time e-learning license from Industrial Logic.

Last but not the least enjoyed speaking about Breaking the monotony with Sai Venkat. Liked the way the talk went. Agility we seek is from the code we write and systems we build and not just from processes and practices we follow. This was the theme of our talk. Got great constructive feedback from the audience.

The conference ended with an open Q and A session.

Kudos to the organizers. The slides are available and the videos should be available soon.

Looking forward to see if we can get the Dogma out of agile and build great software. Feel free to add any comments about topics that I have miss out.





Building java projects on runcoderun.com

26 01 2010

RunCodeRun is a hosted Continuous integration environment developed by Relevance. Thanks to the Relevance team and their effort to provide this service free for opensource projects.

I wanted to build rapa on RunCodeRun. I was trying to setup my project on RunCodeRun using instructions in a blog post. It is a very useful blog and easy to follow. But I was having trouble getting code to compile on runcoderun box. Here are the problems and their solutions.

I guess there is no real jdk on runcoderun, so my compile task on ant would not work.

I checked in tools.jar as a part of my project libraries. Then I added it as part of the classpath when invoking ant from rake.

classpath = [File.join(".","lib","ant.jar"), File.join(".","lib","tools.jar"), File.join(".","lib","junit.jar"), File.join(".","lib","ant-junit.jar"), File.join(".","lib","ant-launcher.jar")].join(File::PATH_SEPARATOR)
system "java -cp #{classpath} org.apache.tools.ant.Main -emacs dist"

Even after I fixed that the build would always go green even if there was failure from ant

I guess the exit status returned by rake was always zero irrespective of what ant returned. I changed the rakefile as shown below.

system "java -cp #{classpath} org.apache.tools.ant.Main -emacs dist"
exit $?.exitstatus

This seemed to fix all the issues with the build.

Along with these issues I was also facing an issue with ant-junit task with junit 4. Basically standard distribution of ant does not understand the annotations of junit4 and requires the test class to extend TestCase. I had to checkout the latest source from ant subversion trunk and build it. I am currently using these jars.

Have a look at the project in github.

http://github.com/harikrishnan83/rapa

Continuous integration on.

http://runcoderun.com/harikrishnan83/rapa





Mnesia Quickstart

5 01 2010

Basic introduction on Mnesia

Mnesia is a RDBMS. But it belongs to the nosql category of RDBMS. Reason being the query language is not sql but Erlang. That makes it very easy to store and retrieve without having to go through Object Relational Mapping. So we can actually call Mnesia an object relational database.

Why and where would one want to Mnesia?

Erlang in general is used to program highly distributed and fault tolerant systems. Even though it has its roots in the telecommunication industry, it has proven useful in several other sites like ejabbered, Facebook chat etc. Mnesia is just a part of Erlang and is built with Erlang.

Hence it gives you configurable degree of Fault-tolerance (by means of replication).

Another important feature of mnesia is the Dirty read interface. It is possible to read, write and search Mnesia tables without protecting the operation inside a transaction.

Quickstart

The below steps should get you started on mnesia, if you are using it for the first time.

Required Software – Erlang

I use ubuntu as my OS. But that should not make it much different on any other OS.

Now lets start with some code. It is useful to install the table-viewer (erlang-tv).

The goal of this exercise is to create table called person, insert few records and read them.

create a file called Person.hrl

-record(person, {name,      %% atomic, unique key
age,        %% age
married_to, %% name of partner or undefined
children }).%% list of children

create a file called Person.erl

-module(person).
-include("<path_to_person.hrl>/person.hrl").
-export([init/0]).
-export([insert/0]).
-export([read/1]).
init() ->
mnesia:create_table(person,[{attributes,record_info(fields,person)}])
.
insert() ->
T = fun() ->
X = #person{name=john,
age=36,
married_to=ana,
children=[josh,kelly,samantha]
},
mnesia:write(X)
end,
mnesia:transaction(T)
.
read(Name) ->
R = fun() ->
mnesia:read(person,Name,write)
end,
mnesia:transaction(R)

Start command line erlang. Type in the below command from the directory which contains the above two files

erl mnesia dir .

The above command conveys that the current directory will be used to store Mnesia files.

Compile the person.erl

>c(person).

{ok,person}

Start mnesia

>mnesia:start().

Create person table.

>person:init().

Insert a record.

>person:insert().

Use table viewer to check if the record has been inserted.

>tv:start().

This will launch table view application. By default the table viewer shows the ETS tables. To look at the table we just created go to view menu and select Mnesia tables.

Read the record using Mnesia:read()

>person:read(klacke).
{atomic,[{person,john,36,ana,[josh,kelly,samantha]}]}

In my next post I will cover Mnesia queries as List Comprehension.





Jumping through hoops to represent trees in Database

29 12 2009

Recently I have been working on a project where we have to represent hierarchical data in Database. Unfortunately we do not have much choice with the database. We are using a relational database.

If you have done this, you will agree with me that it is not a very enjoyable experience.

Firstly we need to choose between several models to represent trees in database

a. Adjacency (self referential tables)

b. Materialized path (lineage)

Shortcomings of adjacency model

Tree traversal is costly in adjacency model. Finding out children and grandchildren of a parent may be quite complex

Shortcomings of materialized path

Materialized path requires you to build this information at some point in time. If you have a million records for which you need to build materialized path, then I suggest you start now, because no knows when it will end. If some one knows of an efficient way of doing this please let me know. If you get past this stage, then there is the issue of updating the data to handle moves and deletions.

Static and Dynamic Data

The choice we make is mostly driven by how many changes can we expect. If we are never going to modify the data, probably materialized path any other approach which stores the lineage information alongside each row is useful. But this is rarely the case.

Some vendor specific help

The guys at micrsoft and oracle seem to have seen this issue and suggest the use of below techniques for this issue.

Sql Server

1. Common table expression: Popularly known as CTE, this is a way to run recursive queries on a self-referential table.

2. HierarchyID: This is a datatype that is available in SqlServer 2008. It uses materialized path.

Oracle

1. Start with and connect by: This is similar to the above method. It works on self-referential Table.

Object modeling trees

Imagine a scenario where you need to model a huge Family. I guess we start by having Person class. Each person has 0 or more children. Children is nothing but a collection of Persons. Mapping this to the data in database is a pain.

1. Lazy loading: Most probably you will have to lazy load the children as and when you need them. Else you may have to wait a generation to get the complete tree loaded.

2. If we want to implement things like Delete or reassignment, saving the data back to database will not be easy.

Better ways to store hierarchical data

Hierarchies are graphs. It is better to use a database like neo4j. Neo4j has been a very popular graph Db.