Friday, July 5, 2013

Scale Me Up -- Musings of a scalability researcher

Scalability. What does it imply?

More connections? Yes
More traffic? Yes
More concurrency issues? Yes
More performance issues? Yes

My work at EnterpriseDB leads me down a path less trodden, the intricate and fine details of scalability.
My musings lie here, in this blog post, which you are reading currently.

Having recently completed the connection throttler project in PEM, I moved on to research optimal architectures and protocols to establish a connection between a client and host databases running on different servers, with PEM server as the intermediary. The architecture had to be scalable. This was a new experience for me, as I had to research network protocols which may work well for our use cases. I was a bit intimated at first, but then remembered a quote I had read somewhere:

"If you are getting comfortable, move out."

This is what lead me to get down and start working on it. A lot of interesting points came up, and I learnt some things which are invaluable.

I was aware of the standard protocols, SSH, TLS for security etc. Dave advised me to go and research IPSec tunnel as well, which I did.

Some points here:

1) When looking at protocols for scalability, look at the performance. See how much time computations for the protocol would take, and how much CPU may be used for the same. To take an example, in my research, I found that TLS's authentication through PKI is time consuming and CPU consuming, compared to IPSec's pre shared key, which is much faster.

2) Look at the protocols your protocol is dependent upon. For e.g. IPSec depends on UDP, which does not support fragmentation, and hence cannot support large data packets. This is a disadvantage over TLS and TCP.

So, again, you have a plus on each side, and a corresponding minus on each side as well.

The above points are pretty specific to a part of scalability components, but demonstrate an important point, which I am trying to highlight here:

There is no clear answer.

Wow, where did that come from?

Well, I feel that this is the answer to most of the things we do, decision we make. Every component, every option has its own pros and cons, and we need to evaluate everything on the same level, with an open mind.

All solutions are applicable, we just need to find the one that best suits our use cases and work loads.

I will keep you all updated.