We are growing quickly here at Junction Networks, and part of this growth means we need to be able to handle more users. This is some thing we are all happy about! We have been tracking and forecasting our growth for some time so we can plan for our capacity needs in the future. To facilitate this capacity expansion, one of the main projects we've been working on is scaling our SIP proxy.
For those of you who don't know what a SIP proxy is, you can think of it as the "Grand Central Terminal" of the network. SIP traffic is sent there and then gets sent out on its way to a different place. When someone calls you, a message is sent to the SIP proxy. The SIP proxy knows where your phone is so it relays the message down to your phone, then acts as an intermediary between you and the other caller until one of you hangs up. SIP proxies can do a lot more; in fact, our SIP proxy does quite a bit more than what has just been described, but you get the idea.
SIP Proxy servers can be of two general types: stateless, and stateful. Stateless SIP proxies don't know any thing about the messages they are receiving; they just forward things along without thinking about it. Stateful SIP proxies keep track of what has happened in a call and use that information to make decisions about what to do throughout a call. The difference becomes important when you start talking about scaling.
Scaling generally happens one of two ways in the software world: vertically, or horizontally. Vertical scaling means you buy bigger and bigger hardware to run the same software, thus giving you more resources to use. Horizontal scaling means you buy many of the same type of smaller hardware and distribute load. Each has it's advantages, but in the end, you can only vertically scale so far - to whatever the biggest computer in the world is. Generally speaking, horizontally scaling, while requiring more planning and initial time investment, can bring you nearly infinite growth capacity at a far lower cost ratio when compared to vertical scaling.
So, how do you how do you horizontally scale a sip proxy? Scaling stateful SIP proxies requires a lot more work than stateless proxies. This is because for any phone call, in a horizontally scaled architecture, it is very possible that messages will me exchanged between multiple SIP proxy servers. This requires that each server know about the exact same state information throughout the call.
Here are the main pieces of state that need to be distributed:
-
1. User Location: What phones does a user have, and where are they located?
-
2. NAT/Firewall traversal: What path must be followed to be able to get messages through a NAT/Firewall to a phone?
-
3. Sequential Request Routing: If there are changes to this call: transfer, hold, etc. Where do those messages need to get sent in order to route back the same way the came?
There are several Request For Comments (RFC's) available that we have implemented to allow for the dissemination of this information between any number of stateful SIP proxy servers.
-
Challenge (1) is solved by using a global registrar so that every SIP proxy has the same view of what phones are where.
-
Challenge (2) is solved by using PATH headers (RFC 3327). PATH headers record the first server that received a request from a phone, the same one that will be keeping a NAT/Firewall open, and save it so that when we look up a users location, we know how to route the call back so that it will traverse a NAT/Firewall.
-
Challenge (3) is solved by using GRUU's (RFC 5627) or Globally Routeable Useragent URI. During a call setup, each phone exchanges information on how it can be contacted for future requests. We change this contact and encode information so that the call can be globally routed from any where, solving a recurrence of issue 2).
There is a lot going on that we just can't cover in a blog post, but these are some of the general problems that we have solved in the process of creating an infinitely scalable, redundant, and fault tolerant network of SIP proxy servers.