Tuesday, February 24, 2009

Solving Cross Data Center Redundancy

First things first. Yes, I realize it's been almost 3 months since my last post...shame on me ! The good news is that we've been quite busy working on lots of new things, so I have plenty of material to keep me writing for a while !

I'd like to start with a topic I've been thinking about a lot lately (today in particular) that I think many people are interested in. That topic is how do you provide automatic, transparent fail-over between servers located in different data centers. Ever since the I-Light fiber between Indianapolis and Bloomington was completed and the ICTC building was completed, we've been receiving requests to enable server redundancy between the two campuses. Seems easy enough, so why haven't we done this yet ?

There are really 3 main options available:

(1) Microsoft Network Load-Balancing or similar solutions. These solutions require the 2 servers to be connected to the same broadcast domain. They usual work by assigning a "shared" MAC or IP address to the two servers along with various tricks for getting the router to direct traffic destined for a single IP address to 2 different servers. Some of these packages also include software that handles the server synchronization (eg synchronizing databases, config files, etc).

(2) Global Server Load Balancing (GSLB). These are DNS based solutions whereby the GSLB DNS server is the authoritative DNS server for a domain and returns different A records based on the IP address of the client (or rather the client's recursive DNS server) and the "health" of the servers. In many cases, "servers" are actually virtual IP addresses on a local load-balancing appliance.

(3) Route Health Injection. These solutions involve a local load-balancing appliance that "injects" a /32 route via BGP or OSPF into the network for the virtual IP address of a server. Typically you have a load-balancing appliance in each data center that injects a /32 for the server's virtual IP address. The key is the virtual IP addresses are the *SAME* IP address in both data centers. It's NOT the same broadcast domain, just the same IP address and the actual servers are typically on private IP addresses "behind" the load-balancing appliances. You can achieve an active-passive configuration by setting the routing metrics so that the announcement at one data center is preferred over the other. *OR* you can set equal route metrics and clients will follow the path to the "closest" data center based on network paths -- this is referred to as "anycast".

So you're thinking "these all sound like good options, surely there must be some gotchas?"....

The issue with option #1 is that you have to extend a broadcast domain between the two data centers - in our case between Indianapolis and Bloomington. As I think I covered in an earlier post, "broadcast domain" == "failure domain". Many types of failures are contained within a single broadcast domain and by extending broadcast domains across multiple facilities, you increase the risk of a single failure bringing down multiple systems. Especially in a university environment where management of servers is very decentralized, this can become very problematic. I can recount numerous occasions where someone made a change (ie did something bad) that created a failure (eg loop, broadcast storm, etc) and all the users in multiple buildings were affected because a VLAN had been "plumbed" through multiple buildings for whatever reason. However, these solutions are typically very inexpensive (often free), so they are very attractive to system owners/administrators.

There are 2 main issues with option #2. First, in order to provide reasonably fast failover, you have to reduce the TTL on the DNS records to a relatively small value (eg 60 seconds). If you have a very large number of clients querying a small set of recursive DNS servers, you may significantly increase the load on your recursive DNS servers. The other issue is with clients that ignore the DNS TTL and cache the records for an extended period of time. GSLB solutions are also significantly more expensive than option #1 solutions. One big advantage of GSLB is that the servers can literally be anywhere on the Internet.

Option 3 is actually quite attractive in many ways. One downside is that the servers must reside behind a local load-balancing appliance. That's not entirely correct. You could install routing software on the servers themselves, but with many different groups managing servers this raises concerns about who is injecting routes into your routing protocols. The need for load-balancing appliances significantly increases the cost of the solution and limits where the servers can be located. In order to reduce costs you could place multiple systems behind a single load-balancing appliance (assuming there's sufficient capacity), but that raises the issue of who manages the appliance. There are virtualization options of some load-balancers that allow different groups to manage different portions of the configuration, so there are some solutions to this.

We are currently exploring both the Global Server Load-Balancing and Route Health Injection options in the hope of developing a service that provides automatic, transparent (to the clients) failover between at least the two UITS data centers and possibly (with GSLB) between any two locations.

No comments: