High Availability – What is the real cost?
Today a lot of people have responded with real world experiences moving to HA, and some doing that on a budget. Many have also written including some non-Microsoft solutions as well as SQL Azure.
One reader brought out a great point about how true High Availability cannot be implimented in a single location. If you really want true High Availability you will probably have two different goals. A failover goal on a single site to a different machine, and a failover goal to a remote site in case the entire site becomes unavailable.
You may have a cluster at site A, a Cluster at site B and use replication or log forwarding from Site A to Site B. Another technique is to have a cluster at two sites and use SAN replication across the sites. Now you are really starting to talk about a serious financial commitment.
I looked into SQL Azure and was quite impressed. It has a few issues where it is not a good solutions for some situations. Only full database backups are possible. You can take a full backup of the database and write the output to very inexpensive JBOD in the cloud. You can transport the backup to other locations and restore the results into a different instance. But you can’t do transaction log backups or point in time restoration. So, if your data gets corrupted (another reason for backups) you have nothing other than the last complete backup. There are a lot of other restrictions as well.
Your SQL Asure database is hosted in a single location, including all cloud failover hardware . So, if you want to do multi site colocation, that is not built into SQL Azure; you have to wire that up yourself. So, while the actual cost of having an extermenly redundant database instance are is quite low, you still have to figure out how to replicate to your remote site for colocation implementations. If the only thing you can transport is the full backup, well, you get the point.
There are third party products out there that sit in front of your SQL Server and pretend to be a SQL Server instance. having received the TDS (Tabular Data Stream) it branches the stream to multiple instances of SQL Server, each acting independently. I don’t know if that works with SQL Azure, but that could be a solution for Colocation.
Here are some comments from our readers:
Jim:
There is a second part not mentioned in your column regarding the high cost of high availability: disaster recovery. The various DR implementations I have seen require a another duplication of the infrastructure at the DR site. So if you have duplication with failover clustering, you will double duplication you already have.
Basically, you can think of it similarly to car insurance. If you absolutely know you will never get into accident, then don’t bother with insurance (or just get the minimum insurance your country/state requires by law). But then, who can say they absolutely understand the future – especially when it is beyond our control or influence? As a result, we take as many practical measures we can afford – until someone can improve upon it.
Damian:
Excellent article on HA vs Cost HA. We are going this pain barrier right now and figuring out how we scale our customer base while delivering HA at the same time without breaking the bank and moving away from MS SQL. After looking hard at Amazon RDS and SQL Azure we have opted for Azure for two reasons, 1 HA out of box with nothing to do and 2 GEO HA (limited to your local region i.e. failover from Dublin to London). These are pretty good options when budgets are tight and traditional HA setup aren’t within the budget.
Cloud Servers from any provider also provide an excellent way to delivery HA using VM’s if you want to get your hands dirty. Rackspace, Amazon etc are offering SSD storage options on VM these days which with the right know how you could deliver a decent HA option with a low-med budget.
A good point to add is Cloud Servers are backed with highly available hardware these days and sometimes HA can be avoided if you can afford some downtime and from my experience is pretty low with cloud servers. I guess it’ll always come down to project requirements. My day 1 our project was all signing all dancing until I present the costs ;-), we settled for some downtime and manual disaster recover if worst came to the worst. That allowed us to grow our customer based from day one with out blowing aload of cash from day dot.
I can’t wait for Azure & Amazon RDS to really go head to head so we can get some really decent cloud based services from MS SQL. Amazon RDS is pretty good bare HA which isn’t quite there just yet. If I didn’t want an easy life Amazon RDS would be the choice and if they would making the pricing easier ;-).
One final bit. Check out http://www.sanbolic.com/products.htm . I’ve had a trial of this and rocks. My budget doesn’t 😉 but I’d definitely implement this if I could afford it. HA, Scalability, and load balancing for MS SQL Server. Surely not! But yes it does work.
Join the discussion; send your contribution to btaylor@sswug.org.
Cheers,
Ben
$$SWYNK$$
Featured Article(s)
Getting Started With Hierarchical Queries
Hierarchical queries return rows in the order that forms a hierarchical tree where each row is located at a certain level in a hierarchy, having a parent, children, or both. The best way to understand this concept is to build some hierarchical queries against database objects that permit it. Concurrency Handling Techniques in ADO.NET [Part 1]
The technique that allows us to detect and resolve conflicts that arise out of two concurrent requests to the same resource is defined as Concurrency Handling. This technique allows multiple users to access the same data simultaneously and yet ensure that the data remains consistent across all subsequent requests. In this series of articles we will explore concurrency handling techniques in ADO.NET, LINQ and entity Framework.
Featured White Paper(s)
Top 10 Tips for Optimizing SQL Server Performance
read more)
Featured Script
admin db – Index (fragmentation) Maintenance for Log Shipping
Keep your indexes in good shape with minimal blocking. Track table fragmentation and prevent maintenance from filling your lo… (read more)