Editorials

Extending SQL Azure Database Size Limits

One of the problems I have had with SQL Azure are the native restrictions for scaling data. The databases are limited in size. The only way to access large volumes of data, (a single table that is bigger than the allotted database size), from an SQL Azure database is through federation. This is always the case if you are accessing your database by connecting to your SQL Azure from your application directly, most likely in an Azure hosted application.

While reading a newsletter I was made aware that you can now use the old Federated View technique, (the precursor to partitioned tables), in a traditional SQL Server instance, having linked servers to SQL Azure databases, in order to create some very large table sizes. A federated view consists of a series of SELECT statements to each server, resulting in a single data set using UNION all for each additional select statement. This federated view must reside in a non-SQL Azure database instance. SQL Azure does not support linked servers.

If you create a constraint that is unique in each table instance, SQL Server will remember what instance contains data specific to a where clause, and only inquire on the appropriate database(s).

So, what do you now have? You can now connect to a single database, query through a view, and access data from over a hundred different databases, having the data sharded across many instances of SQL Azure.

Although you can create this federated view on a database hosted on premises, you probably wouldn’t go that route because of financial reasons. Azure does not charge for database network access inside the hosting site. However, they do charge for data access over the internet. So, if you are using a partitioned view, it is most likely because you have a lot of data.

If you chose to create a federated view, and wish to keep costs down, you can create an Azure Windows Server, using Infrastructure as a Service, install traditional SQL Server, and create your federated view in that environment. Now when you call your view, it is located on the same internal network as your SQL Azure databases, and you pay no external bandwidth fees for usage. If your applications are also hosted in an Azure environment, now your network fees are reduced to those necessary for the client to access your application.

One thing to note before I close. You can use the Linked Server technique to spread data across multiple SQL Azure instances without using a federated view. You can place different data on different instances as another technique to meet the limited database size restriction. Be aware that joining or filtering data using heterogeneous data queries can have some performance implications.

For more information on using linked servers against SQL Azure, and specifically using a federated view, take a look at this MSDN blog.

Are you sharding your SQL Azure data? Have you Linked SQL Azure databases using Linked Servers? Leave a comment to share your insight.

Cheers,

Ben