Utilizing sharded data can be a complicated process. By now you probably know that sharded data is the distribution of data into multiple tables or even multiple databases. The problem then becomes how to retrieve that data from the distributed shards if you don’t have an already built in method such as partitioned tables in SQL Server?
MySql has a number of vendors providing a sharding data broker allowing a client to attach to the broker as if it was a single database, and routing the request to the appropriate shard for processing. Many shops have developed their own in house data broker to perform this task with differing degrees of success. It’s for this reason you see a number of individuals such as Brent Ozar encouraging caution when choosing sharding as a technique for application optimization. It is very easy to get a data broker wrong, and introduce more issues than if you had a single database from the start.
One of the easiest methods of sharding is to have completely separate databases that are basically self-contained. Each database has their own independent sharding data, and also has replicated data shared by all. In this fashion, any query may be implemented within a single database and database connection.
I did some searching for a data broker technology to be used with SQL Server for an hour or so and didn’t find much. Microsoft has an interesting sharding implementation for the SQL Azure service, which makes sense do to some of the size and performance limitations on SQL Azure. Sharding complete databases as described above makes querying all the databases quite seamless. It is up to the client to union the results crossing shards (there aspects of this union that may be done in SQL Azure as well).
Given the need to have autonomous databases and request data from only one, what options are available today for routing to the appropriate shard when using something other than MySql? Have you written your own? Do you have an open system or commercial product to fit into this space? Share your experience with us by joining the conversation.
Cheers,
Ben
