SSWUG TV
With Stephen Wynkoop
What, Big Data? Check out the insights from MVP Stephen Wynkoop on Big Data.
Watch the Show
Sharding
Relational Database Architects have utilized the concepts of Sharding for years. Sharding is a popular term which is used to take an object, and break it down into it’s component properties. For example, SQL Server uses a sharding technique to index an instance of an XML document. An instance of an XML document is broken down into shards of the XML Nodes or properties, decomposing them into a common set of tags or attributes. Indexes are created on the resulting shards.
However, there is another form of sharding SQL Architects use without being aware. Sharding is used in NoSQL data storage as term for partitioning data. While reading about MongoDB in “The Definitive Guide to MongoDB” from Apress, I found sharding to sound very, very familiar. They describe vertical sharding and horizontal sharding.
Guess what? The reason for sharding and the techniques for implementing it are no different than what we do today in relational databases. The intention of sharding is to spread work out across files, disks or even servers. In the Parllel Data Cluster for Microsoft you find sharding as a core concept, scaling out work across multiple databases in parallel.
In Oracle and SQL Server, partitioned tables are key tools for allowing your database to scale. The key to MongoDB sharding is that the load is not only separated on disk, but also on server, resulting in greater utilization of a CPU.
The result of sharding is often called a Federation. Hey, now that’s familiar too. SQL Server had the concept of Federated views before it released Partitioned tables. If you wanted to break your data up into multiple files, you would create separate tables on the different files, and then Federate them with a view that used a UNION joining the different sets into a single central model.
Now you know a little more about sharding if it is a new concept for you. I’m sure many of you are already using the sharding techniques for different purposes. Why not share your experience, or even clarify the concept? Drop me an Email at btaylor@sswug.org.
Cheers,
Ben
$$SWYNK$$
Featured Article(s)
New Features in Silverlight 5
This article discusses the new features in Silverlight 5.
Featured White Paper(s)
Structuring the Unstructured: How to Dimensionalize Semi-Structured Business Data
Written by Interactive Edge
The Business Intelligence industry … (read more)
Featured Script
RANK(), PARTITION, and CTE Demo
A demonstration piece of code allowing you to see how to use RANK(), PARTITION, and CTE’s to solve a business need…. (read more)