Editorials

Column Stores Outstanding Performance

Column Stores are a newer data structure found in SQL Server storing high density volumes of data in Binary Large Objects within the SQL Server database. Instead of storing data in a btree structure where each row is aq separate object processed one at a time, a column store data object may process a large number of records with the same amount of work.

The net result of the column store is performance. Microsoft has tested up to 100x faster performance using column stores for data storage and retrieval.

Some neat features of column stores is that they can be indexed, updated real time in a transactional form, or even inserted in large batches, making them a prime candidate for data warehousing because of the reduced blocking activity.

You might consider this like combining NoSql storage with a relational engine allowing data mining performance along with the join capabilities of a relational database. So, is this the direction we need to be moving? Microsoft has proven great performance and will continue to grow the platform. But how does this scale? How will it fit into the growing SQL Azure platform? Will it allow distributed storage through sharding or will we have to consider purchasing bigger hardware or appliances rather than using the commodity priced products based on Hadoop?

Share your thoughts or experience about column stores here online or by Email to btaylor@sswug.org.

Cheers,

Ben