Uncategorized

Distributed Data

Featured Article(s)
Troubleshooting SQL Server 2008 alerts
In this article, Alexander Chigrik explains some problems you can have with SQL Server 2008 alerts. He also tells how you can resolve these problems.

Featured White Paper(s)
Why and How You Should Find and Fix Index Fragmentation
In a high-volume database you need frequent critical maintenance. If you skip this important task, index fragmentation may ca… (read more)

Distributed Data
I’d like to follow up today on Steve’s editorial from Yesterday, "Are You Working With New Database Platforms?". There are a lot of factors driving trends to new storage paradigms. i alluded to this a while back when talking about distributed databases.

Take Google for instance. They are the hands down example of economical, scalable storage and retrieval. Imagine trying to develop a SQL Server database capable of storing the volume of data maintained by Google, keep it current, append new stuff, delete obsolete stuff, and share it with the world at speeds that have erupted into standards of performance users are starting to demand.

I don’t know if SQL Server 2008 parallel data warehouse could perform as well as the Google storage system (Big Table), at least not with the same cost.

Companies like Amazon, Facebook, twitter or Ebay require massively scalable, fault tolerant storage with ACID characteristics. However, most companies don’t have the volume of data of an Amazon or Ebay.

There are open systems distributed data storage tools available today (do a Google search on Distributed Data Storage). This prompts me with a few questions (groan if you like, I like lists):

  • Do these cost effective tools have the performance and failover characteristics we have come to expect?
  • Is this a technology that is coming to the Small to Medium Business, or is it only for the Massively intensive uses?
  • Are they secure?
  • How do they compare in the cost per performance to other storage methods?
  • What kind of skills are going to be needed to support new storage mediums?
  • What tools are available to mine data from these kinds of storage? Is it all custom code?
  • Are ETL Tools available or is this custom code?
  • Can I live without my relational database system or cubes?
  • How does all of this interact with Cloud implementations?

Are you looking into any of these new platforms? Do you have any lessons you’d be willing to share with our readers?
Drop me a note at btaylor@sswug.org.

Cheers,

Ben