Editorials

Data Separation Techniques

Virtual Workshop: SQL Server 2008 Indexes – Internals and Best Practices
Feb. 25, 2011

Along with a good database design I would say that Indexes are a cornerstone to good database performance. There are a lot of other skills you can develop in relational databases; but without these two skills, you’re going to find yourself in a heap of trouble. The SSWUG.org’s virtual workshop, conducted by Microsoft SQL Server MVP Kalen Delaney, will give you the skills you need to understand indexes, evaluate their effectiveness and make the necessary adjustments for performance. Register today to save your spot.

Online Course: Introduction to SQL Server Replication or SQL Server Integration Services (SSIS)
On-Demand Sessions Start on MARCH 1, 2011 and Ends MARCH 31, 2011

Have you taken advantage of any of these on-demand courses taught by Eric Johnson yet. The response has been awesome. You have access to all the course content, in this case 12 sessions for either Replication or 12 sessions for SSIS, for an entire month. Study when you have time, learn at your own pace, and establish an expertise in areas where you have not had enough experience. Register for Replication and/or Register for SSIS in this March series. Registration ends February 28th.

DBTechCon – Spring 2011
The SSWUG Spring 2011 virtual conference is quickly coming together. You can go to the registration page and see profiles for presenters, and summaries for the sessions to be presented. Without a doubt, this will be the largest virtual conference in the information technology industry. Access more than 70 sessions from well seasoned professionals with in-depth instruction on technologies such as SQL Server, SharePoint, .Net, Business Intelligence and much more. Register any time from now to April 19th. There are early registration discounts for those who register early. Go to the Registration page to get signed up right away.

$$SWYNK$$

Featured Article(s)
Troubleshooting SQL Server 2008 Transactional Replication
In this article, Alexander Chigrik explains some problems that you can have when you work with SQL Server 2008 transactional replication. He also tells how you can resolve these problems.

Featured White Paper(s)
Essential Performance Tools for SQL Server DBAS
Optimizing SQL Server performance can be a daunting task. Especially so for an increasing number of reluctant DBAs faced with… (read more)

Data Separation Techniques
As I mentioned yesterday there are a number of reasons for separating data into physically different databases when you provide software as a service. Often the separation of the data is easily implemented with some form of a data broker, directing the access of the persistence layer to data stores physically separate from each other.

Some implementations will replicated data that is common to all customers across all separate customer databases resulting in the ability to enforce referential integrity since all data is self contained.

Walden provides a good example of this approach:

We have an application for managing research ethics oversight that we sell as a SaaS solution (IRBManager). Compounding the common requirements of general privacy is the fact that HIPAA protected information often makes its way into our application. When we started architecting this system years ago (2003) we decided to put each customer’s data in its own database and not have any shared common database.

Where we have non-customer specific data needed to make the system run we duplicate that data in each customer’s database. This of course has issues of data redundancy and maintenance, but provides us with a perfectly clean break between customers, and the ability to simply ship them "last night’s backup" should they ever want a copy of their data. Simple, clean and effective.

This technique is actually one of the optimization techniques of the Parallel Data Warehouse (PDW) strategy in SQL Server 2008 R2. Data is partitioned into separate databases (not necessarily per regulations in this case, but for performance) and common data is replicated across the different partitions. PDW makes this separation for performance reasons allowing access to multiple databases concurrently. In our case, the separation is to assure that for some security or regulatory reason, no sensitive data is shared.

Tomorrow we’ll talk about not replicating the data common to all. Do you have an implementation using a centralized "shared data" repository? Why not write me a quick note with your experience and share it with our readers?

Cheers,

Ben