Editorials

A Case for Normalization

A Case for Normalization
Recently I came across a process that was causing indexes on a customer’s table to become extremely fragmented throughout the day. The table was used for the purposes of knowing if a document had been updated each day. There was no need to know when, just had it been updated.

In this case, the customer chose to use a Bit column on the document table. Each night, a scheduled job would update the bit to 0 (false). Then as documents were processed throughout the day, the bit flag would be set to true.

For performance reasons, an index was necessary on the Bit column in order to determine documents that could be used if they were current today. You can imagine how quickly the index became fragmented as the documents were updated throughout the day.

One solution for this problem is to use a one to zero or one relationship in a table that simply tracks Document not Processed. In this case I used a negative test. Instead of tracking documents processed, I track the documents not processed today in a table called DocumentNotProcessed.

Each night, the ID from all documents is entered into the DocumentNotProcessed table. Then, as the documents are processed throughout the day, the corresponding entry is DELETED from DocumentsNotProcessed. Creating a clustered index on the DocumentsNotProcessed.DocumentID column results in an index that does not fragment or decrease in performance throughout the day.

If I need to know the Documents having been processed today I can outer join DocumentsNotProcessed and find records with a null value. If I want to know stale documents, I can inner join DocumentsNotProcessed. Either way, my clustered index is always up to date, and effective.

There are many other solutions for this kind of problem. This one seemed to work best with the fact that nearly all documents were replaced daily.

If you have other normalization war stories feel free to share them with us by writing to btaylor@sswug.org.

Cheers,

Ben

$$SWYNK$$

Featured Article(s)
UNIX for DBAs (Part 5)
UNIX is one of the most popular operating system for running database management systems. Of course, Windows is popular, too, as are z/OS and even OS/400, not to mention Linux (but that is a variant of UNIX). At any rate, UNIX is ubiquitous in the realm of database processing and it is important that DBAs know the basics of UNIX in order to effectively manage the databases on that platform. This article offers up a short tutorial on the basic UNIX operating system commands and features that are most important for DBAs to understand.

Featured White Paper(s)
Encryption & Key Management for Microsoft SQL Server 2008/2012
Written by Townsend Security

Simplify encryption and key management on … (read more)