Editorials

Symptoms of a Database in Need of Normalization

Webcast: Understanding the new culture of SharePoint
You have a sponsor for SharePoint development, you have a budget, now where are all the users? Online Collaboration powered by SharePoint is not only a new technology, but a new culture of doing work. In this session we’ll explore change management (the fluffy kind) strategies that can be tailored to your user base. Attract users, and keep your budget!

Presented by: Adam Levithan
> Register Now
> Live date: 7/28/2010 at 12:00 Pacific

Symptoms of a Database in Need of Normalization

As I was writing the editorial for yesterday I started thinking about different symptoms of database problems that may be resolved by normalizing your data better. If you review the rules for normalization they provide great direction. But for me, I tend to remember database normalization more through experience (often bad).

For example, one of the rules in normalization is, "no repeating columns". Symptoms of this are often seen when you have columns named the same thing followed by a numeric suffix. Like phone1, phone2, phone(n).

I can look at a table and see Phone1-Phone(n) and recognize a simple opportunity for database refactoring. For the life of me, I can’t remember the phrasing of the normalization rule against repeating columns, nor am I positive if it is a rule for 2nd normal form; and I don’t feel like looking it up right now. But I know for a fact that this simply doesn’t work well. Here are some reasons why:

  1. It requires schema change when you want to add another instance. So, if you find a need to have 4 phone numbers but you only have three columns, you have to modify your schema instead of adding another row in a phone number table.
  2. It requires a static number of phone numbers. That means every record must have the same amount of phone numbers. Or you have columns wasting space or introduce null issues.
  3. It doesn’t index well for searches. For example: what if you want to find a purchase order by phone number? If you have three phone number columns you have to search each one for the desired number. You would need three indexes to help the search be effective, or one big composite index.
  4. Validation can be complicated. You could put the same phone number in all phone fields. If the fields represent the capability of the phone (Fax, Modem, Voice) then that may not be bad. But if you desired only one instance of a phone number, you would have an interesting check constraint instead of a unique constraint on a normalized table.

So, that’s where I’m going with this. I’d like to get a dialog going, primarily for those who are new to relational databases, providing our SSWUG community to with real world examples of poor database design techniques, and how they expose themselves with problems that are often complicated to solve. Send me your thoughts at btaylor@sswug.org.

I’d be interested in real world problems you have experienced which can be resolved with normalization principles. Instead, you may have an issue you are not sure how to solve; feel free to send that in as well for community review.

While you’re thinking about things to share about database normalization, take a look at the upcoming workshop from Kalen Delaney on July 23rd. That’s next Friday. It will be a great virtual workshop to help you update or increase your knowledge of database indexing, especially as applied to SQL Server 2008. Go to www.vconferenceonline.com/shows/workshops/indexing.asp for more information or to register.

Cheers,

Ben

Featured Article(s)
DTS Conversion to SSIS (Part 4 of 4)
How do I migrate the ActiveX scripts? Completing the migration process.