Editorials

A Little Design Controversy

I’m working on an application that uses a Dot Net FlagsAttribute class implementation for enumerated values.

It works like this. You create an Enumeration using values with details having values assigned as sequential powers of 2.

Enum colors
{
   Black = 1,
   Blue = 2,
   Green = 4,
}

With this enumeration each item is unique. The difference is that it can be used as a bit mask, so that multiple enumerated values may be true at the same time. Because of the bitmask behavior, and the fact that you are using powers of 2 for your base enumerations, you can create additional enumerations for combined base optons demonstrated below as BlackBlueGreen.

Enum colors
{
   Black = 1,
   Blue = 2,
   Green = 4,
   BlackBlueGreen = 7,
   Red = 8,
   Yellow = 16,
}

I’m going to save data from a FlagsAttribute implementation into an SQL Server database for one instance of a class. So, here is the question, “What is the best relational design for this data?” In my experience, the best storage is a many to many relationship between an entity (I’m going to call Thingy) and colors.

CREATE TABLE Colors
(
   Id INT NOT NULL
   ,EnumValue INT NOT NULL
   ,Color VARCHAR(32) NOT NULL
   ,CONSTRAINT PK_Colors 
    PRIMARY KEY CLUSTERED (Id)
)

CREATE TABLE Thingy
(
  Id INT NOT NULL
   ,Name VARCHAR(128) NOT NULL
   ,CONSTRAINT PK_Thingy
    PRIMARY KEY CLUSTERED (Id)
)

CREATE TABLE Thingy_Colors
(
   ThingyId INT NOT NULL
   ,ColorId INT NOT NULL
   ,CONSTRATINT PK_Thingy_Colors
    PRIMARY KEY CLUSTERED (ThingyId, ColorId)
)

Now you have a table, Thingy, that can be colored. You have a table, Colors, that can color something. Because more than one Color can color Thingy, we have a many to many relationship implemented by the join table Thingy_Colors. Now any Thingy may have zero, one or many colors assigned to it, just like the FlagsAttribute enumeration implementation.

Some developers suggested we simply use an integer value to store the enumerations as a bitmask. We could simply store the FlagsAttribute Enum value in the Thingy table directly. We could still use the Colors table, but would not require the many to many join. This option works also. In fact, it doesn’t ruin referential integrity. So why does this implementation feel wrong?

It goes back to the rules of normalization, against which you wish to build a solid database design. The issue here is that you have an ID, or Enumerated Value representing combinations of data from other records within the Enumeration. Yes, the keys are unique, and it doesn’t break down referential integrity. However, if you implement the rules of normalization, only one color name is allowed per identifier. Using normalized table data, there would not be a record with a composite color of BackBlueGreen, and the enumerated value of 7.

Think about it from another angle. How about when you want to consume the data? You have been asked to get a list of Thingy records having the color Blue, because that color is being discontinued. How do you get all of the Thingy records if you have multiple Enums using Blue?

One way is that you could join the Colors table to Thingy, which is the relationship you would have if you didn’t normalize the data, and filter on every color record where the Colors.Color LIKE ‘%Blue%’. Ugh…that query is horrible, even if it works well. You could do the same join and do a bitwise AND against the EnumValue in the Colors table. WHERE Colors.EnumValue & 2 = 2. This query is no better than the first. You cannot use an index effectively on either query. Not only that, you had to know the value 2, or look it up in the table with a subquery.

If you normalize the tables using a many to many join table you can use a simple query for which indexes can be created for optimization:

SELECT  Thingy.*
FROM    Colors
JOIN    Thingy_Colors ON Thingy_Colors.ColorId = Colors.Id
JOIN    Thingy ON Thingy.Id = Thingy_Colors.ThingyId
WHERE   Colors.Color = 'Blue'

While EntityFramework may want you to use a simple implementation for a FlagsAttribute enumeration, it’s worth the time to hand craft the necessary adaption to convert the data to a many to many relationship. Don’t let your Framework or your GUI drive a good database design.

Cheers,

Ben