Editorials

Should You Use Null With Bit Columns?

Nulls in a database have been an area of contention since they were first available and implemented. The primary purpose of null as originally designed was as a placeholder where no value was available. In the case of true/false null provided a tertiary value of unknown, or undefined.

You will get a lot of helpful information about null and how it works, especially in a relational database, by reading this Wikipedia article, http://en.wikipedia.org/wiki/Null_(SQL).

I have found that null is often used in boolean situations because the designer wishes to have the three value option of a boolean data type while only consuming one bit of data. Personally, I have found this to cause a lot more trouble than it is worth. Here are some of my reasons:

  1. You have to use an OR in your where clause to test for the condition NULL or False, or the condition NULL or True. This can result in degraded query performance as one optimization technique is to remove the OR operation from your query
  2. But data types do not index well, and indexes of values opposed to NULL perform better. You can’t index UNKNOWN.
  3. Using a byte data type with three values provides excellent performance and index utilization
  4. You don’t have to manually test for NULL or replace them with a default value using a function like ISNULL(value, default)
  5. You don’t have to handle null when converting Dot Net DataTable records into values, and are not forced to use nullable data types in Dot Net
  6. You are able to extend your allowed values beyond three options.

E. F. Codd redefined null into two different kinds of null. Null where a value is not available and NULL where value doesn’t apply to the current record. For example, you could have a table with persons. In the table have column for gender, and a column with the date of the last breast xray. For a male, the date for the xray would be null because it doesn’t apply. The date for a female would be null because the last date is unknown.

Instead of using NULL for a true/false condition I prefer to use a byte with non-nullable options.

1-True

2-False

3-Unknown

You could add the fourth Codd differentiation as

4-Doesn’t Apply

This data type may be indexed and used efficiently. The following table compares a where clause for a bit value using null and a byte value not using null: It assumes you are writing a WHERE clause against a table with a column named Test.

Bit Column With Null Tiny Int Column Without NULL
WHERE Test = True WHERE Test = 1
WHERE Test IS NULL
OR Test = True

WHERE ISNULL(Test, True) = True

WHERE Test != 2
WHERE Test = False WHERE Test = 2
WHERE Test IS NULL
OR Test = False

WHERE ISNULL(Test, False) = False

WHERE Test > 1
WHERE Test IS NULL WHERE Test > 2

There are a couple performance things to notice in the comparison of the where clauses.

1) Using ISNULL(value, default) against the data in a table disables the use of an Index. All of the queries on the right hand side do not require ISNULL or the OR operation

2) If you descide to split the meaning of NULL into two distinct purposes, the option to the right supports both without any drop in performance. Using a bit column with null there is no differentiation between does not apply and no value available.

What is your position on the practice of using NULL enabling a third option in a binary field? Should the practice be continued? Is it a small thing when it comes to performance? Share your thoughts here or by email to btaylor@sswug.org.

Cheers,

Ben