Editorials

When Identity Columns Hurt

Developers following Microsoft standard development methods are likely to utilize an Identity column as a primary key in their lookup tables. This method is simple, and guaranteed to be unique. The practice has a number of pitfalls.

One pitfall is what happens when you have multiple databases containing the same lookup table, and you want the assigned key value to be the same in all instances. In order to accomplish this, the records must be inserted in the exact same order in every database so that all databases have the same key. All that has to happen is for one insert to fail, and your key assignment is off unless you know how to reset the identity seed.

There are a few things you can do to resolve this scenario. The easiest is to have a single database as the standard of truth. If you want to use identity keys, use this separate database to assign the key values, and then use a different technique to replicate the Identity value to the other databases.

I am going to assume that there is no database replication in place between the different databases. The new records are deployed to the other databases through a change control process, usually a database change script.

If the tables do not have identity columns, then a record may be inserted simply. However, if the tables have been created with an IDENTITY column, you cannot remove the IDENTITY table assignment without replacing the old table with a new one not using an IDENTITY column. However, you can override an IDENTITY assignment in your insert statement in TSQL using the SET IDENTITY_INSERT command. Here is a code snippet to override an IDENTITY value in my fake lookup table named myLookup which has ID and Description columns.

SET IDENTITY INSERT myLookup ON

INSERT INTO myLookup (ID, Description)

VALUES 23, ‘Descriptive text’

SET IDENTITY INSERT myLookup OFF

Executing these commands result in creating a row in myLookup with the ID value of 23, and the Description of ‘Descriptive text’. You can also perform updates statements to rows, including modifying the IDENTITY column value.

IDENTITY_INSERT can only be set to ON for one table at a time in a database. There is no limit to the number of rows you may insert or update. All columns must be specified in the table declaration of the insert statement. So, this becomes more painful as the number of columns in the table increases.

This technique works nicely when you wish to keep the key values assigned synchronized in multiple databases or multiple environments such as DEV, QA and Production without fear of data entry in an incorrect order.

In summary, in my experience, it is a lot less work to manually assign key values for lookup tables than to utilize an IDENTITY column, unless you are having users dynamically generate the entries, in which case you are probably not trying to keep them syncronized in the first place. When you have a set of databases that must synchronize key assignments, maintain them externally from the destination databases, and assign them when deploying database changes. If your tables already have an IDENTITY column, you can use IDENTITY INSERT to override the table definition instead of dropping and re-creating the table.

How do you manage your lookup tables? Share your experience here online, or drop an email to btaylor@sswug.org.

Cheers,

Ben