Editorials

Primary Key Selection

Primary Key Selection
Recently I was talking with a colleague about defining the schema for a table…what columns to include and utilize for primary and foreign keys. The parent table contained a natural key that would be defined as requiring a unique value, and could therefore be used as a primary key.

My colleague also has a standardized policy for maintaining a system assigned sequential key. The preferred database is Microsoft SQL Server. MS SQL Server benefits from having a sequential value as the clustered index. It also works nice with Entity Framework in ORM generation.

A unique sequential value on every table also enables a simple generic logging mechanism where you can use the database, table, column and key value as a unique identifier for values changing in any table.

So, we now have two candidates for a primary key on the table. How does one decide on the best key? Here are some questions I asked to help make the decision:

  • Is the data being used as a foreign key on any other table?
  • If there is a join table, how is the join table utilized? If it is often queried only needing the data from the foreign key, and not needing to join the table containing the primary key, you can increase performance by using the natural key because you won’t have to join to the parent table for the value.
  • If, however, you utilize more than the primary key data from the parent table when working with the join table, it may benefit more by having the sequential value as the relationship column for join performance reasons.
  • Can the value of the externally applied key ever change? If yes, it is not a great candidate as a foreign key in another table because of referential integrity.
  • Are you going to utilize an ORM generator to work with the tables…these tend to be based on sequential columns with the suffix “Id”. Note: I don’t like tools that define my schema…but sometimes, reality and ease of use take precedence.

Those are some thoughts to share today. How about you? Do you have something to add or change? Drop a note to btaylor@sswug.org to get into the conversation.

Cheers,

Ben

$$SWYNK$$

Featured Article(s)
Determining High-Load SQL Statements with ADDM
High-Load SQL Statements may cause serious performance problem, reducing significantly throughput of your database system. So, it�s a good thing to monitor your database, identifying high-load SQL statements. To simplify this task, Oracle Database offers Automatic Database Diagnostic Monitor (ADDM) that you can use to automate the task of finding potential high-load SQL statements and receiving recommendations on how to fix the problem identified.

Featured White Paper(s)
Extending the Value of SCOM with Spotlight on SQL Server
read more)

Featured Script
admin db – number of objects on server monitor of tracking
Identify the number of SQL objects on a server and keep track of the change in the number of SQL objects on a server over tim… (read more)