Editorials

Database Refactoring, An Example

In response to yesterday’s posting, the question was asked, “How does a different interface deal with the fact that a schema change was made to a shared table?” This question goes right to the core issue. The database needs to be able to change without breaking existing applications that are not being modified.

One way to describe this is to walk through a scenario. Let’s consider an HR database tracking employees. The original employee table has a hire date, and a termination date in the employee table. Over time it has been determined that an employee may leave the company and return at a later time. In fact, that may happen multiple times. It is desired to maintain only one employee record, and to normalized the data by putting the employment hire and termination dates into a related table.

All version 1 applications expect to find a hire date and termination date in the employee table. If we make this change, then all the existing applications must be modified to accommodate the change. If you’ve read Scott Amblers book on database refactoring there are a lot of solutions you can do in the database alone. You can create the new table, and have a trigger updating the employee table start and termination dates based on updates to the related employment time segments. This is necessary if all your applications access you database directly.

If, instead, all your applications access your database through a data layer, then you can present the data in different forms more easily. So, the original interface specification has an employee with a hire and termination date. The version 2 interface specification has an employee record with a list of hire and termination dates. Both may be fulfilled from the new database design, without the need for triggers. The list is transformed , returning only the last time segment, and combined with the Employee data as a single object for the implementation of the version 1 interface.

The beauty of this design is that a single objects implementation may be cast to different interfaces. So, when your version 1 application retrieves the object it looks one way. When your new application retrieves the same object, it will be cast to the new version specification. The end results, both versions may live in the same code base, side by side.

The elegant solution is to not handle different requirements at the database level. Instead, do the adaptation as needed at the data access level. Using interfaces greatly simplifies the process when multiple applications all share any data source.

Cheers,

Ben