Editorials

Database Code and Data Separation

Ed Writes:

In a different slant on cross-database operations, we are looking for the best ways to split a database into two where some of the stored procedures use tables from both databases and there is the need for referential integrity with tables in the other database. The reason we would do this? To split the application configuration away from the data. Then we would be able to have multiple configuration databases and have them work on the same data/share the same data. Ideas on being able to do that seamlessly, easily?

I have actually worked on an application built across two different databases that worked together. We shipped a monthly database to each customer containing their historical, read-only, data from the previous period. This data was in a read only database hosted on DVD.

The system also had a second database maintaining custom configuration data the users defined, so that the customizations were not overwritten each month. Most of the stored procedures were in the second, application and customizations database which would remotely attach and retrieve the read only data.

This technique worked very well for that specific scenario. There is no method of built in referential integrity. You can roll your own referential integrity with triggers. Any data maintained in the user configurable database was taken from the read only database, and therefore had a sort of referential integrity. If the user used a read only database from a previous period not having some records that were added in a newer month then the data from the configuration database would simply ignore that data through the join process.

There is not going to be an easy solution to enforce referential integrity should this be a requirement. It is a lot easier to have a separation of code from data, such as having a database with all your stored procedures, user defined functions and types, and maybe even views, and then have a second database containing the data itself. Then the data database is self-sufficient and able to use declarative referential integrity. This is actually how
Microsoft Access applications were optimized in the old days.

Be aware that this technique of using heterogeneous queries may make the deployment of your application easier, even if the difference is not just code but also tables, because the query optimizer does not have available statistics to put together the best query plan for data hosted in the remote database.

You should be able to put together a test rather easily to see how this will work for you in your scenario. Thanks for sharing this interesting requirement, Ed.

Get into the conversation by adding your comments for Ed online. Or, if you have a question of your own, drop me an email at btaylor@sswug.org.

Cheers,

Ben