Compensating for Bad Schema – Views
Views are another tool used to compensate for a bad schema. I have been using the term Bad Schema for a couple days now…and want to qualify the statement. The same schema may be bad in one situation and not bad in a different one.
In a reporting database you consistently find a highly denormalized set of tables called Data Marts. Data Marts are designed to be generally self sufficient. Few or no joins to other tables are required in order to retrieve desired results.
In an Online Transaction Processing database Data Marts are generally not appropriate. Normalized data is the de-facto standard. The degree of normalization depends on the patterns of data usage.
Views are often used as tools to transform a database from one form to another. In a readers comment today you will see a data mart transformed into normalized data through the use of a view.
A view may also be used to transform highly normalized data into results that look like data marts. The resulting view is easy to work with from a consumer perspective, but performance is often poor. This is due to the fact that a view is a "Virtual" object that must be materialized each time it is used.
SQL Server has the ability to create indexes on a view. So, rather than transform the data into a Data Mart, a view is created, and then indexes are created on the view. The creation of indexes on the view results in a materialized object. The net effect is that you now have a data mart that is maintained whenever the underlying normalized data is modified.
Depending on the situation, this may be a good or a bad problem. If the view consists of a many different tables, modifying a record in a single table may create locks on the other tables as well until the resulting rows in the view are modified as well.
Think of a view that summarizes customer purchases by month which is a common data mart design. Now any time a new purchase is created, that purchase record must be rolled up in the appropriate month of the view. Unlike a data mart where you could simply increase the total in the appropriate month for the purchase detail record, now it must process the totals based on the definition of the view; SUM(PurchaseAmount) Group by Month, CustomerID.
In this case you are using a materialized view, through the use of indexes, to maintain what is in reality a data mart. However, the modification of your purchase table is now slowed by the additional weight of maintaining the indexed view. Reporting performance may be increased, but performance when modifying the underlying data is decreased.
The point here is that once size does not fit all. An indexed view is a great thing in many circumstances. For the scenario described here, it could be a database killer. I have seen instances where the use of Indexed Views has literally brought a database to its knees.
Robert writes:
Here’s one incident that I ran into recently when having to create a web-app that involved synchronizing data from a legacy structure for consumption by a new ASP.NET web-app :
One of the tables had the classic type of thing you see often, where newbie data-designers model a single-table after a complex legacy data-entry form, rather than use a relational one-to-many model for repeating elements. In this case, the mortal sin committed involved a list of assignments for commercial Photographer sporting event schedules… where the form specifies 1-3 photographers… so the legacy data had repeating fields for Photog1, 2, 3…. in one table. Other sub-sins included poorly validated data mucking the works.
Rather than have to analyze/rewrite lots of code in production that already accessed its poorly-normalized structure, I used a SQL View with a series of ‘child-creating-unions’ to present a ‘normalized view’ to the new code where needed.
The repeating ‘ID’ field in the legacy table was ‘Photog(1..3), def’d as CHAR… with a ‘0’ stored if that ‘slot’ was not used. Another related ‘repeating’ field was ‘pSits(1…3)’; all biz-rules required exclusion of Photog data w/out ‘pSits’, but the legacy data ‘unclean’, never validated, so it was easy to fix a long-standing ‘lack-of validation’ by excluding child-rows with a filter on their respective UNION’d selection queries.
‘DtEntry’ was purposely denorm’ed, placed in all ‘child rows’… to allow new functionality (primarily reporting).. to easily access all it needed thru the view. ‘Nconfirm’ was another poorly-normalized element, some down-stream processes required it, but it could, by biz-rule definition, only be associated with ‘Photog1’… so another logic switch at the child-row-creating –UNION saved the day.
This view creates ‘PhotogWorkCHILD’… which has saved me enormous time as I write new code; while still allowing the legacy code to function.
CREATE view [dbo].[photogworkchild] as
-- flatten the de-normalized structure into one more workable, w/ 1-row per PhotogX for the fields we need...
-- note2: dups in this created child-rowset, caused by bad prior biz-rule enforcement in the legacy data auto-eliminated; -- UNION default logic does the trick for us; we do not do UNION ALL on purpose).
select photog1 as photog,psits1 as psits,dtentry, nconfirm
from photogwork where photog1<>'0'
union
select photog2 as photog,psits2 as psits,dtentry,0 as nconfirm
from photogwork where photog2<>'0' and psits2>0
union
select photog3 as photog,psits3 as psits,dtentry,0 as nconfirm
from photogwork where photog3<>'0' and psits3>0
I have tried to emphasize that Views are not bad, nor are techniques where data is normalized in one form or another. The emphasis is on the appropriateness of the schema to the maintenance and the consumption of the data. The the schema is not appropriate to the purpose, compensating with views, or worse yet, indexed views, may cause more problems than it resolves.
Feel free to send comments regarding your experience, success or failures when dealing with an inappropriate schema. Send your comments to btaylor@sswug.org.
Tomorrow I’ll be digging into some of the opportunities and challenges of applications running on the newer portable operating systems. Are you writing applications for your business taking advantage of the Android or IPhone OS? Share your challenges with us.
Cheers,
Ben
$$SWYNK$$
Featured Article(s)
Database Snapshots (Part 1 of 3)
Database Snapshots offer a read-only, static view of a source database at a specific point in time. Attend this session to learn how Database Snapshots can be used for reporting purposes, change management, and data recovery.
Featured White Paper(s)
Enterprise Content Management
Written by AvePoint
The goal of an enterprise content management (ECM) system… (read more)
Featured Script
admin db – identify and track tables with no clustered indexes
In most cases you want a clustered index on all tables. The table maintained by this proc is a useful reference when working … (read more)