Editorials

Final Thoughts on Schema Management – OLTP to OLAP

Last Day to Take the Survey
(And be entered in the drawings!)
Please take a moment now and let us know how we’re doing…
[ Take the survey here ]

Featured Article(s)
Oracle Change Data Capture Step By Step ASync — Part 3
Oracle Change Data Capture Step By Step ASync — Part 3.

New SelectViews Show Now Available
(Part 2) Stephen Wynkoop and Chris Shaw talk about top SQL Server Performance Tips. Part 2 of 2 – things you can try, things you should know about, tips for your SQL Server.
[Watch the Show Here]

Previously:
[Watch Here] (Part 1) Stephen Wynkoop and Chris Shaw talk about top SQL Server Performance Tips
[Watch Here] vConference Session Sample: Thomas LaRock SQL Server 2008 Maintenance Plans. Learn about the ins and outs of maintenance plans in SQL Server 2008 – what do you really need to know?
[Watch Here] The show today features RAID and Defrag tips, backup optimization with filegroups, baseline stats and industry news. Also, find out about virtual learning opportunities, the SQL tip, the developer/AJAX thoughts and much more.

Thank You – and a Question
I just wanted to say thanks. We’ve had an overwhelming response to the DBA School class (already more than 2/3 full) and the surveys this week and… well, it’s been great getting all the feedback. I appreciate your help and trust in SSWUG and I can’t wait to meet up for our first class. My question? What other topics would you like to see classes about? We WILL keep the costs low, we’ll make sure the content is amazing. What would you be interested in seeing? Please drop me a note here.

Final Thoughts on Schema Management – OLTP to OLAP
Hank writes "Personally, I like building BI/DW that aggregate all the information down to DASHBOARd style metrics such as KPI graphics with supporting summary tables… I DO NOT GIVE ACCESS TO THE DW to just any person in the organization for they are quite capable of writing ad hoc queries that will cripple a process/process and server resources in general ( cartesian product). This means that I develop highly optioned driven Stored Procedures that will give a measured response within the scope of qualified parameters. The end user reporting is then controlled and manageable. Also allows good OO programming standards. If there is a specific request/modification of a a SP I have create d from True Specs., I will investigate, code, qualify and release if there is a true business need. This required the requester to work with their assigned SME who then will work with my coding standards. In this situation it is very easy to determine if the “Need of the one, benefits the need of the Many”. If this sound like I do not allow Ad hoc reporting, well there is the little thing called SOX, that we all must live by and that is a good thing. SOO Yes, I hate AD-HOCers.

As for your proposed option #1 or #2, It depends on the OLTP activity or load… if light I might allow qualified SQL processes for reporting to be done against #1. However, since I am a DW/BI senior level type guy, I will choose #2 (Replication with Log Shipping) so as to run ETL the data to the DW.

Lastly, you have to consider the real world scenario of ever changing business needs requiring a development database that is properly (timely) refreshed from the production. If you choose #1 then explain to me more of the benefits of Initial Feedback on Mixed Schemas with SQL Server in a Agile environment VS a more controlled requirements based environment… Let us all build a decision matrix that we can all use in our jobs.. Therefore, please take all the emails and filter the good, bad and ugly of both into a matrix and share with all."

and finally, Roland wrote in to say "In almost all cases, using separate schemas for OLTP and OLAP/Reporting is the right thing to do.

This is primarily a logical choice – using a separate database, RDBMS instance or even host machine to achieve even more separation can certainly make sense. However, creating a schema that is optimized for OLAP/reporting is key. It really is not too hard to see why you need to. In OLTP, the schema should be optimized for data maintenance.

Focus for that design should be avoiding and eliminating redundancy (ie. normalize) to enable efficient, but more importantly, consistent management of addition, modification, and removal of data.

There is a school of thought that dictates that all the data you need to do reporting is already in your OLTP schema, and that it is in principle possible to reconstruct all data you ever want to see in your reports by simply querying the OLTP schema. From this perspective, the choice to create a separate (denormalized) schema for reporting is a necessary evil to make it more convenient to create report, or to achieve better performance by using pre-aggregated data, or avoiding excessive joins etc. Tempting as this may seem esp. from the point of view of data maintenance, this is an oversimplification.

First of all, reporting is mostly useless if you can’t compare status over time. In other words, you need to support history, and this is typically not supported by OLTP systems.

Second, there is often a need to combine data from your systems with external information that is not directly related to the logic of your OLTP schema. For example if I have a CD/DVD retail schema, events from the news ("…Micheal Jackson dead at 50…") don’t tie in directly with the OLTP application logic, but can offer plausible explanations for business performance. So basically, reporting/OLAP is simply another type of application, and it is simply good practice to design a schema according to those needs.

Third, sooner or later reports need to include data from multiple source systems – so you need to integrate data from multiple sources and it makes most sense to do all of that in one separate environment (data warehouse) (instead of arbitrarily choosing one of the OLTP schemas).

Fourth, reporting/OLAP needs and requirements change at a pace that is quite different from that of OLTP systems. Typically reporting requirements change at a faster pace.

Compromising availability and stability by constantly doing schema changes to the OLTP schema to support new reporting needs is not conducive to the purpose of typical OLTP schemas, which is consistent and reliable maintenance of operational, day to day data.

The list goes on and on."

Featured White Paper(s)
Real-Time Data Integration for the SQL Server Data Warehouse
The Microsoft SQL Server database is becoming an increasingly popular platform for hosting data warehouse and operational rep… (read more)