Editorials

Reporting Database Designs

Keith provides some really thought provoking comments about Data Marts, prompting me to respond with my take on the whole topic of reporting databases. Keith writes:

I saw the heading on this article and before I read the article, I got stuck… What is a data mart anyway? I’m not talking about the surface definition. I’m talking about in our data centric industry where we have millions of dba’s, BI analysts, data architects, etc, and we have dozens of seminars every where, dozens of data centric books… Is there a datamart/data warehouse/operational data store, star schema, etc that has a consistent definition amongst us? Have our terms become so debated by professionals, and misused by the non data related trained management class (managers who are not database people) that the terms are effectively meaningless? I just saw a job posting for a “DBA” to manage reports on a relatively small Access database (No kidding, do you see my point?). I appreciate the actual article here and I liked it, but before I read it, my mind wandered a bit and I just had to share

When it comes to reporting there are a number of implementation strategies: I’ll list them in order of Ascending complexity and value. The simpler the implementation is, the more difficult it is to consume the data. This list only considers taking data from a single OLTP data store. There is no ETL process bringing data from multiple data stores.

  • Report from your OLTP database directly
  • Clone (replicate) your OLTP database into a Reporting database using the same data structure
  • Transform your OLTP data into data marts in a reporting database
  • Transform your OLTP data into a data warehouse using a star schema or snowflake schema

Ok, now we have used some terms we need to define, as Keith states.

DataMart – A de-normalized table, often with repeating columns, optimized for specific reports. Often you will have multiple data marts, with different levels of aggregation (Annual, Quarterly, Monthly, Daily, and Detail) with the appropriate keys according to the level of aggregation. Data marts will have lots of indexes to make reporting fast and simple.

Star Schema – this is a data warehouse database design methodology. A Star consists of a central fact table. This table looks like a pin cushion because the fact records are integrated with many different dimension table foreign keys. A data warehouse can have more than one star schema. The design depends on how many fact tables exist with multiple dimensions.

Snowflake Schema – Combine a Star Schema with a OLTP database and you have a concept of a snowflake schema. A snowflake consists of a star schema with multiple fact tables. It becomes a snowflake schema when you add additional relationships to a fact table.

In upcoming editorials we’ll decompose these different designs with a little more detail about how it looks and how it is consumed.

Cheers,

Ben