Editorials

Master Data Confusion

Maurice has been a regular contributor to my column with insightful responses and helpful tips. Today Maurice responds to the topic of Master Data Management. He writes:

Duplicate Data is expensive in development costs. If the data would have the same structure everywhere replication mechanism would reduce costs, but often you see duplicate data attributes associated in several ways with different business rules to associate them. In this context keeping data integrity becomes more complex. This is a technical debt that lies around forever. Database normalization doesn’t just target stopping data duplication because of disk space, which is the least of the problem associated with duplicate data, it deals with putting right attributes together to avoid this. The good practice is putting business rules in code in a form of a SQL Query or whatever other means to brings it back together.

After exchanging email we clarified the confusion I had caused, and both came to the same conclusion. Data normalization is a preferred technique when you own all the data and build all of the data persistence for your applications. It is much simpler to have a single instance for a data attribute than to update the same data in multiple places.

Master Data Management is a process we apply when we don’t own or control all of the sources of data. This is likely the case whenever you use Off The Shelf software packages and integrate them with other packages or home grown applications. At that point you have more than one potential origin for a specific entity and need to assure the same value(s) are shared by all systems. Master Data Management tools help you achieve that goal.

For an example you could consider Microsoft Master Data Services. There are many other packages that fit in the same space. Normalize when you can (OLTP). De-normalize when you must (data warehousing). Master Data Management when you are not in control.

Cheers,

Ben