The Most Costly IT Related Problem Ever Encountered?

April 02, 2013

In an age where the mining of actionable business intelligence is paramount to developing a competitive advantage in a crowded marketplace, there's one concept that is of utmost importance - data quality.  According to some really important guy {1}, "data are of high quality if they are fit for their intended uses in operations, decision making and planning".

Bad data quality is a problem that can seemingly lie dormant for years until it rears it's ugly head when a major acquisition or other organization changing event occurs and new systems are introduced into the mix.  But because there's very little pain felt in the beginning, such is the reason why companies most often fail to address it at that time. 

There's one absolute certainty about bad data quality - it is the least expensive it will ever be to address "right now".  Ideally, "right now" refers to a time when an organization is still relatively immature in terms of number of IT Assets and organizational growth.  More often than not, however, "right now" simply means "right now".  The point is that this is a problem that grows exponentially as an organization matures over time.

Let's look at a simple example of how the quality of data decreases over time without specific intervention to prevent it from happening.

Say XYZ Corp buys an ERP system where employee records are kept (Name, Address, Emergency Contact Information, etc.).  This ERP system is to be THE system of record when it comes to employee data.  When a new employee is hired, the appropriate data is entered into the ERP system.  In addition, that new employee will receive an account in Active Directory so that they can access the company network. 

Inevitably, there will be overlapping (i.e. redundant fields) between Active Directory and the ERP system.  Already, we have the same data living in two different systems.

Time goes by, and at some point the company needs to implement yet another application to manage health, safety, and environmental (HSE) records out in the field.  As a company policy, employees in the field are required to periodically update their personal information (Address, Emergency Contact Information, etc.).  These updates are captured in the HSE system.

Now we have the same information in 3 different places!

So which system holds the most accurate employee data?  The answer to this question might seem simple on the surface, but in reality it's probably a little more complex once you dig in to your business processes and find when and where data is actually being entered.

Here's the most likely answer for this simple example:

For employees out in the field, the HSE system contains the most up to date information.  But for employees at headquarters, the ERP system (or Active Directory) probably contains the most current information.

Now let's pretend that somewhere down the road, XYZ Corp invests hundreds of thousands, if not millions of dollars in the development of a business intelligence solution that relies heavily on employee data.  Since the ERP system was deemed THE system of record for Employee Data, the BI solution relies on it alone for employee data.  Now that data can be sliced and diced and analyzed until the cows come home, but at the end of the day it isn't as trustworthy as it could or should be. 

 This, in fact, is why I make the claim that bad data quality could be the single most costly IT related problem a company ever encounters - making decisions based upon incomplete or inaccurate data can be devastating.
That's the bad news. 
The good news is that the problem can be addressed through the introduction of a set of processes, governance, policies, and tools that work together to maintain a "single version of the truth" as it pertains to a company's "master data" (i.e. data that is key to operational decision making).  Collectively, this is referred to as "Master Data Management".

At a high level, implementing a Master Data Management solution will consist of the following general steps {2}: 

1.  Identify sources of master data.
2.  Identify the producers and consumers of the master data.
3.  Collect and analyze metadata for your master data.
4.  Appoint data stewards.
5.  Implement a data-governance program and data-governance council.
6.  Develop the master-data model.
7.  Choose a toolset for cleansing, transforming, and merging source data.
8.  Design the infrastructure.
9.  Generate and test the master data.
10. Modify the producing and consuming systems.
11. Implement the maintenance processes.
If you look at those high level steps, it's easy to understand why I say that addressing the problem of bad data quality is never a less expensive undertaking than it is "right now" - as time goes on, there are more sources for master data, the amount of metadata increases, the master data model becomes larger, more testing is required, and more producing and consuming systems must be modified to participate.
For more information on this and related topics, please refer to the following white papers, or give us a call.

Master Data Management from a Business Perspective
Master Data Management from a Technical Perspective
Bringing Master Data Management to the Stakeholders
Implementing a Phased Approach to Master Data Management
{1} Joseph M. Juran, Wikipedia,
{2} "The What, Why, and How of Master Data Management", Microsoft MSDN,


Information and material in our blog posts are provided "as is" with no warranties either expressed or implied. Each post is an individual expression of our Sparkies. Should you identify any such content that is harmful, malicious, sensitive or unnecessary, please contact

Meet Sparkhound

Review our capabilities and services, meet the leadership team, see our valued partnerships, and read about the hardware we've earned.

Engage with us

Get in touch with any of our offices, or check out our open career positions and consider joining Sparkhound's dynamic team.